September 2020
tl;dr: Calculate effective numbers for each class for better weighted loss.
This paper reminds me of effective receptive field paper from Uber ATG, which basically says the effective RF grows with sqrt(N) with deeper nets.
This paper has some basic assumptions and derived a general equation to come up with the effective number for weight. The effective number of samples
is defined as the volume of samples and can be calculated
by a simple formula
People seem to have noticed it and uses some simple heuristics to counter the effect. For example, this paper noticed using 1/N would bias the loss toward minority class and thus simply uses 1/sqrt(N) as the weighting factor, in PyrOccNet.
- Summaries of the key ideas
- Summary of technical details
- Questions and notes on how to improve/revise the current work