Skip to content

Latest commit

 

History

History
21 lines (13 loc) · 830 Bytes

rethinking_pretraining.md

File metadata and controls

21 lines (13 loc) · 830 Bytes

March 2020

tl;dr: ImageNet pretraining speeds up training but not necessarily increases accuracy.

Overall impression

Tons of ablation study. Another solid work from FAIR.

We should start exploring group normalization

Key ideas

  • ImageNet pretraining does not necessarily improve performance, unless it is below 10k COCO images (7 objects per image. For PASCAL images where 2 objects per iamge, we see overfitting even for 15k). ImageNet pretraining does not gives better regularization and not help reducing overfitting.
  • ImageNet pretraining is still useful in reducing research cycles.

Technical details

  • GroupNorm with batch size of 2 x 8 GPUs.

Notes

  • Questions and notes on how to improve/revise the current work