In this paper, the authors systematically study model scaling and identify that carefully balancing network depth, width, and resolution can lead to better performance. A new scaling method that uniformly scales all dimensions of depth, width and resolution using a simple yet highly effective compound coefficient is demonstrated in this paper.
The papers propose a simple yet effective compound scaling method described below:
Source: Arvix (https://arxiv.org/abs/1905.11946)
A network that goes through dimensional scaling (width, depth or resolution) improves accuracy. But the caveat is that the model accuracy drops with larger models. Hence, it is critical to balance all three dimensions of a network (width, depth, and resolution) during CNN scaling to improve accuracy and efficiency.
As above, the compound scaling method consistently improves model accuracy and efficiency for scaling up existing models such as MobileNet (+1.4% Image Net accuracy), and ResNet (+0.7%), compared to conventional scaling methods.
Scaling doesn’t change the layer operations; instead, they obtained their base network by doing a Neural Architecture Search (NAS) that optimizes accuracy and FLOPS. The scaled EfficientNet models consistently reduce parameters and FLOPS by order of magnitude (up to 8.4x parameter reduction and up to 16x FLOPS reduction) than existing ConvNets such as ResNet-50 and DenseNet-169.
EfficientNets also achieved state-of-the-art accuracy in 5 out of the eight datasets, such as CIFAR-100 (91.7%) and Flowers (98.8%), with an order of magnitude fewer parameters (up to 21x parameter reduction), suggesting that the EfficientNets also transfers well.