In this article, I present a total of 84 papers and articles published in 2020 that I found particularly interesting. For the sake of clarity, I divide them into 12 sections. My personal summary for 2020 is as follows.
In 2020, Transformer model made a huge leap forward. In natural language processing, GPT-3, a large scale Transformer model, has achieved high accuracy in many tasks. Using a large amount of data and a large number of parameters, it has surpassed Big Transfer, which had the highest accuracy in image classification.
Fractal image datasets that are free of discriminatory elements and copyright issues may become very important in the future, when the ethics of AI will become more important. The people in industry who do not have access to ImageNet will be happy to have this dataset.
There have been many publications on self-supervised learning without labels that rival the accuracy of supervised learning.
Deep Fake has become a social problem, and a detection method using biometric signals has been proposed. It is also being used in a positive way to protect the privacy of the victims.
Numerical simulation combined with machine learning is emerging. By learning input/output patterns, the simulation can be made ultra-fast, which may lead to greater use by companies.
1.Image/video classification tasks
2.Unsupervised learning / self-supervised learning
3.Natural language processing
4.Sparse model / Model compression / inference speedup
5.Optimization/ loss function/ data augmentations
6.Deep fake
7.Generative models
8.Machine learning with natural sciences
9.Analysis of deep learning
10.Other research
11.Real world applications
The Transformer model has finally made a breakthrough in the image classification field, surpassing CNN-based models to achieve the highest accuracy in ImageNet. However, since it requires a dataset of JFT-300M level (3 million images) and more than 600 million parameters (about 10 times more than EfficientNet-B7), it is not yet easy to use. In 2021, there may be research that surpasses CNN-based models with a lightweight Transformer. With fractal image datasets, you don’t have to worry about copyright or discriminatory factors. This is also great for people in industry who don’t have easy access to ImageNet.
Adapted from the following paper
Fixing the train-test resolution discrepancy: FixEfficientNet
https://arxiv.org/abs/2003.08237
EfficientNet improves accuracy by increasing the resolution, but there is a gap between the resolution during training and inference. By fine-tuning a top layer at a given resolution after training, they filled this gap and achieved better results than NoisyStudent in ImageNet without using external data (SotA).