Our team addressed the lack of datasets with people wearing masks by using data augmentation techniques. One may enhance their existing datasets (or publicly available ones) by overlaying masks on top of people’s faces. Training facial analysis models such as face detection and sex prediction become an easier task once you do this.
A researcher at Georgia Tech is managing an open-source project called “Mask The Face.” The source code is available on GitHub and can be used to convert face datasets into masked-face datasets.
Running the software package on your images is a cinch!
cd MaskTheFace
# Generic
python mask_the_face.py --path <path-to-file-or-dir> --mask_type <type-of-mask> --verbose --write_original_image# Example
python mask_the_face.py --path 'data/office.jpg' --mask_type 'N95' --verbose --write_original_image
Source code and further details can be found on the GitHub project page.
Using the software extensively is expected to produce a few inconsistent results. In some cases, it works extremely well and addresses variations in head pose and lighting conditions. In other cases, the algorithm will miss faces or misplace the mask in the presence of strong head pose variation and lighting aberrations. This is due to the performance of the face detector used in the project.
Even though some faces in the training set will not be masked this is generally okay because the overall dataset would comprise both masked and unmasked faces (see training section below). In addition, one may also use different detector options which are more robust for better results. All in all, the method is resilient and practical.
As you may see in the corresponding sample pictures, the referenced method produces realistic results. One may also choose from a variety of different masks to increase the diversity of face coverings in the dataset. The options include different patterns, colors, and intensity values.
There are many different ways to train facial analysis models whether this pertains to detection, recognition, sex, age group, and/or sentiment. For the purposes of this guide we will focus on sex prediction assuming the images have already been cropped and aligned using a detection module which is already robust with occluded faces.
We trained our own classification task head which received feature maps from a battletested backbone. This is code for illustration purposes for what would be a small but important component of a much larger system. Nonetheless, the principles remain the same and allowed us to achieve high accuracy across a wide range of tasks.