Hands-on Practice of Using Dataset Object
In the image recognition field, the general training set could take from GB to TB size, with each image growing bigger and bigger, there is no way to preload all images into the memory and do a model training. In this article, we will touch on how to make use of some handy functions in Keras to load images in batches without busting your RAM.
We will work on a concrete data set from a competition in Kaggle and learn how to:
- Organize data set into formatted directories
- Load data directly from the directories with batched images
- Train and Evaluate a model
The purpose of the competition is to detect distracted drivers with images well organized in the training and testing folder.
Some sample images would be like:
Inside /imgs/train
and /imgs/test
are the training and testing images in jpg
format with 10 different classes named c0, c1, ..., c9
.
In order to load images directly from a directory, we need to reformat the folder a bit by creating our training and validating folder each with subfolders indicating classes in it.
In this case, we split the images in the /imgs/train
into training and validating set and organize them in the following way:
|
|_driver_split
|_ train
|_ c0
|_ c1
...
|_ c9
|_ valid
|_ c0
|_ c1
...
|_ c9
Where our images are nested in path_to_{train/valid}/{c0-c9}
. In order to load the images from the directory, the path paradigm we need to follow is:
path_to_folder/{data_set}/{class_labels}/{actual_images}
Which put in our example is
driver_split/train/c0/img1.jpg
...
Some preprocessing code is omitted for simplicity issue, the idea here is that we iterate over all the images, split them into training and validating, and build symlink
between the original image to the image in the organized folder. The result is this:
|
|_driver_split
|_ train
|_ c0
|_ c1
...
|_ c9
|_ valid
|_ c0
|_ c1
...
|_ c9
Now with images settled neatly in designated folders, we can load our image with the function image_dataset_from_directory
:
We input the path to our folder, it will automatically detect labels and images and load them in batches, which is critical in that it would not load all images at once which would cost out your memory.
Key Arguments
labels
:inferred
means that the label would be inferred from the parent folder name, in our casec0, c1, ..., c9
.label_mode
:categorical
means that the labels would be encoded in the one-hot encoder fashion.batch_size
: the batch size of loading the images.image_size
: this would resize the images in the folder
For more arguments, you can refer to official docs.
After finishing loading, you would see this in your console,
Found 18047 files belonging to 10 classes.
Found 4377 files belonging to 10 classes.
indicating the number of files and classes detected.
We can also check our class names and take out 1 example:
For complicity, we would train a simple Resnet here but I wouldn’t put too many explanations on it.
First up, our model has pixel values ranged from 0 to 255, we need to add an extra normalization layer in front of it.
Next, let’s build a simple Resnet.
Lastly, let’s compile and train our model! For the complete coding example, please check my repo.
Reference
[1]https://www.kaggle.com/kweonwooj/kc03-day03-driversplit
[2]https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image_dataset_from_directory