Saliency Map for Visualizing Deep Learning Model Using PyTorch

In this section, we will implement the saliency map using PyTorch. The deep learning model that we will use has trained for a Kaggle competition called Plant Pathology 2020 — FGVC7. To download the dataset, you access on the link here.

Here are the steps that we have to do,

Set up the deep learning model
Open the image
Preprocess the image
Retrieve the gradient
Visualize the result

Now, the first thing that we have to do is to set up the model. In this case, I will use my pretrained model weight using ResNet-18 architecture. Also, I’ve set up the model so we can use the GPU to get the result. The code looks like this,

Right after we set up the model, now we can set up the image. To do this, we will use PIL and torchvision libraries for transform that image. The code looks like this,

After we transform the image, we have to reshape it because our model reads the tensor on 4-dimensional shape (batch size, channel, width, height). The code looks like this,

# Reshape the image (because the model use 
# 4-dimensional tensor (batch_size, channel, width, height))
image = image.reshape(1, 3, 224, 224)

After we reshape the image, now we set our image to run on GPU. The code looks like this,

# Set the device for the image
image = image.to(device)

Then, we have to set the image to catch gradient when we do backpropagation to it. The code looks like this,

# Set the requires_grad_ to the image for retrieving gradients
image.requires_grad_()

After that, we can catch the gradient by put the image on the model and do the backpropagation. The code looks like this,

Now, we can visualize the gradient using matplotlib. But there is one task that we have to do. The image has three channels to it. Therefore, we have to take the maximum value from those channels on each pixel position. Finally, we can visualize the result using the code looks like this,

Here is the visualization looks like,

As you can see from the image above, the left side is the image, and the right size is the saliency map. Recall from its definition the saliency map will show the strength for each pixel contribution to the final output.

In this case, the leaf on this image has a disease called rust as you can see on the yellow spot on it. And if you look carefully, some pixels has a brighter color than any other images. It indicates that those pixels have a huge contribution to the final result, which is the rust itself.

Therefore, we can confidently say that the model has predicted the result by looking at the right information of it.

Footer