Let’s be Artist using Deep Learning🧙‍♂️

Importing Necessary libraries👨‍🚀

# import resources
%matplotlib inlinefrom PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.optim as optim
from torchvision import transforms, models

Load VGG19 🧑‍🚀

VGG19 is split into two portion:

vgg18.features, which are all the convolutional and pooling layers.
vgg19.classifier, which are the three linear, classifier layers at the end.

As of now, we only need features portion,

# get the "features" portion of VGG19 (we will not need the "classifier" portion)
vgg = models.vgg19(pretrained=True).features
# freeze all VGG parameters since we're only optimizing the target image
for param in vgg.parameters():
param.requires_grad_(False)

Moving model to GPU if available,👨‍🔬

# move the model to GPU, if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
vgg.to(device)

Load in Content and Style Images,🧑‍⚖️

def load_image(img_path, max_size=400, shape=None):
''' Load in and transform an image, making sure the image
is <= 400 pixels in the x-y dims.'''
image = Image.open(img_path).convert('RGB')
# large images will slow down processing
if max(image.size) > max_size:
size = max_size
else:
size = max(image.size)
if shape is not None:
size = shape
in_transform = transforms.Compose([
transforms.Resize(size),
transforms.ToTensor(),
transforms.Normalize((0.485, 0.456, 0.406),(0.229, 0.224, 0.225))])      # discard the transparent, alpha channel (that's the :3) and              add the batch dimension
image = in_transform(image)[:3,:,:].unsqueeze(0)
return image

Load Images from local,👨‍🔧

# load in content and style image
content = load_image('call-of-duty-ghosts.jpg').to(device)# Resize style to match content, makes code easier
style = load_image('starrynight.jpg', shape=content.shape[-2:]).to(device)

Utility functions,🧑‍🏭

# helper function for un-normalizing an image# and converting it from a Tensor image to a NumPy image for displaydef im_convert(tensor):     """ Display a tensor as an image. """
image = tensor.to("cpu").clone().detach()
image = image.numpy().squeeze()
image = image.transpose(1,2,0)
image = image * np.array((0.229, 0.224, 0.225))
np.array((0.485, 0.456, 0.406))
image = image.clip(0, 1)return image# display the images
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20, 10))# content and style ims side-by-side
ax1.imshow(im_convert(content))
ax1.set_title("Content Image",fontsize = 20)
ax2.imshow(im_convert(style))
ax2.set_title("Style Image", fontsize = 20)
plt.show()

Content and Style Images

VGG19 Layers,🧑‍🍳

# print out VGG19 structure so you can see the names of various layers
print(vgg)

Content and Style Features,🧑‍🔬

def get_features(image, model, layers=None):       """ Run an image forward through a model and get the features for
a set of layers. Default layers are for VGGNet matching Gatys et al (2016)"""## TODO: Complete mapping layer names of PyTorch's VGGNet to names from the paper## Need the layers for the content and style representations of an image     if layers is None:
layers = {'0': 'conv1_1',
'5': 'conv2_1',
'10': 'conv3_1',
'19': 'conv4_1',
'21': 'conv4_2',  ## content representation
'28': 'conv5_1'}features = {}
x = image     # model._modules is a dictionary holding each module in the model 
for name, layer in model._modules.items():
x = layer(x)
if name in layers:
features[layers[name]] = x     
return features

Gram Matrix,👩‍🔧

The output of every convolutional layer is a Tensor with dimensions associated with the batch_size, a depth, d and some height and width (h, w). The Gram matrix of a convolutional layer can be calculated as follows:

Get the depth, height, and width of a tensor using batch_size, d, h, w = tensor.size
Reshape that tensor so that the spatial dimensions are flattened
Calculate the gram matrix by multiplying the reshaped tensor by it’s transpose

def gram_matrix(tensor):    """ Calculate the Gram Matrix of a given tensor
Gram Matrix: https://en.wikipedia.org/wiki/Gramian_matrix
"""    # get the batch_size, depth, height, and width of the Tensor    _, d, h, w = tensor.size()    # reshape so we're multiplying the features for each channel
tensor = tensor.view(d, h * w)    # calculate the gram matrix
gram = torch.mm(tensor, tensor.t())    return gram

Putting All together,👨‍🚀

# get content and style features only once before training
content_features = get_features(content, vgg)
style_features = get_features(style, vgg)# calculate the gram matrices for each layer of our style representation
style_grams = {layer: gram_matrix(style_features[layer]) for layer in style_features}# create a third "target" image and prep it for change
# it is a good idea to start of with the target as a copy of our *content* image
# then iteratively change its styletarget = content.clone().requires_grad_(True).to(device)

Loss and Weights,🧑‍💼

Individual Layer Style Weights

Below, you are given the option to weight the style representation at each relevant layer. It’s suggested that you use a range between 0–1 to weight these layers. By weighting earlier layers (conv1_1 and conv2_1) more, you can 🧑‍💻 expect to get larger style artifacts in your resulting, target image. Should you choose to weight later layers, you’ll get more emphasis on smaller features. This is because each layer is a different size and together they create a multi-scale style representation!

Content and Style Weight🧑‍🚀

Just like in the paper, we define an alpha (content_weight) and a beta (style_weight). This ratio will affect how stylized your final image is. It’s 🧑‍🔧 recommended that you leave the content_weight = 1 and set the style_weight to achieve the ratio you want.

# weights for each style layer
# weighting earlier layers more will result in *larger* style artifacts
# notice we are excluding `conv4_2` our content representationstyle_weights = {'conv1_1': 1.,
'conv2_1': 0.75,
'conv3_1': 0.2,
'conv4_1': 0.2,
'conv5_1': 0.2}content_weight = 1  # alpha
style_weight = 1e9  # beta

Updating the Target and Calculating Losses,👩‍🍳

Content Loss🕵️‍♂️

The content loss will be the mean squared difference between the target and content features at layer conv4_2. This can be calculated as follows:

content_loss = torch.mean((target_features['conv4_2'] - content_features['conv4_2'])**2)

Style Loss👩‍⚕️

You’ll calculate the gram matrix for the target image, target_gram and style image style_gram at each of these layers and compare those gram matrices, calculating the layer_style_loss. Later, you’ll see that this value is normalized by the size of the layer.

Total Loss👨‍⚖️

Finally, you’ll create the total loss by adding up the style and content losses and weighting them with your specified alpha and beta!

Intermittently, we’ll print out this loss; don’t be alarmed if the loss is very large. It takes some time for an image’s style to change and you should focus on the appearance of your target image rather than any loss value. Still, you should see that this loss decreases over some number of iterations.

Display the Target Image🧑‍⚖️

# display content and final, target image
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(15, 15))
ax1.imshow(im_convert(content))
ax1.set_title("Content Image", fontsize = 20)
ax2.imshow(im_convert(target))
ax2.set_title("Stylized Target Image", fontsize = 20)
ax1.grid(False)
ax2.grid(False)# Hide axes ticks
ax1.set_xticks([])
ax1.set_yticks([])
ax2.set_xticks([])
ax2.set_yticks([])
plt.show()

Artistic image output

Importing Necessary libraries👨‍🚀

Load VGG19 🧑‍🚀

Content and Style Weight🧑‍🚀

Footer