Let’s get started and create our first cloud function, head over to your GCP-Console and from the side-bar select cloud function. Once you are in the view click Create Function.
In the next screen, you can name your cloud-function and decide the trigger type, I am going to name my function predict-mnist-digit and keep the trigger-type as HTTP. I am going to go with the default advanced settings except for memory-allocation. For the sake of this tutorial, I will allow unauthenticated invocations, but you can change the config to your needs.
Once you have edited click Save and hit Next, and you will be taken to the below screen where you can start implementing your cloud-function.
You might have to enable the Cloud Build API if not enabled already.
I will be using python as the runtime and we can see that there are already two files created for us, we can specify the package requirements in the requirements.txt file and write our code in the main.py file. Similarly, we can also specify the entry point of our cloud function which in this default template is the hello_world function.
Now we just need to place the required packages in the requirements.txt file and write code to serve predictions. For predictions, I will be using the model we uploaded to the cloud-storage bucket in the first part of this series.
You can also use the deployed model on tensorflow-serving or AI-Platform and use the cloud-function as a middle layer for pre-processing the data since doing such operations on the client-side is not ideal and time taking.
Go ahead and update the requirements for the cloud function, I will be using the below two packages.
Next we need to implement our cloud function which will be responsible for
- Downloading the model from cloud-storage bucket
- Loading the model using tensorflow
- Handle incoming requests and respond with predictions
Let’s first write code to download the model
Here we are first creating the download-directory for our model, after that, we are getting the files from the bucket and downloading them.
Next, we need to load the downloaded model.
Note that we are using the global model variable because we don’t want our model to be downloaded on every client request, which is also known as a cold invocation. We want to have more warm invocations where we use the cached model instead of downloading and loading it on every request which is just not ideal. Having more warm invocations not only reduces the response time but also reduces the cost. This article from google-cloud explains this very nicely, do give it a read for a better understanding of the concepts.
Lastly, we need to handle the incoming requests and respond with model predictions.
And that’s it we have a cloud function up and running which will automatically scale based on the load. To test it quickly you can make an HTTP-POST request to the cloud function and pass in the data, below is a sample python code snippet that you can use to invoke the cloud function
And that’s it I hope this series has helped you. For any queries or suggestions, you can leave a comment and you can connect with me on LinkedIn.