It is very similar to MovieLens ratings dataset but simplified for development purposes. You can apply this schema to anything including Google Analytics Page views or any other product/content related user activity.
Step 1. After you installed docker you need to authenticate it. Use
gcloud
as the credential helper for Docker:
gcloud auth configure-docker
Step 2. Create your Cloud Storage bucket and set your local environment variable:
export BUCKET_NAME="your_bucket_name"
export REGION=us-central1
gsutil mb -l $REGION gs://$BUCKET_NAME
Hint: Try doing everything in one project in the same region.
Step 3. Clone the repo.
cd Documnets/code/
git clone git@github.com:mshakhomirov/recommendation-trainer-customEnvDocker.git
cd recommendation-trainer/wals_ml_engine
Step 4. Write a dockerfile
Docker file is already there in this repo:
This bit is very important otherwise your instance won’t be able to save model to Cloud Storage:
# Make sure gsutil will use the default service account
RUN echo ‘[GoogleCompute]nservice_account = default’ > /etc/boto.cfg
With this docker file you will build an image with these custom environment dependences:
tensorflow==1.15
numpy==1.16.6
pandas==0.20.3
scipy==0.19.1
sh
These dependences versions is the main reason why I’m using custom container.
Google AI Platform’s runtime-version 1.15 has Tensorflow 1.15 but a different Pandas version which is not acceptable for my use case scenario where Pandas version must be 0.20.3.
Step 5. Build your Docker image.
export PROJECT_ID=$(gcloud config list project --format "value(core.project)")
export IMAGE_REPO_NAME=recommendation_bespoke_container
export IMAGE_TAG=tf_rec
export IMAGE_URI=gcr.io/$PROJECT_ID/$IMAGE_REPO_NAME:$IMAGE_TAGdocker build -f Dockerfile -t $IMAGE_URI ./
Test it locally:
docker run $IMAGE_URI
Output would be:
task.py: error: argument --job-dir is required
And this is alright because this image will be used as our custom environment where entry point is
“trainer/task.py”
For example, after we push our image we will be able to run this command locally:
gcloud ai-platform jobs submit training ${JOB_NAME}
--region $REGION
--scale-tier=CUSTOM
--job-dir ${BUCKET}/jobs/${JOB_NAME}
--master-image-uri $IMAGE_URI
--config trainer/config/config_train.json
--master-machine-type complex_model_m_gpu
--
${ARGS}
and master-image-uri parameter will replace runtime-environment. Check mltrain.sh in the repo for more details.
Step 6. Push the image to docker repo
docker push $IMAGE_URI
Output should be:
The push refers to repository [gcr.io/<your-project>/recommendation_bespoke_container]
Step 7. Submit training job
Run the script included in the repo:
./mltrain.sh train_custom gs://$BUCKET_NAME data/ratings_small.csv — data-type user_ratings
Output: