Why we need TensorFlow Extended (TFX) in 3 diagrams

There are two main considerations when it comes to adopting TFX: value and cost. I want to demonstrate the value of TFX and how it helps with production-level experimentation, and adopting the best standards for model and data validation.

The cost for using TFX is all about locking up Machine Learning to Tensorflow stacks, such as model training and data transformation. Moving to Tensorflow from RAM-based Pandas, scikit-learn, and R-Studio may become a not-so-trivial endeavor, and seeing the potential value of TFX clearly is very important. So, here it is!

Two TFX examples:
1. With Kubeflow:
https://github.com/kubeflow/pipelines/tree/743746b96e6efc502c33e1e529f2fe89ce09481c/samples/core/parameterized_tfx_oss

2. Local mode (no Kubeflow installed): https://github.com/romankazinnik/romankazinnik_blog/tree/master/TFX_KFP

Data Science Experiments as opposed to Production Large-Scale Experiments, Train and Inferences

WORLD WITHOUT TFX — DATA SCIENTISTS AND ENGINEERS, MANY PER SINGLE PRODUCT

Data Scientists (DS):

DS: run experiments locally to find high-accuracy features, models
DS: find data transformations

DS: work with Machine Learning frameworks such as supervised ML, unsupervised ML, active learning, reinforcement learning
Engineers: often create custom production workflow to manage models and data in Live deployment

By adopting Tensorflow and TFX as the standard, Data Scientists create ML pieplines ready for Live deployment:
– DS Experiments: can be run locally and on cluster
– Model management: Train, Inferences is done by TFX
– Kubeflow: TFX can run with and without Kubeflow
– Cluster: integrated with TFX
– Value — minimize disconnect DS-Engineers and make irrelevant the eternal ML questions “BUT IT WORKS ON MY LAPTOP”

WORLD WITHOUT TFX — DATA SCIENTISTS AND ENGINEERS, MANY PER SINGLE PRODUCT

Enjoyed or Hated it? Let me know with a comment, or get in touch on Twitter and follow me on Medium

Footer