Several years ago I set out to learn more about AI and machine learning (ML) and how I could apply it at my own companies, or at least speak intelligently with our data science and engineering teams.
I took online courses, watched videos, and passed exams but did not feel like I really understood how to apply and integrate machine learning into my business. So many questions were left unanswered. If this sounds familiar, read on.
To inspire you to read this article and part 2 of this series, please sit back and first watch this demo of a working machine learning pipeline.
Most describe an ML pipeline as the series of steps to organize and refine data, use the data for training machine learning models, and then serve, or use, those models within some application. Don’t worry if these terms don’t make sense yet.
This demo creates its own structured data from user input, so it simplifies the process for easier understanding — at least I hope so. I’ll share the source code and approach to designing and building it in my next article.
Most people like me with a software engineering or product background speak a familiar language with terms like class, method, function, parameter, input, request, variable, loop, output, return, and response.
In the machine learning world, there are analogous terms like observation, model, dimension, feature, fit, train, test, and inference. Add in mathematical Greek symbols like theta and your head begins to spin. It can be overwhelming at first but we will ease into learning these terms as we go.
In its simplest form, a machine learning model is a math function that when given numerical inputs, returns a numerical output.
I believe a problem with current teachings is it takes too long to understand how to use the technology so we’ll begin with these questions:
- How do I use this function in real life (a.k.a. serving)?
- How do I create this function (a.k.a. training)?
- What techniques are best for certain problems?
Let’s get right to it, learning these answers in the reverse order that most teach.
1. Serving your models (using your function in an app)
Machine learning models, a.k.a. functions, are stored in files. Software libraries can open and run them, accept inputs, and return results.
Think of these model files like mini “spreadsheets”. They have some some cells with formulas. When you pass your input to the “spreadsheet” it places your values in cells (like yellow cells below) and computes a return value which is your prediction (like green cell below).
Notice in my training data (we’ll discuss more below) that whether I had 1 kitchen or 2 kitchens it didn’t change the price. The weight (multiplier) therefore is nearly 0 because it isn’t an important feature to determine price.
Now when I changed the “Beds” to “4”, the prediction got close to the 250,000 value. The ML algorithms adjust the weights to tiny fractions to get them as close to accurate for all inputs as possible. Thus it’s very rare to have 100% accuracy.
A popular model file format is called pickle, and a tool to create and run them is called joblib. Other frameworks like TensorFlow and PyTorch have their own formats, and there are attempts at universal formats to make models more portable.
To use your model in your application (serving) as illustrated above, you import the compatible library version used to create the file, load() it, then predict() just like any other function in your software programs. I will share a working example later but that’s really it!
2. Training your models (building your function)
We’ll get into more detail later, but assume you wanted an input of food and predict whether your child will like it; we all know where this is going! 😉
Traditionally you would write a function by hand like this with known rules:
Your inputs (features) may be attributes about the food like type, smell, spice, temperature, sugar content, or color. Your predicted output (inference) is either “good” or “bad”.
The problem manually writing your function above is whether you really know what the ideal temperature, sweetness, or smell is and which factors are most important to a child deciding whether it’s “good” or “bad”. This approach is trial and error at best.
Now imagine the data you have is ten thousand records. Your brain cannot possibly remember and correlate all those values to determine where that cutoff point for spice level, temperature, smell, and sweetness is. If you use a computer and pass in all that data, it can figure it out, typically within seconds, and define a math function that will return the correct predictions. That’s what machine learning helps us with.
Let’s keep it simple for now with the code below:
This is a simplistic example to “demystify” what machine learning is. Instead of writing the function and rules yourself, you use an algorithm that iteratively multiplies different weights to each input value until it returns the correct answers most of the time (remember the spreadsheet above).
Assume you input 100 records and a pattern emerged that every time the color of the food was 4, the result was 0 (“bad”). The function begins to write itself as it discovers these rules, just like your manual function had rules in it. The response (inference) is mostly correct because you created the function (trained your model) based on some data you already had with correct answers (labels).
Imagine a month later your child says they like broccoli. It’s green, so you have to figure out how to rewrite your first function to account for it. As data changes, this gets harder and harder. By using machine learning, you simply feed in the new information and retrain your model until it has accurate predictions. That is the power it offers.
3. What techniques are best for certain problems?
We are going to skip this. Most AI courses begin with categories of ML algorithms and I think it complicates the learning process. For now, ignore it. First, gain an understanding and later enhance your knowledge.