Different flavors
Not all kinds of AI can recognize speech like Siri does, or recommend the best series based on what you’ve watched on Netflix before. We need to start thinking of this technology as a tool that we can use for different purposes.
As you can see, AI encompasses the fields of Machine Learning (ML), which at the same time encompasses Deep Learning.
Now that we have a basic idea of what AI is about, what we can tell about ML is the following:
Machine Learning consists of training a computer with data so it can make its own decisions, learn by itself, and improve without being explicitly programmed for it.
Let’s learn about other ways of categorizing Artificial Intelligence before getting too deep into ML and talking about deep learning 😉.
Even when this non-biological intelligence is already able to diagnose cancer at a pretty accurate level, drive cars with a low probability of causing any accidents, and much more, experts know that this is only the beginning.
At this moment we have been working and interacting with something called Artificial Narrow Intelligence. The “narrow” here means that it somehow specializes on only one thing at a time.
In the future though, we could be seeing Artificial General Intelligence (AGI), that can do whatever a human being can. This could sound either interesting or scary, but it’s not the end either.
Artificial Super Intelligence will surpass capabilities of a human being, and I’m not sure that a lot of people (including experts) know what we could be achieving by levering such thing.
Types of problems
So AI is a tool that can be used for a lot of different things, and Machine Learning is a kind of AI that can learn and improve (on its task) based on data and experience. So, what kind of things can Machine Learning do?
To think of examples, I first like to think of the kinds of things that ML can do overall, and these are: classifying, clustering and predicting data.
The difference between classifying and clustering can be blurry. The simplest explanation I can find is that classification is about taking an individual piece of data, and putting it in a bucket depending on its characteristics, whereas clustering is about grouping lots of different data together.
Predicting is a little different, but definitely easier to explain. It’s not like we’ll have a magic ball to literally predict what will happen in the world, but we can forecast a continuous quantity, with a fair amount of accuracy.
They ML algorithms in charge of doing this are denominated Regression algorithms.
On to some examples…
Classification: spam detection, determining if an image shown is one of a dog or a cat, classifying handwritten numbers
Clustering: search engines clustering websites that have similarities so they appear together, marketing efforts to know group “kinds” of customers
Regression: forecast the price of a stock, weather, or a house
Machine Learning algorithms
You knew this was coming. So yes, we can work with different kinds of algorithms in ML, depending on the data available, the kind of problem, and what we want to achieve.
These all have pretty fancy names, but just pay attention to the explanation. They’re not that hard, trust me.
For supervised learning, we should first of all know that it’s called “supervised” because the data that we work with is going to be labeled.
This means that if we are classifying cats a dogs, we will start with images that are already classified.
The kinds of problems that we can solve with supervised learning are regression and classification problems.
Unsupervised learning is the opposite, since we have unlabeled data. Thus, we may infer that the algorithm will be looking for similarities in the data. This is, clustering problems.
Last but not least: reinforcement learning. The name may sound intuitive. It is about having the algorithm experience what it should do, and get rewards when it does something right.
A very interesting application of this are self-driving cars — yes, they’re also an example of AI 😅.
If this was the first article I read about Artificial Intelligence, I’d be confused at this point. What does Sofia mean by “training” the algorithm?
Don’t worry now. In this section, we’ll be going behind the scenes and learn what these algorithms are, and how they are coded.
Code like a toddler
Machine Learning is a like a toddler . Toddlers don’t know much about the world, let alone specific tasks. However, with the right training and enough amount of practice, they can actually master the skill that they want.
An example of this is learning how to walk. We weren’t born learning how to do that. We had to practice, and we’d often fall, but now we’re experts 🙌.
Train an algorithm with data so it can work on it, improve, and be ready to make decisions
The process of getting an ML code is simple at the surface level. Here are the most important steps to follow:
- Define the problem
– What do you want the algorithm to do?
– What kind of ML problem is this?
– What data will you be working with?
– What are some important features of that data? - Gather the data
– If you’re working on an algorithm to classify cats and dogs, you’ll want to get lots of images of those
– Due to the advancements made in the field, there are websites like Kaggle where you can find many different kinds of data sets - Prepare the data
– This step is often referred to as “data cleaning” as well. It means getting rid of what we don’t need
– There will be times in which we have spreadsheets with comas, spaces, or characters that won’t let us use the data properly. That’s what we’ll clean - Explore the data
– You need to know and understand the data yourself before starting to actually work with it - Build the model
– Model: it uses the ML algorithm to create an output. In other words, it is what makes sense of the input data, and classifies, clusters, or predicts the output
– This is where the coding takes place - Evaluate the model
– This is basically running the code and knowing what’s right and what’s wrong with it
– Based on your evaluation, you’ll need to do some parameter tuning (adjustments and corrections) - Test the model
– At this point, the code (model) should be ready to be used
– If you were previously using labeled data to train the model, you can now use unlabeled data since the algorithm has learned how to classify, cluster or predict data correctly
Let’s to some math
I bet that few of your math teachers told you a truly awesome application of algebra and calculus. Well, in this section we will understand that the so-called algorithms that we’ve been talking about are based on mathematical expressions.
From my perspective, this means that AI is that combination between coding and math.
I’m gonna be honest: I haven’t completely understood the math behind AI. Some of it, I just learned it at school, but there are words like “convolution” that I had never heard of.
Thus, I’ll only be going through the most important points of supervised learning algorithms. If everything goes right, I’ll write a more detailed article soon 🙂
- Linear regression: y is dependent on x. y = ßo + ß1x + €, where y is the dependent variable, ßo is the y-intercept, ß1x is the slope and € is the error (between the points and the actual line).
- Logistic regression: categorical quantities, y depends on x but y is always going to be a value between 0 and 1 or a probability of the value lying between 0 and 1, which can both be represented by using a sigmoid fundction (S kind of curve).
- Decision trees: pretty much what it sounds like. You have an initial question (root node) that is conencted with branches to the internal modes which can be either answers (yes) or more questions (no), until you get to the leaf nodes which are the outcomes. You first select the best attribute of the dependent variable (outcome) to be the root node, and have each other attribute to be a tree node and assign clasification labels to ach of the tree nodes. If it is classified, then stop, else continue iterating
- Random forest: a collection of decision trees, prevents overfitting the data and is more accurate.
- Naive bayes classifier: classifies depending on the probability of something happening again based on the data that it’s given, using the Bayes theorem. Independent variables
- K nearest neighbour (KNN): classifies a data ponit depending on the features of its neighbouring data points. Can be used for both classification and regression
The last, yet the most exciting portion of this article will be dedicated to that specific area of Machine Learning called Deep Learning.
You know, ML is not able to handle high dimension data, it identifies features in data but it cannot do the opposite thing (extract them), and other tasks such as computer vision would seem too hard if it wasn’t for DL.
In my experience, Deep Learning has actually been easier to understand because of the level of similarity that it holds with the human brain. In fact, this area of AI is all about something called neural networks.
Perceptrons
If you wanted to go way deeper into this concept, I definitely recommend checking out this free online book. The purpose of this article in particular (in a Nutshell) is to gain more breadth than depth of knowledge.
Perceptrons were developed in the 1950s and 1960s by the scientist Frank Rosenblatt. They came before neural networks, and in fact, neural networks are made up of a series of perceptrons.
A perceptron takes several binary inputs, and produces a single binary output. The neuron’s output is determined by whether the weighted sum of the inputs is less than or greater than a given threshold value.
In biology words, we can say that our brains are made up of neurons that connect with each other. We sometimes have environmental inputs such the sound of someone saying “hello!” to us. If our neurons consider that the input is important enough, they will communicate with each other: they will be activated! ⚡
So we already have some new concepts.
– Input: whatever data we use to train our networks
– Weight: this is the value for “how important” each type of data will be
-Threshold: the minimum value that the weighted inputs have to reach in order to activate the neuron
– Output: the result of everything that happens inside each perceptron
– Weighted inputs: an input value times its corresponding weight value
In a simple example of a perceptron, we have the following values:
– Sum of the weighted inputs: 20
– Threshold: 15
– Output: 1
The explanation is that the sum of the weighted inputs is greater than the threshold, so the perceptron is activated, and produces a positive resonse (1).
Neural networks
A neural network is made up of layers and layers of perceptrons. In a simple network, we would have 3 layers: the input layer, the second layer, and the output layer. The first layer would weight the input evidence, while the second layer would make a decision.
Leveling up the concepts, we can write an equation for the decisions that each of our perceptrons is going to be making. The output will 1 if x*w + b ≥ 0
.
Where: x is the input, w is the weight, and b is the bias.
The bias is a measure of how easy it is for each perceptron to fire. It’s kind of replacing the threshold and making a correlation between the input and how easy it will be to have a 1 as an output.
Since a small change in the weights could result in a completely different output, we would need to adjust the weights until having the right model. This can get tedious, but Sigmoid functions can help us.
Sigmoid neurons are some of the most widely used type of neurons. They are modified so that small changes in their weights and bias cause only a small change in their output. That’s the crucial fact which will allow a network of sigmoid neurons to learn.
Just like perceptrons, Sigmoid neurons also take inputs. The difference is that these inputs can be values between 0 and 1 (they can have a decimal point).
Don’t worry if this looks like too much math. The reality is that these days we don’t need to know all of it to build an algorithm that works. Later in the article, we’ll talk about programs that can help a ton.
Architecture of a neural network
If you’ve heard about Neural Networks before, chances are that you’ve also seen a diagram like the one below. We’ll now look at an example of how Neural Networks can do handwritten numbers recognition.
In this simple example, the first layer is called the input layer, the two middle layers are called the hidden layers, and the last layer is called the output layer.
In this case, the input would be photos of a handwritten number. This means, pixels. So if we have a 28 by 28 pixel image, the input layer contains 28×28 neurons as well (each processing 1 pixel).
For simplicity sake, there are not so many neurons here.
If the input pixels are greyscale, each can ave a value of 0 (white), or a value of 1 (black). In between values would represent gradually darkening shades of grey.
The “thought-process” of this work would be something like:
- The first layer takes in the input
- The first neuron in the hidden layer will detect whether or not an image like the the one on the left is present. To put it simply, we could say that the other neurons in the hidden layer are detecting whether or not the other parts of the number are present
- The first output neuron would try to decide whether or not the digit is a 0 by weighing up evidence from the hidden layer
That’s the reason why it’s better to have as many neurons as possible. If each of them processes a small part of the number, we can have high accuracy even if not all the neurons are fired. So if our network is well trained, it will allow some variations.
Types of neural networks
More classifications for Artificial Intelligence! We’ve gone through different types of problems, algorithms, and now it’s time to consider different types of neural networks.
If you do your research, you’ll find out that there are a lot more neural networks than the ones that I mention here. So, these are just the most common/popular ones.
Feedforward Neural Network
– Simplest type of artificial neural network
– Information only moves forward (just as we’ve seen so far)
Recurrent Neural Network (RNN)
– Works with sequential data or time series data
– They allow previous outputs to be used as inputs while having hidden states
– Useful for tasks like connected handwriting recognition or speech recognition
– This is what Siri and Google Translate are made up of! 🤭
Variational autoencoder (VA)
– Applications include facial recognition, understanding the semantic meaning of words, speech generation, image colorization, creation of fictional faces, and high-resolution digital artwork.
– Its architecture consists of an encoder, a decoder, and a loss function
Convolutional Neural Network (CNN)
– Applications in image and video recognition, recommender systems, image classification, image segmentation, medical image analysis, natural language processing, brain-computer interfaces, and financial time series
– Usually multilayer perceptrons, in which each neuron in one layer is connected to every other neuron in the next layer
– They resemble the organization of the animal visual cortex
Generative Adversarial Network (GAN)
– Unsupervised learning
– It discovers and learns the regularities or patterns in input data to generate new examples
– We can have 2 models: the generator model (creates new examples), and the discriminator model (classifies examples as either real or fake, meaning generated)
– Applications for generating human faces, face aging (remember that app?), creation of art
The code
Although there are some tools to interact with Machine Learning without coding, the deeper you get, the better. Coding allows you to truly “personalize” what neural network, algorithms, and problems you want to work with.
This said, you’ll always find that the most recommended programming language for this is Python. Not only for ML, but for AI in general as well as many other purposes. Python is simple to learn, and has everything we need.
To make it easier, we’ll be working with something called frameworks. They are (normally) python libraries that allow us to build ML models without getting too deep into the algorithms (the math).
There are a lot of different frameworks one can use. Your choice should depend on the level of complexity that you want to handle, and what kind of project you’ll be mostly building. These are the most common frameworks:
- TensorFlow: focused on training and inference of deep neural networks. It allows developers to create dataflow graphs
- PyTorch: used for applications such as computer vision and natural language processing
- Keras: high-level. It’s built on top of TensorFlow, so it’s extremely user-friendly and easier than TensorFlow, fast prototyping
- Scikit-Learn: preferable for administered and unsupervised learning calculation
Finally, I started to learn about AI since I discovered that it could be such a powerful tool for biotech. Again, I won’t go deep into the technicalities, but rather go through some of the most exciting applications of AI in biotech, and companies that are working on this.
Diagnostics
This is probably the most exploited application at the moment. Since we discovered that algorithms can diagnose diseases with even a higher degree of accuracy than medical doctors with decades of experience, we knew this was huge.
The first and most common example is using AI to diagnose skin cancer (melanoma). As we now know, the algorithm would be trained with lots of data at first, so it learns what whether the color and distinct boundaries can mean cancer.
Apps like SkinVision and MoleMapper already allow you to take photos of your moles and track them over time. Constant monitoring can be a great advantage hear.
The same happens with radiology. Although to my surprise, RNNs could be applied to analyze radiologic reports. This is, sequential data.
Of course, being images, one of the best way to analyze them would be with CNNs.
Proactive medicine
You’ve maybe heard this before. Medicine has always been more reactive than proactive, meaning that we try to treat diseases after they appear (late). AI is radically changing this by constantly monitoring data, and analyzing our DNA.
Predicting heart attacks?! Data mining processes large amounts of healthcare data, analyses it, and helps predict heart disease. Algorithms such as Naïve Bayes, decision tree, K-nearest neighbor, and random forest algorithm are used for this (Devansh Shah, et.al).
On to precision medicine, we can definitely say that the future is knowing your DNA as the palm of your hand. ML can be used to help find patterns your genome that can be expressed as a phenotypic change (visible). This includes Single Nucleotide Polymorphisms (SNPs), mutations in regulatory sequences, or other areas.
Dry lab
So far, we’ve talked about scientists working with AI. But what if AI was the scientist? 😳
Researchers at Aberystwyth University in Wales and England’s University of Cambridge, designed an AI system embodied as a robot that performs basic biology experiments with minimal human intervention. They called it Adam.
Adam could make hypotheses, prove them with experiments, and interpret the results. It used its robotic arms to pass the samples from one place to another (Greenemeier, 2009).
Now, another huge problem is how long researchers have to study possible candidates for drugs. It’s a tough process, since they are trying to predict the unpredictable (until now): how proteins fold.
Although this may not sound as mind-blowing as Adam, it’s revolutionary. Researchers at Deepmind created AlphaFold: an algorithm that can predict how proteins fold with more almost 90% of accuracy. It’s going to completely change the way we discover new drugs.
Disruptive companies
I’ve said this before. If it’s in the industry, it’s something that people want, and it could be huge.
You just need to see their website to fall in love with their vision and mission. They’re doc.ai: “A Palo Alto-based artificial intelligence company for a new age of healthcare”.
What I find fascinating about them is that they’re going for breadth as much as depth by building a series of apps in different areas of healthcare:
Genewall lets you compare your genetic traits with your family and loved ones — without your data ever leaving your phone.
The doc.ai app makes it easy to collect and aggregate all your health data on your smartphone so you can get insights into your personal healt
Serenity is a mental health chatbot that provides you with expert, confidential, and judgment-free mental health counseling and support
Diving deeper into genetics, 23andMe is a genomics and biotechnology founded by Anne Wojcicki. It brings genomics mainstream by genetic testing kits: you receive a tube to spit into, send it back to them, they analyze your genetic information, and your receive your results back in the app.
What are the results? Ancestry data, meaning what your genetic roots are, as well as the likelihood of getting a certain disease (being predisposed to it). I personally find the concept to be fascinating.
We’ve also talked about DeepMind before, but I really don’t think that people are emphasizing enough the importance of their progress. As I was discussing with a good friend called Mukundh, this doesn’t mean that clinical trials won’t take years, but it means that everything will be ready before.
This company has also done work in diagnostics of kidney disease and breast cancer, saving 30% of the energy used in the normal process.
Down the road, I want to shoutout Higia Technologies, a startup with a technology called EVA that diagnoses breast cancer in a non-invasive way, (you guessed it)… using AI!
Last, but definitely not least: Ginkgo Bioworks, the organism company. They are a platform that enables other companies to create biotechnological products. They know the power that experimental data can hold, and so they are storing it and analyzing it as well.
With this said, I really hope this article has been helpful for your to understand not only the surface level tech behind Artificial Intelligence, but also the crucial role that it already has and will continue to have in the decades to come.
This article was entirely targeted for those curious people who don’t necessarily have a background in either biotech or technology, so please let me know if you truly learned something new 🙂
Now that you’ve finished reading, what do you think: will this be humanity’s last invention?