A detailed plan based on experience
This is a chapter from my upcoming book, Meta-Learning: Powerful Mental Models for Deep Learning and Thriving in the Digital Age. It is currently available for pre-order and you can pick it up at 50% off.
Before you can learn to do deep learning, you must become a developer. Being a developer is not only about programming. In the larger scheme of things, being able to write code is just a small part of what a developer can do.
If I were to start all over, I would optimize this part of the journey for fun. I would want to get into the tinkering mindset and start feeling comfortable messing around on my computer. There is no better way to achieve this than by working on something you truly enjoy.
I would take it easy. When it comes to learning, consistency beats intensity every time. When you are just starting something, there is no better way to stay consistent than to not exert yourself too much. I would still make quick progress, because this time around, I would know what to focus on. This is what the beginning of this chapter will cover.
How to get started
My first stop would be to complete one of the programming MOOCs. They don’t teach you a lot about being a developer, but they do teach you some things about writing code. I would probably go for Harvard CS50 on edx. And, as I want to have fun, I might choose the game programming track.
Harvard CS50 has one great thing going for it: On top of teaching you how to program, it also teaches you about the computer and how it works! Two birds with one stone. Still, regardless of which MOOC or programming book you chose, it is hard to go wrong. Programming has been taught for many generations, and many good teachers are out there. The biggest mistake you can make at this stage is to do more than a single beginner programming course. Striving to learn any particular programming language really well would be a mistake.
It doesn’t make sense to invest all of your time into learning calligraphy if you have nothing to write about! Likewise, it doesn’t make a lot of sense to work on building a table but focus all your effort on a single leg and then complain that the table will not stand!
All we need at this stage is some familiarity with programming concepts — enough so that we can perform meaningful online searches. The programming skills we need are understanding the value of Stack Overflow, documentation, and how to reach both. This might sound tongue in cheek, but these skills are genuinely at the core of a programming career. Add knowing a few basic things about programming, and you have a very firm platform on which to build.
How to move around in code is an essential skill
The next leg of the table we are assembling is our ability to move effectively in code. This is not something that most programming courses cover, but it is an essential skill to have.
This boils down to learning to use a code editor really well. Any major code editor will do. Some people swear by textmate, sublime text, atom, vscode, emacs, or vim. It doesn’t matter which one you choose. Focus the same attention on learning it as you would to writing code and you will be alright. If you are absolutely lost as to which one to pick, chose vscode. Do not worry about the choice too much — it is not one you must stick with forever. I do not know of a single developer who has not tried multiple editors in the course of their career. However, I also don’t know a single developer worth their salt who doesn’t believe that learning to use an editor is an important skill.
Version control can give you superpowers
To work on our next leg, we move to version control. Code is written in small iterations, often by more than one person. Additionally, code is so expensive to produce that we want to make sure we don’t lose our work and that we can explore new directions without destroying the work we have done up to now. All this is very hard to achieve without specialized tools. Thankfully, we have a software solution that addresses all the needs listed above and then some!
The tool goes by the name of git. It is a small command line program that can change any directory into a git repository. In the process, it endows the directory with superpowers, such as an ability to travel back in time or create alternate timelines!
Git comes with an online counterpart, called GitHub. While git excels at managing the source code on your computer, GitHub is great for code sharing and collaborating with others. Pushing your code to GitHub also provides you with a backup. If some accident should befall your PC, your work will be preserved at another location.
Git is a marvel of software engineering but, unfortunately, the API it exposes is not very human-friendly. Thankfully, GitHub created its own command line tool, the gh cli. It has a much gentler learning curve than git.
Being able to use the computer is where it’s at
The last leg of our four-legged table is probably the hardest to construct. The challenge lies solely in the lack of good learning resources. This part deals with learning how to use the computer in the context of writing code. For deep learning, this goes beyond your local setup. How do you spin up a cloud VM? How do you ssh into it? How do you move data around? How do you build your own home rig for experimentation? The good news is that you do not have to learn all of this at once. A great starting point might be to learn how to navigate the file system on the operating system of your choice. As you get more comfortable using the command line, you can gradually add more commands to your arsenal.
The one resource that stands out for learning how to use a computer is The Missing Semester of Your CS Education by MIT.
Maybe a day will come when a single course will cover all the basics of being a developer. For now, unfortunately, we must piece together our own curriculum. The good news is that it is enough to get just a taste of each of the four disciplines I outline above; there is no need to become an expert at any of them.
How to become a machine learning expert
All we have been doing thus far is preparing ourselves for taking the “Practical Deep Learning for Coders” course by fast.ai. If you have a rudimentary understanding of programming and can move around on your computer, the course can take you from there.
And why a fast.ai course and not something else? No other resource out there will teach you everything you need to know to do machine learning at a high level. Many good offerings focus on a particular set of tools or some aspect of the machine learning pipeline, but no other complete package exists. This is the only course I am aware of that will teach you the end-to-end process of arriving at a good solution to a machine learning problem.
In fact, for many, the course goes beyond showing one how to be a good deep learning practitioner. Through taking the fast.ai courses, I learned how to learn. I also started to share my work and became an active member of the deep learning community. The courses not only taught me nearly everything practical that I know about machine learning but also showed me how to thrive in the digital age.
If you give the lectures a proper go and become an active member of the fast.ai forums or fast.ai discord, you will be alright. Nonetheless, the courses propose a way of learning that might be new to many. While the course explains the approach, a written summary of it, from my perspective, with the entire progression explained in one place, might still be beneficial. This is what we will be looking at in the remainder of this chapter.
An effective way to learn
It all begins with watching a lecture. The next step is to open the lecture notebook and figure out how all the pieces fit together. The idea is to run each line of code and look at the outputs. If you come across a function you have not previously encountered, this is a good time to read its documentation. How will the performance differ if you pass in different arguments or slightly tweak the hyperparameters?
Once you understand the bigger picture, it’s time to reproduce the results. To do so, open a new notebook and try to recreate the training pipeline that the lecture demonstrated. It’s an open-book exercise — it’s okay if you need to look back at the lecture notebook, but the less you do so, the better. On the other hand, there are no limitations on reading the documentation or searching online. Both of these activities are encouraged and they closely resemble what programming feels like when it’s done completely on your own.
Once everything is up and running, and you have obtained a result similar to that in the lecture, it is time to go a step further. This time, we search for a similar dataset and test drive the technique we just learned about. Your imagination is the only limit here. You could seek out a dataset online or you could put one together yourself. Alternatively, the fast.ai library provides access to many datasets used for research. You can download them to your machine with a single command.
The idea is to systematically grow your skills. You start in a controlled setting of the lecture notebook, learning the ins and outs of the technique that the lecture discussed. With every new exercise, more of the training wheels come off, to the point where you’re doing everything on your own from start to finish.
This approach to learning works like a charm. There are no hard and fast rules here and you are more than welcome to modify it as you go. Being creative and coming up with exercises that are challenging and interesting to you is a very valuable skill in its own right; acquiring it will take your learning to the next level. The result is always the same: to arrive at a place where you can comfortably apply the techniques you learn in the lectures to problems you haven’t previously seen.
While practice is the only way to improve your ability to solve real-life machine learning problems, moving toward this way of learning does not require you to revolutionize your life. A couple of minutes a day, over multiple days, is all you need to get started. With every consecutive day of learning, you build momentum. Soon you start to notice the results, and the ability to do things you thought were out of reach not too long ago creates a lot of excitement. Very soon, you will find yourself losing track of time as you explore this or that technique.
Where to go next
From the broader picture, going through lecture materials is a starting point. It gives us a platform upon which we can build. Even before you finish the course, it is a good idea to start coming up with and working on mini-projects of your own. If you are low on inspiration, you can check out this amazing forum thread with projects that fast.ai students worked on during one of the courses. There, you will find an assortment of ideas: blog posts, howtos, forum posts, GitHub repositories, explainer videos, and Kaggle competitions.
Summary
The way of learning I describe above was completely new to me. However, after experiencing what it can do, I have never ceased being a fast.ai student. Any other approach feels like a waste of time.
It wasn’t always like that. Initially, I thought that the fast.ai way of learning could not work. The exercises proposed in the lectures didn’t feel like learning to me. How could I make any progress spending so much time doing? To me, learning was this process in which you sat down with a book for an hour or two and came out on the other end being able to discuss something highly complex and abstract.
I was very lucky, as at the time I found fast.ai I was at the end of my rope. Nothing was working, so I was ready to try something completely new.
Stumbling across fast.ai turned out to be the adventure of my life.
Thank you very much for reading! If you enjoyed this and would like to learn more, connect with me on Twitter!