And how you could too
“You can only connect the dots backward, you can not do it forward.” Steve Jobs
I landed my first Data Science job without having a degree in Data Science, nor did an internship related to the subject. In retrospect, I was able to analyze what helped me. If you are trying to get your first Data Science job, this article may help you.
Of course, depending on your personnal situation, background and country you live in, these may be a bit different, but you could use some of the advice here.
So let’s connect the dots.
This is probably the most complicated part of all, and the most important: Why do you want to become a Data Scientist ? I can’t answer for you obviously, but you must be sure to have a clear answer in your head.
You know, it is not the easiest job on the planet, nor the one with the highest salaries. If it is just to follow the hype, you probably must reconsider it.
Now, it is not just a philosophical question, but something that can really help you prepare yourself if you’re willing to go for it, which takes us to next point.
Try to read as much as you can about the day to day job of a Data Scientist. If you have a friend working on it, ask him what he’s doing everyday. For me, it was one of my closest friends, who was working for 4 years as a Data Scientist in one of the most famous data consultancy companies here in France. He was able to give me a really deep view of the work he’s doing everyday.
If you don’t have a friend, you could always contact Data Scientists on Linkedin, I found that most people are willing to help if you ask for it. Try to speak with persons from the entire spectrum of Data Jobs ( Data Scientists, NLP engineers, AI engineers, ML engineers, Data Engineers, Computer Vision engineers).
If you go 10 years back, Data Scientists were doing everything. From statistical analysis to NLP and computer vision. Now that the technology is getting more mature and complex, you can’t learn everything very well.
So my advice is to choose a specific branch of Data and focus on it. It can be NLP, Computer Vision, Time Series, MLOps, .. whatever. But avoid getting too scattered otherwise you’ll end up having only high level knowledge of everything, and will not be able to sell it very well.
So choose a branch and explore it deeply. Understand what it is all about. What are the day to day functions, the models used. Learn what you can from these models, try to get an intuition about them. And the next point can help you to this.
Now that you identified the subject you want to focus on, I suggest you start experimenting immediately. A lot of people will tell you that you must have a certification (from Coursera for instance). Eventhough I believe these can be good for you, you are actually just going into another degree, which is not the scope of this article.
I am a strong supporter of the “learn by doing” philosophy. So I suggest you just start on small projects. The most obvious one is to start by doing Kaggle competitions. You’ll have the data and the required environment. Start with the “Playground” or “Getting Started” competitions, like the Digit Recognizer for computer vision, or Titanic for classification. This is what I personnaly did:
- Copy the best rated notebook and try to understand what he did.
- For each model he’s using, I went read and watch YouTube videos about it. How the model works and when it is good to use it, try to get an intuition. I did the same thing for the metrics he’s using.
- Try to make my own notebook from what I learned. Started with very simple models and metrics. Submit.
Working on personnal projects like Kaggle competitions, will not only help you to progress in Data Science, but also will be added to your portfolio when applying for a job.
When I was applying for Data Science jobs, I was systematically highlighting the different Kaggle Competitions that I did and how I approached problems.
This is the most trickiest part. Whether in your resume or when having an interview, you should not say that you don’t have any Data Science experience. Let me explain why.
We are too much influenced with titles. If you can overcome these you’ll find that a Data Scientist is just a Scientist working with Data. Every machine learning model is finally just a bunch of mathematical theories, laws and now using a bunch of Python libraries.
Now if you go back to your past experiance with this new eye, you’ll find that you did some data science in your life after all. Let me tell you how I myself was able to sell my profile as a Data Scientist.
I have a Master degree in Computer Networking, and a PhD in Telecommunication. These are the things that I mastered at the end of my PhD:
- Probabilities, some linear algebras: these are the mathematical basics of Data Science.
- Mathematical modeling, Linear regression, Markov Chains, mathematical simulation: these can be seen as Machine Learning models.
- Plots like linear plots, scatter plots, barcharts and others that are today’s data visualization techniques.
- Python / C++ / C programming: most used languages in Data Science.
During the interview
So during my interviews, I could say that I don’t have any experience in Data Science because I come from a completely different field. Instead, I chose to highlight what I did during my PhD and master that is the close to what every Data Scientist should know nowadays.
Your resume
You formulate your experience using Data Science keywords. I personnaly used keywords like: mathematical modeling, data visualization, data processing, pipeline building. Use keywords that appear in the job post. If they are asking for Python and sklearn, there is no need to put HTML everywhere. If they are looking for a person to build a product from scratch, highlight the personnal projects or past experiance where you were able to do it. You probably did it once.
I also adapted my resume, and customized it for each job application. It is rare that I use the same resume for each application. I always make sure to customize it in accordance.
This is what, as engineers and scientists, we tend to forget easily. A company primary objective is to make money. And it makes money by providing sufficiant value to the client, who is willing to pay a part of the value he’s getting. Depending on the company, this value could be selling some gray matter (i.e. like in consulting) or a specific product.
A lot of companies are recruiting for a Data Scientist, because they think they need one, because they have a lot of data and want to get some value from it, or just to follow the current hype. Whatever their goal, you should understand clearly what are their business and how a Data Scientist can help them bring more value than what they are investing in paying you. These are some things that you can do before/during an interview:
- Understand the global business: is the company selling shoes, cars, books ?
- What kind of data this company might have: If this is an e-commerce company for instance, you can imagine that the data consist of all transactions in their websites, clients visits history, products listed, …
- What would be a Data Scientist doing their ? remember that after all, the company just wants to increase its revenues. If we take our e-commerce example, a Data Scientist will probably do some recommendation systems, analyze client interests, why people are buying, what product, automatically classify new products into categories. If they are doing Google Ads campaigns, they probably want to be able to maximize their impact while minimizing their bids (this is what I really had in an interview for a famous e-commerce company).
- When you understand all of these, prepare your questions to orient the interview, and bring the interviewer on this field so you can speak about what you understood from the business, and how you could solve these specific problems you already indentified. Of course, in most interviews you don’t need to do this, as the interviwer will ask these question by himself.
It will never be told enough, but the best way of learning something, is by doing it.
Again, I am a partisan of this philosophy. Before getting my first data scientist job, I applied to hundred of jobs, and got dozens of interviews at companies covering different industries (Consulting, Software, Healthcare, Telecommunication, …). Of course, I was systematically getting rejected at the beginning. But I did exactly what a Neural Network does to learn. Each time I got rejected, I used it to:
- Understand more and more the business.
- See the questions asked and was able to identify patterns.
- Most importantly, I took note of my weak points from each interview and enhanced them for the next one.
You probably will learn more by having ten interviews than reading a thousand book.
If we want to summarize, these are the steps to follow to land a first Data Science job:
- Be sure to know why you want to be a Data Scientist, and that you really want it.
- Understand what a Data Scientist job is. Read about it, ask people to give you details about their day to day work.
- Accept that you can’t master everything. Choose a field (Computer Vision, NLP, …) and try to learn the maximum you could about it.
- Learn by practicing, with personnal projects or Kaggle competitions. It will also be a good point in your portfolio to show that you know what you’re talking about.
- Forget about the titles, focus on functions: look in your experiance and find where you did things related to “Data”. Highlight those in your resume. And show how these experiences help you as a Data Scientist.
- For each application, understand what are the needs of the company, why they need a Data Scientist. Prove that you understand the business and show how you can bring value to the company as a Data Scientist.
- Finally, just apply. You’ll learn more than this article by applying and having your own conclusions.
I would like to highlight the fact that these are conclusions from my personnal experience. These are what “I” did to land my first Data Scientist job. Each career path is different, but you could probably use some of the advice if you want to succeed in your next job applications.