- Introduction
- Data Scientist
- Computer Scientist
- Similarities and Differences
- Summary
- References
Data Science and Computer Science often go hand-in-hand, but what really makes them different? What do they have in common? After experiencing several different roles in Data Science at various companies, I have realized general themes of the Data Science process, along with how Computer Science is incorporated into that process as well. It is important to note the differences between these two positions, as well as when one requires the other, and vice versa. Usually, a Data Scientist will benefit from learning Computer Science first, and then specializing in Machine Learning algorithms. However, some Data Scientists start straight into statistics before learning how to code, focusing on the theory of Data Science and Machine Learning algorithms. That was my approach, with learning Computer Science and programming afterward. That being said, does a Data Scientist need to know Computer Science? The short answer is yes. While Computer Science can encompass Data Science, especially critical in artificial intelligence, I believe the main theme of Computer Science is software engineering. Keep on reading if you would like to learn more about the differences between (these two roles), as well as their respective similarities. I will also dive deeper into the focus of each of these positions, including common tools, skills, languages, steps, and concepts.
So what does a Data Scientist actually do? We hear the buzzwords often in the tech industry, but are those actually the keywords we employ in our everyday work? The answer is yes and no. There are undoubtedly many main tools and languages, that I at least employ, daily. As a Data Scientist, I am required to explore the data of the company while also connecting how data affects a product. Ultimately, a Data Scientist will be encouraged to study current data, find new data, solve business and product issues, all with the use of Machine Learning algorithms (e.g., Random Forest). Some of the same problems can be solved by Computer Scientists as well. But for the sake of job title, it is essential to have someone who is solely focused on Machine Learning algorithms as the method of making an otherwise manual process not only more efficient, but more accurate.
Here are some of the steps of the Data Science process that a Data Scientist can expect to employ:
- exploring current data, as well as finding new data
- using SQL to query and understand the company data
- using Python or R to explore data in a dataframe (or something similar)
- performing exploratory data analysis (using libraries like pandas_profiling)
- isolating the business question and possible impact a model should have for success
- searching and running base Machine Learning algorithms to compare against the null or current process
- optimizing the final or ensemble of algorithms for the best results
- displaying results with some type of visualization (e.g., seaborn, Tableau)
- working alongside a Computer Scientist, perhaps, or an MLOps Engineer
- — to deploy and predict with your final model in the company ecosystem
- finally, summarize your improvements
As you can see, this process can sometimes be shared with others like that of an Artificial Intelligence Engineer, Data Engineer, Computer Scientist, MLOps Engineer, Software Engineer, and so on. What makes the Data Scientist role unique is the focus of Machine Learning theory and its effect on a business problem.
Here are some of the tools that a Data Scientist can expect to employ:
- SQL
- R, SAS
- Python
- Tableau
- Jupyter Notebook
- PySpark
- Docker
- Kubernetes
- Airflow
- AWS/Google
While the Data Science process is more ‘set in stone’, like how the Scientific Process is, the tools that a Data Scientist uses are up for more interpretation. That being said, I would say most Data Scientists focus on using SQL, Python, and a Jupyter Notebook (or something similar). This focus is because these tools or languages can be applied to any business. However, some companies will have certain preferences or requirements that allow you to choose Google Data Studio over Tableau for example. Next, we will talk specifically about the Computer Scientist role.
While the field of Computer Science is more popular than the title of Computer Scientist, there are still roles out there that focus on this role name solely. That being said, Computer Science in the workforce tends to aim towards Software Engineering specifically. Other roles that could follow under Computer Scientist include, but are not limited to: Database Administrator, Hardware Engineer, Systems Analyst, Network Architect, Web Developer, and a plethora of IT roles. This variety makes the definition of a Computer Scientist a little more difficult to define, kind of how like Data Science can include Machine Learning operations, Data Engineering, Data Analytics, and so on. Ultimately, it will be up to you and the company you work for to define your role in Computer Science. A job description, of course, is an easy way to find out what the specific sub role is like.
Here are some of the steps of the Computer Science process a Computer Scientist can expect to employ:
- understanding the business, data, products, and of course, software
- for a specific problem, define the requirements
- understand and design the system and software
- implementing the process as well as unit testing
- how the software will be integrated and how it affects the system
- and lastly, operations and maintenance
Although this process is not exactly like that of a specialized Data Scientist, it still does share some of the broader aspects of a more technical process, including, but not limited to understanding software, data, and implementing an improvement, while thereafter, analyzing and reporting on its effect.
Here are some of the tools and lanagues of Computer Science a Computer Scientist can expect to employ:
- IDE’s
- testing software
- Python, and other Object-Oretined Programming languages
- Slack
- Amazon
- Notes
- Atom
- Visual Studio
- Microsoft Azure
- GitHub
- Atlassian
There are countless tools and languages that a Computer Scientist can expect to employ. Once again, it really is up to what you are focusing on — is it Software Engineering, is it Network Analysis, is it IT? Hopefully, there is a role out there for you that not only you are qualified for, but one that you prefer to work under. Next, I will be diving deeper into the similarities and differences between the Data Scientisnce and Computer Scientist position.
Now that we have discussed the main themes and expectations of these two roles: Data Science and Computer Science, we will now highlight both the similarities and differences amongst them. Of course, there are more points to be discussed, but these are some of the ones that come to mind as main players from my experience.
Here are the similarities that you can expect between the two roles:
- both require an understanding of the business and its products
- both require working knowledge of the data at the company
- both roles usually mean are fluent with the use of Git or GitHub
- both overall follow a systemic approach to the scientistic process
- both are expected to be leaders in technology
- both usually are proficient in one programming language
- both can start in the other respective role and switch to the other
- both are cross-functional
The similarities between the roles highlight the field of technology that these roles are within.
Here are the differences that you can expect between the two roles:
- Data Scientists focus more on Machine Learning algorithms
- Computer Scientist focus more on software design
- Computer Scientists as a role is more encompassing with more variety
- education between the two is different, usually a Computer Science degree and a Data Science degree
- Data Scientists have a background in statistics
- Computer Scientists have a background in Computer Engineering
- Computer Scientists are more automation and object-oriented-focused
- Data Scientists often work with Product Managers or other business-facing roles more
Because these roles are both very inclusive of other sub roles, they can differ vastly from one another at one company, and be surprisingly overlapping at another company.
As you can see, these positions require different skills, tools, and languages; however, they also share some qualities of those same aspects. The main concept of a Data Scientist is to solve business problems using Machine Learning algorithms, while the main theme of a Computer Scientist is either the direction of object-oriented programming and Software Engineering or more towards IT requiring a general working knowledge of everything that is a computer.
Overall, here are both roles summarized:
Data Scientists: business analysis, research, data, statistics, and Machine Learning algorithmsComputer Scientists: programming, Software Engineering, productionaliztion, DevOps, automation, IT, Networking, Database Administration, Hardware, Systems Analytics, and Web Development
I hope you found this article both interesting and useful. Keep in mind this article is based on my opinion and personal experiences with both roles. If you disagree or agree, feel free to comment down below why and the specific things you would add. Do you like being a Data Scientist more, or a Computer Scientist more? Do you think they should be consolidated into one role? Is there really a difference? As a Computer Scientist, what is your focus? Is it IT or something like Networking? It would be interesting to receive some insight from others so that everyone can learn from others in order to find out the best representation of the similarities and differences between Data Science and Computer Science. Thank you for reading and feel free to check out my profile or read other articles and contact me if you have any questions about any of them.
If you would like to check out my profile to learn more about Data Science feel free to, as well as check out my other, similar article on Data Science vs Machine Learning Ops Engineer [5]. It highlights the differences and similarities between Data Science and MLOps, both of which share plenty of tools and experiences, while also differing:
Thank you!
[1] Photo by Oğuzhan Akdoğan on Unsplash, (2019)
[2] Photo by Markus Spiske on Unsplash, (2018)
[3] Photo by Kari Shea on Unsplash, (2017)
[4] Photo by Eric Prouzet on Unsplash, (2020)
[5] M.Przybyla, Data Scientist vs Machine Learning Ops Engineer. Here’s the Difference., (2021)