If you’re thinking about a programming
language
to learn then something which is being used widely should be a starting point. There’s no escaping this and it’s good thing. Python
is being used by everyone: Data
Scientists
and not.
If you look at Google
or Stack
Overflow
trends, Python
is super hot. More jobs also require Python
coders than R
coders.
If you want to be in with the crowd, Python
is a good place to start.
If you’re ever stuck on a problem or you can’t figure something out, the online community of coders has to be one of the nicest places out there. Knowledge
sharing in the coding
community
is altruistic and frequent, with many blogging and coding websites specifically targeting coders who are learning or are struggling to complete a task.
This community makes it so much easier to tackle harder problems and therefore, a bigger community allows coders to solve their problems quicker. I’ve relied heavily on the community and you will too.
When you make a piece of software or a nifty little tool, what do you do with it then? Do you want to keep it locally to look at or do you want to deploy it to fix the problem?
As a coder you should always be looking for problems and moreover, you should always be thinking about solutions. Is my code too slow? Is my code too clunky? Are my data tasks too bulky?
Regardless of what it is, you should be able to come up with a solution and integrate it easily. As Python
can essentially do the whole vertical when it comes to coding, if you’ve made some statistical code, it’s easier to integrate it with Python
than it is with R.
You have to wear multiple hats when you’re a data scientist. You have to be able to manage large data sets, you have to be able to clean them. Then you have to be able to analyse and deduce something important. After which you have to be able to integrate your findings into the business to improve an object (sales, efficiency etc).
Being able to do all of this in R
just isn’t possible. However w
With a larger online community meaning more resources to learn from, Python
has one other advantage over R
. Actually on UDacity, they say that R
is easier to pick up if you’ve learned something like C++
or Java
before.
Whilst I’m not fully convinced to the full extent of this logic, I can say for sure that R
is less obvious in its coding
syntax
. Having more commands and more libraries is part of the problem, but having operators like <-
and if
functions like:
if (test_expression) {
statement
}
It definitely isn’t as easy as Python
.
Python has a few key libraries that really do the bulk of the work:
Now these four libraries are open-source but they’ve been developed with a solid group of volunteers. They’ve been developed in a professional way so that all the documentation is complete, most functions come with tests (to ensure functionality) and also, documentation links to the academic references for which they follow.
On the other hand, R
packages are generally made by academics/specialists who have little time and therefore, may not be able to make as high quality documentation as we’d like.
You can read the following discussion on Reddit for more information:
Or you can look at common R
libraries and see what you think.
Quick answer => Python
!
But in reality, you shouldn’t use either of these languages for speed.
Empirical data is often the most informative and looking at the website indeed.com, as of right now (04/02/2020), there are 4,100 Python Developer jobs and only 291 jobs under the category of R Developer.
That means there’s almost 30x more Python jobs going than R jobs on this one website.
In light of that, a statistician
would tell you that you’re also more likely to get a higher paid job, though you could argue R is more specialised so requires a higher salary. According to Business Insider, R has a slightly higher average salary.
From my personal experience though, every company I’ve worked with has generally opted for a Python based infrastructure and if they haven’t got it, they’re working towards it.