Decide What Programming Language Is Better for Your Application

There are many different programming languages for various applications, such as data science, machine learning, signal processing, numerical optimization, and web development. Therefore, it is essential to know how to decide which programming language is more suitable for your application.

In this article, I will discuss the advantages and disadvantages of using Python, R, and Matlab. I will explain when and for what applications these programming languages are more suitable. I organize the outline based on popular research and work done extensively in the real-world.

Following is the outline of this article:

**Generic Programming Tasks****Machine Learning****Graphical and Probabilistic Modeling****Causal Inference****Time-Series Analysis****Signal Processing and Digital Communication****Control and Dynamical System****Optimization and Numerical Analysis.****Web-Development****Pros and Cons of Each Language****Conclusion**

Generic programming tasks are problems that are not specific to any application. For example, reading and saving data to a file, preprocessing CSV or text file, writing scripts or functions for basic problems like counting the number of occurrences of an event, plotting data, performing basic statistical tasks such as computing the mean, median, standard deviation, etc.

For these problems, either Python, R, or Matlab can be used with no problem. Python and Matlab are relatively comparable in speed, and they are both faster than R.

Matlab has a better visualization than R or Python and provides inherent support for matrix and vector manipulation. On the other hand, Python has better support for saving, reading, and performing various operations on CSV and text data, thanks to the Pandas library.

This is the area where Python and R have a clear advantage over Matlab. They both have access to numerous libraries and packages for both classical (random forest, regression, SVM, etc.) and modern (deep learning and neural networks such as CNN, RNN, etc.) machine learning models. However, Python is the most widely used language for modern machine learning research in industry and academia. It is the number one language for natural language processing (NLP), computer vision (CV), and reinforcement learning, thanks to many available packages such as NLTK, OpenCV, OpenAI Gym, etc.

Python is also the number one language for most research or work involving neural networks and deep learning, thanks to many available libraries and platforms such as Tensorflow, Pytorch, Keras, etc.

Probabilistic graphical models are a class of models for inference and learning on graphs. They are divided into undirected graphical models or sometimes referred to as Markov random field and directed graphical models or Bayesian network.

Python, R, and Matlab all have support for PGM. However, Python and R are outperforming Matlab in this area. Matlab, thanks to the BNT (Bayesian Network Toolbox) by Kevin Murphy, has support for the static and dynamic Bayesian network. The Matlab standard library (hmmtrain) supports the discrete hidden Markov model (HMM), a well-known class of dynamic Bayesian networks. Matlab also supports the conditional random field (CRF) thanks to crfChain (by Mark Schmidt and Kevin Swersky) and UGM by Mark Schmidt.

Python has excellent support for PGM thanks to hmmlearn (Full support for discrete and continuous HMM), pomegranate, bnlearn (a wrapper around the bnlearn in R), pypmc, bayespy, pgmpy, etc. It also has better support for CRF through sklearn-crfsuite.

R has excellent support for PGM (both in the structure learning discussed in the next section and parameter learning and inference). It has numerous stunning packages and libraries such as bnlearn, bnstruct, depmixS4, etc. The support for CRF is done through the CRF and crfsuite packages.

R by far is the strongest programming language for causal inference. It is the most widely used one in industry and research (along with SAS and STATA; however, R is free while the other two are not). It has numerous libraries such as bnlearn, bnstruct for causal discovery (structure learning) to learn the DAG (directed acyclic graph) from data. It has libraries and functions for various techniques such as outcome regression, IPTW, g-estimation, etc.

Python also, thanks to the dowhy package by Microsoft research, is capable of combining the Pearl causal network framework with the Rubin potential outcome model and provides an easy interface for causal inference modeling.

R is also the strongest and by far the most widely used language for time series analysis and forecasting. Numerous books have been written about time series forecasting using R. There are many libraries to implement algorithms such as ARIMA, Holt-Winters, exponential smoothing. For example, the forecast package by Rob Hyndman is the most used package for time series forecasting.

Python, thanks to neural networks, especially the LSTM, receives lots of attention in time series forecasting ¹. Furthermore, the Prophet package by Facebook written in both R and Python provides excellent and automated support for time series analysis and forecasting.

This is the area where Matlab is the strongest and is used often in research and industry. Matlab communications toolbox provides all functionalities needed to implement a complete communication system. It has functionalities to implement all well-known modulation schemes, channel and source coding, equalizer, and necessary decoding and detection algorithms in the receiver. The DSP system toolbox provides all functionalities to design IIR (Infinite Impulse Response), FIR (Finite Impulse Response), and adaptive filters. It has complete support for FFT (Fast Fourier Transform), IFFT, wavelet, etc.

Python, although is not as capable as Matlab in this area but has support for digital communication algorithms through CommPy and Komm packages.

Matlab is still the most widely used language for implementing the control and dynamical system algorithms thanks to the control system toolbox. It has extensive supports for all well-known methods such as PID controller, state-space design, root locus, transfer function, pole-zero diagrams, Kalman Filter, and many more. However, the main strength of Matlab is coming from its excellent and versatile graphical editor Simulink. Simulink lets you simulate the real-world system using drag and drop blocks (It is similar to the LabView). The Simulink output can then be imported to Matlab for further analysis.

Python has support for control and dynamical system through the control and dynamical systems library.

All three programming languages have excellent support for optimization problems such as linear programming (LP), convex optimization, nonlinear optimization with and without constraint.

The support for optimization and numerical analysis in Matlab is done through the optimization toolbox. This supports linear programming (LP), mixed-integer linear programming (MILP), quadratic programming (QP), second-order cone programming (SOCP), nonlinear programming (NLP), constrained linear least squares, nonlinear least squares, nonlinear equations, etc. CVX is another strong package in Matlab written by Stephen Boys and his Ph.D. student for convex optimization.

Python supports optimization through various packages such as CVXOPT, pyOpt (Nonlinear optimization), PuLP(Linear Programming), and CVXPY (python version of CVX for convex optimization problems).

R supports convex optimization through CVXR (Similar to CVX and CVXPY), optimx (quasi-Newton and conjugate gradient method), and ROI (linear, quadratic, and conic optimization problems).

This is an area where Python outperforms R and Matlab by a large margin. Actually, neither R nor Matlab are used for any web development design.

Python, thanks to Django and Flask, is a compelling language for backend development. Many existing websites, such as Youtube, Pinterest, and Instagram, use Python as their backend.

Django is a full-stack platform that gives you everything you need right off the box (Battery-included). It also has support for almost all well-known databases. On the other hand, Flask is a lightweight platform that is mainly used to design less complex websites.

This section will discuss the cons and pros of each programming language and summarize what was discussed in previous sections.

**Matlab**

**Advantage:**

- Many wonderful libraries and the number one choice in signal processing, communication system, and control theory.
- Simulink: One of the best toolboxes in MATLAB is used extensively in control and dynamical system applications.
- Lots of available and robust packages for optimization, control, and numerical analysis.
- Nice toolbox for graphical work (Lets you plot beautiful looking graphs) and inherent support for matrix and vector manipulation.
- Easy to learn and has a user-friendly interface.

**Disadvantage:**

- Proprietary and not free or open-source, which makes it very hard for collaboration.
- Lack of good packages and libraries for machine learning, AI, time series analysis, and causal inference.
- Limited in terms of functionality: cannot be used for web development and app design.
- Not object-oriented language.
- Smaller user community compared to Python.

## Python

**Advantage:**

- Many wonderful libraries in machine learning, AI, web development, and optimization.
- Number one language for deep learning and machine learning in general.
- Open-source and free.
- A large community of users across GitHub, Stackoverflow, and …
- It can be used for other applications besides engineering, unlike MATLAB. For example, GUI (Graphical User Interface) development using Tkinter and PyQt.
- Object-oriented language.
- Easy to learn and user-friendly syntax.

**Disadvantage:**

- Lack of good packages for signal processing and communication (still behind for engineering applications).
- Steeper learning curve than MATLAB since it is an object-oriented programming(OOP) language and is harder to master.
- Requires more time and expertise to setup and install the working environment.

## R

**Advantage:**

- So many wonderful libraries in statistics and machine learning.
- Open-source and free.
- Number one language for time series analysis, causal inference, and PGM.
- A large community of researchers, especially in academia.
- Ability to create web applications, for example, through the Shiney app.

**Disadvantage:**

- Slower compared to Python and Matlab.
- More limited scope in terms of applications compared to Python. (Cannot be used for game development or cannot be as a backend for web developments)
- Not object-oriented language.
- Lack of good packages for signal processing and communication (still behind for engineering applications).
- Smaller user communities compared to Python.
- Harder and not user-friendly compared to Python and Matlab.

To summarize, Python is the most popular language for machine learning, AI, and web development while it provides excellent support for PGM and optimization. On the other hand, Matlab is a clear winner for engineering applications while it has lots of good libraries for numerical analysis and optimization. The biggest disadvantage of Matlab is that it is not free or open-source. R is a clear winner for time series analysis, causal inference, and PGM. It also has excellent support for machine learning and data science applications.

In this article, I discussed the pros and cons of using Python, R, and Matlab. I also discussed when and for what applications each programming language is more suitable.