Customise Jupyter Lab, an IDE tool, for yourself
If you are a Data Scientist or a Data Engineer using Python as your primary programming language, I believe you must use Jupyter Notebook. As the “next-generation” web-based application for Jupyter Notebook, Jupyter Lab provides much more convenient features than its old bother. One of them is the extensions.
Now, even the Jupyter Lab development team is excited to have such a robust and thrive third-party extension community. In this article, I’ll introduce 10 Jupyter Lab extensions that I found are very useful to dramatically improve the productivity of a typical data scientist or data engineer.
Most of the online resource will tell you to run the command like the following to install a Jupyter Lab extension.
jupyter labextension install @jupyterlab/...
Well, using the command line is also my favourite. However, if you are a user of VS Code, Sublime or Atom, you might also want to directly search what you want to install in a “manager”. Jupyter Lab does provide this feature.
As shown in the screenshot, you can go to the 4th tab on the left navigation, which is the extension manager. Then, you can search whatever you want to get the right extension for your needs. So, you don’t even need to know the extension beforehand.
Now, let’s have a look at what are the recommended extensions!
We all love Jupyter because of its interactive. However, sometimes, the debugging feature is necessary for coding. For example, we may want to run a for-loop step by step to see what is exactly happening inside. Most of the IDE tools supports this debugging feature with “step over” and “ step into”, but unfortunately not in Jupyter naturally.
@jupyterlab/debugger
is such an extension allows us to supplement this missing feature in Jupyter Lab.
Have a long notebook? Want to make your notebook more beautiful for a presentation? Or, Just want to have a table of content for your notebook? @jupyterlab/toc
comes to help.
With this extension, the table of content will be automatically generated based on the markdown cells with headings (make sure use the sharp signs ##
to specify your heading levels). This is also a good manner of using Jupyter Notebook that makes your work more systematic and organised.
Diagram.net (formerly Draw.IO) is my favourite tool for drawing diagrams. If you are one of my followers, I can tell you that most of the diagrams in my articles are drawn using this. It is indeed a perfect open-source alternative to MS Visio.
Now, with jupyterlab-drawio
, we can bring this perfect tool on Jupyter Lab. Please go check it out an enjoy drawing 🙂
One of the amazing features of Jupyter Notebook/Lab is that it provides lots of useful magic commands. For example, we can use %timeit
to test how long our code will take for running. It will run our code snippet hundreds or thousands of times and get the average to make sure to give a fair and accurate result.
However, sometimes we don’t need it to be such scientific. Also, it would be good to know how much time for each cell for running. In this case, it is absolutely overkilling to use %timeit
for every cell.
jupyterlab-execute-time
can help in this case.
As shown in the screenshot, it shows not only the elapse time of executing the cell, but also the last executed time. I promise you that this is a very convenient feature to indicate the order of your execution for those cells.
As a Data Scientist or Data Engineer, you must have to deal with spreadsheets sometimes. However, Jupyter does not natively support to read Excel files, which forces us to open multiple tools to switch between the Jupyter for coding and Excel for viewing.
jupyterlab-spreadsheet
solves this problem perfectly. It embedded the xls/xlsx spreadsheet viewing feature in the Jupyter Lab, so we can have all we need in a single place.
Python is not an execution effective programming language, which means that it may consume more CPU/memory resources compare the others. Also, one of the most common use cases for Python is Data Science. Therefore, we might want to monitor our system hardware resource to be aware that our Python code may freeze the operating system.
jupyterlab-topbar-extension
is the extension you may want to have. It will display the CPU and memory usage on a top bar of the Jupyter Lab UI so that we can monitor them in real-time.
While I love Jupyter, it does not do the code auto-completion as well as the other classic IDE tools do. The code auto-completion is very limited and slow.
You may ever hear of Kite, which is a free AI-powered code completion service. It is available in almost all the popular IDEs such as Sublime, VS Code and PyCharm. In fact, you can use this service in Jupyter Lab, too.
With this extension, we can code in Jupyter Lab more fluently.
If you are a Data Scientist who were switched from R studio or Matlab, you might be very familiar with the variables inspector that these tools provided. This feature is unfortunately not available in Jupyter Lab by default. However, the extension jupyterlab-variableInspector
brings this feature back to it.
Matplotlib is a must-learn Python library if you are a Data Scientist. It is a basic but powerful tool for Data Visualisation in Python. However, when we use Jupyter Lab, the interactive feature has gone.
The jupyter-matplotlib
extension can make your Matplotlib interactive again. Simply enable it using a magic command %matplotlib widget
, your fancy 3D chart will become interactive.
While Matplotlib is the most basic and powerful library for Data Visualisation, Plotly is my favourite library in this area. It wrapped many common charts that we can generate amazing charts in a few lines of code.
To make Jupyter Lab seamlessly support and be able to display interactive Plotly charts, jupyterlab-plotly
needs to be installed.
Of course, there are much more wonderful Jupyter Lab extensions available in the community. In this article, I have introduced 10 of them based on my preferences. These extensions help me to improve my productivity a lot in the past several years, and now I recommend them to you!
Life is short, use Python!