In this tutorial, you will the proper ways to extractlive personal data from your Linkedin account and use Python to analyze and draw useful insights from it.
First things first…
If you don’t have a Linkedin account, please run as fast as you can to Linkedin page to create one.
Actually it’s not a good habit to not have a Linkedin account in this modern world…lol
Linkedin is one of the biggest social network out there, and the chances are you are proud Linkedin member (if not create one now-please).
Linkedin gives you access you to your data and you can download and analyze this data to draw insights from it.
Linkedin has a clear guide as to how to download your data. I have included the most important part of this guide below for your reference:
You can initiate a download of your LinkedIn data from your Settings & Privacy page.
To request a download of your data:
- Click the
- Select Settings & Privacy from the dropdown.
- Click the Data Privacy on the left rail.
- Under the How LinkedIn uses your data section, click Get a copy of your data.
- Select the data that you’re looking for and Request archive.
You’ll have the option to either select a specific category of information or a larger download. If you select a specific type of content, you’ll receive an email within minutes with a link. If you select the larger data archive, you’ll receive an email within 24 hours. Use the link provided in the email to download the information you requested. The data archive will be available for download for 72 hours.
You can select the type of data you are interested in downloading for your analysis. In this tutorial, we will be using the Connections data. Feel free to also analyse anything else that you are curious about.
You can easily download your data as soon as it ready.
We will load our downloaded data using Pandas and store the data in a variable called lkd(use any variable name of your choice). We will then see the first 20 records of our linkedin data.
As we can see, the above is an output of my Linkedin connections.
Let’s see how many connections I have on Linkedin:
So as you can, as at when am writing this tutorial, I have 8000+ connections on Linkedin, that’s not bad though.
Let’s find out how my Connection activities has been so far. That’s how I have been sending connections and receiving connections over time.
As we can see, there are spikes in my connection activities, there are some periods I had a lot of connections(e.g. on 17-April-2020) while there some days that my connections dropped (e.g on 10-Jun-2020).
Now that we know our connection activities over time, we will then try to find out where these connections are working.
I hope one of my connections is working at Tesla, I need to drive one of the new Tesla models…lol
Now let’s find out where these connections that we have connected with on Linkedin are working.
We will analyse the Company column in order to identify the companies our connections are working at.
This does not give us clarity of what we want.
Let’s therefore use the groupby() function to group our data by the Company and use the count() function to count how many of our Connections work in the various companies.
This gives us the count of how many Connections are working in a particular company. It’s by default using ascending order-ascending=True (from the smallest count to the highest count). So example, I have 1 connection working at Vsolv Engineering India Pvt Ltd.
Let’s reverse the table. That is, by seeing the companies with highest count first.
Let’s sort these values in descending order instead by using the sort_values() function by setting ascending=False. We will sort it by the count of the ‘Connected On’ column.
You can also sort it by any other variable like First Name, Last Name or Position.
This one looks better. We can see that most of my connections are working at Tata Consultancy Services followed by Amazon, Accenture, Cognizant and so on…
Let’s use Plotly to visualise our data for better insights.
As we can see, Plotly has also make it more clear the various companies that my connections are working.
Now from the plot, we can see all the companies that our connections are working at and also how many are working in each company. For instance, I can see that most of my connections are working at Tata Consultancy Services and so on…
NB: If you get the any error, try to upgrade your plotly using the command below:
pip install — upgrade plotly
Plotting Tree Map
Treemap gives us a better view. The size of each company box represents the size of the connections working at that particular company.
When you plot the Tree Map, you can hover on the boxes to have a better view of the individual companies and the number of connections working there.
Let’s now try to find out which specific positions our connections are occupying.
From the above, we can see that most of my connections are Data Scientist, Data Analyst ,Software Engineer, Senior Data Scientist ,Machine Learning Engineer.
There is a break after Machine Learning Engineer, so I can’t really see all of my connections. Let’s do something about it.
From below, am going to count all the number of positions and find the percentage of each each position, and also give a condition to make the selection (e.g. I can find all the positions that are having more than 20% connections).
Am not interested in the True or False, so let me add the above two codes in one so that I will get the actual count of the positions and the job title.
Let’s combine the two codes from above…
Now it looks much better!!
Let’s visualise this with Plotly.
We can see the various positions and the number of connections holding these positions.
What if we get into the cloud?
Let’s use a WordCloud to have a better view.
We define a function called CreateWordCloud, which will take in a text and generate a wordcloud based on the text data fed to it.
This looks quite better right? Yeah I think so too.
Now in this tutorial, we have used the Connections data and analyse it and drawn some insights from it.
You can download any different type of your Linkedin data and perform similar analysis.
If enjoyed this tutorial, please give it a clap. That’s enough appreciation to make my day.
Thanks for clapping.