Customizing visualizations
Altair is a statistical visualization library for Python. Its syntax is clean and easy to understand as we will see in the examples. It is also very simple to create interactive visualizations with Altair.
Altair is highly flexible in terms of data transformations. We can apply many different kinds of transformations while creating a visualization. It makes the library even more efficient for exploratory data analysis.
The first three parts of the Altair series covered the following topics.
In this article, we will see different ways of customizing visualizations with Altair. Creating informative visualizations that demonstrate the underlying structure within data or unveil the relationships among variables is critical part of data science.
It is also important to make them look nice and appealing. Thus, we should spend some time to customize the visualizations for better appearance.
We will be using an insurance dataset that is available on Kaggle. Let’s start by importing the libraries and reading the dataset into a Pandas dataframe.
import numpy as np
import pandas as pd
import altair as altinsurance = pd.read_csv("/content/insurance.csv")
insurance.head()
The dataset contains some measures (i.e. features) about the customers of an insurance company and the amount that is charged for the insurance.
We can create a scatter plot to inspect the relationship between body mass index (bmi) and insurance cost (charges). The smoker column can be used to separate smoker and non-smoker people.
(alt.
Chart(insurance).
mark_circle().
encode(x='charges', y='bmi', color='smoker').
properties(height=400, width=500))
All of the bmi values are higher than 15. Thus, it would look better if the ticks on y-axis start at 15. We can use the scale property to adjust to customize the y-axis.
(alt.
Chart(insurance).
mark_circle().
encode(
alt.X('charges'),
alt.Y('bmi', scale=alt.Scale(zero=False)),
alt.Color('smoker')).
properties(height=400, width=500))
In order to use the scale property, we specify the column with Y encoding ( alt.Y(‘bmi’) ) instead of passing a string ( y=’bmi’ ). The zero parameter is set as “False” to prevent the axis from starting at zero.
Here is the updated visualizations:
We can also use the domain parameter to specify a custom range. Let’s also change the size of the visualization using the properties function.
(alt.
Chart(insurance).
mark_circle().
encode(
alt.X('charges',
alt.Y('bmi', scale=alt.Scale(domain=(10,60)),
alt.Color('smoker')
).
properties(height=300, width=400))