3 Pandas Functions That Will Make Your Life Easier

1. Convert_dtypes

For an efficient data analysis process, it is essential to use the most appropriate data types for variables.

It is mandatory to have a specific data type in order to use some functions. For instance, we cannot do any mathematical operations on a variable with object data type. In some cases, string data type is preferred over object data type to enhance certain operations.

Pandas offers many options to handle data type conversions. The convert_dtypes function converts columns to the best possible data type. It is clearly more practical to convert each column separately.

Let’s create a sample dataframe that contains columns with object data type.

import numpy as np
import pandas as pdname = pd.Series(['John','Jane','Emily','Robert','Ashley'])
height = pd.Series([1.80, 1.79, 1.76, 1.81, 1.75], dtype='object')
weight = pd.Series([83, 63, 66, 74, 64], dtype='object')
enroll = pd.Series([True, True, False, True, False], dtype='object')
team = pd.Series(['A','A','B','C','B'])df = pd.DataFrame({
'name':name,
'height':height,
'weight':weight,
'enroll':enroll,
'team':team
})

df (image by author)

The data type for all columns is object which is not the optimal choice.

df.dtypes
name      object
height    object
weight    object
enroll    object
team      object
dtype: object

We can use the convert_dtypes function as below:

df_new = df.convert_dtypes()df_new.dtypes
name       string
height    float64
weight      Int64
enroll    boolean
team       string
dtype: object

The data types are converted to the best possible option. A useful feature of the convert_dtypes function is that we can convert the boolean values to 1 and 0. It is more appropriate for data analysis.

We just need to set the convert_boolean as False.

df_new = df.convert_dtypes(convert_boolean=False)

(image by author)

1. Convert_dtypes

Footer