Master The Loc Method in Pandas

Boolean Array

Lastly, we can use an array of boolean values. However, this array of boolean values must have the same length as the axis we are using it on. For example, our ufo dataframe has a shape of (18241, 5) according to the shape attribute we used above, meaning it has 18241 rows and 5 columns.

So if we want to use a boolean array to specify our rows, then it would need to have a length of 18241 elements as well. If we want to use a boolean array to specify the columns, it would need to have a length of 5 elements. The most common way of creating this boolean array is by using a conditional, which creates an alignable boolean series.

For example, let’s say we wanted to select only the rows that include Abilene as the city in which the ufo sightings took place. We can start with the following condition:

ufo.City == ‘Abilene’

Note how this returns a pandas series (or array like object) that has a length of 18241 and is made up of boolean values (True or False). This is the exact number of values we need to be able to use this boolean array to specify our rows using the loc method!

Imagine we are overlaying this series of True and False values over the index of our ufo dataframe. Wherever there is a True boolean value in this series, that specific row will be selected and will show up in our dataframe.

We can see above that the first True value shows up in the 4th row, with label 3, which means that the first row we will see once we use this array of boolean values with our loc method is the row with the label 3 (or 4th row in our ufo dataframe).

ufo.loc[ufo.City == ‘Abilene’, :]

ufo sightings in Abilene

And that is exactly what we see! We have specified the rows we want using an array of boolean values with a length equal to the number of rows in our original dataframe.

Boolean Array

Footer