Analyzing Melbourne House Prices with R

A comprehensive practice guide for data analysis.

There are several libraries and packages that provide data scientists, analysts, or any one interested in data with functions to perform efficient data analysis. Most of them are well-documented so you can easily find out what a function does.

However, the best way to learn such libraries is through practice. It is not enough to know what a function does. We should be able to recall and use them at the right time and place. Thus, I highly recommend to practice to learn a package or library.

In this article, we will use R packages to explore and gain insight into Melbourne housing dataset available on Kaggle.

For data analysis and manipulation, we will be using the data.table package of R. Let’s import it and read the csv file.

> library(data.table)> house_prices <- fread("Downloads/melb_data.csv")> head(house_prices)

(image by author)

The dataset contains several attributes of the houses in Melbourne along with their prices. Since the focus of this dataset is the price, it is better to get an overview of the price column first.

> house_prices[, summary(Price)]   Min.  1st Qu.  Median     Mean  3rd Qu.     Max. 
85000  650000   903000  1075684  1330000  9000000

We use the summary function on the price column to get an overview in terms of basic statistics. The average house price is approximately 1.07 million.

We know the average house price in general. We might need to compare the house prices in different regions. This is a group by task and can easily be done by adding the name of column to be used for grouping.

> house_prices[, mean(Price), by = Regionname]

(image by author)

The aggregated column is represented as “V1” which is not very informative. We can assign a name to the aggregated column with a slight change in the syntax. Let’s also calculate the number of houses in each region along with the average house price.

> house_prices[, .(avg_price = mean(Price), number_of_houses = .N), by = Regionname]

(image by author)

Footer