• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • Crypto Currency
  • Technology
  • Contact
NEO Share

NEO Share

Sharing The Latest Tech News

  • Home
  • Artificial Intelligence
  • Machine Learning
  • Computers
  • Mobile
  • Crypto Currency

Analyzing Melbourne House Prices with R

March 6, 2021 by systems

Soner Yıldırım

A comprehensive practice guide for data analysis.

Photo by Hamza Zaidi on Unsplash

There are several libraries and packages that provide data scientists, analysts, or any one interested in data with functions to perform efficient data analysis. Most of them are well-documented so you can easily find out what a function does.

However, the best way to learn such libraries is through practice. It is not enough to know what a function does. We should be able to recall and use them at the right time and place. Thus, I highly recommend to practice to learn a package or library.

In this article, we will use R packages to explore and gain insight into Melbourne housing dataset available on Kaggle.

For data analysis and manipulation, we will be using the data.table package of R. Let’s import it and read the csv file.

> library(data.table)> house_prices <- fread("Downloads/melb_data.csv")> head(house_prices)
(image by author)

The dataset contains several attributes of the houses in Melbourne along with their prices. Since the focus of this dataset is the price, it is better to get an overview of the price column first.

> house_prices[, summary(Price)]   Min.  1st Qu.  Median     Mean  3rd Qu.     Max. 
85000 650000 903000 1075684 1330000 9000000

We use the summary function on the price column to get an overview in terms of basic statistics. The average house price is approximately 1.07 million.

We know the average house price in general. We might need to compare the house prices in different regions. This is a group by task and can easily be done by adding the name of column to be used for grouping.

> house_prices[, mean(Price), by = Regionname]
(image by author)

The aggregated column is represented as “V1” which is not very informative. We can assign a name to the aggregated column with a slight change in the syntax. Let’s also calculate the number of houses in each region along with the average house price.

> house_prices[, .(avg_price = mean(Price), number_of_houses = .N), by = Regionname]
(image by author)

Filed Under: Artificial Intelligence

Primary Sidebar

Stay Ahead: The Latest Tech News and Innovations

Cryptocurrency Market Updates: What’s Happening Now

Emerging Trends in Artificial Intelligence: What to Watch For

Top Cloud Computing Services to Secure Your Data

The Future of Mobile Technology: Recent Advancements and Predictions

Footer

  • Privacy Policy
  • Terms and Conditions

Copyright © 2025 NEO Share

Terms and Conditions - Privacy Policy