While this metaphor does have its strengths, ultimately data as a resource is very different from oil, and this comparison grossly oversimplifies the nature of data. In order to take advantage of data when solving business problems, we need to understand what kind of resource data really is.
Oil is a finite resource, while data is virtually infinite
While there may be many undiscovered oil reserves in the world, there is a finite amount of oil left on our planet. At some point, we will run out of oil and be forced to transition to other forms of energy. In 2019, the U.S. on average alone consumed 20.54 million barrels of petroleum per day. However, sources from as early as 2018 claim that 2.5 quintillion bytes of data are produced each day globally.
With the number of internet users growing exponentially, we can safely say that data is practically infinite. We will never really run out of data. In fact, we will keep creating more and more indefinitely. This concept leads to the next point.
Oil is consumed, but data is created
When oil is used as fuel it is consumed once and permanently destroyed. Data, on the other hand, is created and does not have to be destroyed even after we use it for analytics. In the information age, everyday human actions generate data every day. Here are a few examples:
- When someone creates a Facebook profile, they are creating data.
- When someone accepts a friend request on Facebook they have created data that Facebook can use for friend suggestions.
- When you watch a movie on Netflix, you are creating data for the movie recommendation algorithm.
- When you buy something on Amazon, you are creating data for Amazon’s recommendation system.
- When you search for something on Google, you are creating data in the form of your search history.
What this means is that data is an asset that doesn’t have to go away and can remain useful for a long time. Technology companies can keep collecting data about customer behavior for years in order to build more robust models that can provide a better experience for customers. Just imagine how much more sophisticated Amazon’s product recommendation system will be after learning patterns in another ten years of online shopping. By updating and improving algorithms with the arrival of new data, companies can turn data into an asset that keeps adding value.
Privacy and ethics come into play when collecting data
So far, data sounds like the ultimate resource for any company. The fact that data is virtually infinite and continues to be created every day seems too good to be true. And truthfully, there are some caveats to this idea. Not all of the data that the world produces is directly accessible to businesses. In fact, a significant amount of potentially useful data may be protected by privacy guidelines and laws. Naturally, there are also ethical concerns that may occur when using data collected from customers. Companies that produce digital products and collect customer data may have to keep the following questions in mind:
- What kinds of customer data can they legally collect?
- What data must remain private if it is collected?
- How can the company protect private customer data from data breaches?
- Is it ethical to use the data collected from real customers for analytics?
These are very real issues and failing to consider them can have serious consequences for companies. Take the famous Facebook-Cambridge Analytica scandal of 2018 for example. In this data scandal, Cambridge Analytica, a British political consulting firm, collected personal information without consent from millions of Facebook users for the purpose of political advertising. This scandal was so serious that it led to the downfall of Cambridge Analytica and caused Facebook’s market cap to fall by over $100 billion in just a few days.
Although there are ethical issues involved in drilling for oil, the privacy concerns that apply to data do not apply to oil. Data is powerful because it is abundant and fuels analytics and artificial intelligence, but with great power comes great responsibility.
Invest in data infrastructure
Like oil, data is a resource that requires both collection and storage infrastructure to maintain. If you are part of a company that plans to take advantage of analytics or data mining, you need to make sure you have data infrastructure in place to manage your data. Whether your data management solution exists on the cloud or on a physical server that your company owns, you need to make sure it is available, fault-tolerant, and cost-effective.
Collect quality data that is actually useful
The quality of any practical analytics or AI solution is dependant on the data used to build it. High-quality data leads to high-quality analytics. Low-quality data leads to low-quality analytics. If your raw data contains missing or inaccurate information, you may have to refine it until it reaches the level of quality that you need for analytics.
Data can be an asset that keeps adding value
While more oil will not necessarily make a combustion engine perform better, more data has the potential to produce more robust predictive models. Having a system that allows you to collect and store more and more data for training and refining models allows you to turn data into an asset that keeps adding value to your business.
Be aware of the ethical issues involved in data analytics
Data analytics is powerful, but with great power comes great responsibility. Data, especially customer data is a resource that must be handled ethically and responsibly. Always consider the ethical and legal implications of your work if you plan to use customer data or otherwise private data for analytics.
- Data is similar to oil because it acts as the fuel for analytics and artificial intelligence.
- Like oil, data requires infrastructure in order to collect, store, and maintain it.
- While data is similar to oil, it is much more complex than oil as a resource because it is created and not destroyed and can keep adding more value as more of it becomes available.
- Unlike oil, collecting data comes with issues of privacy and ethics that must be carefully considered.
- While data is valuable like oil, we need to look at it differently when understanding the potential of data as a resource for advancing businesses.
Do you want to get better at data science and machine learning? Do you want to stay up to date with the latest libraries, developments, and research in the data science and machine learning community?
Join my mailing list to get updates on my data science content. You’ll also get my free Step-By-Step Guide to Solving Machine Learning Problems when you sign up!
- R. K. Ragan and T. Strasser, Big Data: The New Oil Fields, (2020), Credit Union Times.
- L. Adamson, Is Data the New Oil? , (2019), LinkedIn Pulse.
- Wikipedia, Facebook–Cambridge Analytica data scandal, (updated 2020), Wikipedia, the free Encyclopedia.