Currently, this generation of information technology represented by big data, the Internet, and artificial intelligence is developing vigorously, and has a huge and far-reaching impact on social progress, national economic development, and people’s lives.
What is big data
In May 2011, the McKinsey Institute issued an announcement-Big data the next frontier for innovation, competition, and productivity, which for the first time gave a relatively clear definition of big data: “Big data means that its size is beyond the norm. Database tools acquire, store, manage and analyze data sets.”
Big data is a data collection with large capacity, multiple types, fast access speed, and high application value. The collection, storage and correlation analysis of huge data with scattered sources and various formats are used to discover new knowledge, create new value, and enhance new capabilities of a new generation of information technology and service formats.” Academics and government organizations are concerned about “big data”. There are various definitions, but the understanding of the inherent characteristics of big data is basically the same, namely Volume (large volume), Variety (various types), Velocity (fast speed) and Value (high value).
What change big data bring to us
Big data is a rich mine of information and knowledge, which contains unlimited business opportunities and huge benefits. A study by the University of Texas on the effectiveness of data shows that companies can significantly improve their business performance by improving their own data usage and data quality.
If the data usage rate of enterprises increases by 10%, the output per capita of industries such as retail, consulting, and aviation will increase by 49%, 39% and 21% respectively. If the data usage rate of the median company in the Fortune 1000 increases by 10%, it can increase operating income by $2 billion each year, and output per capita will increase by 14%. The improvement of data quality will have a more significant impact on enterprises.
If the quality of corporate data is improved by 10%, then the income of utilities, aviation, telecommunications, petrochemical and other industries will increase significantly, and the return on equity will increase by more than 200%, and the net asset return of the Fortune 1000 median company The rate increase is about 76%.
The volume of big data
First, let’s take a look at the changes in the total amount of global data in recent years. In 2004, the total amount of global data was 30EB; in 2005, it reached 50EB; in 2006, it reached 161EB; by 2015, it reached an astonishing 7900EB; by 2020, it is expected to reach 35,000EB.
Data generated by a small number of enterprise applications, such as data in relational databases and data in data warehouses, data generated by a large-scale population, such as data generated by people using social software, entertainment software, online shopping platforms, enterprise software, etc.;
Data generated by a huge number of machines, such as application server logs, various sensor data, image and video surveillance data, two-dimensional code and barcode (bar code) scan data, according to the industry in which the data is generated, it can be divided into five types.
Internet companies
Data generated by Internet companies, take Ebay as an example. The total amount of Ebay’s data exceeds a thousand petabytes. The data covers webpages, website promotion, website logs, and has a huge search data.
Service Providers
Data generated by telecommunications, finance, insurance, power, and petrochemical systems. Take the telecommunication industry as an example. The telecommunications industry data includes data such as user Internet records, calls, information, geographic location, etc. The amount of data owned by operators is nearly 100 petabytes, and the annual user data increases by more than 10%;
Safety and medical care
Data generated in the fields of public safety, medical care, and transportation. As a simple example, a large city can record 300 million traffic bayonet records in a month.
Forecast and predictions
Data generated in the fields of meteorology, geography, and government affairs. Taking the amount of data owned by the Global Meteorological research as an example, the amount of data stored by the Meteorological research is nearly 100PB, and the amount of data is increasing by thousands of TB each year.
Traditional industries
Data generated by manufacturing and other traditional industries. The amount of data generated by the manufacturing industry and other traditional industries is also increasing rapidly, but it is still in the accumulation period, and the overall volume is not large, reaching the PB level at most, and reaching the level of tens or hundreds of TB at least. According to the type of data generated, it can be divided into structured and unstructured data. Big data is not only a huge amount of data, but also many types of data. Among the massive data, about 20% of the data is structured data, and the remaining 80% is unstructured data.
Big data application use cases
The use of big data has brought significant benefits to all walks of life. There are two main aspects of big data applications in the retail industry. One aspect is that the retail industry can conduct precise marketing of products and reduce marketing costs by understanding customers’ consumption preferences and consumption trends.
For example, record customers’ buying habits and remind customers to purchase some daily necessities before they are about to run out of them through precise advertising.
Another aspect is to provide customers with other products that may be purchased based on the products purchased by customers to expand sales. For example, through customer purchase records, we can understand the customer’s purchasing preferences for related products, put related products and products purchased by customers together for sale, and increase the sales of related products.
In addition, the retail industry can use big data to grasp future consumer trends to help the purchase management of hot-selling products and the processing of out-of-season products.
The big data application scenarios in the financial industry are relatively extensive. Typical application scenarios include banking data application scenarios, insurance data application scenarios, and securities data application scenarios.
The bank’s data application scenarios are relatively rich. It is basically concentrated on user management, risk control, product design and decision support. Banks can refer to the consumption records of POS machines to locate these high-end wealth management groups, provide them with customized wealth management solutions, absorb them as wealth management customers, and increase sales of deposits and wealth management products.
Secondly, insurance data application scenarios generally revolve around products and customers. Typical application scenarios include using user behavior data to set auto insurance prices, using customer external behavior data to understand customer needs, and recommend products to target customers. For example, find auto insurance customers for insurance companies based on personal information data, external car maintenance APP data, etc. Finally, the types of data owned by the security’s industry include personal attribute data (including name, contact information, home address, etc.), asset data, transaction data, income data, etc. Security companies can use these data to establish business scenarios, screen target customers, and provide users Provide suitable products to increase individual customer revenue.