Every data science project can be seen as a sequence of steps, each step depends on the previous ones.

Before learning more about data science life cycle, I suggest you take a look in advance at what data science is and what its goals are, by reading this article : **Data Science : The Buzzword****.**

Data science is a multidisciplinary field that uses methods, algorithms, mathematics, statistics, programing and logic in order to extract insights and relevant ideas from structured and unstructured data.

For a better understanding of **What is Data Science ?**, let’s explore its life cycle and understand each stage.

For the first step, we have to clearly define the problem that we should solve.

** Example 1 : Suppose M. Ahmad has a car agency, his goal is to improve the sales of his comany by identifying the drivers of sales.*

*To accomplish his objective, he needs to answer the following questions :*

*How to estimate the car price ?**How are the in-store promotions working ?**Are the car placements effectively deployed ?*

*His primary aim is to answer these questions which would surely influence the outcome of the project. So, he appoints you as a Data Scientist. Let’s solve his problem using the Data Science process. **

The first essential step before starting any data science project. As its name indicates, it’s the comprehension of the application field.

** Suppose that M. Ahmad want to answer the first question : be able to estimate the cars price.*

*So first, we have to get familiar with the application field, witch is car sales. **

For every problem we should know the data source, there are several ways to discover data from various sources which could be :

- In an unstructured format like videos or images
- In a structured format like in text files.
- From relational database systems.

** To solve the problem given at the previous example, we will use the scrapping technic (*extract content from websites) *to collect data from an e-commerce website (like **www.avito.ma)** to get cars freatures.*

Once the data identification step is completed, the data file will look something like this, set of lines and colomns, containing cars features :