Data — The Most Important Aspect of Any Data Science Project

80% of the Work in Data Science is Data Preparation, the Remaining 20% is Coding

Carla Martins
4 min readAug 9, 2022
Photo by Ferenc Almasi on Unsplash

Even if your path in the world of Data Science is new, you have certainly heard or read somewhere that 80% of the Data Scientist’s work is data preparation. It may not sound like a noble task for a prestigious and sexy Data Scientist, but you should always remember that data preparation is quite difficult, time-consuming, and requires a lot of knowledge and expertise. In this article, I will discuss the importance of Data for every Data Science project.

What is data?

Data is information that can be found, stored, and labeled. Data assumes several formats, which can be numeric, text, video, and sound… and can be stored in digital format (the one that interests us), but also as paper, analog, or other formats.

→ In the IT world data is always in digital format, and allows users and researchers to perform logical (mathematical) operations.

→ With computers, it became easier for humans to store, read, and perform several operations with data. All these advantages would not be possible without modern computers.

Where is the data?

--

--

Carla Martins
Carla Martins

Written by Carla Martins

Compulsive learner. Passionate about technology. Speaks C, R, Python, SQL, Haskell, Java and LaTeX. Interested in creating solutions.

No responses yet