71.3k views
0 votes
A car company asked a data scientist to determine what type of customers are more likely to purchase their vehicles. However, the data comes from several sources and is in a relatively "raw format". What kind of processing can the data scientist perform on the data to prepare it for the Modeling stage?

1 Answer

3 votes

Answer:

In order for data to be useful, it must be prepared so that it can be analyzed and processed. Preparing data usually takes several steps:

  • First the data scientist must edit the data: correct incomplete, inconsistent or ambiguous answers.
  • Then the data scientist must assign codes to possible answers so that they can be processed statistically.
  • Then the data must be transcribed so that it can be processed.
  • It is always a good idea to remove all inconsistencies and extreme values before starting to statistically process the data.
  • Finally, the data scientist must decide which data analysis strategy should he/she use based on the previous steps and the characteristics of the dta itself.
User Talus
by
4.7k points