Final answer:
Data cleansing is the process of removing inaccurate or duplicate information, while filtering selects specific columns from a table for more efficient analysis. Analyzing raw data is essential for determining its support for hypotheses or deriving meaningful insights, often using statistical measures like median and variation.
Step-by-step explanation:
To complete the sentences related to the activities performed in the transformation phase when moving data to a data warehouse:
Data cleansing is a process of deleting inaccurate or duplicate data. This is a crucial step to ensure the quality of data in the analysis. On the other hand, Filtering is a process carried out to select only the necessary columns from a table instead of the entire table, allowing for a more focused and efficient data analysis process.
Data scientists must analyze and interpret raw data they gather in their investigations. Using statistics, they can make sense of raw data to decide whether it supports a hypothesis. Similarly, in other scenarios such as real estate, data like house prices can be described using measures like median price and variation to present information more meaningfully.
Analysis is key in judging the validity of sources, which involves collection of data and applying specific criteria to that data to make interpretive claims about them. Applying these methodologies ensures the extracted insights from the data are reliable and valuable.