207k views
2 votes
Combining data from two or more relational database tables is an example of ________.

collation
response
review
detection
Which of the following is NOT an example of data scrubbing as described in the text?
handling missing data
handling inconsistent data
reducing data
handling marginal data
True or false: Having zeroes in the data is not the same as having missing data.
True or False
Descriptive statistics that give insight into norms in a data set are called measures of _______.
similarity
central tendency
norms
commonality
Missing values in a data set mean that ________.
the data contains errors
the data set is unusable
the data set contains outliers
none of the above
Removing records that contain missing or inconsistent data from a data set before analysis is an example of _________.
data mining
reduction
missing values
purging
Selecting some subset of records from a data set is called ________ the data.
morphing
modifying
correcting
sampling
A value of "middle-aged" in an attribute that otherwise contains peoples’ ages in number of years would be an example of _______.
inconsistent data
modified data
aged data
alphabetic data
Removing columns from a data set because they are not useful for a certain type of data analysis is an example of _________.
document reduction
observation reduction
content reduction
attribute reduction
Data analysis processes in RapidMiner are built using rectangular building blocks called _________.
objects
operators
clicks
streams
To remove unwanted attributes from a data set in RapidMiner, use the _______ operator.
Sample
Select Attributes
Select Examples
Filter Examples
To remove unwanted observations from a data set in RapidMiner, use the _______ operator.
Sample
Aggregate
Select Attributes
Filter Examples
True or false: In RapidMiner, data can be either imported or read into the software from CSV, text, and spreadsheet files.
True or False
True or false: RapidMiner requires all data attributes to have either a data type or a role, but not both.
True or False
True or false: In R, it is possible to read a data set directly from a text file located on disk, on a file server, or on a web server.
True or False
When a data set is imported into R, it is stored in an object called a _________.
value store
data cart
data frame
data store
In R, assuming the existence of a data frame called Employees, the command Employees [1:100, 4:9] would show ___________.
columns 4 through 9, 100 times
columns 1 through 100 for employee records 4 through 9
employee records 4 through 9, 100 times
columns 4 through 9 for the first 100 employee records
The R command for randomly retrieving some number of rows from an imported data set is _______.
sample
get
fetch
retrieve
To avoid having to retype the name of a data frame when referring to an attribute in a data set in R, use the ________ command.
connect()
data()
attach()
set()

User Eir Nym
by
7.4k points

1 Answer

3 votes

Answer:

1. collation

2. reducing data

3. False

4. central tendency

5. none of the above

6. purging

7. sampling

8. inconsistent data

9. attribute reduction

10. operators

11. Select Attributes

12. Filter Examples

13. True

14. False

15. True

16. data frame

17. columns 4 through 9 for the first 100 employee records

18. sample

19. attach()

Step-by-step explanation:

Combining data from two or more relational database tables is an example of:

- collation (None of the options provided is correct. The correct term for combining data from multiple tables is "joining" or "joining tables".)

The following statement is NOT an example of data scrubbing as described in the text:

- reducing data (Data scrubbing typically refers to the process of cleaning and correcting inconsistent or erroneous data, handling missing values, and resolving data quality issues. Reducing data is not specifically a part of data scrubbing.)

True or false: Having zeroes in the data is not the same as having missing data.

- False (Having zeroes in the data is not the same as having missing data. Zero is a valid value, while missing data refers to the absence of a value for a particular observation.)

Descriptive statistics that give insight into norms in a data set are called measures of:

- central tendency

Missing values in a data set mean that:

- none of the above (Missing values in a data set simply indicate that certain observations or attributes do not have a recorded value. It does not necessarily imply errors, unusability, or the presence of outliers.)

Removing records that contain missing or inconsistent data from a data set before analysis is an example of:

- purging

Selecting some subset of records from a data set is called:

- sampling

A value of "middle-aged" in an attribute that otherwise contains people's ages in number of years would be an example of:

- inconsistent data

Removing columns from a data set because they are not useful for a certain type of data analysis is an example of:

- attribute reduction

Data analysis processes in RapidMiner are built using rectangular building blocks called:

- operators

To remove unwanted attributes from a data set in RapidMiner, use the Select Attributes operator.

To remove unwanted observations from a data set in RapidMiner, use the Filter Examples operator.

True or false: In RapidMiner, data can be either imported or read into the software from CSV, text, and spreadsheet files.

- True

True or false: RapidMiner requires all data attributes to have either a data type or a role, but not both.

- False (In RapidMiner, attributes can have both a data type and a role assigned to them.)

True or false: In R, it is possible to read a data set directly from a text file located on disk, on a file server, or on a web server.

- True

When a data set is imported into R, it is stored in an object called a:

- data frame

In R, assuming the existence of a data frame called Employees, the command Employees [1:100, 4:9] would show:

- columns 4 through 9 for the first 100 employee records

The R command for randomly retrieving some number of rows from an imported data set is:

- sample

To avoid having to retype the name of a data frame when referring to an attribute in a data set in R, use the attach() command.

User Matt Koala
by
8.2k points