55.3k views
0 votes
Which of the following are potential training data inadequacies that you should communicate to your users in support of transparency?

1) The amount of time it took to collect the data.
2) Any missing values in the dataset and how they were handled.
3) Any known bias in the sample data.
4) The relatively large size of a dataset used in training.

User Jurevert
by
7.3k points

1 Answer

4 votes

Final answer:

Transparency in data training requires communicating issues like handling of missing values and biases in the dataset to users, as these can affect the performance and reliability of the model.

Step-by-step explanation:

Potential training data inadequacies that should be communicated to your users in support of transparency include:



  • Any missing values in the dataset and how they were handled. Incomplete data can lead to biased results and should always be accounted for transparently.
  • Any known bias in the sample data. Bias can distort the performance and decision-making capabilities of algorithmic models, and it's crucial for users to understand these limitations.



While the time taken to collect the data and the size of the dataset might also be interesting, they are not inherently indicative of inadequacies in the training data. However, users should certainly be made aware of issues that could directly affect the model's performance and generalizability, such as handling of missing values and any known biases.

User Kikap
by
7.5k points