10.0k views
0 votes
One of your colleagues is trying to build a predictive model for a rare disease that affects approximately 1 in every 10,000 people. He builds a predictive model and reports 99% classification accuracy on a set of test data. Briefly explain why should you be skeptical about this claim.

1 Answer

1 vote

Final answer:

You should be skeptical about the claim of 99% classification accuracy for a rare disease. Imbalanced data, false positive/negative rates, and other evaluation metrics should be considered.

Step-by-step explanation:

You should be skeptical about your colleague's claim of 99% classification accuracy for a rare disease that affects approximately 1 in every 10,000 people because:

  1. The high accuracy rate may be due to imbalanced data, where the majority class overwhelms the rare class. This can lead to a model that primarily predicts the majority class, resulting in high overall accuracy but poor performance for the rare class.
  2. It is important to consider the false positive and false negative rates in addition to accuracy. In the case of a rare disease, even a small false positive or false negative rate can significantly impact the effectiveness of the model.
  3. Without additional information, it is difficult to assess the model's performance on other metrics like precision, recall, or F1 score, which provide a more comprehensive evaluation of the model's ability to correctly identify the rare disease.

User Reynoldsnlp
by
7.2k points