62.5k views
3 votes
If we use a large dataset with relatively equal number of spam and ham examples to train both this model and the original Naive model, then which of the following effects are expected to be true?

A. The entropy of the posterior P ( |W ) will on average be lower in the new model.
B. The accuracy on the training data will be higher with the new model.
C. The accuracy on the test data will be higher with the new model.
D. None of the above.

User Harsha R
by
7.2k points

1 Answer

2 votes

Final answer:

The comparison between a new Bayesian model and the Naive Bayes model depends on how prior knowledge is integrated. While the new model might show higher training accuracy if priors are informative, test data accuracy depends on model generalization. Entropy of the posterior is influenced by the prior distribution and data fit.

Step-by-step explanation:

The question refers to the comparison between a new Bayesian model and the original Naive Bayes model in terms of their performance using a large dataset with relatively equal numbers of spam and ham examples. According to Bayesian principles, the new model would integrate prior knowledge about parameters through a prior distribution, potentially leading to different outcomes compared to a standard Naive Bayes model.

  • Entropy of the posterior may not necessarily be lower for the new model as it is influenced by prior information and the flexibility of the model to fit data.
  • Accuracy on the training data might be higher for the new model if the priors are informative and correctly specified since they provide additional information.
  • Whether or not accuracy on the test data will be higher for the new model depends on factors such as the appropriateness of the prior information and how well the model generalizes to unseen data.

User Rufanov
by
8.0k points