174k views
5 votes
Your classifier should choose between three (3) categories (say: positive | neutral | negative) how would you use Naive Bayes classifier to do so? Replace first 50% NEUTRAL sample labels with POSITIVE and the other 50% with NEGATIVE label. Build a binary Naive Bayes classifier around this modified data set. Build THREE binary Naive Bayes classifiers: a) one that decides between POSITIVE/OTHER [NEUTRAL+NEGATIVE], b) one that decides between NEUTRAL/OTHER [POSITIVE+NEGATIVE], c) one that decides between NEGATIVE/OTHER [POSITIVE+NEUTRAL]. You can't use Naive Bayes classifier in such context. Build ONE Naive Bayes classifier with three sets of parameters (for POSITIVE, NEGATIVE, and NEUTRAL classes).

User Thindery
by
7.6k points

1 Answer

6 votes

Final answer:

To classify data into three categories with a Naive Bayes classifier, construct one classifier with three sets of parameters for the positive, neutral, and negative classes, instead of creating separate binary classifiers. Train this classifier using representative training data labeled with all three classes.

Step-by-step explanation:

To use a Naive Bayes classifier for multi-class classification, you initially need to understand that a standard Naive Bayes classifier can indeed discriminate between more than two classes. However, the question describes a scenario in which a standard tri-class problem is first artificially converted into a binary problem by combining two classes and then into three separate binary classification problems. This is possible, but less efficient than the direct multi-class approach. Therefore, instead of creating three separate binary classifiers, you should build one Naive Bayes classifier with three sets of parameters, each corresponding to one of the classes: positive, neutral, and negative.

The Naive Bayes classifier works on the principle of probability. It calculates the probability of each class given a set of features, and then assigns the class with the highest probability to the instance. In multi-class classification, this involves calculating probabilities for each class separately and then comparing them to pick the class which has the highest probability.

To implement this, ensure that the training data is labeled correctly for all three classes. Then, train the classifier by estimating the class conditional probability for each feature. This involves counting how often each feature value occurs in examples from each class, and then applying Bayes' theorem to compute the probability of the class given an observation of the feature values.

For a well-functioning multi-class Naive Bayes classifier, it's crucial to have a representative set of training data for each of the classes. If a significant imbalance exists, techniques such as under-sampling the majority class or over-sampling the minority class might be necessary.

User KevinHJ
by
7.7k points