Final answer:
The incorrect statement about confusion matrices is that they can only be created for the training set and not for the test set. In reality, confusion matrices can be used for both, helping assess a model's performance on known and unseen data with actual categorical outcomes rather than probabilities.
Step-by-step explanation:
The statement that is NOT true about confusion matrices is: a. We can only create a confusion matrix for our training set, we cannot create a confusion matrix for our test set.
Contrary to this statement, we can create a confusion matrix for both the training and test sets of a machine learning model. A confusion matrix is a tabular representation that summarizes the performance of a classification algorithm. It shows the number of times the model's predictions matched (or did not match) the actual target values. Therefore, it's an essential tool for evaluating the predictive accuracy of a model, not just for the training set but also for the test set, which shows how well the model is likely to perform on unseen data.
Furthermore, while binary classification problems often have predictions and actual values in the form of zeros and ones, to create a confusion matrix, we do not strictly need them in this format. They can be in any categorical form as long as there is a clear correspondence between predicted and actual categories. Probabilities may be used to make predictions, but for a confusion matrix, these predictions are typically converted to categorical outcomes (e.g., yes/no, 1/0) before being compared with actual categories.