63.8k views
0 votes
Select all the statements that are correct about K-means:

a. Good initialization strategies can improve the performance of K-means
b. K-means objective has a unique global optimum, however, it is hard to find this optimum
c. The number of clusters is a hyper-parameter
d. Increasing the number of clusters always decreases the K-means cost function

User Kdazzle
by
8.3k points

1 Answer

1 vote

Final answer:

Correct statements about K-means include that good initialization strategies can enhance performance, the number of clusters is indeed a hyper-parameter, and increasing the number of clusters typically decreases the cost function, though this may cause overfitting.

Step-by-step explanation:

The statements that are correct about K-means are:

  • a. Good initialization strategies can improve the performance of K-means
  • c. The number of clusters is a hyper-parameter
  • d. Increasing the number of clusters always decreases the K-means cost function

For clause b, it is misleading to say that K-means has a unique global optimum. In fact, K-means algorithm is notorious for converging to local optima depending on the initialization of the centroids. There isn't a single global optimum; instead, there may be many local optima. The goal is to find the best solution out of many potential local optima, which is why good initialization can be crucial for performance. As for clause d, yes, increasing the number of clusters generally decreases the K-means objective function (inertia), which is a measure of the variance within the clusters. However, beyond a certain point, adding more clusters may lead to overfitting and lack of generalization.

User Olivier De Rivoyre
by
7.9k points