159k views
5 votes
Suppose that for a data set • there are m points and K clusters, • half the points and clusters are in "more dense" regions, • half the points and clusters are in "less dense" regions, and • the two regions are well-separated from each other. Which of the following should occur to minimize the squared error when finding K clusters: (a) Centroids should be equally distributed between more dense and less dense regions. (b) More centroids should be allocated to the less dense region. (c) More centroids should be allocated to the denser region. Explain:

User Sdu
by
5.4k points

1 Answer

2 votes

Answer:

b

Step-by-step explanation:

The correct answer should be option b that is

b) more centroids should be allocated to the less dense region and more dense region should have less centroids.

reason

When more dense region has more points in a small area a centroids in that area would have very less distance from that point and hence the calculation would be accurate even with less centroids.

For points spread over a large area, many centroids would help in providing less distance to the points and becoming accurate calculations.

User Masoud Gheisari
by
5.4k points