104k views
0 votes
In k-means or kNN, we use euclidean distance to calculate the distance between nearest neighbors. Why not manhattan distance ?

User Pery Mimon
by
8.5k points

1 Answer

3 votes

Final answer:

In k-means and kNN, Euclidean distance is commonly chosen because it matches our understanding of straight-line distance. The Manhattan distance can be used when movements are restricted to grid-like paths. The choice of distance metric depends on the nature of the data and application requirements.

Step-by-step explanation:

In both k-means and k-Nearest Neighbors (kNN), the choice of distance metric can greatly influence the outcome of the algorithm. Euclidean distance is often used because it corresponds to the shortest path between two points in many physical settings, which aligns with our intuitive understanding of distance in geometric space. However, the Manhattan distance, which measures the sum of absolute differences between coordinates, can also be used and is particularly useful in urban settings where travel is constrained along grid-like paths.

Euclidean distance is calculated as the square root of the sum of the squared differences between corresponding components, thus respecting the direct 'as-the-crow-flies' distance. Manhattan distance sums up the absolute differences and is akin to how one would travel in a city laid out in a grid pattern. The choice between these distances depends on the nature of the data and the specific requirements of the application. For example, if data points are on a grid and movement between them is restricted to grid lines, Manhattan distance can be more appropriate.

User Mahadi Hassan
by
8.3k points