Closest-points problem

Consider a set of points P = {p1, p2, p3, ..., pn}; find the two points closest to each other.

Brute force: The brute force method calculates the lengths of all lines for all unique pairs of points and returns the pair forming the shortest line. This is a solution that runs in quadratic time, O(n2). Can we do better?

Divide-and-conquer: Consider if the set of points is sorted according to their x-value. There is some mid-point along the x-axis that divides the set of points in half (except in special circumstances---such as colinear points). The closest pair of points must either both be in the left half of the sorted points, or both in the right half, or they straddle the midpoint such that one point is in the left half and the other in the right.

Finding the closest pair in the left half can be computed recursively, as can the closest pair in the right, so we need only concern ourselves with the case where the closest pair are endpoints to a line that straddles the midpoint.

Assume dl and dr are the distances for the closest points in the left and right halves, respectively. Let δ = min(dl, dr). Let dc be the distance between the closest points forming a line that straddles the midpoint, and therefore is potentially less that δ. That is, if dc > δ then δ is our solution, otherwise dc will be our solution. Obviously only those points in the left half of our set of points whose distance from the midpoint is less than δ need be considered as part of the solution to dc because all other points would be too far away from the nearest point in the right half; and, similarly, only those in the right half whose distance from the midpoint is less than δ need be considered. We shall call the regions within δ of the midpoint the left-strip and the right-strip, respectively, and dc will be the distance between one point in the left-strip and one point in the right-strip.

dc is the Euclidean distance between two points calculated using Pythagoras' Theorem. Only pairs of points whose difference between x-values is less than δ and whose difference between y-values is less than δ need be considered. Therefore, we can compute dc efficiently using the following algorithm:

for i=1 to (numpoints in left-strip)
  for j=1 to (numpoints in right-strip)
     if (both x and y coordinate differences for pi and pj < δ)
        if(dist(pi,pj)< δ)
	    δ = dist(pi, pj)

Note that if the points in the left-strip and the points in the right-strip are ordered by their y-value, the inner loop can exit early once the difference in the y-values exceeds δ.