A Near-Linear Time Algorithm for the Chamfer Distance (via zoom)

Abstract: I will describe a fast algorithm to approximate the Chamfer distance, a commonly used proxy for the more computationally demanding Earth-Mover (Optimal Transport) Distance. For any two point sets A and B of size n, the Chamfer distance from A to B is defined as CH(A, B) = sum over x in A of d(x, B), where d(x, B) is the minimum distance from x to any point in B. In other words, it is the sum of nearest neighbor distances from A to B. The Chamfer distance is a popular measure of dissimilarity between point clouds, used in many machine learning, computer vision, and graphics applications, and admits a straightforward O(n^2)-time brute force algorithm. However, the quadratic dependence on n in the running time makes the naive approach intractable for large datasets. I will present a (1+epsilon)-approximate algorithm for estimating the Chamfer distance with a near-linear running time. Our experiments demonstrate that it is both accurate and fast on large high-dimensional datasets.
This is joint work with Ainesh Bakshi, Piotr Indyk, Rajesh Jayaram, and Erik Waingarten, and will appear at NeurIPS 23. 
Bio: Sandeep is a final year PhD student at MIT, advised by Piotr Indyk. His interests are broadly in efficient algorithm design. Recently, he has been working in the intersection of machine learning and algorithms by designing provable algorithms in various ML settings, such as efficient algorithms for processing large datasets, as well as using ML to inspire and motivate algorithm design.