Timothy M. Chan's Publications

We study geometric set cover problems in dynamic settings, allowing insertions and deletions of points and objects. We present the first dynamic data structure that can maintain an O(1)-approximation in sublinear update time for set cover for axis-aligned squares in 2D. More precisely, we obtain randomized update time O(n^{2/3+delta}) for an arbitrarily small constant delta > 0. Previously, a dynamic geometric set cover data structure with sublinear update time was known only for unit squares by Agarwal, Chang, Suri, Xiao, and Xue [SoCG 2020]. If only an approximate size of the solution is needed, then we can also obtain sublinear amortized update time for disks in 2D and halfspaces in 3D. As a byproduct, our techniques for dynamic set cover also yield an optimal randomized O(n log n)-time algorithm for static set cover for 2D disks and 3D halfspaces, improving our earlier O(n log n (log log n)^{O(1)}) result [SoCG 2020].

Simple multi-pass streaming algorithms for skyline points and extreme points

(with Saladi Rahul)

In this paper, we present simple randomized multi-pass streaming algorithms for fundamental computational geometry problems of finding the skyline (maximal) points and the extreme points of the convex hull. For the skyline problem, one of our algorithm occupies O(h) space and performs O(log n) passes, where h is the number of skyline points. This improves the space bound of the currently best known result by Das Sarma, Lall, Nanongkai, and Xu [VLDB'09] by a logarithmic factor. For the extreme points problem, we present the first non-trivial result for any constant dimension greater than two: an O(h log^{O(1)}n) space and O(log^d n) pass algorithm, where h is the number of extreme points. Finally, we argue why randomization seems unavoidable for these problems, by proving lower bounds on the performance of deterministic algorithms for a related problem of finding maximal elements in a poset.

In Proc. 38th Symposium on Theoretical Aspects of Computer Science (STACS), pages 22:1-22:14, 2021

(Near-)linear-time randomized algorithms for row minima in Monge partial matrices and related problems

We revisit classical problems about searching in totally monotone and Monge matrices, which have many applications in computational geometry and other areas. We present a number of new results, including the following:

A randomized algorithm that finds the row minima in an n x n Monge staircase matrix in O(n) expected time; this improves a longstanding O(n alpha(n)) bound by Klawe and Kleitman (1990) for totally monotone staircase matrices.
A randomized algorithm that reports the K smallest elements (in an arbitrary order) in an n x n Monge (complete or staircase) matrix in O(n + K) expected time; this improves and extends a previous O(n + K log n) algorithm by Kravets and Park [SODA'90].
A randomized algorithm that reports the K smallest elements (in an arbitrary order) in an n x n totally monotone (complete) matrix in O(n + K log^* n) expected time.
A randomized algorithm that reports the k_i smallest elements in the i-th row, for every i, in an n x n totally monotone (complete) matrix in O((n + K) log^* n) expected time, where K = sum_i k_i.
A randomized algorithm that finds the row minima in an n x n totally monotone "v-matrix" in O(n alpha(n) log^* n loglog n) expected time; this answers an open question by Klawe [SODA'90]. The log^* n factor can be removed in the Monge case.

Near-optimal randomized algorithms for selection in totally monotone matrices

We revisit classical problems about searching in totally monotone matrices, which have many applications in computational geometry and other areas. In a companion paper, we gave new (near-)linear-time algorithms for a number of such problems. In the present paper, we describe new subquadratic results for more basic problems, including the following:

A randomized algorithm to select the K-th smallest element in an n x n totally monotone matrix in O(n^{4/3} polylog n) expected time; this improves previous O(n^{3/2} polylog n) algorithms by Alon and Azar [SODA'92], Mansour, Park, Schieber, and Sen (1993), and Agarwal and Sen (1996).
A near-matching lower bound of Omega(n^{4/3}) for the problem (which holds even for Monge matrices).
A similar result for selecting the k_i-th smallest in the i-th row for all i.
In the case when all k_i's are the same, an improvement of the running time to O(n^{6/5} polylog n).
Variants of all these bounds that are sensitive to K (or sum_i k_i).

These matrix searching problems are intimately related to problems about arrangements of pseudo-lines. In particular, our selection algorithm implies an O(n^{4/3} polylog n) algorithm for computing incidences between n points and n pseudo-lines in the plane. This improves, extends, and simplifies a previous method by Agarwal and Sharir [SODA'02].

Improved upper and lower bounds for LR Drawings of binary trees

(with Zhengcheng Huang)

In SODA'99, Chan introduced a simple type of planar straight-line upward order-preserving drawings of binary trees, known as LR drawings: such a drawing is obtained by picking a root-to-leaf path, drawing the path as a straight line, and recursively drawing the subtrees along the paths. Chan proved that any binary tree with n nodes admits an LR drawing with O(n^{0.48}) width. In SODA'17, Frati, Patrignani, and Roselli proved that there exist families of n-node binary trees for which any LR drawing has Omega(n^{0.418}) width. In this note, we improve Chan's upper bound to O(n^{0.437}) and Frati et al.'s lower bound to Omega(n^{0.429}).

Further results on colored range searching

(with Qizheng He and Yakov Nekrich)

We present a number of new results about range searching for colored (or "categorical") data:

For a set of n colored points in three dimensions, we describe randomized data structures with O(n polylog n) space that can report the distinct colors in any query orthogonal range (axis-aligned box) in O(k polyloglog n) expected time, where k is the number of distinct colors in the range, assuming that coordinates are in {1,...,n}. Previous data structures require O(log n / loglog n + k) query time. Our result also implies improvements in higher constant dimensions.
Our data structures can be adapted to halfspace ranges in three dimensions (or circular ranges in two dimensions), achieving O(k log n) expected query time. Previous data structures require O(k log^2 n) query time.
For a set of n colored points in two dimensions, we describe a data structure with O(n polylog n) space that can answer colored "type-2" range counting queries: report the number of occurrences of every distinct color in a query orthogonal range. The query time is O(log n / loglog n + k loglog n), where k is the number of distinct colors in the range. Naively performing k uncolored range counting queries would require O(k log n / loglog n) time.

Our data structures are designed using a variety of techniques, including colored variants of randomized incremental construction (which may be of independent interest), colored variants of shallow cuttings, and bit-packing tricks.

In Proc. 36th Symposium on Computational Geometry (SoCG), pages 28:1-28:15, 2020

Faster approximation algorithms for geometric set cover

We improve the running times of O(1)-approximation algorithms for the set cover problem in geometric settings, specifically, covering points by disks in the plane, or covering points by halfspaces in three dimensions. In the unweighted case, Agarwal and Pan [SoCG 2014] gave a randomized O(n log^4 n)-time, O(1)-approximation algorithm, by using variants of the multiplicative weight update (MWU) method combined with geometric data structures. We simplify the data structure requirement in one of their methods and obtain a deterministic O(n log^3 n log log n)-time algorithm. With further new ideas, we obtain a still faster randomized O(n log n (log log n)^O(1))-time algorithm. For the weighted problem, we also give a randomized O(n log^4 n log log n)-time, O(1)-approximation algorithm, by simple modifications to the MWU method and the quasi-uniform sampling technique.

In Proc. 36th Symposium on Computational Geometry (SoCG), pages 27:1-27:14, 2020

Approximating pattern-to-text Hamming distances

(with Shay Golan, Tomasz Kociumaka, Tsvi Kopelowitz, and Ely Porat)

We revisit a fundamental problem in string matching: given a pattern of length m and a text of length n, both over an alphabet of size sigma, compute the Hamming distance (i.e., the number of mismatches) between the pattern and the text at every location. Several randomized (1+eps)-approximation algorithms have been proposed in the literature (e.g., by Karloff (Inf. Proc. Lett., 1993), Indyk (FOCS 1998), and Kopelowitz and Porat (SOSA 2018)), with running time of the form O(eps^{-O(1)} n log n log m), all using fast Fourier transform (FFT). We describe a simple randomized (1+eps)-approximation algorithm that is faster and does not need FFT. Combining our approach with additional ideas leads to numerous new results (all Monte-Carlo randomized) in different settings:

We design the first truly linear-time approximation algorithm for constant eps; the running time is O(eps^{-2}n). In fact, the time bound can be made slightly sublinear in n if the alphabet size sigma is small (by using bit packing tricks).
We apply our approximation algorithms to design a faster exact algorithm computing all Hamming distances up to a threshold k; its runtime of O(n + min(nk sqrt{log m} / sqrt{m}, nk^2/m)) improves upon previous results by logarithmic factors and is linear for k <= sqrt{m}.
We alternatively design approximation algorithms with better eps-dependence, by using fast rectangular matrix multiplication. In fact, the time bound is O(n polylog n) when the pattern is sufficiently long, i.e., m >= eps^{-c} for a specific constant c. Previous algorithms with the best eps-dependence require O(eps^{-1} n polylog n) time.
When k is not too small, we design a truly sublinear-time algorithm to find all locations with Hamming distance approximately (up to a constant factor) less than k, in time O((n/k^{Omega(1)} + occ) n^{o(1)}) time, where occ is the output size. The algorithm leads to a property tester for pattern matching that costs O~(delta^{-1/3}n^{2/3} + delta^{-1}n/m) time and, with high probability, returns true if an exact match exists and false if the Hamming distance is more than delta*m at every location.
We design a streaming algorithm that approximately computes the Hamming distance for all locations with the distance approximately less than k, using O(eps^{-2} sqrt{k}) space. Previously, streaming algorithms were known for the exact problem with O(k) space (which is tight up to the polylogn factor) or for the approximate problem with O~(eps^{-O(1)} sqrt{m}) space.

Reducing 3SUM to Convolution-3SUM

Given a set S of n numbers, the 3SUM problem asks to determine whether there exist three elements a,b,c in S such that a+b+c = 0. The related Convolution-3SUM problem asks to determine whether there exist a pair of indices i,j such that A[i]+A[j] = A[i+j], where A is a given array of nnumbers.

When the numbers are integers, a randomized reduction from 3SUM to Convolution-3SUM was given in a seminal paper by Patrascu [STOC 2010], which was later improved by Kopelowitz, Pettie, and Porat [SODA 2016] with an O(log n) factor slowdown. In this paper, we present a simple deterministic reduction from 3SUM to Convolution-3SUM for integers bounded by U. We also describe additional ideas to obtaining further improved reductions, with only a (loglog n)^{O(1)} factor slowdown in the randomized case, and a (log U)^{O(1)} factor slowdown in the deterministic case.

PDF
In Proc. 3rd SIAM Symposium on Simplicity in Algorithms (SOSA), pages 1-7, 2020 (SOSA best paper)

On the change-making problem

Given a set of n non-negative integers representing a coin system, the change-making problem seeks the fewest number of coins that sum up to a given value t, where each type of coin can be used an unlimited number of times. This problem is a popular homework exercise in dynamic programming, where the textbook solution runs in O(nt) time.

It is not hard to solve this problem in O(t polylog t) time by using convolution. In this paper, we present a simple deterministic O(t log t loglog t) time algorithm, and later improve the running time to O(t log t) by randomization.

Dynamic generalized closest pair: Revisiting Eppstein's technique

Eppstein (1995) gave a technique to transform any data structure for dynamic nearest neighbor queries into a data structure for dynamic closest pair, for any distance function; the transformation increases the time bound by two logarithmic factors. We present a similar, simple transformation that is just as good, and can avoid the extra logarithmic factors when the query and update time of the given structure exceed n^epsilon for some constant epsilon > 0.

Consequently, in the case of an arbitrary distance function, we obtain an optimal O(n)-space data structure to maintain the dynamic closest pair of n points in O(n) amortized time plus O(n) distance evaluations per update.

Faster deterministic and Las Vegas algorithms for offline approximate nearest neighbors in high dimensions

(with Josh Alman and Ryan Williams)

We present a deterministic, truly subquadratic algorithm for offline (1+epsilon)-approximate nearest or farthest neighbor search (in particular, the closest pair or diameter problem) in Hamming space in any dimension d < n^delta, for a sufficiently small constant delta > 0. The running time of the algorithm is roughly n^{2-epsilon^{1/2+O(delta)}} for nearest neighbors, or n^{2-Omega(sqrt{epsilon}/log(1/epsilon))} for farthest. The algorithm follows from a simple combination of expander walks, Chebyshev polynomials, and rectangular matrix multiplication.

We also show how to eliminate errors in the previous Monte Carlo randomized algorithm of Alman, Chan, and Williams [FOCS'16] for offline approximate nearest or farthest neighbors, and obtain a Las Vegas randomized algorithm with expected running time n^{2-Omega(epsilon^{1/3}/log(1/epsilon))}.

Finally, we note a simplification of Alman, Chan, and Williams' method and obtain a slightly improved Monte Carlo randomized algorithm with running time n^{2-Omega(epsilon^{1/3}/log^{2/3}(1/epsilon))}.

As one application, we obtain improved deterministic and randomized (1+epsilon)-approximation algorithms for MAX-SAT.

Better data structures for colored orthogonal range reporting

(with Yakov Nekrich)

Range searching on categorical, or "colored", data has been studied extensively for over two decades. In this paper, we obtain the current best results for perhaps the most basic, and most often studied, version of the geometric problem: colored orthogonal range reporting.

Given n colored points in two-dimensional space [U]^2, we present a data structure with O(n log^{3/4+epsilon} n) space, for an arbitrarily small constant epsilon > 0, so that all k distinct colors in any axis-aligned query rectangle can be reported in (optimal) O(loglog U + k) time; this is the first method to break the O(n log n) space barrier.

In three dimensions, we present a data structure with O(n log^{9/5+epsilon} n) space and O(log n/loglog n + k) time; this improves the previous space bound of O(n log^4 n).

Range closest-pair search in higher dimensions

(with Saladi Rahul and Jie Xue)

Range closest-pair (RCP) search is a range-search variant of the classical closest-pair problem, which aims to store a given set S of points into some space-efficient data structure such that when a query range Q is specified, the closest pair in S \cap Q can be reported quickly. RCP search has received attention over years, but the primary focus was only on R^2. In this paper, we study RCP search in higher dimensions. We give the first nontrivial RCP data structures for orthogonal, simplex, halfspace, and ball queries in R^d for any constant d. Furthermore, we prove a conditional lower bound for orthogonal RCP search for d >= 3.

Orthogonal range reporting and rectangle stabbing for fat rectangles

(with Yakov Nekrich and Michiel Smid)

In this paper we study two geometric data structure problems in the special case when input objects or queries are fat rectangles. We show that in this case a significant improvement compared to the general case can be achieved. We describe data structures that answer two- and three-dimensional orthogonal range reporting queries in the case when the query range is a fat rectangle. Our two-dimensional data structure uses O(n) words and supports queries in O(loglog U + k) time, where n is the number of points in the data structure, U is the size of the universe and k is the number of points in the query range. Our three-dimensional data structure needs O(n log^eps U) words of space and answers queries in O(loglog U + k) time. We also consider the rectangle stabbing problem on a set of three-dimensional fat rectangles. Our data structure uses O(n) space and answers stabbing queries in O(log U log log U + k) time.

Dynamic geometric data structures via shallow cuttings

We present new results on a number of fundamental problems about dynamic geometric data structures:

We describe the first fully dynamic data structures with sublinear amortized update time for maintaining (i) the number of vertices or the volume of the convex hull of a 3D point set, (ii) the largest empty circle for a 2D point set, (iii) the Hausdorff distance between two 2D point sets, (iv) the discrete 1-center of a 2D point set, (v) the number of maximal (i.e., skyline) points in a 3D point set. The update times are near n^{11/12} for (i) and (ii), n^{7/8} for (iii) and (iv), and n^{2/3} for (v). Previously, sublinear bounds were known only for restricted "semi-online" settings [Chan, SODA 2002].
We slightly improve previous fully dynamic data structures for answering extreme point queries for the convex hull of a 3D point set and nearest neighbor search for a 2D point set. The query time is O(log^2 n), and the amortized update time is O(log^4 n) instead of O(log^5 n) [Chan, SODA 2006; Kaplan et al., SODA 2017].
We also improve previous fully dynamic data structures for maintaining the bichromatic closest pair between two 2D point sets and the diameter of a 2D point set. The amortized update time is O(log^4 n) instead of O(log^7 n) [Eppstein 1995; Chan, SODA 2006; Kaplan et al., SODA 2017].

Smallest k-enclosing rectangle revisited

(with Sariel Har-Peled)

Given a set of n points in the plane, and a parameter k, we consider the problem of computing the minimum (perimeter or area) axis-aligned rectangle enclosing k points. We present the first near quadratic time algorithm for this problem, improving over the previous near-O(n^{5/2})-time algorithm by Kaplan et al. [ESA 2017]. We provide an almost matching conditional lower bound, under the assumption that (min,+)-convolution cannot be solved in truly subquadratic time. Furthermore, we present a new reduction (for either perimeter or area) that can make the time bound sensitive to k, giving near O(nk) time. We also present a near linear time (1+eps)-approximation algorithm to the minimum area of the optimal rectangle containing k points. In addition, we study related problems including the 3-sided, arbitrarily oriented, weighted, and subset sum versions of the problem.

Computing Shapley values in the plane

(with Sergio Cabello)

We consider the problem of computing Shapley values for points in the plane, where each point is interpreted as a player, and the value of a coalition is defined by the area of usual geometric objects, such as the convex hull or the minimum axis-parallel bounding box.

For sets of n points in the plane, we show how to compute in roughly O(n^{3/2}) time the Shapley values for the area of the minimum axis-parallel bounding box and the area of the union of the rectangles spanned by the origin and the input points. When the points form an increasing or decreasing chain, the running time can be improved to near-linear. In all these cases, we use linearity of the Shapley values and algebraic methods.

We also show that Shapley values for the area of the convex hull or the minimum enclosing disk can be computed in O(n^2) and O(n^3) time, respectively. These problems are closely related to the model of stochastic point sets considered in computational geometry, but here we have to consider random insertion orders of the points instead of a probabilistic existence of points.

On locality-sensitive orderings and their applications

(with Sariel Har-Peled and Mitchell Jones)

For any constant d and parameter eps > 0, we show the existence of (roughly) 1/eps^d orderings on the unit cube [0,1)^d, such that any two points p,q in [0,1)^d that are close together under the Euclidean metric are "close together" in one of these linear orderings in the following sense: the only points that could lie between p and q in the ordering are points with Euclidean distance at most eps d(p,q) from p or q. These orderings are extensions of the "Z-order", and they can be efficiently computed.

Functionally, the orderings can be thought of as a replacement to quadtrees and related structures (like well-separated pair decompositions). We use such orderings to obtain surprisingly simple algorithms for a number of basic problems in low-dimensional computational geometry, including (i) dynamic approximate bichromatic closest pair, (ii) dynamic spanners, (iii) dynamic approximate minimum spanning trees, (iv) static and dynamic fault-tolerant spanners, and (v) approximate nearest neighbor search.

Stabbing rectangles by line segments: How decomposition reduces the shallow-cell complexity

(with Thomas C. van Dijk, Krzysztof Fleszar, Joachim Spoerhase, and Alexander Wolff)

We initiate the study of the following natural geometric optimization problem. The input is a set of axis-aligned rectangles in the plane. The objective is to find a set of horizontal line segments of minimum total length so that every rectangle is stabbed by some line segment. A line segment stabs a rectangle if it intersects its left and its right boundary. The problem, which we call Stabbing, can be motivated by a resource allocation problem and has applications in geometric network design. To the best of our knowledge, only special cases of this problem have been considered so far.

Stabbing is a weighted geometric set cover problem, which we show to be NP-hard. A constrained variant of Stabbing turns out to be even APX-hard. While for general set cover the best possible approximation ratio is Theta(log n), it is an important field in geometric approximation algorithms to obtain better ratios for geometric set cover problems. Chan et al. [SODA'12] generalize earlier results by Varadarajan [STOC'10] to obtain sub-logarithmic performances for a broad class of weighted geometric set cover instances that are characterized by having low shallow-cell complexity. The shallow-cell complexity of Stabbing instances, however, can be high so that a direct application of the framework of Chan et al. gives only logarithmic bounds. We still achieve a constant-factor approximation by decomposing general instances into what we call laminar instances that have low enough complexity.

Our decomposition technique yields constant-factor approximations also for the variant where rectangles can be stabbed by horizontal and vertical segments and for two further geometric set cover problems.

An improved approximation algorithms for the discrete Fr�chet distance

(with Zahed Rahmati)

For two sequences P and Q of n points in R^d, we compute an approximation to the discrete Fr�chet distance. Our f-approximation algorithm runs in time O(n logn +n^2/f^2), for any f in [1, n/logn] and d=O(1), which improves (and, at the same time, slightly simplifies) the previous O(n logn +n^2/f)-time algorithm by Bringmann and Mulzer [SoCG�15].

Information Processing Letters, 138:72-74, 2018

Orthogonal point location and rectangle stabbing queries in 3-d

(with Yakov Nekrich, Saladi Rahul, and Konstantinos Tsakalidis)

In this work, we present a collection of new results on two fundamental problems in geometric data structures: orthogonal point location and rectangle stabbing.

Orthogonal point location. We give the first linear-space data structure that supports 3-d point location queries on n disjoint axis-aligned boxes with optimal O(log n) query time in the (arithmetic) pointer machine model. This improves the previous O(log^{3/2} n) bound of Rahul [SODA 2015]. We similarly obtain the first linear-space data structure in the I/O model with optimal query cost, and also the first linear-space data structure in the word RAM model with sub-logarithmic query time.
Rectangle stabbing. We give the first linear-space data structure that supports 3-d 4- sided and 5-sided rectangle stabbing queries in optimal O(log_w n + k) time in the word RAM model. We similarly obtain the first optimal data structure for the closely related problem of 2-d top-k rectangle stabbing in the word RAM model, and also improved results for 3-d 6-sided rectangle stabbing.

For point location, our solution is simpler than previous methods, and is based on an interesting variant of the van Emde Boas recursion, applied in a round-robin fashion over the dimensions, combined with bit-packing techniques. For rectangle stabbing, our solution is a variant of Alstrup, Brodal, and Rauhe�s grid-based recursive technique (FOCS 2000), combined with a number of new ideas.

Tree drawings revisited

We make progress on a number of open problems concerning the area requirement for drawing trees on a grid. We prove that

every tree of size n (with arbitrarily large degree) has a straight-line drawing with area n2^{O(sqrt{loglog n logloglog n})}, improving the longstanding O(n log n) bound;
every tree of size n (with arbitrarily large degree) has a straight-line upward drawing with area n sqrt{log n}(loglog n)^{O(1)}, improving the longstanding O(n log n) bound;
every binary tree of size n has a straight-line orthogonal drawing with area n2^{O(log^* n)}, improving the previous O(n loglog n) bound by Shin, Kim, and Chwa (1996) and Chan, Goodrich, Kosaraju, and Tamassia (1996);
every binary tree of size n has a straight-line order-preserving drawing with area n2^{O(log^* n)}, improving the previous O(n loglog n) bound by Garg and Rusu (2003);
every binary tree of size n has a straight-line orthogonal order-preserving drawing with area n2^{O(sqrt{log n})}, improving the O(n^{3/2}) previous bound by Frati (2007).

Subquadratic encodings for point configurations

(with Jean Cardinal, John Iacono, Stefan Langerman, and Aurelien Ooms)

For many algorithms dealing with sets of points in the plane, the only relevant information carried by the input is the combinatorial configuration of the points: the orientation of each triple of points in the set (clockwise, counterclockwise, or collinear). This information is called the order type of the point set. In the dual, realizable order types and abstract order types are combinatorial analogues of line arrangements and pseudoline arrangements. Too often in the literature we analyze algorithms in the real-RAM model for simplicity, putting aside the fact that computers as we know them cannot handle arbitrary real numbers without some sort of encoding. Encoding an order type by the integer coordinates of a realizing point set is known to yield doubly exponential coordinates in some cases. Other known encodings can achieve quadratic space or fast orientation queries, but not both. In this contribution, we give a compact encoding for abstract order types that allows efficient query of the orientation of any triple: the encoding uses O(n^2) bits and an orientation query takes O(log n) time in the word-RAM model with word size w >= log n. This encoding is space-optimal for abstract order types. We show how to shorten the encoding to O(n^2 (loglog n)^2 / log n) bits for realizable order types, giving the first subquadratic encoding for those order types with fast orientation queries. We further refine our encoding to attain O(log n / loglog n) query time at the expense of a negligibly larger space requirement. In the realizable case, we show that all those encodings can be computed efficiently. Finally, we generalize our results to the encoding of point configurations in higher dimension.

Approximate shortest paths and distance oracles in weighted unit-disk graphs

(with Konstantinos Tsakalidis)

We present the first near-linear-time (1+epsilon)-approximation algorithm for the diameter of a weighted unit-disk graph of n vertices, running in O(n log^2 n) time, for any constant epsilon > 0, improving the near-O(n^{3/2})-time algorithm of Gao and Zhang [STOC 2003]. Using similar ideas, we can construct a (1+epsilon)-approximate distance oracle for weighted unit-disk graphs with O(1) query time, with a similar improvement in the preprocessing time, from near O(n^{3/2}) to O(n log^3 n). We also obtain new results for a number of other related problems in the weighted unit-disk graph metric, such as the radius and bichromatic closest pair.

As a further application, we use our new distance oracle, along with additional ideas, to solve the (1+epsilon)-approximate all-pairs bounded-leg shortest paths problem for a set of n planar points, with near O(n^{2.579}) preprocessing time, O(n^2 log n) space, and O(log log n) query time, improving thus the near-cubic preprocessing bound by Roditty and Segal [SODA 2007].

PDF
Journal of Computational Geometry, 10(2):3--20, 2019 (SoCG special issue)
In Proc. 34th International Symposium on Computational Geometry (SoCG), pages 24:1-24:13, 2018

Dynamic planar orthogonal point location in sublogarithmic time

We study a longstanding problem in computational geometry: dynamic 2-d orthogonal point location, i.e., vertical ray shooting among n horizontal line segments. We present a data structure achieving O(log n / loglog n) optimal expected query time and O(log^{1/2+epsilon} n) update time (amortized) in the word-RAM model for any constant epsilon > 0, under the assumption that the x-coordinates are integers bounded polynomially in n. This substantially improves previous results of Giyora and Kaplan [SODA 2007] and Blelloch [SODA 2008] with O(log n) query and update time, and of Nekrich (2010) with O(log n / loglog n) query time and O(log^{1+epsilon} n) update time. Our result matches the best known upper bound for simpler problems such as dynamic 2-d dominance range searching.

We also obtain similar bounds for orthogonal line segment intersection reporting queries, vertical ray stabbing, and vertical stabbing-max, improving previous bounds, respectively, of Blelloch [SODA 2008] and Mortensen [SODA 2003], of Tao (2014), and of Agarwal, Arge, and Yi [SODA 2005] and Nekrich [ISAAC 2011].

Approximation schemes for 0-1 knapsack

We revisit the standard 0-1 knapsack problem. The latest polynomial-time approximation scheme by Rhee (2015) with approximation factor 1+epsilon has running time near O~(n+(1/epsilon)^{5/2}) (ignoring polylogarithmic factors), and is randomized. We present a simpler algorithm which achieves the same result and is deterministic.

With more effort, our ideas can actually lead to an improved time bound near O~(n + (1/epsilon)^{12/5}), and still further improvement for small n.

More logarithmic-factor speedups for 3SUM, (median,+)-convolution, and some geometric 3SUM-hard problems

We present an algorithm that solves the 3SUM problem for n real numbers in O((n^2/log^2n) (loglog n)^{O(1)}) time, improving previous solutions by about a logarithmic factor. Our framework for shaving off two logarithmic factors can be applied to other problems, such as (median,+)-convolution/matrix multiplication and algebraic generalizations of 3SUM. We also obtain the first subquadratic results on some 3SUM-hard problems in computational geometry, for example, deciding whether (the interiors of) a constant number of simple polygons have a common intersection.

PDF file
ACM Transactions on Algorithms, 16(1): 7:1-7:23, 2020 (SODA special issue)
In Proc. 29th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 881-897, 2018

Improved bounds for drawing trees on fixed points with L-shaped edges

(with Therese Biedl, Martin Derka, Kshitij Jain, and Anna Lubiw)

Let T be an n-node tree of maximum degree 4, and let P be a set of n points in the plane with no two points on the same horizontal or vertical line. It is an open question whether T always has a planar drawing on P such that each edge is drawn as an orthogonal path with one bend (an "L-shaped" edge). By giving new methods for drawing trees, we improve the bounds on the size of the point set P for which such drawings are possible to: O(n^{1.55}) for maximum degree 4 trees; O(n^{1.22}) for maximum degree 3 (binary) trees; and O(n^{1.142}) for perfect binary trees.

Drawing ordered trees with L-shaped edges is harder---we give an example that cannot be done and a bound of O(n log n) points for L-shaped drawings of ordered caterpillars, which contrasts with the known linear bound for unordered caterpillars.

In Proc. 25th International Symposium on Graph Drawing and Network Visualization (GD), pages 305-317, 2017

Faster approximate diameter and distance oracles in planar graphs

We present an algorithm that computes a (1+epsilon)-approximation of the diameter of a weighted, undirected planar graph of n vertices with non-negative edge lengths in O(n log n (log n + (1/epsilon)^5)) expected time, improving upon the O(n ((1/epsilon)^4 log^4 n + 2^{O(1/epsilon)}))-time algorithm of Weimann and Yuster [ICALP 2013]. Our algorithm makes two improvements over that result: first and foremost, it replaces the exponential dependency on 1/epsilon with a polynomial one, by adapting and specializing Cabello's recent abstract-Voronoi-diagram-based technique [SODA 2017] for approximation purposes; second, it shaves off two logarithmic factors by choosing a better sequence of error parameters during recursion.

Moreover, using similar techniques, we improve the (1+epsilon)-approximate distance oracle of Gu and Xu [ISAAC 2015] by first replacing the exponential dependency on 1/epsilon on the preprocessing time and space with a polynomial one and second removing a logarithmic factor from the preprocessing time.

All-pairs shortest paths in geometric intersection graphs

(with Konstantinos Tsakalidis)

We address the All-Pairs Shortest Paths (APSP) problem for a number of unweighted, undirected geometric intersection graphs. We present a general reduction of the problem to static, offline intersection searching (specifically detection). As a consequence, we can solve APSP for intersection graphs of n arbitrary disks in O(n^2 log n) time, axis-aligned line segments in O(n^2 loglog n) time, arbitrary line segments in O(n^{7/3} log^{1/3} n) time, d-dimensional axis-aligned boxes in O(n^2 log^{d-1.5} n) time for d >= 2, and d-dimensional axis-aligned unit hypercubes in O(n^2 loglog n) time for d=3 and O(n^2 log^{d-3} n) time for d >= 4.

In addition, we show how to solve the Single-Source Shortest Paths (SSSP) problem in unweighted intersection graphs of axis-aligned line segments in O(n log n) time, by a reduction to dynamic orthogonal point location.

Dynamic orthogonal range searching, revisited

We study a longstanding problem in computational geometry: 2-d dynamic orthogonal range reporting. We present a new data structure achieving O(log n/loglog n + k) optimal query time and O(log^{2/3+o(1)}n) update time (amortized) in the word RAM model, where n is the number of data points and k is the output size. This is the first improvement in over 10 years of Mortensen's previous result [SIAM J. Comput., 2006], which has O(log^{7/8+epsilon}n) update time for an arbitrarily small constant epsilon.

In the case of 3-sided queries, our update time reduces to O(log^{1/2+epsilon}n), improving Wilkinson's previous bound [ESA 2014] of O(log^{2/3+epsilon}n).

Orthogonal range searching in moderate dimensions: k-d trees and range trees strike back

We revisit the orthogonal range searching problem and the exact l_infinity nearest neighbor searching problem for a static set of n points when the dimension d is moderately large. We give the first data structure with near linear space that achieves truly sublinear query time when the dimension is any constant multiple of log n. Specifically, the preprocessing time/space is O(n^{1+delta}) for any constant delta > 0, and the expected query time is n^{1 - 1/O(c log c)}. The data structure is simple and is based on a new "augmented, randomized, lopsided" variant of k-d trees. It matches (in fact, slightly improves) the performance of previous combinatorial algorithms that work only in the case of offline queries [Impagliazzo, Lovett, Paturi, and Schneider (2014) and Chan (SODA'15)]. It leads to faster combinatorial algorithms for all-pairs shortest paths in general weighted graphs and rectangular Boolean matrix multiplication.

In the offline case, we show that the problem can be reduced to the Boolean orthogonal vectors problem and thus admits an n^{2 - 1/O(log c)}-time non-combinatorial algorithm [Abboud, Williams, and Yu (SODA'15)]. This reduction is also simple and is based on range trees.

Finally, we use a similar approach to obtain a small improvement to Indyk's data structure [FOCS'98] for approximate l_infinity nearest neighbor search when d=c log n.

Applications of Chebyshev polynomials to low-dimensional computational geometry

We apply the polynomial method---specifically, Chebyshev polynomials---to obtain a number of new results on geometric approximation algorithms in low constant dimensions. For example, we give an algorithm for constructing epsilon-kernels (coresets for approximate width and approximate convex hull) in close to optimal time O(n + (1/epsilon)^{(d-1)/2}), up to a small near-(1/epsilon)^{3/2} factor, for any d-dimensional n-point set. We obtain an improved data structure for Euclidean approximate nearest neighbor search with close to O(n log n + (1/epsilon)^{d/4}n) preprocessing time and O((1/epsilon)^{d/4} log n) query time. We obtain improved approximation algorithms for discrete Voronoi diagrams, diameter, and bichromatic closest pair in the L_s-metric for any even integer constant s > 2. The techniques are general and may have further applications.

PDF file (journal version)
Journal of Computational Geometry, 9(2):3-20, 2018 (SoCG special issue)
In Proc. 33rd International Symposium on Computational Geometry (SoCG), pages 26:1-26:15, 2017

All-pairs shortest paths in unit disk graphs in slightly subquadratic time

(with John Hershberger and Simon Pratt)

In this paper we study the all-pairs shortest paths problem in (unweighted) unit-disk graphs. The previous best solution for this problem required O(n^2 log n) time, by running the O(n log n)-time single-source shortest path algorithm of Cabello and Jejcic (2015) from every source vertex, where n is the number of vertices. We not only manage to eliminate the logarithmic factor, but also obtain the first (slightly) subquadratic algorithm for the problem, running in O(n^2 sqrt{loglog n/log n}) time. Our algorithm computes an implicit representation of all the shortest paths, and, in the same amount of time, can also compute the diameter of the graph.

In Proc. 27th International Symposium on Algorithms and Computation (ISAAC), pages 24:1-24:13, 2016

A clustering-based approach to kinetic closest pair

(with Zahed Rahmati)

Given a set P of n moving points in fixed dimension d, where the trajectory of each point is a polynomial of degree bounded by some constant, we present a kinetic data structure (KDS) for maintenance of the closest pair on P. Assuming the closest pair distance is between 1 and Delta over time, our KDS uses O(n log Delta) space and processes O(n^2 beta log Delta log n + n^2 beta log Delta loglog Delta) events, each in worst-case time O(log^2 n + log^2 log Delta). Here, beta is an extremely slow-growing function. The locality of the KDS is O(log n + loglog Delta). Our closest pair KDS supports insertions and deletions of points. An insertion or deletion takes worst-case time O(log Delta log^2 n +log Delta log^2log Delta). Also, we use a similar approach to provide a KDS for the all epsilon-nearest neighbors in R^d. The complexities of the previous KDSs, for both closest pair and all epsilon-nearest neighbors, have polylogarithmic factors, where the number of logs depends on dimension d. Assuming Delta is polynomial in n, our KDSs obtain improvements on the previous KDSs. Our solutions are based on a kinetic clustering on P. Though we use ideas from the previous clustering KDS by Hershberger, we simplify and improve his work.

Polynomial representations of threshold functions and algorithmic applications

(with Josh Alman and Ryan Williams)

We design new polynomials for representing threshold functions in three different regimes: probabilistic polynomials of low degree, which need far less randomness than previous constructions, polynomial threshold functions (PTFs) with "nice" threshold behavior and degree almost as low as the probabilistic polynomials, and a new notion of probabilistic PTFs where we combine the above techniques to achieve even lower degree with similar "nice" threshold behavior. Utilizing these polynomial constructions, we design faster algorithms for a variety of problems:

Offline Hamming Nearest (and Furthest) Neighbors: Given n red and n blue points in d-dimensional Hamming space for d = c log n, we can find an (exact) nearest (or furthest) blue neighbor for every red point in randomized time n^{2 - 1/O(sqrt{c} log^{2/3} c)} or deterministic time n^{2 - 1/O(c log^2 c)}. These improve on a randomized n^{2 - 1/O(c log^2 c)} bound by Alman and Williams (FOCS'15), and also lead to faster MAX-SAT algorithms for sparse CNFs.
Offline Approximate Nearest (and Furthest) Neighbors: Given n red and n blue points in d-dimensional l_1 or Euclidean space, we can find a (1+eps)-approximate nearest (or furthest) blue neighbor for each red point in randomized time near dn + n^{2 - Omega(eps^{1/3}/log(1/eps))}. This improves on an algorithm by Valiant (FOCS'12) with randomized time near dn + n^{2 - Omega(sqrt{eps})}, which in turn improves previous methods based on locality-sensitive hashing.
SAT Algorithms and Lower Bounds for Circuits With Linear Threshold Functions: We give a satisfiability algorithm for AC^0[m] . LTF . LTF circuits with a subquadratic number of linear threshold gates on the bottom layer, and a subexponential number of gates on the other layers, that runs in deterministic 2^{n - n^eps} time. This strictly generalizes a SAT algorithm for ACC^0 . LTF circuits of subexponential size by Williams (STOC'14) and also implies new circuit lower bounds for threshold circuits, improving a recent gate lower bound of Kane and Williams (STOC'16). We also give a randomized 2^{n - n^eps}-time SAT algorithm for subexponential-size MAJ . AC^0 . LTF . AC^0 . LTF circuits, where the top MAJ gate and middle LTF gates have O(n^{6/5-delta}) fan-in.

Dynamic streaming algorithms for epsilon-kernels

Introduced by Agarwal, Har-Peled, and Varadarajan (2004), an epsilon-kernel of a point set is a coreset that can be used to approximate the width, minimum enclosing cylinder, minimum bounding box, and solve various related geometric optimization problems. Such coresets form one of the most important tools in the design of linear-time approximation algorithms in computational geometry, as well as efficient insertion-only streaming algorithms and dynamic (non-streaming) data structures. In this paper, we continue the theme and explore dynamic streaming algorithms (in the so-called turnstile model).

Andoni and Nguyen [SODA'12] described a dynamic streaming algorithm for maintaining a (1+epsilon)-approximation of the width using O(polylog U) space and update time for a point set in [U]^d for any constant dimension d and any constant epsilon > 0. Their sketch, based on a polynomial method, does not explicitly maintain an epsilon-kernel. We extend their method to maintain an epsilon-kernel, and at the same time reduce some of logarithmic factors. As an application, we obtain the first randomized dynamic streaming algorithm for the width problem (and related geometric optimization problems) that supports k outliers, using poly(k, log U) space and time.

Two approaches to building time-windowed geometric data structures

Given a set of geometric objects each associated with a time value, we wish to determine whether a given property is true for a subset of those objects whose time values fall within a query time window. We call such problems time-windowed decision problems, and they have been the subject of much recent attention, for instance studied by Bokal, Cabello, and Eppstein [SoCG 2015]. In this paper, we present new approaches to this class of problems that are conceptually simpler than Bokal et al.'s, and also lead to faster algorithms. For instance, we present algorithms for preprocessing for the time-windowed 2D diameter decision problem in O(n log n) time and the time-windowed 2D convex hull area decision problem in O(n log n) time, improving Bokal et al.'s O(n log^2 n) and O(n log n loglog n) solutions respectively.

Our first approach is to reduce time-windowed decision problems to a generalized range successor problem, which we solve using a novel way to search range trees. Our other approach is to use dynamic data structures directly, taking advantage of a new observation that the total number of combinatorial changes to a planar convex hull is near linear for any FIFO update sequence, in which deletions occur in the same order as insertions. We also apply these approaches to obtain the first O(n polylog n) algorithms for the time-windowed 3D diameter decision and 2D orthogonal segment intersection detection problems.

Deterministic APSP, orthogonal vectors, and more: quickly derandomizing Razborov-Smolensky

(with Ryan Williams)

We show how to solve all-pairs shortest paths on n nodes in deterministic n^3 / 2^{Omega(sqrt{log n})} time, and how to count the pairs of orthogonal vectors among n 0-1 vectors in d = c log n dimensions in deterministic n^{2 - 1/O(log c)} time. These running times essentially match the best known randomized algorithms of (Williams, STOC'14) and (Abboud, Williams, and Yu, SODA 2015) respectively, and the ability to count was open even for randomized algorithms. By reductions, these two results yield faster deterministic algorithms for many other problems. Our techniques can also be used to count k-SAT assignments on n variable formulas in 2^{n - n/O(k)} time, roughly matching the best known running times for detecting satisfiability and resolving an open problem of Santhanam (2013).

A key to our constructions is an efficient way to deterministically simulate certain probabilistic polynomials critical to the algorithms of prior work, applying epsilon-biased sets and modulus-amplifying polynomials.

Improved deterministic algorithms for linear programming in low dimensions

At SODA'93, Chazelle and Matousek presented a derandomization of Clarkson's sampling-based algorithm for solving linear programs with n constraints and d variables in d^{(7+o(1))d} n deterministic time. The time bound can be improved to d^{(5+o(1))d} n with subsequent work by Bronnimann, Chazelle, and Matousek [FOCS'93]. We first point out a much simpler derandomization of Clarkson's algorithm that avoids epsilon-approximations and runs in d^{(3+o(1))d} n time. We then describe a few additional ideas that eventually improve the deterministic time bound to d^{(1/2+o(1))d} n.

PDF file
PDF talk slides
ACM Transactions of Algorithms, 14(3): 30:1-30:10, 2018 (SODA special issue)
In Proc. 27th ACM-SIAM Symposium on Discrete Algorithms, pages 1213-1219, 2016

Multidimensional range selection

(with Gelin Zhou)

We study the problem of supporting (orthogonal) range selection queries over a set of n points in constant-dimensional space. Under the standard word-RAM model with word size w = Omega(lg n), we present data structures that occupy O(n (lg n/lglg n)^{d-1}) words of space and support d-dimensional range selection queries using O((lg n/lglg n)^d) query time. This improves the best known data structure by a factor of lglg n in query time. To develop our data structures, we generalize the "parallel counting" technique of Brodal, Gfeller, J�rgensen, and Sanders (2011) for one-dimensional range selection to higher dimensions.

As a byproduct, we design data structures to support d-dimensional range counting queries within O(n (log_w n)^{d-2}) words of space and O((log_w n)^{d-1}) query time, for any word size w = Omega(lg n). This improves the best known result of JaJa, Mortensen, and Shi (2004) when lg w >> lglg n.

In Proc. 26th International Symposium on Algorithms and Computation (ISAAC), Lecture Notes in Computer Science, volume 9472, pages 83-92, 2015

Towards an optimal method for dynamic planar point location

(with Yakov Nekrich)

We describe a fully dynamic linear-space data structure for point location in connected planar subdivisions, or more generally vertical ray shooting among non-intersecting line segments, that supports queries in O(log n (loglog n)^2) time and updates in O(log n loglog n) time. This is the first data structure that achieves close to logarithmic query and update time simultaneously, ignoring loglog n factors. We further show how to reduce the query time to O(log n loglog n) in the RAM model with randomization. Alternatively, the query time can be lowered to O(log n) if the update time is increased to O(log^{1+eps}n) for any constant eps>0, or vice versa.

PDF file
SIAM Journal on Computing, 47:2337-2361, 2018 (FOCS special issue)
In Proc. 56th IEEE Symposium on Foundations of Computer Science (FOCS), pages 390-409, 2015

Time-windowed closest pair

(with Simon Pratt)

Given a set of points in any constant dimension, each of which is associated with a time during which that point is active, we design a data structure with O(n log n) space that can find the closest pair of active points within a query interval of time in O(loglog n) time using a quadtree-based approach in the word-RAM model.

PDF file
In Proc. 27th Canadian Conference on Computational Geometry (CCCG), pages 141-144, 2015

Approximating the minimum closest pair distance and nearest neighbor distances of linearly moving points

(with Zahed Rahmati)

Given a set of n moving points in R^d, where each point moves along a linear trajectory at arbitrary but constant velocity, we present an O~(n^{5/3})-time algorithm to compute a (1+epsilon)-factor approximation to the minimum closest pair distance over time, for any constant epsilon>0 and any constant dimension d. This addresses an open problem posed by Gupta, Janardan, and Smid (1996).

More generally, we consider a data structure version of the problem: for any linearly moving query point q, we want a (1+epsilon)-factor approximation to the minimum nearest neighbor distance to q over time. We present a data structure that requires O~(n^{5/3}) space and O~(n^{2/3}) query time, O~(n^5) space and polylogarithmic query time, or O~(n) space and O~(n^{4/5}) query time, for any constant epsilon>0 and any constant dimension d.

PDF file
Computational Geometry: Theory and Applications, 60:2-7, 2017 (CCCG special issue)
In Proc. 27th Canadian Conference on Computational Geometry (CCCG), pages 136-140, 2015

Dynamic data structures for approximate Hausdorff distance in the word RAM

(with Konstantinos Tsakalidis)

We give a fully dynamic data structure for maintaining an approximation of the Hausdorff distance between two point sets in a constant dimension d, a standard problem in computational geometry. Our solution has an approximation factor of 1+epsilon for any constant epsilon>0 and expected update time O(log U/loglog n}). The result of the paper greatly improves over the previous exact method, which required O~(n^{5/6}) time and worked only in a semi-online setting. The model of computation is the word RAM model.

PDF file
Computational Geometry: Theory and Applications, 60:37-44, 2017 (CCCG special issue)
In Proc. 27th Canadian Conference on Computational Geometry (CCCG), pages 151-155, 2015

Fast string dictionary lookup with one error

(with Moshe Lewenstein)

A set of strings, called a string dictionary, is a basic string data structure. The most primitive query, where one seeks the existence of a pattern in the dictionary, is called a lookup query. Approximate lookup queries, i.e., to lookup the existence of a pattern with a bounded number of errors, is a fundamental string problem. Several data structures have been proposed to do so efficiently. Almost all solutions consider a single error, as will this result. Lately, Belazzougui and Venturini (CPM 2013) raised the question whether one can construct efficient indexes that support lookup queries with one error in optimal query time, that is, O(|p|/w + occ), where p is the query, w the machine word-size, and occ the number of occurrences.

Specifically, for the problem of one mismatch and constant alphabet size, we obtain optimal query time. For a dictionary of d strings our proposed index uses O(w d log^{1+eps}d) additional bit space (beyond the dictionary which can be maintained in compressed form). Our results are parameterized for a space-time tradeoff.

We propose more results for the case of lookup queries with one insertion/deletion on dictionaries over a constant sized alphabet. These results are especially effective for large patterns.

A simpler linear-time algorithm for intersecting two convex polyhedra in three dimensions

Chazelle [FOCS'89] gave a linear-time algorithm to compute the intersection of two convex polyhedra in three dimensions. We present a simpler algorithm to do the same.

Optimal deterministic algorithms for 2-d and 3-d shallow cuttings

We present optimal deterministic algorithms for constructing shallow cuttings in an arrangement of lines in two dimensions or planes in three dimensions. Our results improve the deterministic polynomial-time algorithm of Matousek (1992) and the optimal but randomized algorithm of Ramos (1999). This leads to efficient derandomization of previous algorithms for numerous well-studied problems in computational geometry, including halfspace range reporting in 2-d and 3-d, k nearest neighbors search in 2-d, (<= k)-levels in 3-d, order-k Voronoi diagrams in 2-d, linear programming with k violations in 2-d, dynamic convex hulls in 3-d, dynamic nearest neighbor search in 2-d, convex layers (onion peeling) in 3-d, epsilon-nets for halfspace ranges in 3-d, and more. As a side product we also describe an optimal deterministic algorithm for constructing standard (non-shallow) cuttings in two dimensions, which is arguably simpler than the known optimal algorithms by Matousek (1991) and Chazelle (1993).

Clustered integer 3SUM via additive combinatorics

(with Moshe Lewenstein)

We present a collection of new results on problems related to 3SUM, including:

The first truly subquadratic algorithm for
- computing the (min,+) convolution for monotone increasing sequences with integer values bounded by O(n),
- solving 3SUM for monotone sets in 2D with integer coordinates bounded by O(n), and
- preprocessing a binary string for histogram indexing (also called jumbled indexing).
The running time is O(n^{(9+sqrt{177})/12} polylog n)=O(n^{1.859}) with randomization, or O(n^{1.864}) deterministically. This greatly improves the previous n^2 / 2^{Omega(sqrt{log n})} time bound obtained from Williams' recent result on all-pairs shortest paths [STOC'14], and answers an open question raised by several researchers studying the histogram indexing problem.
The first algorithm for histogram indexing for any constant alphabet size that achieves truly subquadratic preprocessing time and truly sublinear query time.
A truly subquadratic algorithm for integer 3SUM in the case when the given set can be partitioned into n^{1-delta} clusters each covered by an interval of length n, for any constant delta > 0.
An algorithm to preprocess any set of n integers so that subsequently 3SUM on any given subset can be solved in O(n^{13/7} polylog n) time.

All these results are obtained by a surprising new technique, based on the Balog-Szemeredi-Gowers Theorem from additive combinatorics.

Speeding up the Four Russians algorithm by about one more logarithmic factor

We present a new combinatorial algorithm for Boolean matrix multiplication that runs in O(n^3 (loglog n)^3 / log^3 n) time. This improves the previous combinatorial algorithm by Bansal and Williams [FOCS'09] that runs in O(n^3 (loglog n)^2 / log^{9/4} n) time. Whereas Bansal and Williams' algorithm uses regularity lemmas for graphs, the new algorithm is simple and uses entirely elementary techniques: table lookup, word operations, plus a deceptively straightforward divide-and-conquer.

Our algorithm is in part inspired by a recent result of Impagliazzo, Lovett, Paturi, and Schneider (2014) on a different geometric problem, offline dominance range reporting; we improve their analysis for that problem as well.

Drawing partially embedded and simultaneously planar graphs

(with Fabrizio Frati, Carsten Gutwenger, Anna Lubiw, Petra Mutzel, and Marcus Schaefer)

We investigate the problem of constructing planar drawings with few bends for two related problems, the partially embedded graph (PEG) problem to extend a straight-line planar drawing of a subgraph to a planar drawing of the whole graph and the simultaneous planarity (SEFE) problem to find planar drawings of two graphs that coincide on shared vertices and edges. In both cases we show that if the required planar drawings exist, then there are planar drawings with a linear number of bends per edge and, in the case of simultaneous planarity, a constant number of crossings between every pair of edges. Our proofs provide efficient algorithms if the combinatorial embedding information about the drawing is given. Our result on partially embedded graph drawing generalizes a classic result of Pach and Wenger showing that any planar graph can be drawn with fixed locations for its vertices and with a linear number of bends per edge.

Succinct indices for path minimum with applications to path reporting

(with Meng He, J. Ian Munro, and Gelin Zhou)

In the path minimum query problem, we preprocess a tree on n weighted nodes, such that given an arbitrary path, we can locate the node with the smallest weight along this path. We design novel succinct indices for this problem; one of our index structures supports queries in O(alpha(m,n)) time, and occupies O(m) bits of space in addition to the space required for the input tree, where m is an integer greater than or equal to n and alpha(m,n) is the inverse-Ackermann function. These indices give us the first succinct data structures for the path minimum problem, and allow us to obtain new data structures for path reporting queries, which report the nodes along a query path whose weights are within a query range. We achieve three different time/space tradeoffs for path reporting by designing (a) an O(n)-word structure with O(lg^eps n + occ lg^eps n) query time, where occ is the number of nodes reported; (b) an O(n lglg n)-word structure with O(lglg n + occ lglg n) query time; and (c) an O(n lg^eps n)- word structure with O(lglg n + occ) query time. These tradeoffs match the state of the art of two-dimensional orthogonal range reporting queries which can be treated as a special case of path reporting queries. When the number of distinct weights is much smaller than n, we further improve both the query time and the space cost of these three results.

On hardness of jumbled indexing

(with Amihood Amir, Moshe Lewenstein, and Noa Lewenstein)

Jumbled indexing is the problem of indexing a text T for queries that ask whether there is a substring of T matching a pattern represented as a Parikh vector, i.e., the vector of frequency counts for each character. Jumbled indexing has garnered a lot of interest in the last four years. There is a naive algorithm that preprocesses all answers in O(n^2 |Sigma|) time allowing quick queries afterwards, and there is another naive algorithm that requires no preprocessing but has O(n log |Sigma|) query time. Despite a tremendous amount of effort there has been little improvement over these running times.

In this paper we provide good reason for this. We show that, under a 3SUM-hardness assumption, jumbled indexing for alphabets of size omega(1) requires Omega(n^{2-epsilon}) preprocessing time or Omega(n^{1-delta}) query time for any epsilon,delta>0. In fact, under a stronger 3SUM-hardness assumption, for any constant alphabet size r >= 3 there exist describable fixed constant epsilon_r and delta_r such that jumbled indexing requires Omega(n^{2-epsilon_r}) preprocessing time or Omega(n^{1-delta_r}) query time.

Deterministic rectangle enclosure and offline dominance reporting on the RAM

(with Peyman Afshani and Konstantinos Tsakalidis)

We revisit a classical problem in computational geometry that has been studied since the 1980s: in the rectangle enclosure problem we want to report all k enclosing pairs of n input rectangles in 2D. We present the first deterministic algorithm that takes O(n log n + k) worst-case time and O(n) space in the word-RAM model. This improves previous deterministic algorithms with O((n log n + k) loglog n) running time. We achieve the result by derandomizing the algorithm of Chan, Larsen and Patrascu [SoCG'11] that attains the same time complexity but in expectation.

The 2D rectangle enclosure problem is related to the offline dominance range reporting problem in 4D, and our result leads to the currently fastest deterministic algorithm for offline dominance reporting in any constant dimension d >= 4.

A key tool behind Chan et al.'s previous randomized algorithm is shallow cuttings for 3D dominance ranges. Recently, Afshani and Tsakalidis [SODA'14] obtained a deterministic O(n log n)-time algorithm to construct such cuttings. We first present an improved deterministic construction algorithm that runs in O(n loglog n) time in the word-RAM; this result is of independent interest. Many additional ideas are then incorporated, including a linear-time algorithm for merging shallow cuttings and an algorithm for an offline tree point location problem.

Better epsilon-dependencies for offline approximate nearest neighbor search, Euclidean minimum spanning trees, and epsilon-kernels

(with Sunil Arya)

Recently, Arya, da Fonseca, and Mount [STOC 2011, SODA 2012] made notable progress in improving the epsilon-dependencies in the space/query-time tradeoffs for (1+epsilon)-factor approximate nearest neighbor search in fixed-dimensional Euclidean spaces. However, epsilon-dependencies in the preprocessing time were not considered, and so their data structures cannot be used to derive faster algorithms for offline proximity problems. Known algorithms for many such problems, including approximate bichromatic closest pair (BCP) and approximate Euclidean minimum spanning trees (EMST), typically have factors near (1/epsilon)^{d/2 +/- O(1)} in the running time when the dimension d is a constant.

We describe a technique that breaks the (1/epsilon)^{d/2} barrier and yields new results for many well-known proximity problems, including:

an O((1/epsilon)^{d/3+O(1)} n)-time randomized algorithm for approximate BCP,
an O((1/epsilon)^{d/3+O(1)} n log n)-time algorithm for approximate EMST, and
an O(n log n + (1/epsilon)^{d/3+O(1)} n)-time algorithm to answer n approximate nearest neighbor queries on n points.

Using additional bit-packing tricks, we can shave off the log n factor for EMST, and even move most of the epsilon-factors to a sublinear term.

The improvement arises from a new time bound for exact "discrete Voronoi diagrams", which were previously used in the construction of epsilon-kernels (or extent-based coresets), a well-known tool for another class of fundamental problems. This connection leads to more results, including:

a streaming algorithm to maintain an approximate diameter in O((1/epsilon)^{d/3+O(1)}) time per point using O((1/epsilon)^{d/2+O(1)}) space, and
a streaming algorithm to maintain an epsilon-kernel in O((1/epsilon)^{d/4+O(1)}) time per point using O((1/epsilon)^{d/2+O(1)}) space.

On constant factors in comparison-based geometric algorithms and data structures

(with Patrick Lee)

Many standard problems in computational geometry have been solved asymptotically optimally as far as comparison-based algorithms are concerned, but there has been little work focusing on improving the constant factors hidden in big-Oh bounds on the number of comparisons needed. In this paper, we consider orthogonal-type problems and present a number of results that achieve optimality in the constant factors of the leading terms, including:

an algorithm for the 2D maxima problem that uses n lg h + O(n sqrt{lg h}) comparisons, where h denotes the output size;
a randomized algorithm for the 3D maxima problem that uses n lg h + O(n lg^{2/3} h) expected number of comparisons;
a randomized algorithm for detecting intersections among a set of orthogonal line segments that uses n lg n + O(n sqrt{lg n}) expected number of comparisons;
a data structure for point location among 3D disjoint axis-parallel boxes that can answer queries in (3/2)lg n + O(lg lg n) comparisons;
a data structure for point location in a 3D box subdivision that can answer queries in (4/3)lg n + O(sqrt{lg n}) comparisons.

Some of the results can be adapted to solve nonorthogonal problems, such as 2D convex hulls and general line segment intersection.

Our algorithms and data structures use a variety of techniques, including Seidel and Adamy's planar point location method, weighted binary search, and height-optimal BSP trees.

PDF file
Discrete and Computational Geometry, 53:489-513, 2015 (SoCG special issue)
In Proc. 30th Symposium on Computational Geometry (SoCG), pages 40-49, 2014

Selection and sorting in the "restore" model

(with J. Ian Munro and Venkatesh Raman)

We consider the classical selection and sorting problems in a model where the initial permutation of the input has to be restored after completing the computation. While the requirement of the restoration is stringent compared to the classical versions of the problems, this model is more relaxed than a read-only memory where the input elements are not allowed to be moved within the input array.

We first show that for a sequence of n integers, selection (finding the median or more generally the k-th smallest element for a given k) can be done in O(n) time using O(lg n) words of extra space in this model. In contrast, no linear-time selection algorithm is known which uses polylogarithmic space in the read-only memory model.

For sorting n integers in this model, we first present an O(n lg n)-time algorithm using O(lg n) words of extra space. When the universe size U is polynomial in n, we give a faster O(n)-time algorithm (analogous to radix sort) which uses O(n^eps) words of extra space for an arbitrarily small constant eps>0. More generally, we show how to match the time bound of any word-RAM integer-sorting algorithm using O(n^eps) words of extra space. In sharp contrast, there is an Omega(n^2/S)-time lower bound for integer sorting using O(S) bits of space in the read-only memory model. Extension of our results to arbitrary input types beyond integers is not possible: for "indivisible" input elements, we can prove the same Omega(n^2/S) lower bound for sorting in our model.

En route, we develop linear-time in-place algorithms to extract leading bits of the input array and to compress and decompress strings with low entropy; these techniques may be of independent interest.

PDF file (conference version)
ACM Transactions on Algorithms, 14(2): 11:1-11:18, 2018
In Proc. 25th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 995-1004, 2014

Finding median in read-only memory on integer input

(with J. Ian Munro and Venkatesh Raman)

Starting with Munro and Paterson (1980), the selection or median-finding problem has been extensively studied in the read-only memory model and in streaming models. Munro and Paterson's deterministic algorithm and its subsequent refinements require at least polylogarithmic or logarithmic space, whereas the algorithms by Munro and Raman (1996) and Raman and Ramnath (1999) can be made to use just O(1) storage cells but take O(n^{1+eps}) time for an arbitrarily small constant eps>0.

In this paper, we show that faster selection algorithms in read-only memory are possible if the input is a sequence of integers. For example, one algorithm uses O(1) storage cells and takes O(n lg U) time where U is the universe size. Another algorithm uses O(1) storage cells and takes O(n lg n lglg U) time. We also describe an O(n)-time algorithm for finding an approximate median using O(lg^eps U) storage cells.

All our algorithms are simple and deterministic. Interestingly, one of our algorithms works by making multiple calls to the textbook algorithm to find the majority of a sequence of bits. This is to find the `centroid' of the trie of the binary representation of the sequence of integers. This technique could be of independent interest.

Minimum length embedding of planar graphs at fixed vertex locations

(with Hella-Franziska Hoffmann, Stephen Kiazyk, and Anna Lubiw)

We consider the problem of finding a planar embedding of a graph at fixed vertex locations that minimizes the total edge length. The problem is known to be NP-hard. We give polynomial time algorithms achieving an O(n^{1/2} log n) approximation for paths and matchings, and an O(n) approximation for general graphs.

Klee's measure problem made easy

We present a new algorithm for a classic problem in computational geometry, Klee's measure problem: given a set of n axis-parallel boxes in d-dimensional space, compute the volume of the union of the boxes. The algorithm runs in O(n^{d/2}) time for any constant d >= 3. Although it improves the previous best algorithm by "just" an iterated logarithmic factor, the real surprise lies in the simplicity of the new algorithm.

We also show that it is theoretically possible to beat the O(n^{d/2}) time bound by logarithmic factors for integer input in the word RAM model, and for other variants of the problem.

With additional work, we obtain an O(n^{d/3} polylog n)-time algorithm for the important special case of orthants or unit hypercubes (which include the so-called "hypervolume indicator problem"), and an O(n^{(d+1)/3} polylog n)-time algorithm for the case of arbitrary hypercubes or fat boxes, improving a previous O(n^{(d+2)/3})-time algorithm by Bringmann.

Maximum-weight planar boxes in O(n^2) time (and better)

(with Jérémy Barbay, Gonzalo Navarro, and Pablo Pérez-Lantero)

Given a set P of n points in R^d, where each point p of P is associated with a weight w(p) (positive or negative), the Maximum-Weight Box problem consists in finding an axis-aligned box B maximizing the sum of w(p) over all points p in B. We describe algorithms for this problem in two dimensions that run in the worst case in O(n^2) time, and much less on more specific classes of instances. In particular, these results imply similar ones for the Maximum Bichromatic Discrepancy Box problem. These improve by a factor of log n on the best worst-case complexity previously known for these problems, O(n^2 lg n) [Cortes et al., J. Alg., 2009; Dobkin et al., J. Comput. Syst. Sci., 1996].

Geometric red-blue set cover for unit squares and related problems

(with Nan Hu)

We study a geometric version of the Red-Blue Set Cover problem originally proposed by Carr, Doddi, Konjevod, and Marathe (SODA 2000): given a red point set, a blue point set, and a set of objects, we want to use objects to cover all the blue points, while minimizing the number of red points covered. We prove that the problem is NP-hard even when the objects are unit squares in 2D, and we give the first PTAS for this case. The technique we use simplifies and unifies previous PTASes for the weighted geometric set cover problem and the unique maximum coverage problem for 2D unit squares.

Smart-grid electricity allocation via strip packing with slicing

(with Soroush Alamdari, Therese Biedl, Elyot Grant, Krishnam Raju Jampani, S. Keshav, Anna Lubiw, and Vinayak Pathak)

One advantage of smart grids is that they can reduce the peak load by distributing electricity-demands over multiple short intervals. Finding a schedule that minimizes the peak load corresponds to a variant of a strip packing problem. Normally, for strip packing problems, a given set of axis-aligned rectangles must be packed into a fixed-width strip, and the goal is to minimize the height of the strip. The electricity-allocation application can be modelled as strip packing with slicing: each rectangle may be cut vertically into multiple slices and the slices may be packed into the strip as individual pieces. The stacking constraint forbids solutions in which a vertical line intersects two slices of the same rectangle.

We give a fully polynomial time approximation scheme for this problem, as well as a practical polynomial time algorithm that slices each rectangle at most once and yields a solution of height at most 5/3 times the optimal height.

How to morph planar graph drawings

(with Soroush Alamdari, Patrizio Angelini, Fidel Barrera-Cruz, Giordano Da Lozzo, Giuseppe Di Battista, Fabrizio Frati, Penny Haxell, Anna Lubiw, Maurizio Patrignani, Vincenzo Roselli, Sahil Singla, Bryan T. Wilkinson)

In 1944, Cairns proved the following theorem: given any two straight-line planar drawings of a triangulation with the same outer face, there exists a morph (i.e., a continuous transformation) between the two drawings so that the drawing remains straight-line planar at all times. Cairns's original proof required exponentially many morphing steps. We prove that there is a morph that consists of O(n^2) steps, where each step is a linear morph that moves each vertex at constant speed along a straight line. Using a known result on compatible triangulations this implies that for a general planar graph G and any two straight-line planar drawings of G with the same embedding, there is a morph between the two drawings that preserves straight-line planarity and consists of O(n^4) steps.

PDF file (conference version)
SIAM Journal on Computing, 46(2):824-852, 2017
In Proc. 24th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1656-1667, 2013

Adaptive and approximate orthogonal range counting

(with Bryan T. Wilkinson)

We present three new results on one of the most basic problems in geometric data structures, 2-D orthogonal range counting. All the results are in the w-bit word RAM model.

It is well known that there are linear-space data structures for 2-D orthogonal range counting with worst-case optimal query time O(log n/loglog n). We give an O(n loglog n)-space adaptive data structure that improves the query time to O(loglog n + log k/loglog n), where k is the output count. When k=O(1), our bounds match the state of the art for the 2-D orthogonal range emptiness problem [Chan, Larsen, and Patrascu, SoCG 2011].
We give an O(n loglog n)-space data structure for 2-D approximate orthogonal range counting that can compute a (1+delta)-factor approximation to the count in O(loglog n) time for any fixed constant delta > 0. Again, our bounds match the state of the art for the 2-D orthogonal range emptiness problem.
Lastly we consider the 1-D range selection problem, where a query in an array involves finding the k-th least element in a given subarray. This problem is closely related to 3-sided 2-D orthogonal range counting. Recently, Jørgensen and Larsen [SODA 2011] presented a linear-space adaptive data structure with query time O(loglog n + log k/loglog n). We give a new linear-space structure that improves the query time to O(1 + log k/loglog n), exactly matching the lower bound proved by Jørgensen and Larsen.

PDF file (conference version)
ACM Transactions on Algorithms, 12(4):45:1-45:15, 2016
In Proc. 24th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 241-251, 2013

Self-approaching graphs

(with Soroush Alamdari, Elyot Grant, Anna Lubiw, and Vinayak Pathak)

In this paper we introduce self-approaching graph drawings. A straight-line drawing of a graph is self-approaching if, for any origin vertex s and any destination vertex t, there is an st-path in the graph such that, for any point q on the path, as a point p moves continuously along the path from the origin to q, the Euclidean distance from p to q is always decreasing. This is a more stringent condition than a greedy drawing (where only the distance between vertices on the path and the destination vertex must decrease), and guarantees that the drawing is a 5.33-spanner.

We study three topics: (1) recognizing self-approaching drawings; (2) constructing self-approaching drawings of a given graph; (3) constructing a self-approaching Steiner network connecting a given set of points.

We show that: (1) there are efficient algorithms to test if a polygonal path is self-approaching in R^2 and R^3, but it is NP-hard to test if a given graph drawing in R^3 has a self-approaching uv-path; (2) we can characterize the trees that have self-approaching drawings; (3) for any given set of terminal points in the plane, we can find a linear sized network that has a self-approaching path between any ordered pair of terminals.

Linear-space data structures for range minority query in arrays

(with Stephane Durocher, Matthew Skala, and Bryan T. Wilkinson)

We consider range queries in arrays that search for low-frequency elements: least frequent elements and alpha-minorities. An alpha-minority of a query range has multiplicity no greater than an alpha fraction of the elements in the range. Our data structure for the least frequent element range query problem requires O(n) space, O(n^{3/2}) preprocessing time, and O(sqrt{n}) query time. A reduction from boolean matrix multiplication to this problem shows the hardness of simultaneous improvements in both preprocessing time and query time. Our data structure for the alpha-minority range query problem requires O(n) space and O(1/alpha) query time, and allows alpha to be specified at query time.

Conflict-free coloring of points with respect to rectangles and approximation algorithms for discrete independent set

In the conflict-free coloring problem, for a given range space, we want to bound the minimum value F(n) such that every set P of n points can be colored with F(n) colors with the property that every nonempty range contains a unique color. We prove a new upper bound O(n^{0.368}) with respect to orthogonal ranges in two dimensions (i.e., axis-parallel rectangles), which is the first improvement over the previous bound O(n^{0.382}) by Ajwani, Elbassioni, Govindarajan, and Ray [SPAA'07]. This result leads to an O(n^{1-0.632/2^{d-2}}) upper bound with respect to orthogonal ranges (boxes) in dimension d, and also an O(n^{1-0.632/(2^{d-3}-0.368)}) upper bound with respect to dominance ranges (orthants) in dimension d >= 4.

We also observe that combinatorial results on conflict-free coloring can be applied to the analysis of approximation algorithms for discrete versions of geometric independent set problems. Here, given a set P of (weighted) points and a set S of ranges, we want to select the largest(-weight) subset Q of P with the property that every range of S contains at most one point of Q. We obtain, for example, a randomized O(n^{0.368})-approximation algorithm for this problem with respect to orthogonal ranges in the plane.

Linear-space data structures for range mode query in arrays

(with Stephane Durocher, Kasper Green Larsen, Jason Morrison, and Bryan T. Wilkinson)

A mode of a multiset S is an element a in S of maximum multiplicity; that is, a occurs at least as frequently as any other element in S. Given an array A[1:n] of n elements, we consider a basic problem: constructing a static data structure that efficiently answers range mode queries on A. Each query consists of an input pair of indices (i, j) for which a mode of A[i:j] must be returned. The best previous data structure with linear space, by Krizanc, Morin, and Smid (ISAAC 2003), requires O(sqrt(n) loglog n) query time. We improve their result and present an O(n)-space data structure that supports range mode queries in O(sqrt(n / log n)) worst-case time. Furthermore, we present strong evidence that a query time significantly below sqrt(n) cannot be achieved by purely combinatorial techniques; we show that boolean matrix multiplication of two sqrt(n) by sqrt(n) matrices reduces to n range mode queries in an array of size O(n). Additionally, we give linear-space data structures for orthogonal range mode in higher dimensions (queries in near O(n^(1-1/2d)) time) and for halfspace range mode in higher dimensions (queries in O(n^(1-1/d^2)) time).

PDF file
Theory of Computing Systems, 55:719-741, 2014 (STACS special issue)
In Proc. 29th Symposium on Theoretical Aspects of Computer Science (STACS), pages 290-301, 2012

Weighted capacitated, priority, and geometric set cover via improved quasi-uniform sampling

(with Elyot Grant, Jochen Koenemann, and Malcolm Sharpe)

The minimum-weight set cover problem is widely known to be O(log n)-approximable, with no improvement possible in the general case. We take the approach of exploiting problem structure to achieve better results, by providing a geometry-inspired algorithm whose approximation guarantee depends solely on an instance-specific combinatorial property known as shallow cell complexity (SCC). Roughly speaking, a set cover instance has low SCC if any column-induced submatrix of the corresponding element-set incidence matrix has few distinct rows. By adapting and improving Varadarajan's recent quasi-uniform random sampling method for weighted geometric covering problems, we obtain strong approximation algorithms for a structurally rich class of weighted covering problems with low SCC.

Our main result has several immediate consequences. Among them, we settle an open question of Chakrabarty et al. by showing that weighted instances of the capacitated covering problem with underlying network structure have O(1)-approximations. Additionally, our improvements to Varadarajan's sampling framework yield several new results for weighted geometric set cover, hitting set, and dominating set problems. In particular, for weighted covering problems exhibiting linear (or near-linear) union complexity, we obtain approximability results agreeing with those known for the unweighted case. For example, we obtain a constant approximation for the weighted disk cover problem, improving upon the 2^{O(log* n)}-approximation known prior to our work and matching the O(1)-approximation known for the unweighted variant. We also obtain an O(log log* n)-approximation for weighted fat triangle cover.

Exact algorithms and APX-hardness results for geometric set cover

(with Elyot Grant)

We study several geometric set cover problems in which the goal is to compute a minimum cover of a given set of points in Euclidean space by a family of geometric objects. We give a short proof that this problem is APX-hard when the objects are axis-aligned fat rectangles, even when each rectangle is an epsilon-perturbed copy of a single unit square. We extend this result to several other classes of objects including almost-circular ellipses, axis-aligned slabs, downward shadows of line segments, downward shadows of graphs of cubic functions, 3-dimensional unit balls, and axis-aligned cubes, as well as some related hitting set problems. Our hardness results are all proven by encoding a highly structured minimum vertex cover problem which we believe may be of independent interest.

In contrast, we give a polynomial-time dynamic programming algorithm for 2-dimensional set cover where the objects are pseudodisks containing the origin or are downward shadows of pairwise 2-intersecting x-monotone curves. Our algorithm extends to the weighted case where a minimum-cost cover is required.

Bichromatic line segment intersection counting in O(n sqrt{log n}) time

(with Bryan T. Wilkinson)

We give an algorithm for bichromatic line segment intersection counting that runs in O(n sqrt{log n}) time under the word RAM model via a reduction to dynamic predecessor search, offline point location, and offline dynamic ranking. This algorithm is the first to solve bichromatic line segment intersection counting in o(n log n) time.

Streaming and dynamic algorithms for minimum enclosing balls in high dimensions

(with Vinayak Pathak)

At SODA'10, Agarwal and Sharathkumar presented a streaming algorithm for approximating the minimum enclosing ball of a set of points in d-dimensional Euclidean space. Their algorithm requires one pass, uses O(d) space, and was shown to have approximation factor at most (1+sqrt{3})/2 + eps ~ 1.3661. We prove that the same algorithm has approximation factor less than 1.22, which brings us much closer to a (1+sqrt{2})/2 ~ 1.207 lower bound given by Agarwal and Sharathkumar.

We also apply this technique to the dynamic version of the minimum enclosing ball problem (in the non-streaming setting). We give an O(dn)-space data structure that can maintain a 1.22-approximate minimum enclosing ball in O(d log n) expected amortized time per insertion/deletion.

Closest pair and the post office problem for stochastic points

(with Pegah Kamousi and Subhash Suri)

Given a (master) set M of n points in d-dimensional Euclidean space, consider drawing a random subset that includes each point m_i in M with an independent probability p_i. How difficult is it to compute elementary statistics about the closest pair of points in such a subset? For instance, what is the probability that the distance between the closest pair of points in the random subset is no more than l, for a given value l? Or, can we preprocess the master set M such that given a query point q, we can efficiently estimate the expected distance from q to its nearest neighbor in the random subset? These basic computational geometry problems, whose complexity is quite well-understood in the deterministic setting, prove to be surprisingly hard in our stochastic setting. We obtain hardness results and approximation algorithms for stochastic problems of this kind.

Orthogonal range searching on the RAM, revisited

(with Kasper Green Larsen and Mihai Patrascu)

We present a number of new results on one of the most extensively studied topics in computational geometry, orthogonal range searching. All our results are in the standard word RAM model:

We present two data structures for 2-d orthogonal range emptiness. The first achieves O(n lglg n) space and O(lglg n) query time, assuming that the n given points are in rank space. This improves the previous results by Alstrup, Brodal, and Rauhe (FOCS'00), with O(n lg^eps n) space and O(lglg n) query time, or with O(n lglg n) space and O(lg^2lg n) query time. Our second data structure uses O(n) space and answers queries in O(lg^eps n) time. The best previous O(n)-space data structure, due to Nekrich (WADS'07), answers queries in O(lg n/lglg n) time.
We give a data structure for 3-d orthogonal range reporting with O(n lg^{1+eps}n) space and O(lglg n + k) query time for points in rank space, for any constant eps>0. This improves the previous results by Afshani (ESA'08), Karpinski and Nekrich (COCOON'09), and Chan (SODA'11), with O(n lg^3 n) space and O(lglg n + k) query time, or with O(n lg^{1+eps}n) space and O(lg^2lg n + k) query time. Consequently, we obtain improved upper bounds for orthogonal range reporting in all constant dimensions above 3.
Our approach also leads to a new data structure for 2-d orthogonal range minimum queries with O(n lg^eps n) space and O(lglg n) query time for points in rank space.
We give a randomized algorithm for 4-d offline dominance range reporting/emptiness with running time O(n lg n) plus the output size. This resolves two open problems (both appeared in Preparata and Shamos' seminal book):
- given a set of n axis-aligned rectangles in the plane, we can report all k enclosure pairs (i.e., pairs (r_1,r_2) where rectangle r_1 completely encloses rectangle r_2) in O(n lg n + k) expected time;
- given a set of n points in 4-d, we can find all maximal points (points not dominated by any other points) in O(n lg n) expected time.
The most recent previous development on (a) was reported back in SoCG'95 by Gupta, Janardan, Smid, and Dasgupta, whose main result was an O([n lg n + k] lglg n) algorithm. The best previous result on (b) was an O(n lg n lglg n) algorithm due to Gabow, Bentley, and Tarjan---from STOC'84! As a consequence, we also obtain the current-record time bound for the maxima problem in all constant dimensions above 4.

Three problems about dynamic convex hulls

We present three results related to dynamic convex hulls:

A fully dynamic data structure for maintaining a set of n points in the plane so that we can find the edges of the convex hull intersecting a query line, with expected query and amortized update time O(log^{1+eps}n) for an arbitrarily small constant eps>0. This improves the previous bound of O(log^{3/2}n).
A fully dynamic data structure for maintaining a set of n points in the plane to support halfplane range reporting queries in O(log n + k) time with O(polylog n) expected amortized update time. A similar result holds for 3-dimensional orthogonal range reporting. For 3-dimensional halfspace range reporting, the query time increases to O(log^2 n/loglog n + k).
A semi-online dynamic data structure for maintaining a set of n line segments in the plane, so that we can decide whether a query line segment lies completely above the lower envelope, with query time O(log n) and amortized update time O(n^eps). As a corollary, we can solve the following problem in O(n^{1+eps}) time: given a triangulated terrain in 3-d of size n, identify all faces that are partially visible from a fixed viewpoint.

Stochastic minimum spanning trees in Euclidean spaces

(with Pegah Kamousi and Subhash Suri)

We study the complexity of geometric minimum spanning trees under a stochastic model of input: Suppose we are given a master set of points {s_1,s_2,...,s_n} in d-dimensional Euclidean space, where each point s_i is active with some independent and arbitrary but known probability p_i. We want to compute the expected length of the minimum spanning tree (MST) of the active points. This particular form of stochastic problems has not been investigated before in computational geometry to our knowledge, and is motivated by uncertainty inherent in many sources of geometric data.

We show that this stochastic MST problem is #P-hard for any dimension d >= 2.
We present a simple fully polynomial randomized approximation scheme (FPRAS) in any metric, and thus also in any Euclidean, space.
For d=2, we present two deterministic approximation algorithms: an O(n^4)-time constant-factor algorithm, and a PTAS based on a combination of shifted quadtrees and dynamic programming.
Finally, for the related problem of approximating the tail bounds of the distribution of the MST length, we observe that no polynomial algorithm with any multiplicative factor is possible for d >= 2, assuming P != NP.

In addition to this existential model of stochastic input, we also briefly consider a locational model where each point is present with certainty but its location is probabilistic.

Persistent predecessor search and orthogonal point location on the word RAM

We answer a basic data structuring question (for example, raised by Dietz and Raman back in SODA 1991): can van Emde Boas trees be made persistent, without changing their asymptotic query/update time? We present a (partially) persistent data structure that supports predecessor search in a set of integers in {1,...,U} under an arbitrary sequence of n insertions and deletions, with O(loglog U) expected query time and expected amortized update time, and O(n) space. The query bound is optimal in U for linear-space structures and improves previous near-O((loglog U)^2) methods.

The same method solves a fundamental problem from computational geometry: point location in orthogonal planar subdivisions (where edges are vertical or horizontal). We obtain the first static data structure achieving O(loglog U) worst-case query time and linear space. This result is again optimal in U for linear-space structures and improves the previous O((loglog U)^2) method by de Berg, Snoeyink, and van Kreveld (1992). The same result also holds for higher-dimensional subdivisions that are orthogonal binary space partitions, and for certain nonorthogonal planar subdivisions such as triangulations without small angles. Many geometric applications follow, including improved query times for orthogonal range reporting for dimensions >= 3 on the RAM.

Our key technique is an interesting new van-Emde-Boas-style recursion that alternates between two strategies, both quite simple.

PDF file
ACM Transactions on Algorithms, 9(3):22, 2013 (SODA special issue)
In Proc. 22nd ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1131-1145, 2011

Optimal partition trees

We revisit one of the most fundamental classes of data structure problems in computational geometry: range searching. Back in SoCG'92, Matousek gave a partition tree method for d-dimensional simplex range searching achieving O(n) space and O(n^{1-1/d}) query time. Although this method is generally believed to be optimal, it is complicated and requires O(n^{1+eps}) preprocessing time for any fixed eps > 0. An earlier method by Matousek (SoCG'91) requires O(n log n) preprocessing time but O(n^{1-1/d} polylog n) query time. We give a new method that achieves simultaneously O(n log n) preprocessing time, O(n) space, and O(n^{1-1/d}) query time with high probability. Our method has several advantages:

It is conceptually simpler than Matousek's SoCG'92 method. Our partition trees satisfy many ideal properties (e.g., constant degree, optimal crossing number at almost all layers, and disjointness of the children's cells at each node).
It leads to more efficient multilevel partition trees, which are important in many data structural applications (each level adds at most one logarithmic factor to the space and query bounds, better than in all previous methods).
A similar improvement applies to a shallow version of partition trees, yielding O(n log n) time, O(n) space, and O(n^{1-1/(d/2)}) query time for halfspace range emptiness in even dimensions d >= 4.

Numerous consequences follow (e.g., improved results for computing spanning trees with low crossing number, ray shooting among line segments, intersection searching, exact nearest neighbor search, linear programming queries, finding extreme points, ...).

Counting inversions, offline orthogonal range counting, and related problems

(with Mihai Patrascu)

We give an O(n sqrt{lg n})-time algorithm for counting the number of inversions in a permutation on n elements. This improves a long-standing previous bound of O(n lg n/lg lg n) that followed from Dietz's data structure [WADS'89], and answers a question of Andersson and Petersson [SODA'95]. As Dietz's result is known to be optimal for the related dynamic rank problem, our result demonstrates a significant improvement in the offline setting. Our new technique is quite simple: we perform a "vertical partitioning" of a trie (akin to van Emde Boas trees), and use ideas from external memory. However, the technique finds numerous applications: for example, we obtain

in d dimensions, an algorithm to answer n offline orthogonal range counting queries in time O(n lg^{d-2+1/d} n);
an improved construction time for online data structures for orthogonal range counting;
an improved update time for the partial sums problem;
faster Word RAM algorithms for finding the maximum depth in an arrangement of axis-aligned rectangles, and for the slope selection problem.

As a bonus, we also give a simple (1+epsilon)-approximation algorithm for counting inversions that runs in linear time, improving the previous O(n lg lg n) bound by Andersson and Petersson.

PostScript file (preliminary version)
In Proc. 21st ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 161-173, 2010

Instance-optimal geometric algorithms

(with Peyman Afshani and Jérémy Barbay)

We prove the existence of an algorithm A for computing 2-d or 3-d convex hulls that is optimal for every point set in the following sense: for every set S of n points and for every algorithm A' in a certain class C, the maximum running time of A on input s_1,...,s_n is at most a constant factor times the maximum running time of A' on s_1,...,s_n, where the maximum is taken over all permutations s_1,...,s_n of S. In fact, we can establish a stronger property: for every S and A', the maximum running time of A is at most a constant factor times the average running time of A' over all permutations of S. We call algorithms satisfying these properties instance-optimal in the order-oblivious and random-order setting. Such instance-optimal algorithms simultaneously subsume output-sensitive algorithms and distribution-dependent average-case algorithms, and all algorithms that do not take advantage of the order of the input or that assume the input is given in a random order.

The class C under consideration consists of all algorithms in a decision tree model where the tests involve only multilinear functions with a constant number of arguments. To establish an instance-specific lower bound, we deviate from traditional Ben-Or-style proofs and adopt an interesting adversary argument. For 2-d convex hulls, we prove that a version of the well known algorithm by Kirkpatrick and Seidel (1986) or Chan, Snoeyink, and Yap (1995) already attains this lower bound. For 3-d convex hulls, we propose a new algorithm.

To demonstrate the potential of the concept, we further obtain instance-optimal results for a few other standard problems in computational geometry, such as maxima in 2-d and 3-d, orthogonal line segment intersection in 2-d, finding bichromatic L_infty-close pairs in 2-d, off-line orthogonal range searching in 2-d, off-line dominance reporting in 2-d and 3-d, off-line halfspace range reporting in 2-d and 3-d, and off-line point location in 2-d.

PostScript file (preliminary version)
Journal of the ACM, 64(1): 3:1-3:38, 2017
In Proc. 50th IEEE Symposium on Foundations of Computer Science (FOCS), pages 129-138, 2009

Approximation algorithms for maximum independent set of pseudo-disks

(with Sariel Har-Peled)

We present approximation algorithms for maximum independent set of pseudo-disks in the plane, both in the weighted and unweighted cases. For the unweighted case, we prove that a local search algorithm yields a PTAS. For the weighted case, we suggest a novel rounding scheme based on an LP relaxation of the problem, that leads to a constant-factor approximation.

Most previous algorithms for maximum independent set (in geometric settings) relied on packing arguments that are not applicable in this case. As such, the analysis of both algorithms requires some new combinatorial ideas, which we believe to be of independent interest.

Optimal in-place and cache-oblivious algorithms for 3-d convex hulls and 2-d segment intersection

We describe the first optimal randomized in-place algorithm for the basic 3-d convex hull problem (and, in particular, for 2-d Voronoi diagrams). The algorithm runs in O(n log n) expected time using only O(1) extra space; this improves the previous O(n log^3 n) bound by Bronnimann, Chan, and Chen [SoCG'04]. The same approach leads to an optimal randomized in-place algorithm for the 2-d line segment intersection problem, with O(n log n + K) expected running time for output size K, improving the previous O(n log^2 n + K) bound by Vahrenhold [WADS'05]. As a bonus, we also point out a simplification of a known optimal cache-oblivious (non-in-place) algorithm by Kumar and Ramos (2002) for 3-d convex hulls, and observe its applicability to 2-d segment intersection, extending a recent result for red/blue segment intersection by Arge, Molhave, and Zeh [ESA'08]. Our results are all obtained by standard random sampling techniques, with some interesting twists.

Quake heaps: a simple alternative to Fibonacci heaps

This note describes a data structure that has the same theoretical performance as Fibonacci heaps, supporting decrease-key operations in O(1) amortized time and delete-min operations in O(log n) amortized time. The data structure is simple to explain and analyze, and may be of pedagogical value.

Comparison-based time-space lower bounds for selection

We establish the first nontrivial lower bounds on time-space tradeoffs for the selection problem. We prove that any comparison-based randomized algorithm for finding the median requires Omega(n log log_S n) expected time in the RAM model (or more generally in the comparison branching program model), if we have S bits of extra space besides the read-only input array. This bound is tight for all S >> log n, and remains true even if the array is given in a random order. Our result thus answers a 16-year-old question of Munro and Raman, and also complements recent lower bounds that are restricted to sequential access, as in the multi-pass streaming model [Chakrabarti et al., SODA 2008].

We also prove that any comparison-based, deterministic, multi-pass streaming algorithm for finding the median requires Omega(n log^* (n/s) + n log_s n) worst-case time (in scanning plus comparisons), if we have s cells of space. This bound is also tight for all s >> log^2 n. We get deterministic lower bounds for I/O-efficient algorithms as well.

All proofs in this paper involve "elementary" techniques only.

Optimal halfspace range reporting in three dimensions

We give the first optimal solution to a standard problem in computational geometry: three-dimensional halfspace range reporting. We show that n points in 3-d can be stored in a linear-space data structure so that all k points inside a query halfspace can be reported in O(log n + k) time. The data structure can be built in O(n log n) expected time. The previous methods with optimal query time required superlinear (O(n log log n)) space.

We also mention consequences, for example, to higher dimensions and to external-memory data structures. As an aside, we partially answer another open question concerning the crossing number in Matousek's shallow partition theorem in the 3-d case (a tool used in many known halfspace range reporting methods).

Dynamic connectivity: connecting to networks and geometry

(with Mihai Patrascu and Liam Roditty)

Dynamic connectivity is a well-studied problem, but so far the most compelling progress has been confined to the edge-update model: maintain an understanding of connectivity in an undirected graph, subject to edge insertions and deletions. In this paper, we study two more challenging, yet equally fundamental problems:

Subgraph connectivity asks to maintain an understanding of connectivity under vertex updates: updates can turn vertices on and off, and queries refer to the subgraph induced by on vertices. (For instance, this is closer to applications in networks of routers, where node faults may occur.) We describe a data structure supporting vertex updates in O~(m^{2/3}) amortized time, where m denotes the number of edges in the graph. This greatly improves over the previous result [STOC'02], which required fast matrix multiplication and had an update time of O(m^{0.94}). The new data structure is also simpler.

Geometric connectivity asks to maintain a dynamic set of n geometric objects, and query connectivity in their intersection graph. (For instance, the intersection graph of balls describes connectivity in a network of sensors with bounded transmission radius.) Previously, nontrivial fully dynamic results were known only for special cases like axis-parallel line segments and rectangles. We provide similarly improved update times, O~(n^{2/3}), for these special cases. Moreover, we show how to obtain sublinear update bounds for virtually all families of geometric objects which allow sublinear-time range queries. In particular, we obtain the first sublinear update time for arbitrary 2D line segments: O~(n^{9/10}); for d-dimensional simplices: O~(n^{1-1/(d(2d+1))}); and for d-dimensional balls: O~(n^{1-1/((d+1)(2d+3))}).

A (slightly) faster algorithm for Klee's measure problem

Given n axis-parallel boxes in a fixed dimension d >= 3, how efficiently can we compute the volume of the union? This standard problem in computational geometry, commonly referred to as Klee's measure problem, can be solved in time O(n^{d/2} log n) by an algorithm of Overmars and Yap (FOCS 1988). We give the first (albeit small) improvement: our new algorithm runs in time n^{d/2} 2^{O(log* n)}, where log* denotes the iterated logarithm.

For the related problem of computing the depth in an arrangement of n boxes, we further improve the time bound to near O(n^{d/2} / log^{d/2-1} n), ignoring log log n factors. Other applications and lower-bound possibilities are discussed. The ideas behind the improved algorithms are simple.

Dynamic coresets

We give a dynamic data structure that can maintain an epsilon-coreset of n points, with respect to the extent measure, in O(log n) time for any constant epsilon > 0 and any constant dimension. The previous method by Agarwal, Har-Peled, and Varadarajan requires polylogarithmic update time. For points with integer coordinates bounded by U, we alternatively get O(log log U) time. Numerous applications follow, for example, on dynamically approximating the width, smallest enclosing cylinder, minimum bounding box, or minimum-width annulus. We can also use the same approach to maintain approximate k-centers in O(min{log n, log log U}) randomized amortized time for any constant k and any constant dimension. For the smallest enclosing cylinder problem, we also show that a constant-factor approximation can be maintained in O(1) randomized amortized time on the word RAM.

On levels in arrangements of curves, III: further improvements

We revisit the problem of bounding the combinatorial complexity of the k-level in a two-dimensional arrangement of n curves. We give a number of small improvements over the results from the author's previous paper (FOCS'03). For example:

For pseudo-parabolas, we obtain an upper bound of O(n^{3/2} log n), which improves the previous bound of O(n^{3/2} log^2 n).
For 3-intersecting curves, we obtain an upper bound of O(n^{2-1/(3+sqrt{7})}) = O(n^{1.823}), the first improvement over the previous bound of O(n^{11/6}) = O(n^{1.834}).
For s-intersecting curves or curve segments with s>= 3, we obtain an upper bound of O(n^{2 - 1/2s - delta_s}) if s is odd, and O(n^{2 - 1/(2(s-1)) - delta_s}) if s is even, for some constant delta_s > 0.
For pseudo-segments, we obtain an upper bound of O(n^{4/3} log^{1/3-delta} n) for some constant delta > 0; the previous bound was O(n^{4/3} log^{2/3} n).
For s-intersecting curve segments such that all but B pairs intersect at most once, we obtain an upper bound of O((n^{4/3} + n^{1+delta}B^{1/3-delta})log^{1/3-delta} n + B) for some constant delta > 0.

We also observe that better concrete bounds for k-levels for constant values of n could in principle lead to better asymptotic bounds for arbitrary n.

Well-separated pair decomposition in linear time?

Given a point set in a fixed dimension, we note that a well-separated pair decomposition can be found in linear time if we assume that the ratio of the farthest pair distance to the closest pair distance is polynomially bounded. Many consequences follow; for example, we can construct spanners or solve the all-nearest-neighbors problem in linear time (under the same assumption), and we compute an approximate Euclidean minimum spanning tree in linear time (without any assumption).

PostScript file (Manuscript, 2007)
Information Processing Letters, 107:138-141, 2008

In-place 2-d nearest neighbor search

We revisit a classic problem in computational geometry: preprocessing a planar n-point set to answer nearest neighbor queries. In SoCG 2004, Bronnimann, Chan, and Chen showed that it is possible to design an efficient data structure that takes no extra space at all other than the input array holding a permutation of the points. The best query time known for such "in-place data structures" is O(log^2 n). In this paper, we break the O(log^2 n) barrier by providing a method that answers nearest neighbor queries in time

O((log n)^{log_{3/2} 2} loglog n) = O(log^{1.71} n).

The new method uses divide-and-conquer (based on planar separators) in a way that is quite unlike traditional point location methods, and extends previous 1-d data structuring techniques (specifically the van Emde Boas layout). The method has further applications, for example, in answering extreme point queries for a 3-d point set on the boundary of a convex set of constant complexity.

On the bichromatic k-set problem

We study a bichromatic version of the well-known k-set problem: given two sets R and B of points of total size n and an integer k, how many subsets of the form (R\cap h) \cup (B - h) can have size exactly k over all halfspaces h? In the dual, the problem is asymptotically equivalent to determining the worst-case combinatorial complexity of the k-level in an arrangement of n halfspaces.

Disproving a conjecture by Linhart (1993), we present the first nontrivial upper bound for all k << n in two dimensions: O(nk^{1/3} + n^{5/6-e}k^{2/3+2e} + k^2) for any fixed e > 0. In three dimensions, we obtain the bound O(nk^{3/2} + n^{0.5034}k^{2.4932} + k^3). Incidentally, this also implies a new upper bound for the original k-set problem in four dimensions: O(n^2k^{3/2} + n^{1.5034}k^{2.4932} + nk^3), which improves the best previous result for all k << n^{0.923}. Extensions to other cases, such as arrangements of disks, are also discussed.

Transdichotomous results in computational geometry, II: offline search

(with Mihai Patrascu)

We reexamine fundamental problems from computational geometry in the word RAM model, where input coordinates are integers that fit in a machine word. We develop a new algorithm for offline point location, a two-dimensional analog of sorting where one needs to order points with respect to segments. This result implies, for example, that the Voronoi diagram of n points in the plane can be constructed in (randomized) time n . 2^{O(\sqrt{lg lg n})}. Similar bounds hold for numerous other geometric problems, such as three-dimensional convex hulls, planar Euclidean minimum spanning trees, line segment intersection, and triangulation of non-simple polygons.

In FOCS'06, we developed a data structure for online point location, which implied a bound of O(n lg n / lg lg n) for Voronoi diagrams and the other problems. Our current bounds are dramatically better, and a convincing improvement over the classic O(n lg n) algorithms. As in the field of integer sorting, the main challenge is to find ways to manipulate information, while avoiding the online problem (in that case, predecessor search).

arXiv version
Submitted to ACM Transactions on Algorithms
In Proc. 39th ACM Symposium on Theory of Computing (STOC), pages 31-39, 2007

More algorithms for all-pairs shortest paths in weighted graphs

In the first part of the paper, we reexamine the all-pairs shortest paths (APSP) problem and present a new algorithm with running time approaching O(n^3 log^3log n / log^2 n), which improves all known algorithms for general real-weighted dense graphs.

In the second part of the paper, we use fast matrix multiplication to obtain truly subcubic APSP algorithms for a large class of "geometrically weighted" graphs, where the weight of an edge is a function of the coordinates of its vertices. For example, for graphs embedded in Euclidean space of a constant dimension d, we obtain a time bound near O(n^{3-(3-w)/(2d+4)}), where w < 2.376; in two dimensions, this is O(n^{2.922}). Our framework greatly extends the previously considered case of small-integer-weighted graphs, and incidentally also yields the first truly subcubic result (near O(n^{3-(3-w)/4}) = O(n^{2.844}) time) for APSP in real-vertex-weighted graphs, as well as an improved result (near O(n^{(3+w)/2}) = O(n^{2.688}) time) for the all-pairs lightest shortest path problem for small-integer-weighted graphs.

On approximate range counting and depth

(with Hamid Zarrabi-Zadeh)

Improving previous methods by Aronov and Har-Peled (SODA'05) and Kaplan and Sharir (SODA'06), we present a randomized data structure of O(n) expected size which can answer 3D approximate halfspace range counting queries in O(log (n/k)) expected time, where k is the actual value of the count. This is the first optimal method for the problem in the standard decision tree model; moreover, unlike previous methods, the new method is Las Vegas instead of Monte Carlo. In addition, we describe new results for several related problems, including approximate Tukey depth queries in 3D, approximate regression depth queries in 2D, and approximate linear programming with violations in low dimensions.

PostScript file
Discrete and Computational Geometry, 42:3-21, 2009 (SoCG special issue)
In Proc. 23rd ACM Symposium on Computational Geometry (SoCG), pages 337-343, 2007

Transdichotomous results in computational geometry, I: Point location in sublogarithmic time

(with Mihai Patrascu)

Given a planar subdivision whose coordinates are integers bounded by U <= 2^w, we present a linear-space data structure that can answer point location queries in O(min{ lg n/lglg n, sqrt{lg U/lglg U} }) time on the unit-cost RAM with word size w. This is the first result to beat the standard Theta(lg n) bound for infinite precision models.

As a consequence, we obtain the first o(n lg n) (randomized) algorithms for many fundamental problems in computational geometry for arbitrary integer input on the word RAM, including: constructing the convex hull of a three-dimensional point set, computing the Voronoi diagram or the Euclidean minimum spanning tree of a planar point set, triangulating a polygon with holes, and finding intersections among a set of line segments. Higher-dimensional extensions and applications are also discussed.

Though computational geometry with bounded precision input has been investigated for a long time, improvements have been limited largely to problems of an orthogonal flavor. Our results surpass this long-standing limitation, answering, for example, a question of Willard (SODA'92).

PostScript file (journal version)
SIAM Journal on Computing, 39:703-729, 2009 (FOCS special issue)
The journal version combines my paper in Proc. 47th IEEE Symposium on Foundations of Computer Science (FOCS), pages 333-342, 2006 with Mihai's paper in Proc. 47th IEEE Symposium on Foundations of Computer Science (FOCS), pages 325-332, 2006

A randomized algorithm for online unit clustering

In this paper, we consider the online version of the following problem: partition a set of input points into subsets, each enclosable by a unit ball, so as to minimize the number of subsets used. In the one-dimensional case, we show that surprisingly the naive upper bound of 2 on the competitive ratio can be beaten: we present a new randomized 15/8-competitive online algorithm. We also provide some lower bounds and an extension to two dimensions.

Dynamic connectivity for axis-parallel rectangles

(with Hamid Zarrabi-Zadeh)

In this paper we give a fully dynamic data structure to maintain the connectivity of the intersection graph of n axis-parallel rectangles. The amortized update time (insertion and deletion of rectangles) is O(n^{10/11} polylog n) and the query time (deciding whether two given rectangles are connected) is O(1). It slightly improves the update time (O(n^{0.94})) of the previous method while drastically reducing the query time (near O(n^{1/3})). Our method does not use fast matrix multiplication results and supports a wider range of queries.

Necklaces, convolutions, and X+Y

(with David Bremner, Erik D. Demaine, Jeff Erickson, Ferran Hurtado, John Iacono, Stefan Langerman, Mihai Patrascu, and Perouz Taslakian)

We give subquadratic algorithms that, given two necklaces each with n beads at arbitrary positions, compute the optimal rotation of the necklaces to best align the beads. Here alignment is measured according to the l_p norm of the vector of distances between pairs of beads from opposite necklaces in the best perfect matching. We show surprisingly different results for p=1, p=2, and p=infty. For p=2, we reduce the problem to standard convolution, while for p=infty and p=1, we reduce the problem to (min,+) convolution and (median,+) convolution. Then we solve the latter two convolution problems in subquadratic time, which are interesting results in their own right. These results shed some light on the classic sorting X+Y problem, because the convolutions can be viewed as computing order statistics on the antidiagonals of the X+Y matrix. All of our algorithms run in o(n^2) time, whereas the obvious algorithms for these problems run in Theta(n^2) time.

A simple streaming algorithm for minimum enclosing balls

We analyze an extremely simple approximation algorithm for computing the minimum enclosing ball (or the 1-center) of a set of points in high dimensions. We prove that this algorithm computes a 3/2-factor approximation in any dimension using minimum space in just one pass over the data points.

In Proc. 18th Canadian Conference on Computational Geometry (CCCG), pages 139-142, 2006

A minimalist's implementation of an approximate nearest neighbor algorithm in fixed dimensions

We consider the standard problem of approximate nearest neighbor search, for a given set of n points with integer coordinates in a constant-dimensional Euclidean space. We describe a simple implementation of a randomized algorithm that guarantees O(log n) expected query time and O(n log n) preprocessing time. The entire C++ code is under 100 lines long and requires no extra space other than the input array. The algorithm can easily be made dynamic as well.

PostScript file (Manuscript, 2006)
Source code

A dynamic data structure for 3-d convex hulls and 2-d nearest neighbor queries

We present a fully dynamic randomized data structure that can answer queries about the convex hull of a set of n points in three dimensions, where insertions take O(log^3 n) expected amortized time, deletions take O(log^6 n) expected amortized time, and extreme-point queries take O(log^2 n) worst-case time. This is the first method that guarantees polylogarithmic update and query cost for arbitrary sequences of insertions and deletions, and improves the previous O(n^epsilon)-time method by Agarwal and Matousek a decade ago. As a consequence, we obtain similar results for nearest neighbor queries in two dimensions and improved results for numerous fundamental geometric problems (such as levels in three dimensions and dynamic Euclidean minimum spanning trees in the plane).

All-pairs shortest paths for unweighted undirected graphs in o(mn) time

We revisit the all-pairs-shortest-paths problem for an unweighted undirected graph with n vertices and m edges. We present new algorithms with the following running times:

O(mn / log n) if m > n log n logloglog n
O(mn loglog n / log n) if m > n loglog n
O(n^2 log^2log n / log n) if m <= n loglog n.

These represent the best time bounds known for the problem for all m << n^{1.376}. We also obtain a similar type of result for the diameter problem for unweighted directed graphs.

Approximating the piercing number for unit-height rectangles

(with Abdullah-Al Mahmood)

The piercing problem seeks the minimum number of points for a set of objects such that each object contains at least one of the points. We present a polynomial-time approximation scheme (PTAS) for the piercing problem for a set of axis-parallel unit-height rectangles. We also examine the problem in a dynamic setting and show how to maintain a factor-2 approximation under insertions in logarithmic amortized time, by solving an incremental version of the maximum independent set problem for interval graphs.

In Proc. 17th Canadian Conference on Computational Geometry (CCCG), pages 2-5, 2005

Approximation algorithms for maximum cliques in 3D unit-disk graphs

We study two problems for a given n-point set in 3-space: finding a largest subset with diameter at most one, and finding a subset of k points with minimum diameter. For the former problem we suggest several polynomial-time algorithms with constant approximation factors, the best of which has factor pi / arccos(1/3) < 2.553. For the latter problem we observe that there is a polynomial-time approximation scheme.

In Proc. 17th Canadian Conference on Computational Geometry (CCCG), pages 6-9, 2005

Space-efficient algorithms for Klee's measure problem

We give space-efficient geometric algorithms for two related problems. Given a set of n axis-aligned rectangles in the plane, we calculate the area covered by the union of these rectangles (Klee's measure problem) in O(n^{3/2} log n) time with O(sqrt{n}) extra space. If the input can be destroyed and there are no degenerate cases and input coordinates are all integers, we can solve Klee's measure problem in O(n log^2 n) time with O(log^2 n) extra space. Given a set of n points in the plane, we find the axis-aligned unit square that covers the maximum number of points in O(n log^3 n) time with O(log^2 n) extra space.

In Proc. 17th Canadian Conference on Computational Geometry (CCCG), pages 38-41, 2005

All-pairs shortest paths with real weights in O(n^3 / log n) time

We describe an O(n^3 / log n)-time algorithm for the all-pairs-shortest-paths problem for a real-weighted directed graph with n vertices. This slightly improves a series of previous, slightly subcubic algorithms by Fredman (1976), Takaoka (1992), Dobosiewicz (1990), Han (2004), Takaoka (2004), and Zwick (2004). The new algorithm is surprisingly simple and different from previous ones.

Multi-pass geometric algorithms

Both sets of results are obtained using the core-set framework recently proposed by Agarwal, Har-Peled, and Varadarajan.

We initiate the study of exact geometric algorithms that require limited storage and make only a small number of passes over the input. Fundamental problems such as low-dimensional linear programming and convex hulls are considered.

PostScript file
Discrete and Computational Geometry, 37:79-102, 2007 (SoCG special issue)
In Proc. 21st ACM Symposium on Computational Geometry (SoCG), pages 180-189, 2005

On levels in arrangements of surfaces in three dimensions

A favorite open problem in combinatorial geometry is to determine the worst-case complexity of a level in an arrangement. Up to now, nontrivial upper bounds in three dimensions are known only for the linear cases of planes and triangles. We propose the first technique that can deal with more general surfaces in three dimensions. For example, in an arrangement of n "pseudo-planes" or "pseudo-spheres" (where each triple of surfaces has at most two common intersections), we prove that there are at most O(n^{2.997}) vertices of any given level.

PDF file (improved journal version)
Discrete & Computational Geometry, 48:1-18, 2012
In Proc. 16th ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 232-240, 2005

Finding the shortest bottleneck edge in a parametric minimum spanning tree

Given a parametric graph with n vertices and m edges where edge weights change linearly over time, we show how to find the time value at which the heaviest edge weight in the minimum spanning tree is minimized in O(n(m/n)^epsilon log n + m) expected time...

Three problems about simple polygons

We give three related algorithmic results concerning a simple polygon P:

Continuing previous efforts by Bespamyatnikh, Biedl, Bose, Czyzowicz, E. Demaine, M. Demaine, Kim, Kranakis, Lubiw, Maheshwari, Morin, Shin, Toussaint, Vigneron, and Yang, we show how to find a largest pair of disjoint congruent disks inside P in linear expected time.
As a subroutine for the above result, we show how to find the convex hull of any given subset of the vertices of P in linear worst-case time.
More generally, we show how to compute a triangulation of any given subset of the vertices or edges of P in almost linear time.

PostScript file (9/04 version)
Computational Geometry: Theory and Applications, 35:209-217, 2006
(The main open question about the third problem was solved by van Kreveld, Loeffler, and Mitchell (2008).)

Geometric optimization problems over sliding windows

(with Bashir S. Sadjad)

We study the problem of maintaining a (1+epsilon)-factor approximation of the diameter of a stream of points under the sliding window model. In one dimension, we give a simple algorithm that only needs to store O((1/epsilon) log R) points at any time, where the parameter R denotes the "spread" of the point set. This bound is optimal and improves Feigenbaum, Kannan, and Zhang's recent solution by two logarithmic factors. We then extend our one-dimensional algorithm to higher constant dimensions and, at the same time, correct an error in the previous solution. In high nonconstant dimensions, we also observe a constant-factor approximation algorithm that requires sublinear space. Related optimization problems, such as the width, are also considered in the two-dimensional case.

PostScript file | PDF file
International Journal of Computational Geometry and Applications, 16:145-157, 2006 (ISAAC special issue)
In Proc. 15th International Symposium on Algorithms and Computation (ISAAC), Lecture Notes in Computer Science, vol. 3341, pages 246-258, 2004 (Bashir received a best student paper award)

Faster core-set constructions and data-stream algorithms in fixed dimensions

We speed up previous (1+epsilon)-factor approximation algorithms for a number of geometric optimization problems in fixed dimensions: diameter, width, minimum-radius enclosing cylinder, minimum-width annulus, minimum-volume bounding box, minimum-width cylindrical shell, etc. Linear time bounds were known before; we further improve the dependence of the "constants" in terms of epsilon.

We next consider the data stream model and present new (1+epsilon)-factor approximation algorithms that need only constant space for all of the above problems in any fixed dimension. Previously, such a result was known only for diameter.

Towards in-place geometric algorithms and data structures

(with Hervé Brönnimann, and Eric Y. Chen)

For many geometric problems, there are efficient algorithms that surprisingly use very little extra space other than the given array holding the input. For many geometric query problems, there are efficient data structures that need no extra space at all other than an array holding a permutation of the input. In this paper, we obtain the first such space-economical solutions for a number of fundamental problems, including three-dimensional convex hulls, two-dimensional Delaunay triangulations, fixed-dimensional range queries, and fixed-dimensional nearest neighbor queries.

PDF file (2/06 version, with several improvements(!))
In Proc. 20th ACM Symposium on Computational Geometry (SoCG), pages 239-246, 2004

Space-efficient algorithms for computing the convex hull of a simple polygonal line in linear time

(with Hervé Brönnimann)

We present space-efficient algorithms for computing the convex hull of a simple polygonal line in-place, in linear time. It turns out that the problem is as hard as stable partition, i.e., if there were a truly simple solution then stable partition would also have a truly simple solution, and vice versa. Nevertheless, we present a simple self-contained solution that uses O(log n) space, and indicate how to improve it to O(1) space with the same techniques used for stable partition. If the points inside the convex hull can be discarded, then there is a truly simple solution that uses a single call to stable partition, and even that call can be spared if only extreme points are desired (and not their order). If the polygonal line is closed, then the problem admits a very simple solution which does not call for stable partitioning at all.

An optimal randomized algorithm for maximum Tukey depth

We present the first optimal algorithm to compute the maximum Tukey depth (also known as location or halfspace depth) for a non-degenerate point set in the plane. The algorithm is randomized and requires O(n log n) expected time for n data points. In a higher fixed dimension d >= 3, the expected time bound is O(n^{d-1}), which is probably optimal as well. The result is obtained using an interesting variant of the author's randomized optimization technique, capable of solving "implicit" linear-programming-type problems; some other applications of this technique are briefly mentioned.

A note on maximum independent sets in rectangle intersection graphs

Finding the maximum independent set in the intersection graph of n axis-parallel rectangles is NP-hard. We re-examine two known approximation results for this problem. For the case of rectangles of unit height, Agarwal, van Kreveld, and Suri (1997) gave a (1+1/k)-factor algorithm with an O(n log n + n^{2k-1}) time bound for any integer constant k >= 1; we describe a similar algorithm running in only O(n log n + nD^{k-1}) time, where D <= n denotes the maximum number of rectangles a point can be in. For the general case, Berman, DasGupta, Muthukrishnan, and Ramaswami (2001) gave a log_k n-factor algorithm with an O(n^{k+1}) time bound for any integer constant k >= 2; we describe similar algorithms running in O(n log n + nD^{k-2}) and n^{O(k/log k)} time.

On levels in arrangements of curves, II: a simple inequality and its consequence

We give a surprisingly short proof that in any planar arrangement of n curves where each pair intersects at most a fixed number (s) of times, the k-level has subquadratic (O(n^{2-1/2s})) complexity. This answers one of the main open problems from the author's previous paper (FOCS'00), which provided a weaker bound for a restricted class of curves only (graphs of degree-s polynomials). When combined with existing tools (cutting curves, sampling, etc.), the new idea generates a slew of improved k-level results for most of the curve families studied earlier, including a near-O(n^{3/2}) bound for parabolas.

A space-efficient algorithm for segment intersection