Optimal Geometric Data Structures

Amirali Khosravi

Promotor: Prof.dr. M.T. de Berg (TU/e)
Technische Universiteit Eindhoven
Date: 10 January, 2012


Spatial data structures form a core ingredient of many geometric algorithms, both in theory and in practice. Many of these data structures, especially the ones used in practice, are based on partitioning the underlying space (examples are binary space partitions and decompositions of polygons) or partitioning the set of objects (examples are bounding-volume hierarchies).

The efficiency of such data structures—and, hence, of the algorithms that use them—depends on certain characteristics of the partitioning. For example the performance of many algorithms that use binary space partitions (BSPs) depends on the size of the BSP. Similarly, the performance of answering range queries using bounding-volume hierarchies (BVHs) depends on the so-called crossing number that can be associated with the partitioning on which the BVH is based. Much research has been done on the problem of computing partitionings whose characteristics are good in the worst case. In this thesis, we studied the problem from a different point of view, namely instance-optimality. In particular, we considered the following question: given a class of geometric partitioning structures—like BSPs, simplicial partitions, polygon triangulations—and a cost function—like size or crossing number—can we design an algorithm that computes a structure whose cost is optimal or close to optimal for any input instance (rather than only worst-case optimal). We studied the problem of finding optimal data structures for some of the most important spatial data structures.

We started our research by studying BSPs for line segments in the plane, where the cost function is the size of the BSP. A popular type of BSPs for line segments are the so-called auto-partitions. We proved that finding an optimal auto-partition is NP-hard. In fact, finding out if a set of input segments admits an auto-partition without any cuts is already NP-hard. We also studied the relation between two other types of BSPs, called free and restricted BSPs, and showed that the number of cuts of an optimal restricted BSP for a set of segments in R2 is at most twice the number of cuts of an optimal free BSP for that set.

Then we turned our attention to so-called rectilinear r-partitions for planar point sets, with the crossing number as cost function. A rectilinear r-partition of a point set P is a partitioning of P into r subsets, each having roughly |P|/r points. The crossing number of the partition is defined using the bounding boxes of the subsets; in particular, it is the maximum number of bounding boxes that can be intersected by any horizontal or vertical line. We performed some theoretical as well as experimental studies on rectilinear r-partitions. On the theoretical side, we proved that computing a rectilinear r-partition with optimal stabbing number for a given set of points and parameter~r is NP-hard. We also proposed an exact algorithm for finding optimal rectilinear r-partitions whose running time is polynomial when r is a constant, and a faster 2-approximation algorithm. Our last theoretical result showed that considering only partitions whose bounding boxes are disjoint is not sufficient for finding optimal rectilinear r-partitions. On the experimental side, we performed a comparison between four different heuristics for constructing rectilinear r-partitions. The so-called windmill kd-tree gave the best results.

The third structure we studied was decomposition of a simple polygon. The cost function we considered was again the stabbing number, which is defined in this case as the maximum number of regions in the decomposition that can be stabbed by a segment inside the polygon. We proposed a 3-approximation for finding an optimal rectangular decomposition of a rectilinear polygon, and an O(1)-approximation for finding an optimal Steiner triangulation of a simple polygon.

Finally, we considered another optimization problem, namely how to approximate a piecewise-linear function F with another piecewise-linear function with fewer pieces. Here one can distinguish two versions of the problem. The first one is called the min-k problem; the goal is then to approximate the function within a given error ε such that the resulting function has the minimum number of links. The second one is called the min-ε problem; here the goal is to find an approximation with at most k links (for a given k) such that the error is minimized. These problems have already been studied before. Our contribution is to consider the problem for so-calleduncertain functions, where the value of the input function F at its vertices is given as a discrete set of different values, each with an associated probability. We show how to compute an approximation that minimizes the expected error.