first promotor: dr. Kevin A.B. Verbeek (TU/e)
second promotor: prof.dr. Bettina Speckmann (TU/e)
co-promotor: dr. Wouter Meulemans (TU/e)
Eindhoven University of Technology
Date: 25 August, 2020
A large amount of data that is collected nowadays is time-varying: air traffic, stock prices and weather are all examples of every-day data that changes over time. To aid humans in their decision making, this data has to be analyzed quickly and communicated effectively. Algorithms and visualizations together play an important part in both these tasks; hence a variety of techniques is already developed and widely available. For time-varying data it is especially important to not only communicate data at one point in time, but also to show the evolution of the data over time. To preserve temporal patterns in visualizations, it is important that they are stable: small changes in the data should result in small changes in the visualization. Thus far, the lack of theoretical tools to analyze and develop stable algorithms has hindered the development of stable visualizations.
In this thesis we set out to tackle the shortage of theoretical tools by introducing a framework for analyzing the stability of algorithms. More specifically, the framework allows us to better understand the trade-offs between stability and traditional criteria for evaluating algorithms, such as solution quality and running time. For our framework we subdivide algorithms for time-varying data into stateless, state-aware and clairvoyant algorithms, which respectively have access to data at the current time step, data calculated from previous time steps or data at every time step. Depending on the type of algorithm, different levels of stability can be achieved. The framework provides three definitions for measuring stability that each address different aspects. The event stability of a problem counts the number of times the combinatorial structure of the output changes. This is closely related to the efficiency of so-called kinetic data structures. The topological and Lipschitz stability both expect the output of an algorithm to change continuously, by imposing a topology or metric on the output space of the algorithm. Lipschitz stability additionally limits how fast an output can change depending on the change in the input. Because of this constraint, Lipschitz stability comes closest to our intuitive definition of stability, namely small changes in the input should lead to small changes in the solution, and is therefore the preferred type of analysis. However, Lipschitz stability analysis is often prohibitively challenging or infeasible. The other types of stability analysis simplify the stability requirements, making the analysis significantly easier. Although these types of stability analysis do not fully capture all aspects of stability, they do offer useful insight into the interplay between problem instances, solutions, and the optimization function. These insights are invaluable for the development of stable algorithms.
We first show how to use the framework by analyzing the stability of the kinetic Euclidean minimum spanning tree problem. The input for this problem consists of a set of moving points, which should be connected by straight lines between the points, while minimizing the total line length. We show how to improve the event stability, while approximating an optimal solution. Furthermore, we give upper and lower bounds on the topological stability for different ways of introducing continuity on spanning trees, and show how to find a simple lower bound on Lipschitz stability when the output changes slowly with respect to the input.
Second, we turn to the kinetic k-center problem. Here the input again is a set of moving points, which are covered by a set of k disks or squares whose size should be minimized. For a natural way of introducing continuity, it is impossible to achieve Lipschitz stability for this problem, when k is at least 3. Hence, we analyze only the topological stability. Furthermore, we propose a clairvoyant algorithm to calculate an upper bound on the approximation ratio required for a topologically stable solution to an instance of the k-center problem.
Third, we analyze orientation-based shape descriptors on a set of moving points. As the points move, the shape descriptor should give a low-complexity description of the point set. The size of orientation-based shape descriptors is minimized by being in a certain orientation. We investigate the topological and Lipschitz stability of three such shape descriptors, first principal component, oriented bounding box, and covering strip. We start by proving that no topologically stable stateless algorithm can exist that approximates any of these three shape descriptors, and we therefore turn to state-aware algorithms. To analyze the Lipschitz stability we introduce a natural kind of state-aware algorithm, called a chasing algorithm. We prove that such an algorithm, which always moves the stable solution towards an optimal solution, approximates an optimal solution well while changing at a speed proportional to the changes in the input.
Finally, we consider an application that requires stability. Inspired by our work on shape descriptors, we develop stable methods to generate a visual summary, a visualization method that shows changes in time-varying data in a static way. The visual summaries we study use one-dimensional representations, such as linearizations, of the data at every time step. We propose stable versions of popular dimensionality-reduction techniques, such as principal components, to create 1D representations that do not only improve the summaries visually, but are also measurably of higher quality. An extensive quantitative evaluation of the existing and newly introduced methods shows that stable techniques improve on stability, without sacrificing the ability of the individual linearizations to represent the underlying data truthfully.