Robust non-parametric data approximation of pointsets via data reduction

Jason Morrison

In this paper we present a novel non-parametric method of simplifying piecewise linear curves and we apply this method as a statistical approximation of structure within sequential data in the plane. We consider the problem of minimizing the average length of sequences of consecutive input points that lie on any one side of the simplified curve. Specifically, given a sequence P of n points in the plane that determine a simple polygonal chain consisting of n-1 segments, we describe algorithms for selecting an ordered subset Q ⊂ P (including the first and last points of P) that determines a second polygonal chain to estimate P, such that the number of crossings between the two polygonal chains is maximized, and the cardinality of Q is minimized among all such maximizing subsets of P. Our algorithms have respective running times O(n2 log n) when P is monotonic and O(n2 log2 n) when P is an arbitrary simple polyline. Finally, we examine the application of our algorithms iteratively in a bootstrapping technique to define a smooth robust non-parametric approximation of the original montonic sequence.