This article is about the best way to generate random convex polygons. A convex polygon is a polygon where the straight line segment connecting any two points on the inside never crosses the boundary. Alternatively, they are polygons where you always turn the same direction when walking along the boundary. Convex polygons are nice to work with for a variety of reasons, so, especially for testing purposes, it would be nice to be able to generate lots of different convex polygons. Here, we'll do that by generating a truly random convex polygon.
Before we can talk about how to generate something, we have to know exactly what we want. In this case, we want a convex polygon - that much is clear. But what does it mean for this polygon to be random? There are uncountably many convex polygons. Should our method be able to generate all of them? Is it okay if it generates certain polygons more than others?
There are many possible answers to these questions, leading to different methods. Here, we will restrict ourselves to all convex polygons in the [0,1] × [0,1] unit square, and we would like each polygon to be equally likely.
That last restriction is key. There are many ways to generate convex polygons that result in biased distributions. For example, generating a lot of points at random inside a circle and taking their convex hull gets you a random convex polygon. But it will probably look a lot like a circle, whereas most convex polygons do not.
The easiest way to do this is by rejection sampling. We generate a random set of points; if they form a convex polygon, we return it, otherwise we try again.
This definitely satisfies our first condition that every convex polygon in the unit square can be generated. Does it satisfy our second condition too? Well, the body of the loop generates every set of points in the unit square with equal probability. The only thing that happens afterwards is checking whether the points form a convex polygon. So the point sets that pass this test will follow the same distribution as before, with each convex set being equally likely. Thus, it satisfies our second condition as well.
We found an algorithm that meets our criteria. Does this mean that we're done? Well, we left one important criterium out: the algorithm should be efficient as well. In other words, it shouldn't take too long to generate even large convex polygons. So how long does it take to generate an n-point convex polygon with rejection sampling? If we assume that we can generate a random number between 0 and 1 in constant time, the loop body takes O(n) time. But how many times does the loop execute? Since it depends on random points, the answer could range from one to infinite, so we should look at the expected number of iterations. This is simply 1/p, where p is the probability of termination. This probability is tricky to compute, but luckily someone else has done the hard work for us.
In 1995, Pavel Valtr proved that \( p = \left( \displaystyle\frac{\binom{2n - 2}{n-1}}{n!} \right)^2 \), and so the expected number of iterations is \( \displaystyle\frac{1}{p} = \left( \displaystyle\frac{n!}{\binom{2n - 2}{n-1}} \right)^2 = \Omega(n^n) \). It takes my computer about 5 milliseconds to generate a convex 10-gon with rejection sampling. This means that I would have to wait over a century for even a convex 20-gon. Ouch.
Thankfully, we can use Valtr's proof to give a faster algorithm that generates convex polygons with the same distribution as our rejection sampling method. The algorithm works as follows:
The following animation shows each of these steps in action.
If you like reading code, or want to play around with it, I also made a Java implementation of the algorithm.
Remember that I said most convex polygons do not look like circles? It turns out they look like a strange cross between a square and a circle. In particular, when we make the number of points large enough, they converge to the curve \( \{(x,y) | \sqrt{1 - |x|} + \sqrt{1 - |y|} = 1\} \), shown below on the right. The left image shows the distribution of points on convex polygons of increasing size. It starts off uniform for random triangles (any three points form a triangle), but converges to the curve as the number of points grows.
Copyright © 2017 Sander Verdonschot
The text of this article is licensed under a Creative Commons Attribution 3.0 Unported License.