COM­P4804: Al­go­rithms II
$\new­com­mand{\R}{\mathbb{R}}\De­clar­e­Math­Op­er­a­tor{\E}{\mathbf{E}}\De­clar­e­Math­Op­er­a­tor{\deg}{deg}$
Clos­est Pair and Lin­ear Pro­gram­ming

This lec­ture looks at two ran­dom­ized al­go­rithms for two geo­met­ric prob­lems. Both al­go­rithms are ran­dom­ized in­cre­men­tal al­go­rithms: They each com­pute a ran­dom per­mu­ta­tion of the input and then in­cre­men­tally solve the prob­lem by adding one input el­e­ment at a time and up­dat­ing the cur­rent so­lu­tion after the ad­di­tion of each el­e­ment.

Clos­est Pair

For sim­plic­ity, we will as­sume that all inter-point dis­tances are unique, so there are no ties. Re­call that a hash table can store a set of key/value pairs so that, we can test if any key is in the set and look up its value in $O(1)$ ex­pected time.

We will solve the clos­est-pair prob­lem with this ran­dom­ized in­cre­men­tal al­go­rithm:

Lemma: The prob­a­bil­ity that $p_{i}$ is part of the clos­est pair among $p_1,\ldots,p_i$ is at most $2/i$.

Proof: Re­mem­ber that $p_1,\ldots,p_n$ is a ran­dom per­mu­ta­tion of $P$, so $p_1,\ldots,p_i$ is a ran­dom per­mu­ta­tion of $P_i=\{p_1,\ldots,p_i\}$. Ex­actly two el­e­ments $a$ and $b$ of $P_i$ de­fine the clos­est pair and $p_i$ is equally likely to be any of the $i$ el­e­ments of $P_i$ so the prob­a­bil­ity that $p_i\in\{a,b\}$ is $2/i$. ∎

The­o­rem: Given a set of $n$ points in $\R^2$, we can find the clos­est pair in $O(n)$ ex­pected time

Proof: After com­put­ing the ran­dom per­mu­ta­tion, the rest of the al­go­rithm ex­am­ines each point, does a con­stant num­ber of work, and pos­si­bly re­builds the hash table. Let $c>0$ be some con­stant and for each $i\in\{2,\ldots,n\}$, let
\[ I_i = \begin{cases} ci & \text{if $p_i$ is part of the CP of $p_1,\ldots,p_i$} \\ 0 & \text{oth­er­wise .} \end{cases} \] Ex­er­cise: Check that $\E[I_i] \le 2c$. Then, the ex­pected amount of time spent re­build­ing hash ta­bles is \[ \E\left[\sum_{i=2}^n I_i\right] = \sum_{i=2}^n \E[I_i] = \sum_{i=2}^n 2c = \sum_{i=2}^n O(1) = O(n) \en­space . \] The other steps of the al­go­rithm eas­ily run in $O(n)$ time. If you're not sure how the ran­dom per­mu­ta­tion is gen­er­ated, you can read about the Fisher–Yates Shuf­fle. ∎

The pre­ced­ing al­go­rithm eas­ily gen­er­al­izes to points in $\R^{d}$, though the ex­pected run­ning time be­comes $O(C^d n)$ for some con­stant $C$. So the al­go­rithm is lin­ear in $n$ but ex­po­nen­tial in $d$. This is pretty com­mon for geo­met­ric prob­lems and is some­times called the curse of di­men­sion­al­ity.

Lin­ear Pro­gram­ming

Ex­er­cise: How much bet­ter can you do than the triv­ial al­go­rithm? $O(n^2)$, $O(n\log n)$?

To make life sim­ple, we as­sume there are no hor­i­zon­tal lines and no three lines in­terect in a com­mon point. We can solve this prob­lem again with a ran­dom­ized in­cre­men­tal al­go­rithm.

This should look fa­mil­iar. All we need to show now is that the new so­lu­tion is not likely to in­volve $\ell_i$. Ex­actly 2 lines in $\ell_1,\ldots,\ell_i$ de­fine the op­ti­mal so­lu­tion for $\ell_1,\ldots,\ell_i$. Now, $\ell_i$ is uni­formly cho­sen from a set of size $i-2$, so the prob­a­bil­ity that $\ell_i$ is one of the two that de­fines the so­lu­tion is at most $2/(i-2)$. The rest of the analy­sis is just like the analy­sis of the clos­est pair al­go­rithm.

The­o­rem: Given $n$ lines in $\R^2$, we can find the low­est point that is above all the lines in $O(n)$ ex­pected time.

The pre­ced­ing al­go­rithm also gen­er­al­izes to $\R^d$, and again, the ex­pected run­ning-time is $O(C^dn)$ for some con­stant $C$. The gen­er­al­iza­tion is less straight­for­ward, though, and in­volves a re­cur­sion on the di­men­sion, $d$.