Here are my original handwritten notes on this topic.
This problem is also known as Maximum Independent Set in Unit Height Rectangles
- Input: A set $S$ of $n$ axis-aligned rectangles, all of which have height 1.
- Output: A subset of $I\subseteq S$ of pairwise-disjoint rectangles having maximum cardinality.
This models a map labelling problem where we have $n$ place labels at specific locations on a map, but we don't want two labels to overlap. All the labels have the same height because they're all written in the same font.
A $\frac{1}{2}$-Approximation
Here is an algorithm that gives a 2-approximation:
- Draw a set of horizontal lines $L_1,\ldots,L_k$ such that:
- The distance between any two lines is at least 1
- Each line intersects at least one rectangle in $S$
- Each rectangle in $R$ intersects at least one line
- Let $S_i$ be the subset of $S$ that intersects $L_i$, and find a MIS, $I_i$, of $S_i$, for each $i\in\{1,\ldots,k\}$.
- Output $I_1\cup I_3\cup I_5\cup\cdots,\cup I_{2\lceil k/2\rceil-1}$ or $I_2\cup I_4\cup I_6\cup\cdots,\cup I_{2\lfloor k/2\rfloor}$, whichever is bigger.
We still don't know how to perform Step 2 efficiently, but let's ignore that for now. First, we can be sure that the algorithm outputs an independent set because each $I_i$ is an independent set and the rectangles in $I_i$ can not intersect the rectangles in $I_{i+2j}$ because the distance between $L_i$ and $L_{i+2j}$ is at least $2j\ge 2$.
Next, let's convince ourselves that this is a 2-approximation. Consider some optimal solution $I^\star$. We can partition $I^\star$ into $k$ sets $I^\star_1,\ldots,I^star_k$ where $I^\star_i=I^\star\cap S_i$, for each $i\in\{1,\ldots,k\}$. So \[ |I^\star| = \sum_{i=1}^k |I^\star_i| \le \sum_{i=1}^k |I_i| \enspace , \] where the second inequality comes from the fact that $I_i$ is a maximum independent set of $S_i$ while $I^\star_i$ is an independent set of $S_i$. But now we're done, because the solution output by our approximation algorithm has size at least \[ \frac{1}{2}\sum_{i=1}^k |I_i| \ge \frac{1}{2}|I^\star| \enspace . \] So, the only thing left to explain is how to implement Step 2, finding an MIS of $S_i$.
Finding an MIS in an Interval graph
Each set $S_i$ is a set of axis-aligned rectangles that are all stabbed by the horizontal line $L_i$. A few seconds of reflection will convince you that this means we can forget about the rectangles and just focus on their intersections with $L_i$. This means we have to solve the following simpler problem:
- Input: A set $T=\{[a_1,b_1],[a_2,b_2],\ldots,[a_n,b_n]\}$ of $n$ real intervals.
- Output: A subset of $I\subseteq S$ of pairwise-disjoint intervals having maximum cardinality.
It turns out that this can be solved optimally by a greedy algorithm:
- Sort $T$ by the right endpoint of each interval so that $b_1\le b_2\le b_3\le\cdots\le b_n$.
- $I\gets\emptyset$
- for each $i\in\{1,\ldots,n\}$ do
- if $[a_i,b_i]\in T$
- $I\gets I\cup\{[a_i,b_i]\}$.
- remove every interval in $T$ that intersects $[a_i,b_i]$
- if $[a_i,b_i]\in T$
- return $I$
We can prove that this algorithm is optimal by induction on $n$. The case $n=0$ is trivial. Let $I^\star$ be some optimal solution. Consider the smallest right endpoint in $I^\star$. Suppose this comes from interval $[a_i,b_i]$ so that $I^\star$ does not contain any of $[a_1,b_1],\ldots,[a_{i-1},b_{i-1}]$. But this means $I^\star$ does not contain any interval that intersects $[a_1,b_1]$ except possibly $[a_i,b_i]$.
....
A $(1-\eps)$-Approximation
The $\frac{1}{2}$-approximation algorithm given above can be generalized to give a $1-\epsilon$ approximation for any $\epsilon>0$. Here's the idea.
- Draw a set of horizontal lines $L_1,\ldots,L_k$ as in the previous algorithm. For each $i\in\{1,\ldots,k\}$, let $S_i$ be the subset of rectangles intersected by $L_i$. For other values of $i$, define $S_i = \emptyset$.
- For each $i\in\{1,\ldots,k\}$, find an MIS, $I_i$, of $S_i\cup S_{i+1}\cup\cdots S_{i+r}$
- For each $i\in\{1,\ldots,r\}$, let $A_j = \bigcup_{i=1}^{\lfloor k/(r+1)\rfloor} I_{j+(r+1)j}$
- Output the largest among $A_1,\ldots,A_r$.
$I_i$, of $S_i$, for each $i\in\{1,\ldots,k\}$. 3. Output $I_1\cup I_3\cup I_5\cup\cdots,\cup I_{2\lceil k/2\rceil-1}$ or $I_2\cup I_4\cup I_6\cup\cdots,\cup I_{2\lfloor k/2\rfloor}$, whichever is bigger.