Algorithms

An algorithm is a finite set of precise instructions for solving a problem.

Here is an algorithm for outputting the largest number from a list of $n$ numbers. Such a number is called a maximum.

  1. set a temporary variable to the first number
  2. compare the next number to the temporary variable
    • if it is larger, set the temporary variable to this number
  3. repeat step 2 until there are no more numbers left
  4. return the value of the temporary variable

Here is an algorithm for outputting the index of the number $x$ from an array of $n$ numbers. This is called a linear search.

  1. look at the first element
    • if it is equal to $x$, then return that index
  2. look at the next element
    • if it is equal to $x$, then return that index
  3. repeat step 2 until the end of the array is reached
  4. return not found

Sorting

For the sorting problem, we are given a list of elements that can be ordered (typically numbers) and wish to rearrange the list so that the elements are in non-decreasing order.

One algorithm that solves this problem is BubbleSort. It works by looking at pairs of elements in the list and "swapping" them whenever they are out of order. If this is done enough times, then the list will be in order! In pseudocode:

			BubbleSort$\left(a_1, a_2, \ldots, a_n\right)$
				for $i \gets 1$ to $n-1$
					for $j \gets 1$ to $n-i$
						if $a_j > a_{j+1}$
							swap $a_j$ and $a_{j+1}$
						end if
					end for
				end for
			

Here is how the BubbleSort algorithm works on the array $3,2,4,1,5$:

\begin{array}{c@{\hskip 0.5in}c@{\hskip 0.5in}c@{\hskip 0.5in}c} i=1 & i=2 & i=3 & i=4 \\\hline \underbrace{3,2}_\text{swap},4,1,5 & \underbrace{2,3}_\text{good},1,4,|5 & \underbrace{2,1}_\text{swap},3,|4,5 & \underbrace{1,2}_\text{good},|3,4,5 \\ 2,\underbrace{3,4}_\text{good},1,5 & 2,\underbrace{3,1}_\text{swap},4,|5 & 1,\underbrace{2,3}_\text{good},|4,5 \\ 2,3,\underbrace{4,1}_\text{swap},5 & 2,1,\underbrace{3,4}_\text{good},|5 & & \\ 2,3,1,\underbrace{4,5}_\text{good} & & & \\ \end{array}

A natural question to ask is, "How long does this take?" The answer is: it depends! (On operating system, hardware, implementation, and many other things)

Another algorithm for sorting is InsertionSort.

In pseudocode:

			InsertionSort$\left(a_1,a_2,\ldots,a_n\right)$
				for $j \gets 2$ to $n$
					$k \gets a_j$
					$i \gets j-1$
					while $i > 0$ and $a_i > k$
						$a_{i+1} \gets a_i$
						$i \gets i-1$
					end while
					$a_i \gets k$
				end for
			

Here is how the InsertionSort algorithm works on the array $3,2,4,1,5$:

Sorting a list using InsertionSort
Sorting a list using InsertionSort
Sorting a list using InsertionSort
Sorting a list using InsertionSort

How long does this take? Again, it depends. But how does it compare to BubbleSort? (Assuming same hardware, operating system, etc.)

Analysis of Algorithms

To determine how "long" an algorithm takes, we need to know how long operations (such as additions, comparisons, etc.) take. To do this, we define a model of computation.

There are many such models. For now, let's say that comparisons (\ie, $\lt,\le,\gt,\ge,=,\neq$) take one time unit ("unit time"). The actual amount of time varies (with hardware, etc.), but we assume they all take the same time. This is generally a fair assumption.

The number of such operations is the time complexity of an algorithm. Typically, we will be interested in worst-case time complexity: the maximum number of operations performed.

Here are some examples of deriving time complexity:

Notice that BubbleSort uses $n^2 + n - 1$ comparisons while InsertionSort uses $n^2 + 2n - 4$ comparisons. Therefore, BubbleSort uses fewer comparisons in the worst case.

But are these two functions really that different? Both have a $n^2$ term, which is much bigger than any (constant) fraction that they are multiplied by, and much bigger than any linear function of $n$ they are added to. As $n$ grows bigger and bigger, the $n^2$ part makes the biggest difference. In the worst case, these functions behave approximately the same.

Contrast this with finding the maximum and linear search: both use $2n-1$ comparisons. This is much faster than the two sorting algorithms, even though the leading term has coefficient $2$. This is because $n$ grows much more slowly than $n^2$. We care about large $n$.

Therefore, for the analysis of algorithms, we don't care too much about the exact function (since it is often too much work to find!) What matters is how fast the function grows.