next up previous contents
Next: Logarithms: an intuitive definition Up: How to measure efficiency Previous: Complexity classes   Contents

How to determine the complexity of an algorithm

So how does one go about counting the number of operations an algorithm needs? We have already seen many examples. Let's consider insertion sort again.

If we have an (unsorted) array a[] of integers of length $ n$, the algorithm searches for the smallest element of the array and then swaps it with the element in the first position, a[0]. Then it seeks the smallest element in the array from position $ 1$ to position $ n-1$ and swaps it with $ a[1]$, and so on.

The sorting method would have a definition like this:

for (int i = 0; i< n; i++) {
               // find the smallest element, a[k], in a[i] to a[n-1]
               // swap a[k] with a[i]
}

Filling in just a bit more code we get:

for (int i = 0; i< n; i++) { 
  int k;
  for (int j=i+1; j<n, j++) { 
    k = i; 
    if (a[j] < a[k]) then k = j; 
  } 
  /* k now contains the index of the smallest element from
     a[i] to a[n-1]
  */
  int l = a[i];
  a[i] = a[k];
  a[k] = l;  // this does the swapping
}

So how many operations does this program execute? There is first of all an outer loop which is run through $ n$ times. Then there is the inner loop, and it depends on the counter of the outer loop how often we go through that. For each run of the inner loop, we have one comparison, but we don't always get an assignment, due to the `if' clause. If we are interested in the average case, we would assume that about half the time, the `if' clause is true. And then we get the three assignments each time we complete the outer loop.

This gives us the following number of operations carried out on average (where we ignore the fact that incrementing loop variables, etc, costs a bit of time, too):


$\displaystyle f(n)$ $\displaystyle =$ $\displaystyle \sum_{i=0}^{n-1} ((\sum_{i+1}^{n-1} 1 + \frac{1}{2}) + 3)$  
  $\displaystyle =$ $\displaystyle \sum_{i=0}^{n-1} ((n-i-1)\frac{3}{2} + 3)$  
  $\displaystyle =$ $\displaystyle \frac{3}{2} \sum_{i=0}^{n-1} (n-i-1) + 3n$  
  $\displaystyle =$ $\displaystyle \frac{3}{2}((n-1) + (n-2) + \cdots + 1) + 3n$  
  $\displaystyle =$ $\displaystyle \frac{3}{2} n(n-1) + 3n$  
  $\displaystyle =$ $\displaystyle \frac{3}{2} n^2 + \frac{3}{2} n$  

Therefore this algorithm has complexity $ O(n^2)$ which, as we know, isn't a very good complexity class, and we have seen that there are better sorting algorithms. One thing to note is that when considering sorting algorithms one often measures complexity via the number of comparisons that occur, ignoring things such as assignments, etc. Note that if we only count comparisons in the example above, we get $ f(n) = (n-1) + (n-2) + \cdots + 1 =
n(n-1)/2 = (1/2) n^2 + (1/2) n$. So the complexity class stays the same. This is typical for the (so-called `internal') sorting algorithms, and the reason why people like to restrict themselves to counting the number of comparisons involved is that it yields a good measure for comparing the various sorting algorithms. What exactly to count when considering the complexity of a particular algorithm is always a judgement call. You will have to gain more experience before you feel comfortable with making such calls yourself.

Something to be aware of when making these calculations is that it isn't a bad idea to keep track of any factors, in particular those that go with the dominating sub-term. In the above example, the factor applied to the dominating sub-term, namely $ n^2$, was $ 3/2$ (and by coincidence, this was also the factor that came with the second term, namely $ n$). It is good and well to know that an algorithm that is linear will perform better than a quadratic one provided the size of the problem is large enough, but if you know that your problem has a size of, say, at most $ 100$ then a complexity of $ (1/{10}) n^2$ might be preferable to one of $ 1000000
n$. Or if you know that your program is only ever used on fairly small samples, then using the simplest algorithm you can find might be beneficial over all--it is easier to program, and there is not a lot of time to be saved.


next up previous contents
Next: Logarithms: an intuitive definition Up: How to measure efficiency Previous: Complexity classes   Contents
Martin Escardo 2006-01-12