Close
Register
Close Window

Data Structures and Algorithms Y

Chapter 1 Mathematical Background

Show Source |    | About   «  1.3. Miscellaneous Notation   ::   Contents   ::   1.5. Summations  »

1.4. Logarithms

1.4.1. Logarithms

The logarithm of base \(b\) for value \(y\) is the power to which \(b\) is raised to get \(y\). Normally, this is written as \(\log_b y = x\). Thus, if \(\log_b y = x\) then \(b^x = y\), and \(b^{log_b y} = y\).

Logarithms are used frequently by programmers. Here are two typical uses.

Example 1.4.1

Many programs require an encoding for a collection of objects. What is the minimum number of bits needed to represent \(n\) distinct code values? The answer is \(\lceil \log_2 n \rceil\) bits. For example, if you have 1000 codes to store, you will require at least \(\lceil \log_2 1000 \rceil = 10\) bits to have 1000 different codes (10 bits provide 1024 distinct code values).

Example 1.4.2

Consider the binary search algorithm for finding a given value within an array sorted by value from lowest to highest. Binary search first looks at the middle element and determines if the value being searched for is in the upper half or the lower half of the array. The algorithm then continues splitting the appropriate subarray in half until the desired value is found. How many times can an array of size (n) be split in half until only one element remains in the final subarray? The answer is \(\lceil \log_2 n \rceil\) times.

In OpenDSA, nearly all logarithms used have a base of two. This is because data structures and algorithms most often divide things in half, or store codes with binary bits. Whenever you see the notation \(\log n\) in OpenDSA, either \(\log_2 n\) is meant or else the term is being used asymptotically and so the actual base does not matter. For logarithms using any base other than two, we will show the base explicitly.

Logarithms have the following properties, for any positive values of \(m\), \(n\), and \(r\), and any positive integers \(a\) and \(b\).

  1. \(\log (nm) = \log n + \log m\).
  2. \(\log (n/m) = \log n - \log m\).
  3. \(\log (n^r) = r \log n\).
  4. \(\log_a n = \log_b n / \log_b a\).

The first two properties state that the logarithm of two numbers multiplied (or divided) can be found by adding (or subtracting) the logarithms of the two numbers. [1] Property (3) is simply an extension of property (1). Property (4) tells us that, for variable \(n\) and any two integer constants \(a\) and \(b\), \(\log_a n\) and \(\log_b n\) differ by the constant factor \(\log_b a\), regardless of the value of \(n\). Most runtime analyses we use are of a type that ignores constant factors in costs. Property (4) says that such analyses need not be concerned with the base of the logarithm, because this can change the total cost only by a constant factor.

A useful identity to know is:

\[2^{\log n} = n\]

To give some intuition for why this is true: What does it mean to take the log (base 2) of \(n\)? If \(\log_2 n = x\), then \(x\) is the power to which you need to raise 2 to get back to \(n\). So of course, \(2^{\log n} = n\) when the base of the log is 2.

When discussing logarithms, exponents often lead to confusion. Property (3) tells us that \(\log n^2 = 2 \log n\). How do we indicate the square of the logarithm (as opposed to the logarithm of \(n^2\))? This could be written as \((\log n)^2\), but it is traditional to use \(\log^2 n\). On the other hand, we might want to take the logarithm of the logarithm of \(n\). This is written \(\log \log n\).

A special notation is used in the rare case when we need to know how many times we must take the log of a number before we reach a value \(\leq 1\). This quantity is written \(\log^* n\). For example, \(\log^* 1024 = 4\) because \(\log 1024 = 10\), \(\log 10 \approx 3.33\), \(\log 3.33 \approx 1.74\), and \(\log 1.74 < 1\), which is a total of 4 log operations.

[1]These properties are the idea behind the slide rule. Adding two numbers can be viewed as joining two lengths together and measuring their combined length. Multiplication is not so easily done. However, if the numbers are first converted to the lengths of their logarithms, then those lengths can be added and the inverse logarithm of the resulting length gives the answer for the multiplication (this is simply logarithm property (1)). A slide rule measures the length of the logarithm for the numbers, lets you slide bars representing these lengths to add up the total length, and finally converts this total length to the correct numeric answer by taking the inverse of the logarithm for the result.

Here is some practice with manipulating logarithms.

   «  1.3. Miscellaneous Notation   ::   Contents   ::   1.5. Summations  »

nsf
Close Window