\(\) \(%\newcommand{\CLRS}{\href{https://mitpress.mit.edu/books/introduction-algorithms}{Cormen et al}}\) \(\newcommand{\CLRS}{\href{https://mitpress.mit.edu/books/introduction-algorithms}{Introduction to Algorithms, 3rd ed.} (\href{http://libproxy.aalto.fi/login?url=http://site.ebrary.com/lib/aalto/Doc?id=10397652}{online via Aalto lib})}\) \(\newcommand{\SW}{\href{http://algs4.cs.princeton.edu/home/}{Algorithms, 4th ed.}}\) \(%\newcommand{\SW}{\href{http://algs4.cs.princeton.edu/home/}{Sedgewick and Wayne}}\) \(\) \(\newcommand{\Java}{\href{http://java.com/en/}{Java}}\) \(\newcommand{\Python}{\href{https://www.python.org/}{Python}}\) \(\newcommand{\CPP}{\href{http://www.cplusplus.com/}{C++}}\) \(\newcommand{\ST}[1]{{\Blue{\textsf{#1}}}}\) \(\newcommand{\PseudoCode}[1]{{\color{blue}\textsf{#1}}}\) \(\) \(%\newcommand{\subheading}[1]{\textbf{\large\color{aaltodgreen}#1}}\) \(\newcommand{\subheading}[1]{\large{\usebeamercolor[fg]{frametitle} #1}}\) \(\) \(\newcommand{\Blue}[1]{{\color{flagblue}#1}}\) \(\newcommand{\Red}[1]{{\color{aaltored}#1}}\) \(\newcommand{\Emph}[1]{\emph{\color{flagblue}#1}}\) \(\) \(\newcommand{\Engl}[1]{({\em engl.}\ #1)}\) \(\) \(\newcommand{\Pointer}{\raisebox{-1ex}{\huge\ding{43}}\ }\) \(\) \(\newcommand{\Set}[1]{\{#1\}}\) \(\newcommand{\Setdef}[2]{\{{#1}\mid{#2}\}}\) \(\newcommand{\PSet}[1]{\mathcal{P}(#1)}\) \(\newcommand{\Card}[1]{{\vert{#1}\vert}}\) \(\newcommand{\Tuple}[1]{(#1)}\) \(\newcommand{\Implies}{\Rightarrow}\) \(\newcommand{\Reals}{\mathbb{R}}\) \(\newcommand{\Seq}[1]{(#1)}\) \(\newcommand{\Arr}[1]{[#1]}\) \(\newcommand{\Floor}[1]{{\lfloor{#1}\rfloor}}\) \(\newcommand{\Ceil}[1]{{\lceil{#1}\rceil}}\) \(\newcommand{\Path}[1]{(#1)}\) \(\) \(%\newcommand{\Lg}{\lg}\) \(\newcommand{\Lg}{\log_2}\) \(\) \(\newcommand{\BigOh}{O}\) \(%\newcommand{\BigOh}{\mathcal{O}}\) \(\newcommand{\Oh}[1]{\BigOh(#1)}\) \(\) \(\newcommand{\todo}[1]{\Red{\textbf{TO DO: #1}}}\) \(\) \(\newcommand{\NULL}{\textsf{null}}\) \(\) \(\newcommand{\Insert}{\ensuremath{\textsc{insert}}}\) \(\newcommand{\Search}{\ensuremath{\textsc{search}}}\) \(\newcommand{\Delete}{\ensuremath{\textsc{delete}}}\) \(\newcommand{\Remove}{\ensuremath{\textsc{remove}}}\) \(\) \(\newcommand{\Parent}[1]{\mathop{parent}(#1)}\) \(\) \(\newcommand{\ALengthOf}[1]{{#1}.\textit{length}}\) \(\) \(\newcommand{\TRootOf}[1]{{#1}.\textit{root}}\) \(\newcommand{\TLChildOf}[1]{{#1}.\textit{leftChild}}\) \(\newcommand{\TRChildOf}[1]{{#1}.\textit{rightChild}}\) \(\newcommand{\TNode}{x}\) \(\newcommand{\TNodeI}{y}\) \(\newcommand{\TKeyOf}[1]{{#1}.\textit{key}}\) \(\) \(\newcommand{\PEnqueue}[2]{{#1}.\textsf{enqueue}(#2)}\) \(\newcommand{\PDequeue}[1]{{#1}.\textsf{dequeue}()}\) \(\) \(\newcommand{\Def}{\mathrel{:=}}\) \(\) \(\newcommand{\Eq}{\mathrel{=}}\) \(\newcommand{\Asgn}{\mathrel{\leftarrow}}\) \(%\newcommand{\Asgn}{\mathrel{:=}}\) \(\) \(%\) \(% Heaps\) \(%\) \(\newcommand{\Downheap}{\textsc{downheap}}\) \(\newcommand{\Upheap}{\textsc{upheap}}\) \(\newcommand{\Makeheap}{\textsc{makeheap}}\) \(\) \(%\) \(% Dynamic sets\) \(%\) \(\newcommand{\SInsert}[1]{\textsc{insert}(#1)}\) \(\newcommand{\SSearch}[1]{\textsc{search}(#1)}\) \(\newcommand{\SDelete}[1]{\textsc{delete}(#1)}\) \(\newcommand{\SMin}{\textsc{min}()}\) \(\newcommand{\SMax}{\textsc{max}()}\) \(\newcommand{\SPredecessor}[1]{\textsc{predecessor}(#1)}\) \(\newcommand{\SSuccessor}[1]{\textsc{successor}(#1)}\) \(\) \(%\) \(% Union-find\) \(%\) \(\newcommand{\UFMS}[1]{\textsc{make-set}(#1)}\) \(\newcommand{\UFFS}[1]{\textsc{find-set}(#1)}\) \(\newcommand{\UFCompress}[1]{\textsc{find-and-compress}(#1)}\) \(\newcommand{\UFUnion}[2]{\textsc{union}(#1,#2)}\) \(\) \(\) \(%\) \(% Graphs\) \(%\) \(\newcommand{\Verts}{V}\) \(\newcommand{\Vtx}{v}\) \(\newcommand{\VtxA}{v_1}\) \(\newcommand{\VtxB}{v_2}\) \(\newcommand{\VertsA}{V_\textup{A}}\) \(\newcommand{\VertsB}{V_\textup{B}}\) \(\newcommand{\Edges}{E}\) \(\newcommand{\Edge}{e}\) \(\newcommand{\NofV}{\Card{V}}\) \(\newcommand{\NofE}{\Card{E}}\) \(\newcommand{\Graph}{G}\) \(\) \(\newcommand{\SCC}{C}\) \(\newcommand{\GraphSCC}{G^\text{SCC}}\) \(\newcommand{\VertsSCC}{V^\text{SCC}}\) \(\newcommand{\EdgesSCC}{E^\text{SCC}}\) \(\) \(\newcommand{\GraphT}{G^\text{T}}\) \(%\newcommand{\VertsT}{V^\textup{T}}\) \(\newcommand{\EdgesT}{E^\text{T}}\) \(\) \(%\) \(% NP-completeness etc\) \(%\) \(\newcommand{\Poly}{\textbf{P}}\) \(\newcommand{\NP}{\textbf{NP}}\) \(\newcommand{\PSPACE}{\textbf{PSPACE}}\) \(\newcommand{\EXPTIME}{\textbf{EXPTIME}}\)

Binary search trees

In this round, we implement ordered sets and maps with Binary trees having some special properties. We associate the keys to nodes and the key of a node \( \TNode \) is denoted by \( \TKeyOf{\TNode} \).

Definition: Binary search tree

A binary tree is a binary search tree (BST) if the following binary search tree property holds: For every node \( \TNode \) in the tree it holds that

  1. if \( \TNodeI \) is a node in the sub-tree rooted at the left child of \( \TNode \), then \( \TKeyOf{\TNodeI} \le \TKeyOf{\TNode} \)

  2. if \( \TNodeI \) is a node in the sub-tree rooted at the right child of \( \TNode \), then \( \TKeyOf{\TNodeI} \ge \TKeyOf{\TNode} \)

We assume that the keys are distinct; for equal keys discussion, see Problem 12.1 in Introduction to Algorithms, 3rd ed. (online via Aalto lib). For simplicity, we mainly focus on ordered sets in these notes, ordered maps are a relatively straightforward extension (each node also contains the value to which the key is mapped).

Example

The three binary trees below are BSTs.

_images/bst-ex-1.png _images/bst-ex-2.png _images/bst-ex-3.png

The binary tree below is not a BST as the node with the key \( 4 \) is in the right sub-tree of the root node having the key \( 8 \).

_images/bst-ex-bad.png

Searching keys

Searching a key \( k \) in a BST is easy due to to the binary search tree property. We traverse the nodes recursively, starting from the root, and proceed as follows:

  • ​ If the current node \( x \) has the key \( k \), we are done.

  • ​ If the key in the node \( x \) is greater than \( k \),

    • ​ traverse the left child of the node if it has one, or

    • ​ report that the key is not in the BST if the node has no left child.

  • ​ If the key in the node \( x \) is less than \( k \),

    • ​ traverse the right child of the node if it has one, or

    • ​ report that the key is not in the BST if the node has no right child.

Example

Consider again the BST below.

_images/bst-ex-3.png

If we search for the key \( 4 \),

  1. we start from the root and note that \( 8 > 4 \),

  2. go to the left child and note that \( 3 < 4 \),

  3. go to the right child and note that \( 6 > 4 \),

  4. go to the left child and note that \( 4 = 4 \), and

  5. report that \( 4 \) is in the tree.

Example

Consider again the BST below:

_images/bst-ex-3.png

If we search for the key \( 5 \),

  1. we start from the root and note that \( 8 > 5 \),

  2. go to the left child and note that \( 3 < 5 \),

  3. go to the right child and note that \( 6 > 5 \),

  4. go to the left child and note that \( 4 < 5 \), and

  5. note that the node does not have a right child and thus report that \( 5 \) is not in the tree.

Inserting keys

To insert a new key in a BST, first proceed like when searching for the key. If the key is already in the BST, stop. Otherwise, the last step in search procedure wanted to traverse the non-existing left (or right, respectively) child of a node \( n \). In this case,

  • ​ insert a new node \( n’ \) with the key \( k \) in the tree, and

  • ​ make it to be the left (right, respectively) child of \( n \).

Example

Consider again the BST below.

_images/bst-ex-3.png

If we insert the key \( 10 \) in the tree, we

  1. start from the root and note that \( 8 < 10 \),

  2. go to the right child and note that \( 9 < 10 \),

  3. go to the right child and note that \( 11 > 10 \),

  4. note that the node does not have a left child, and

  5. add \( 10 \) as the left child of \( 11 \).

The resulting BST is shown below.

_images/bst-ex-4.png

Removing keys

Removing keys is a bit more complicated. First, find the node \( z \) having the key. Then:

  1. If the node \( z \) is a leaf, we just remove it.

  2. If the node \( z \) has only one child \( x \), remove the node \( z \) and substitute the only child \( x \) to be in its place.

  3. Otherwise, the node \( z \) has two chilren. In this case, we

    1. find the node \( y \) with the largest key in the left subtree of \( z \),

    2. make \( z \) have the key of \( y \),

    3. remove the node \( y \) and substitute the left child \( x \) of \( y \) (if it has one) to be in its place. Note that \( y \) does not have a right child; why?.

A task: study the following examples and argument to yourself why the BST property is preserved when we remove keys with the above method. That is: if the tree is a BST in the beginning, then the tree resulting after the key removal is also a BST.

Example

Let us remove the key 33 from the BST on the left-hand side below. In this case, we simply remove the node \( z \) having the key 33. The result is shown on the right hand side.

_images/bst-remove-1.png

Example

Let us remove the key 27 from the BST on the left-hand side below. In this case, we remove the node \( z \) having the key 27 and move its only child \( x \) to be in its place.

_images/bst-remove-2.png

Example

Let us remove the key 31 from the BST on the left-hand side below.

  • ​ As the node \( z \) with the key 31 has two children, we find the node \( y \) with the largest key in the left sub-tree of \( z \).

  • ​ We then move the key 27 of \( y \) to be the key of \( z \), shown with the red arrow in the figure.

  • ​ After this, we remove the node \( y \) and move its only child \( x \) to be in its place (shown with the blue dashed arrow).

_images/bst-remove-3.png

Listing the keys in ascending order

Due to the BST property, listing the elements/keys in ascending order is easy: just traverse the tree in inorder (recall the Section Traversing trees) and output the key of the node when it is visited. Thus \( n \) elements could be sorted by (i) inserting them one-by-one in a BST, and (ii) then listing the elements at the end. However, this would not be the most efficient sorting algorithm available.

Example

Consider the BST below. Traversing the nodes in inorder and printing the keys gives the sequence 2,3,4,6,7,8,9,10,11 as expected.

_images/bst-ex-4.png

Finding the smallest and largest keys

The smallest key can be found by simply iteratively traversing, starting from the root, to the left child as long as possible. Similarly, the largest key can be found from the “rightmost” node.

Example

Consider the BST shown below. The smallest key in it is 2 and the largest one is 11.

_images/bst-ex-4.png

Finding the predecessor and successor keys

Given a key \( k \), the predecessor key in a BST is the largest key that is smaller than \( k \) or “None” if all the keys in the BST are at least as large as \( k \). It can be found by initializing a variable best to “None” and traversing the tree from the root node as far as possible:

  • ​ If the key \( k’ \) in the current node is \( k \) or larger, go to the left child.

  • ​ Otherwise, set best to \( k’ \) and go to the right child.

At the end, report best. Finding the successor key of \( k \) (the smallest key that is larger than \( k \)) can be done in a similar way.

Example

Consider the BST shown below.

  • ​ The predecessor key of \( 5 \) is \( 4 \).

  • ​ The predecessor key of \( 20 \) is \( 11 \).

  • ​ The predecessor key of \( 2 \) is “None”.

  • ​ The successor key of \( 4 \) is \( 6 \).

_images/bst-ex-4.png

Time-complexity analysis

The algorithms for searching and inserting keys, as well as finding predecessor and successor keys,

  • ​ only traverse one path in the tree and

  • ​ perform constant-time operations in each traversed node (assuming that key comparison can be done in constant time).

Thus their time-complexity is \( \Oh{h} \), where \( h \) is the height of the tree.

The algorithm for removing keys also traverses one path to find the removed key and then continues one path downwards to find the key that is substituted in its place. Thus its time-complexity is also \( \Oh{h} \).

Expected and worst-case behaviour

It can be shown that if one builds a BST from \( n \) random distinct keys, then the expected height of the BST is \( \Oh{\Lg n} \), see Section 12.4 in the Introduction to Algorithms, 3rd ed. (online via Aalto lib) book. However, it is also easy to insert keys in a way that makes the tree very inbalanced and look like a linked list.

Example

Inserting the keys 1, 2, 3, 4, and 5 in this order gives the BST shown below. This is essentially a linked list (the right child link corresponds to the next element link in linked lists).

_images/bst-worst.png

Thus the worst-case time complexity of the operations given earlier is \( \Theta(n) \) when the tree has \( n \) keys.

In the following sections, we show how to fix this worst-case situation by balancing BSTs so that searching, inserting, and deleting keys can be done in time \( \Oh{\Lg n} \) in the case the tree has \( n \) keys in it.