52.219 Binary Trees

See source /home/s7/pat/scheme/pub/btree.scm and /home/s7/pat/scheme/pub/draw-btree.scm and Exercise 4 (later)

Binary Trees
In the implementation above, a binary tree is composed of nodes (we'll call them bnodes in scm) where a node has a label that contains some value, a left branch, and a right branch. The left of a node is either empty or a binary tree (ie an other node), and the right of a node is either empty or a binary tree (ie an other node).

We can represent it in Scheme using the record structures, such that we have (left bnode), (label bnode), (right bnode).

Alternatively would could represent using a 2-dimensional array, with n rows (one row per node) and 3 colums. The 1st column corresponds to left, 3d column corresponds to right, and 2nd column corresponds to the label. This is then a "cursor" implementation and is similar to our alist.scm representation of lists.

A third representation is the same as proposed for trees, where we have a vector v, and (vector-ref v i) is the parent of node i. This is "crisp" (compact) but makes it difficult to traverse, and get hold of descendants of a node.

Traversals
Again, three traversals, preorder, inorder, and postorder. If we have expressions in a binary tree, such as ((a + b) * (c + d)), where expressions are composed of triples, where each triple is an expression, inorder delivers algebraic, preorder gives prefix notation, and postorder postfix notation (and there are demonstrations of this in the source file, see make-bintree).

Binary Search Trees
Probably the most common use of binary trees is to represent sorted sets (of objects). We call this a binary search tree, and has the following property:

(label (left x)) < (label x) < (label (right x))
for all x. Therefore, when we insert an element e into the set represented as a binary tree we compare it with (label bnode). If e is less than (label bnode) we insert e in (left bnode), otherwise we insert e in (right bnode). Inserting e into an empty tree delivers a leaf, with e as label, left as empty, and right as empty.

The test for membership of a set then involves searching through the binary search tree, moving left/right until found or a leaf node encountered. If the tree is well balanced, we should expect worst case time to be log n (base 2) where n is number of elements in the set.

NOTE: In exercise 4 we build up binary search trees from unordered sets of integers, and examine the resulting trees to see the "height" of the tree. This gives us worst case cost in testing for membership from that tree. We then do this over many sets and get a feel for the average worst case. Hopefully this should be O(log n).

Deletion from a Binary Search Tree
This is non-trivial. Assume we wish to delete e from the set. We must first find the node X, where (label X) = e. There are then 4 scenarios to consider:

  1. X is a leaf node ... delete it
  2. (left X) is empty ... replace X with (right X)
  3. (right X) is empty ... replace X with (left X)
  4. (left X) and (right X) is NOT empty ... 
      ie. X is an "interior" node

Scenario 4 is the complication. We do as follows:

  A. Find the lowest valued element in the right branch of X, call it
     Y. NOTE: Y cannot be an interior node, otherwise it would have a
     lower valued element to its left.
  B. Replace the (label X) with (label Y)
  C. Delete Y (and this is trivial)

This will preserve the properties of the binary search tree.

There is a symmetrical way to do this. In step A above, let Y be the highest valued element in the left branch of X.

Worst Case, Best Case
In the worst case the binary search tree behaves as a linear linked list. We get this behaviour if we build the tree from a sorted set (sorted with respect to < or >, same results). To get best behaviour (ie. log n) we want the tree to be balanced, ie same number of nodes on the left as on the right of every node in the tree.

Traversals of Binary Search Tree
An inorder traversal is in order.

Relationship to Binary Search in a Sorted Vector
We can think of binary search as a functional representation of a binary tree, and a binary tree as a structural representation of binary search. The binary search requires static space put aside, whereas the binary tree can grow dynamically. BINARY TREES AS SETS ... advantages and disadvantages In a set we require the basic operations insert, delete, member, union, intersection, set difference. We could use a list implementation of a set, and that could be implemented as llist, vlist, or alist. Or as a binary tree Therefore ...

   (a) how long does it take to insert, delete, test membership
   (b) is the space cost significant
   (c) how deep will tree be for n items
       how many elements at depth 1, depth 2, ... depth d?
       So how deep would the tree have to be for n elements?
   (d) what is worst case, and what is the resulting structure?
       ie what do we mean by "log n to the base 2".
   (e) Consider the member fuction with vlist if it is ordered
       That is describe the "binary chop". What is its complexity