NOTES OF RED BLACK TREE
in advance data structure
Download Link
Notes on Red-Black Trees
The efficiency of binary search trees
- Binary search trees provide the ability to maintain collections of ordered data in a potentially efficient fashion.
- Search, insert, and remove are all O(h), where h is the height of the search tree.
- If we are fortuitous, the tree is approximately balanced --- for any node, the number of nodes in its in left subtree is roughly the same as the number of nodes in its right subtree.
- In the balanced, or nearly balanced case, the height of the tree is O(lg(n)) where n is the total number of nodes in the tree.
- In the worst (least balanced) case, the height is O(n).
Improvements and alternatives
Rather than rely on luck, we'd like to be able to guarantee that we can perform the search tree operations in logarithmic time. There are several ways this can be accomplished:
As read in Main & Savitch, we can use B-trees to store the data in a way that is wide rather than deep, and still allows for efficient manipulations. B-trees are not binary search trees --- they aren't even binary trees. In order to implement them we need to start from the ground up with a fresh data structure.
We'd like instead to be able to reuse binary search trees, but in such a way that their height remains logarithmic. The are several ways to do this:
- AVL trees --- see Programming Assignment Three
- Red-black trees --- discussed below
- Splay trees --- self-adjusting trees (possible final project material)
Since the same binary search tree can be represented by binary trees of varying height, there must be someway to transform trees that are imbalanced into ones that are balanced.
Our goal is to reuse the same algorithms for manipulating binary search trees, but in addition, when the tree is changed (i.e. with insert or remove) we apply an additional rebalancing phase. As in:
Modified-insert: 0. usual BST insert 1. rebalance the treeOf course, this will only be useful if the rebalancing is itself efficient (we'll get back to that).
Before proceeding into further details of how to keep binary search trees balanced, let's explore more thoroughly how we can build new, but similar data structures (for example, red-black trees) from existing ones (i.e. from binary search trees). This is our motivation for introducing inheritance.
Internal and external nodes
Thus far, we have considered only internal tree nodes --- nodes that contain data. We can also consider NULL pointers to be nodes without data (and without children) these are so called external leaf nodes:(2) / \ (1) (5) / \ / \ x x (3) x / \ x x (the x's are external nodes)
Red-black trees
A red-black tree is a binary search tree such that each node (internal and external) is assigned a color (either red or black). The coloring of the tree must satisfy the following red-black properties:- Every external leaf (NULL node) is considered to be black.
- If a node is red, then both its children are black.
- For a given node, the number of black nodes between it and an external leaf is the same regardless of which path is taken.
Red-black trees are well balanced
It can be proven that the height of a red-black tree is never more than 2*lg(n+1) where n is the total number of internal nodes in the tree. Thus, the height is O(lg(n)).The intuition: By property 2, no two red nodes can be parent and child. This means on any path from root to an external leaf, there are at least as many black nodes as there are red. Property three says that regardless of which path is taken, the same number of black nodes are encountered. So, in the worst-cast, there is a path that is twice as long as the shortest path. So the height of the tree is no worse than twice the height of the shortest path. That's fairly balanced.
Implementing red-black trees
template <class Item, class Key> class BST { public: // Constructors BST(); BST(const BST<Item,Key>& source); // copy constructor // Destructor ~BST(); // Constant member functions bool isEmpty() const; size_t size() const; size_t height() const; bool search(const Key& k, Item& returnVal) const; // Modification member functions void operator=(const BST& source); bool insert(const Item& v, const Key& k); bool remove(const Key& v); protected: typedef BinaryTree<pair<Item,Key> > BT; BT *root; // protected member functions Item itemOf(const pair<Item,Key>& ikp) const {return ikp.first;} Key keyOf(const pair<Item,Key>& ikp) const {return ikp.second;} bool isRightChild(BT *node) const; BT *removeNode(BT *node); BT *successor(BT *node) const; BT *_search(BT *t, const Key& k) const; BT *_insert(const pair<Item,Key>& ikp); void _remove(BT *node); };
enum color {RED, BLACK}; template < class Item, class Key > class RBT : public BST<pair<Item, color>, Key> { public: ... bool search(const Key& k, Item& returnVal) const; bool insert(const Item& v, const Key& k); bool remove(const Key& v); private: typedef BinaryTree<pair<pair<Item, color>,Key> > BT; protected: // Protected member functions Item itemOf(const pair<pair<Item,color>,Key>& ick) const; color colorOf(BT *node) const; void setColor(BT *node, color c); BT *rotateLeft(BT *node); BT *rotateRight(BT *node); void fixInsert(BT *node); void fixRemove(BT *x, BT *p); };
template <class Item, class Key> bool RBT<Item,Key>::insert(const Item& v, const Key& k) { BT *node = _insert(pair<pair<Item,color>,Key>(pair<Item,color>(v, RED), k)); if (node != NULL) fixInsert(node); return (node != NULL); }
Binary search property preserving transformations
rotateLeft(x): (x) (y) / \ / \ 'a (y) ===> (x) 'c / \ / \ 'b 'c 'a 'b
rotateRight(y): (y) (x) / \ / \ (x) 'c ===> 'a (y) / \ / \ 'a 'b 'b 'c
Restoring red-black properties after insert
New node (other than root) always colored red.- If new node also has red parent than we make changes depending on the color of the new nodes' uncle.
- After making the changes we move up the tree to until repeating the changes if necessary. In the worst-case we hit the root.
- Otherwise the properties are maintained.
Restoring red-black properties after remove
- If deleted node was black than property 3 is violated, so we make changes depending on the color of the former node's sibling.
- After making the changes we move up the tree to until repeating the changes if necessary. In the worst-case we hit the root.
- Otherwise the properties are maintained.