Search

Tree

Unlike linear data structures such as stacks, queues, arrays, and lists, a tree is a non-linear data structure. In other words, a tree has a hierarchical structure.
Now, let’s start exploring trees in depth. But before we do that, let’s take a look at some important terminology used in tree structures
Node : An element that makes up a tree.
Edge : A line that connects one node to another in a tree.
Root Node : The topmost node in a tree.
Terminal Node (Leaf Node) : A node that has no children.
Internal Node : A node that has at least one child, all non-leaf nodes are internal nodes. If the root node has children, it is also considered an internal node.

Binary Tree

A binary tree is a tree structure where each node has at most two child nodes. It is defined recursively such that the tree consists of a root node and two subtrees, each of which must also be a binary tree. Here, a subtree refers to a smaller tree contained within the larger tree, and the empty set (an empty tree) is considered a binary tree as well. Including the empty tree ensures that the definition remains consistent when reaching leaf nodes, nodes without children, and also allows a tree with only a single node to satisfy the binary tree conditions.
In a tree, each node’s depth is represented by a level, starting with the root node at level 0, and increasing by one as you move down each layer. The highest level in the tree, where the deepest node exists, is called the height of the tree. Binary trees are commonly classified into three types. A perfect binary tree is one where every level is completely filled with nodes. A complete binary tree is filled level by level from top to bottom and left to right, with all levels fully filled except possibly the last, which is filled from the left up to some point. A full binary tree is one where every node has either zero or two children, no nodes have only one child.
Binary trees can also be represented using arrays. When the tree has n nodes and the root node’s index starts at 1, the parent of the node at index i is located at index floor(i/2), the left child at 2i, and the right child at 2i + 1. This indexing scheme allows for efficient traversal and implementation of binary trees using simple array operations.

BST (Binary Search Tree)

A Binary Search Tree (BST) is a type of binary tree in which each node has at most two child nodes, and the nodes are arranged according to a specific set of rules that allow for efficient search, insertion, and deletion operations. The rule is as follows, for any given node, all values in its left subtree must be less than the node’s value, and all values in its right subtree must be greater than the node’s value. This condition is applied recursively to every node in the tree.
For example, if the root node has a value of 8, the value 3 can be placed on the left and 10 on the right. Then, 1 and 6 can be placed as the left and right children of 3, respectively, and so on. Because of this structure, a BST reduces unnecessary comparisons during operations, allowing average time complexity of O(log n) for searching, inserting, and deleting values. However, if the tree becomes unbalanced (for example, by inserting sorted data in order), these operations can degrade to O(n) in the worst case.
Another advantage of a BST is that an in-order traversal of the tree yields the stored values in ascending order. This property makes BSTs especially useful when handling sorted data or when quick lookup operations are needed on a dynamic dataset. To maintain efficient performance, it's important to keep the tree balanced, which is why variations like AVL Trees and Red-Black Trees known as balanced binary search trees, are often used.

How do I choose a root node then?

Choosing the root node of a Binary Search Tree (BST) is an important decision, as it can greatly influence the structure and performance of the tree. If you are inserting values one by one without knowing the entire dataset in advance, the first value you insert typically becomes the root.
However, this can lead to an unbalanced tree, especially if the values are in sorted order, resulting in slower operations. On the other hand, if you already have all the values beforehand, you can construct a more balanced BST by first sorting the values and choosing the middle element as the root.

What if more data is added after setting the median?

Let’s suppose we initially have data ranging from 1 to 100. In this case, setting the root node to the median value, 50, would result in a generally balanced binary search tree. The left subtree would contain values from 1 to 49, while the right subtree would contain values from 51 to 100. This allows efficient searching, as only one half of the data needs to be considered depending on whether the target value is less than or greater than the root.
However, if additional data from 101 to 200 is later inserted, the situation changes. Since this new data is significantly skewed to the right, the tree becomes increasingly unbalanced. For example, if the target value is 150, which is much greater than the original median of 50, the search algorithm would need to traverse deep into the right subtree, resulting in slower performance. In contrast, if the target is less than 50, search performance would still be fast. This imbalance causes the overall efficiency of search operations to vary depending on the value being searched.
To solve this problem, it would be ideal, at least in theory, to recompute the median based on the full data set and reconstruct the tree accordingly. For instance, if the data now spans from 1 to 200, the new median would be 100. Setting this as the root would result in a much more balanced structure. However, reconstructing the entire tree each time new data is added is inefficient and computationally expensive.
For this reason, in practical scenarios, we typically use specialized data structures that automatically maintain balance during insertions and deletions. These are called Self-Balancing Binary Search Trees, with common examples including the AVL Tree and the Red-Black Tree. These trees apply rotation and restructuring operations automatically to maintain balance, so there's no need for the programmer to manually reset the median or rebuild the tree.
In conclusion, although the median value plays a crucial role in optimizing search performance, it is far more practical and efficient in dynamic environments to rely on self-balancing tree structures than to manually update the median every time data changes.
Tree Type
Description
Advantages
Disadvantages
AVL Tree
Maintains balance using node height
Very fast search
Frequent rotations on insert/delete
Red-Black Tree
Maintains loose balance via color rules
Good insert/delete performance
Slightly slower search than AVL
Splay Tree
Moves recently accessed node to root
Fast access for frequent values
Unstable structure inconsistent average performance
Treap
Combines BST structure with heap priority
Simple probabilistic balancing
Doesn't guarantee perfect balance
B / B+ Tree
Multi-way tree used in DB/file systems
Optimized for large-scale data
Complex less suitable for in-memory use

AVL Tree

An AVL Tree is a type of self-balancing binary search tree that automatically maintains its balance as elements are inserted or removed. It is named after its inventors, Adelson-Velsky and Landis, who introduced the structure in 1962. The core principle of an AVL tree is to ensure that for any given node, the height difference between its left and right subtrees, also known as the balance factor is no more than 1.
Each node in an AVL tree stores its height, and whenever a node is added or deleted, the tree checks whether this balance condition is still satisfied up the path toward the root. If an imbalance is detected, the tree performs one or more rotations to restore balance. These rotations are localized structural transformations that rearrange a small portion of the tree to ensure it remains balanced.
There are four types of rotations in an AVL tree
LL (Left-Left) Rotation : occurs when a node is inserted into the left subtree of a left child.
RR (Right-Right) Rotation : occurs when a node is inserted into the right subtree of a right child.
LR (Left-Right) Rotation : occurs when a node is inserted into the right subtree of a left child.
RL (Right-Left) Rotation : occurs when a node is inserted into the left subtree of a right child.
Each of these rotations restructures the affected subtree to bring the balance factor of involved nodes back within the acceptable range. These operations are performed in O(1) time, and because the number of nodes on the rebalancing path is logarithmic, the overall time complexity for insertion and deletion remains O(log n).
Because of this strict balancing policy, AVL trees provide very fast search performance, often better than other self-balancing trees like Red-Black Trees.
However, the trade-off is that insertions and deletions may incur slightly higher overhead due to the more frequent rebalancing operations.
In summary, an AVL tree maintains optimal search performance by enforcing a strict balancing rule on each node. This makes it ideal for applications where fast lookup is critical and updates are relatively infrequent, such as in-memory databases, caching systems, and read-heavy indexing structures.
Let's take a look at how an AVL Tree works through images and animations.
The given tree is an AVL tree consisting of a total of 21 nodes (N = 21), with a maximum height of 4. The root node is 66, and its left and right subtrees are evenly distributed, maintaining proper balance. An AVL tree is a type of self-balancing binary search tree (BST) that preserves the BST properties, left children hold smaller values than the parent, and right children hold larger values, while also ensuring that the height difference (balance factor) between any node’s left and right subtrees is no more than 1. This tree fully satisfies those conditions.

Left Subtree (Values less than 66)

The left child of the root 66 is 26, which has two children, 5 on the left and 41-2 on the right.
Node 5 further has 4-2 as its left child and 10 as its right child. Node 10, in turn, has 20 as its right child.
Node 41-2 has 34 as its left child and 53 as its right child. Node 53 further has 62 as its right child.
The labels 4-2 and 41-2 are used by the visualization tool (such as Visualgo) to distinguish multiple insertions of the same value. These suffixes are automatically added for clarity when duplicate values are inserted. In reality, the actual values are simply 4 and 41. While AVL trees typically do not allow duplicate values, such labeling is permitted in educational visualizations for experimentation or demonstration purposes. These tags are visual identifiers only and do not affect the actual node values.

Right Subtree (Values greater than 66)

The right child of the root 66 is 89, which has two children, 74 on the left and 92 on the right.
Node 74 has 67 as its left child and 80 as its right child. Node 80 has 88 as its right child.
Node 92 has 90-2 as its left child and 93 as its right child. Node 93 has 96-2 as its right child.
As with the left subtree, 90-2 and 96-2 are labeled to distinguish repeated values for visualization purposes, though the actual data values are simply 90 and 96. This right subtree also satisfies the AVL tree balance conditions, ensuring efficient operations for searching, inserting, and deleting.

Tree Summary

Item
Description
Tree Type
AVL Tree (Self-Balancing Binary Search Tree)
Root Node
66
Total Nodes (N)
21
Tree Height (h)
4
Balance Status
All subtrees maintain height differences within ±1
Duplicate Labels
Suffixes like -2 are added by visualization tools to distinguish duplicate insertions actual values remain unchanged (e.g., 90-2 means the second occurrence of 90)
This AVL tree maintains ideal balance, and all its nodes adhere to AVL balancing rules. As a result, operations such as search, insertion, and deletion can all be performed in O(log n) time complexity.
Now, let’s take a look at how the tree changes when we insert the value 99!
When the value 99 is inserted into the AVL tree with root node 66, it follows the path: 66 → 89 → 92 → 93 → 96-2, and is finally added as the right child of 96-2. This placement satisfies the rules of a Binary Search Tree (BST), but the AVL tree also requires that the balance of every node be maintained after any insertion.
Initially, 96-2 remains balanced since its left and right subtree heights differ by only 1. However, its parent node 93 now has no left child and a right subtree of height 2, resulting in a balance factor of -2, which violates the AVL condition. This triggers a Right-Right (RR) imbalance, and the AVL tree responds by performing a left rotation around node 93.
As a result of the rotation, 96-2 becomes the new right child of 92, while 93 becomes the left child of 96-2, and 99 remains its right child. This restructuring rebalances the right subtree of 92, resolving the imbalance caused by the insertion.
In the end, the AVL tree once again satisfies all balance conditions, maintaining its logarithmic height and ensuring that operations like search, insertion, and deletion continue to run in O(log n) time. This automatic rebalancing mechanism is a core strength of AVL trees, enabling them to keep performance consistent as data grows dynamically.
Advanced Tree will take a look next time
Trie (Prefix Tree)
Segment Tree
Fenwick Tree (Binary Indexed Tree)