1. Overview
Data structures represent a crucial asset in computer programming, and knowing when and why to use them is very important.
This article is a brief introduction to trie (pronounced “try”) data structure, its implementation and complexity analysis.
2. Trie
A trie is a discrete data structure that’s not quite well-known or widely-mentioned in typical algorithm courses, but nevertheless an important one.
A trie (also known as a digital tree) and sometimes even radix tree or prefix tree (as they can be searched by prefixes), is an ordered tree structure, which takes advantage of the keys that it stores – usually strings.
A node’s position in the tree defines the key with which that node is associated, which makes tries different in comparison to binary search trees, in which a node stores a key that corresponds only to that node.
All descendants of a node have a common prefix of a String associated with that node, whereas the root is associated with an empty String.
Here we have a preview of TrieNode that we will be using in our implementation of the Trie:
public class TrieNode {
private HashMap<Character, TrieNode> children;
private String content;
private boolean isWord;
// ...
}
There may be cases when a trie is a binary search tree, but in general, these are different. Both binary search trees and tries are trees, but each node in binary search trees always has two children, whereas tries’ nodes, on the other hand, can have more.
In a trie, every node (except the root node) stores one character or a digit. By traversing the trie down from the root node to a particular node n, a common prefix of characters or digits can be formed which is shared by other branches of the trie as well.
By traversing up the trie from a leaf node to the root node, a String or a sequence of digits can be formed.
Here is the Trie class, which represents an implementation of the trie data structure:
public class Trie {
private TrieNode root;
//...
}
3. Common Operations
Now, let’s see how to implement basic operations.
3.1. Inserting Elements
The first operation that we’ll describe is the insertion of new nodes.
Before we start the implementation, it’s important to understand the algorithm:
- Set a current node as a root node
- Set the current letter as the first letter of the word
- If the current node has already an existing reference to the current letter (through one of the elements in the “children” field), then set current node to that referenced node. Otherwise, create a new node, set the letter equal to the current letter, and also initialize current node to this new node
- Repeat step 3 until the key is traversed
*The complexity of this operation is O(n), where n represents the key size.*
Here is the implementation of this algorithm:
public void insert(String word) {
TrieNode current = root;
for (char l: word.toCharArray()) {
current = current.getChildren().computeIfAbsent(l, c -> new TrieNode());
}
current.setEndOfWord(true);
}
Now let’s see how we can use this method to insert new elements in a trie:
private Trie createExampleTrie() {
Trie trie = new Trie();
trie.insert("Programming");
trie.insert("is");
trie.insert("a");
trie.insert("way");
trie.insert("of");
trie.insert("life");
return trie;
}
We can test that trie has already been populated with new nodes from the following test:
@Test
public void givenATrie_WhenAddingElements_ThenTrieNotEmpty() {
Trie trie = createTrie();
assertFalse(trie.isEmpty());
}
3.2. Finding Elements
Let’s now add a method to check whether a particular element is already present in a trie:
- Get children of the root
- Iterate through each character of the String
- Check whether that character is already a part of a sub-trie. If it isn’t present anywhere in the trie, then stop the search and return false
- Repeat the second and the third step until there isn’t any character left in the String. If the end of the String is reached, return true
*The complexity of this algorithm is O(n), where n represents the length of the key.*
Java implementation can look like:
public boolean find(String word) {
TrieNode current = root;
for (int i = 0; i < word.length(); i++) {
char ch = word.charAt(i);
TrieNode node = current.getChildren().get(ch);
if (node == null) {
return false;
}
current = node;
}
return current.isEndOfWord();
}
And in action:
@Test
public void givenATrie_WhenAddingElements_ThenTrieContainsThoseElements() {
Trie trie = createExampleTrie();
assertFalse(trie.containsNode("3"));
assertFalse(trie.containsNode("vida"));
assertTrue(trie.containsNode("life"));
}
3.3. Deleting an Element
Aside from inserting and finding an element, it’s obvious that we also need to be able to delete elements.
For the deletion process, we need to follow the steps:
- Check whether this element is already part of the trie
- If the element is found, then remove it from the trie
The complexity of this algorithm is O(n), where n represents the length of the key.
Let’s have a quick look at the implementation:
public void delete(String word) {
delete(root, word, 0);
}
private boolean delete(TrieNode current, String word, int index) {
if (index == word.length()) {
if (!current.isEndOfWord()) {
return false;
}
current.setEndOfWord(false);
return current.getChildren().isEmpty();
}
char ch = word.charAt(index);
TrieNode node = current.getChildren().get(ch);
if (node == null) {
return false;
}
boolean shouldDeleteCurrentNode = delete(node, word, index + 1) && !node.isEndOfWord();
if (shouldDeleteCurrentNode) {
current.getChildren().remove(ch);
return current.getChildren().isEmpty();
}
return false;
}
And in action:
@Test
void whenDeletingElements_ThenTreeDoesNotContainThoseElements() {
Trie trie = createTrie();
assertTrue(trie.containsNode("Programming"));
trie.delete("Programming");
assertFalse(trie.containsNode("Programming"));
}
4. Conclusion
In this article, we’ve seen a brief introduction to trie data structure and its most common operations and their implementation.
The full source code for the examples shown in this article can be found over on GitHub.