1. Introduction

In this article, we’ll look at the MapDB library — an embedded database engine accessed through a collection-like API.

We start by exploring the core classes DB and DBMaker that help configure, open, and manage our databases. Then, we’ll dive into some examples of MapDB data structures that store and retrieve data.

Finally, we’ll look at some of the in-memory modes before comparing MapDB to traditional databases and Java Collections.

2. Storing Data in MapDB

First, let’s introduce the two classes that we’ll be using constantly throughout this tutorial — DB and DBMaker. The DB class represents an open database. Its methods invoke actions for creating and closing storage collections to handle database records, as well as handling transactional events.

DBMaker handles database configuration, creation, and opening. As part of the configuration, we can choose to host our database either in-memory or on our file system.

2.1. A Simple HashMap Example

To understand how this works, let’s instantiate a new database in memory.

First, let’s create a new in-memory database using the DBMaker class:

DB db = DBMaker.memoryDB().make();

Once our DB object is up and running, we can use it to build an HTreeMap to work with our database records:

String welcomeMessageKey = "Welcome Message";
String welcomeMessageString = "Hello Baeldung!";

HTreeMap myMap = db.hashMap("myMap").createOrOpen();
myMap.put(welcomeMessageKey, welcomeMessageString);

HTreeMap is MapDB’s HashMap implementation. So, now that we have data in our database, we can retrieve it using the get method:

String welcomeMessageFromDB = (String) myMap.get(welcomeMessageKey);
assertEquals(welcomeMessageString, welcomeMessageFromDB);

Finally, now that we’re finished with the database, we should close it to avoid further mutation:

db.close();

To store our data in a file, rather than in memory, all we need to do is change the way that our DB object is instantiated:

DB db = DBMaker.fileDB("file.db").make();

Our example above uses no type parameters. As a result, we’re stuck with casting our results to work with specific types. In our next example, we’ll introduce Serializers to eliminate the need for casting.

2.2. Collections

MapDB includes different collection types. To demonstrate, let’s add and retrieve some data from our database using a NavigableSet, which works as you might expect of a Java Set:

Let’s start with a simple instantiation of our DB object:

DB db = DBMaker.memoryDB().make();

Next, let’s create our NavigableSet:

NavigableSet<String> set = db
  .treeSet("mySet")
  .serializer(Serializer.STRING)
  .createOrOpen();

Here, the serializer ensures that the input data from our database is serialized and deserialized using String objects.

Next, let’s add some data:

set.add("Baeldung");
set.add("is awesome");

Now, let’s check that our two distinct values have been added to the database correctly:

assertEquals(2, set.size());

Finally, since this is a set, let’s add a duplicate string and verify that our database still contains only two values:

set.add("Baeldung");

assertEquals(2, set.size());

2.3. Transactions

Much like traditional databases, the DB class provides methods to commit and rollback the data we add to our database.

To enable this functionality, we need to initialize our DB with the transactionEnable method:

DB db = DBMaker.memoryDB().transactionEnable().make();

Next, let’s create a simple set, add some data, and commit it to the database:

NavigableSet<String> set = db
  .treeSet("mySet")
  .serializer(Serializer.STRING)
  .createOrOpen();

set.add("One");
set.add("Two");

db.commit();

assertEquals(2, set.size());

Now, let’s add a third, uncommitted string to our database:

set.add("Three");

assertEquals(3, set.size());

If we’re not happy with our data, we can rollback the data using DB’s rollback method:

db.rollback();

assertEquals(2, set.size());

2.4. Serializers

MapDB offers a large variety of serializers, which handle the data within the collection. The most important construction parameter is the name, which identifies the individual collection within the DB object:

HTreeMap<String, Long> map = db.hashMap("indentification_name")
  .keySerializer(Serializer.STRING)
  .valueSerializer(Serializer.LONG)
  .create();

While serialization is recommended, it is optional and can be skipped. However, it’s worth noting that this will lead to a slower generic serialization process.

3. HTreeMap

MapDB’s HTreeMap provides HashMap and HashSet collections for working with our database. HTreeMap is a segmented hash tree and does not use a fixed-size hash table. Instead, it uses an auto-expanding index tree and does not rehash all of its data as the table grows. To top it off, HTreeMap is thread-safe and supports parallel writes using multiple segments.

To begin, let’s instantiate a simple HashMap that uses String for both keys and values:

DB db = DBMaker.memoryDB().make();

HTreeMap<String, String> hTreeMap = db
  .hashMap("myTreeMap")
  .keySerializer(Serializer.STRING)
  .valueSerializer(Serializer.STRING)
  .create();

Above, we’ve defined separate serializers for the key and the value. Now that our HashMap is created, let’s add data using the put method:

hTreeMap.put("key1", "value1");
hTreeMap.put("key2", "value2");

assertEquals(2, hTreeMap.size());

As HashMap works on an Object’s hashCode method, adding data using the same key causes the value to be overwritten:

hTreeMap.put("key1", "value3");

assertEquals(2, hTreeMap.size());
assertEquals("value3", hTreeMap.get("key1"));

4. SortedTableMap

MapDB’s SortedTableMap stores keys in a fixed-size table and uses binary search for retrieval. It’s worth noting that once prepared, the map is read-only.

Let’s walk through the process of creating and querying a SortedTableMap. We’ll start by creating a memory-mapped volume to hold the data, as well as a sink to add data. On the first invocation of our volume, we’ll set the read-only flag to false, ensuring we can write to the volume:

String VOLUME_LOCATION = "sortedTableMapVol.db";

Volume vol = MappedFileVol.FACTORY.makeVolume(VOLUME_LOCATION, false);

SortedTableMap.Sink<Integer, String> sink =
  SortedTableMap.create(
    vol,
    Serializer.INTEGER,
    Serializer.STRING)
    .createFromSink();

Next, we’ll add our data and call the create method on the sink to create our map:

for(int i = 0; i < 100; i++){
  sink.put(i, "Value " + Integer.toString(i));
}

sink.create();

Now that our map exists, we can define a read-only volume and open our map using SortedTableMap’s open method:

Volume openVol = MappedFileVol.FACTORY.makeVolume(VOLUME_LOCATION, true);

SortedTableMap<Integer, String> sortedTableMap = SortedTableMap
  .open(
    openVol,
    Serializer.INTEGER,
    Serializer.STRING);

assertEquals(100, sortedTableMap.size());

Before we move on, let’s understand how the SortedTableMap utilizes binary search in more detail.

SortedTableMap splits the storage into pages, with each page containing several nodes comprised of keys and values. Within these nodes are the key-value pairs that we define in our Java code.

SortedTableMap performs three binary searches to retrieve the correct value:

  1. Keys for each page are stored on-heap in an array. The SortedTableMap performs a binary search to find the correct page.
  2. Next, decompression occurs for each key in the node. A binary search establishes the correct node, according to the keys.
  3. Finally, the SortedTableMap searches over the keys within the node to find the correct value.

5. In-Memory Mode

MapDB offers three types of in-memory store. Let’s take a quick look at each mode, understand how it works, and study its benefits.

5.1. On-Heap

The on-heap mode stores objects in a simple Java Collection Map. It does not employ serialization and can be very fast for small datasets. 

However, since the data is stored on-heap, the dataset is managed by garbage collection (GC). The duration of GC rises with the size of the dataset, resulting in performance drops.

Let’s see an example specifying the on-heap mode:

DB db = DBMaker.heapDB().make();

5.2. Byte[]

The second store type is based on byte arrays. In this mode, data is serialized and stored into arrays up to 1MB in size. While technically on-heap, this method is more efficient for garbage collection.

This is recommended by default, and was used in our ‘Hello Baeldung’ example:

DB db = DBMaker.memoryDB().make();

5.3. DirectByteBuffer

The final store is based on DirectByteBuffer. Direct memory, introduced in Java 1.4, allows the passing of data directly to native memory rather than Java heap. As a result, the data will be stored completely off-heap.

We can invoke a store of this type with:

DB db = DBMaker.memoryDirectDB().make();

6. Why MapDB?

So, why use MapDB?

6.1. MapDB vs Traditional Database

MapDB offers a large array of database functionality configured with just a few lines of Java code. When we employ MapDB, we can avoid the often time-consuming setup of various services and connections needed to get our program to work.

Beyond this, MapDB allows us to access the complexity of a database with the familiarity of a Java Collection. With MapDB, we do not need SQL, and we can access records with simple get method calls.

6.2. MapDB vs Simple Java Collections

Java Collections will not persist the data of our application once it stops executing. MapDB offers a simple, flexible, pluggable service that allows us to quickly and easily persist the data in our application while maintaining the utility of Java collection types.

7. Conclusion

In this article, we’ve taken a deep dive into MapDB’s embedded database engine and collection framework.

We started by looking at the core classes DB and DBMaker to configure, open and manage our database. Then, we walked through some examples of data structures that MapDB offers to work with our records. Finally, we looked at the advantages of MapDB over a traditional database or Java Collection.

As always, the example code is available over on GitHub.