1. Overview
H2 is a very popular database solution, especially when it comes to testing. It was first introduced as an in-memory database to use in test and integration environments because it’s easy to set up and use. It’s also proven reliable and fast, so it’s a very good alternative for slow-to-configure and costly-to-deploy traditional databases.
In time, H2 started supporting disk storage and also a server mode. This way, it became an option for long-persisted storage and a possible database for distributed systems, since different services and/or servers can access it in server mode. So, developers have started thinking of H2 as another option for storage, even in production environments.
In this tutorial, we’ll go through the features that make H2 an option for production storage, the limitations that still exist and we’ll evaluate cases in which H2 can be used in production and others that we should avoid.
2. H2 Features
Let’s first see some of the features of H2 that make it a fast and easy-to-use database solution:
- it has a very fast database engine
- it’s easy to configure, especially in Spring Boot applications
- supports standard SQL and JDBC API
- provides security, with authentication, encryption functions, and others
- uses transactions and two phase commit
- allows multiple connections and row level locking
- it’s open source and written in Java
- provides a web console application
On top of that, H2 supports different connection modes:
- an embedded mode, for local connections, using JDBC
- a server mode, for remote connections, using JDBC or ODBC over TCP/IP
- and a mixed mode, which combines both previous modes
2.1. In-Memory
H2 provides in-memory storage. In certain cases, like caching and gaming, we don’t need to persist data and can use in-memory databases. Redis is a very popular in-memory caching solution and is widely used in production environments.
So, in modern applications, where multiple databases can be used by a service, we can use both persistence and in-memory databases to improve performance. If we take into consideration how fast H2 is, as a Java database with the option of embedded mode, then we can understand why people use it in production more and more.
2.2. Disk Storage
When talking about databases, it’s the default thought that data is being persisted. Databases were initially invented to store data and provide durability. In the vast majority of cases, we don’t want to lose data if the database shuts down or restarts.
Over time, there was an increased request for H2 to support some persistence too. In later versions, H2 added disk storage support and it can be used for persisting data. In this mode, data is stored in files of the host machine. So, when the database shuts down, data is retained to be used when it restarts.
2.3. Embedded Mode
In embedded mode, the database and the application will be executed in the same JVM. This is the fastest and easiest H2 mode. The application can connect to the database using JDBC. But other applications/ servers, outside this virtual machine, don’t have access to it.
In this mode, both persistence and in-memory are possible. Also, there are no limits in databases or connections opened simultaneously.
If we have a production environment with multiple servers sharing the same database, then obviously this mode can’t be used. For small applications with only one server, the embedded mode could be the option, especially if we think about the growing usage of databases like SQLite in production in the last few years.
2.4. Server/Mixed Mode
In server mode, the application opens the database remotely, within the same or a different virtual or physical machine. We may see it also referred to as remote mode or server/client mode. One or more applications can then connect to H2 using JDBC or ODBC over TCP/IP.
This mode is slower, since we use TCP/IP, but allows more machines to connect to the same database. Same as in embedded mode, in-memory and persistence are supported and there are no limits on open databases or connections.
Mixed mode is a combination of the embedded and server modes. One application runs in embedded mode, but it also starts a server. Other applications can use this server to connect to the database remotely.
Using server or mixed mode, we can use the H2 database in production, when we need to support multiple servers connecting to the same database.
3. Why Is H2 a Proper Solution for Production?
From what we have seen so far, we can make the statement that H2 is a very fast and easy-to-use database, when used in embedded mode, with in-memory storage. Any application where the need for speed is more important than the need for durability could benefit from H2 over a traditional database.
In the other modes, it’s still a solid solution, compared to other databases, because:
- it has a very fast engine
- it’s very easy to use and configure (for Java applications, it could mean to just add a dependency and some properties)
- it supports clustering, providing durability and no single point of failure
- it’s very cheap since it’s open source
- very easy to learn and use
In general, H2 is a simple, easy solution for production when we don’t need many TPS or big data sizes. Moreover, people who have used it in production reports that it would be a good fit for application internals such as caching, keeping short-lived data, loading fast-access needed data for performance improvements, etc (like what SQLite is mostly used for).
4. Why Is H2 Not a Proper Solution for Production?
There are cases when H2 might not be the best option for a production database. It has some obvious limitations, by design. But there are reports about its performance as well. H2 is being used lately in production environments, by different, usually small, applications and there are PoCs of its real-world performance.
Starting with the limitations, in any mode, H2 struggles when dealing with storing and reading large objects. If the objects don’t fit in memory, BLOB and CLOB types can be used, but this increases the complexity and performance.
Availability, scalability, and durability are also concerns since H2 clustering can only support up to two nodes in the cluster, at the moment. This means that for high availability services, it’s not an option to use in production. The same goes for storing critical data since durability might be compromised when having a maximum of only two servers up and running.
Moreover on durability, as stated in the H2 documentation, this database doesn’t guarantee that all committed transactions survive a power failure.
Reading reports and articles from people who have used H2 in production, the general conclusion is that, at the moment, for real apps with real-world data sizes, H2 isn’t reliable. Even if it can handle the size, it causes bugs and sometimes loss of data. Especially in multi-thread/ multi-connection use cases, users have suffered a lot of issues including deadlocks and poor performance when the data grows.
Some more minor limitations are that H2 doesn’t have commercial support and it has fewer features than other, traditional, databases. In the case of in-memory usage, we should consider the extra cost, since memory is more expensive than disk space.
5. Conclusion
In this article, we looked at the main features of the H2 database and focused on the modes that make it an option for production storage. Then, we discussed the strong points and also the limitations it has.
In conclusion, H2 is a good fit for production, in cases we don’t have to handle high volumes of data and high transaction rates. This is especially so when one production server is needed and H2 can be used in embedded mode. On the other hand, we should avoid it when we need high availability or scalability or face high data volumes.