1. Introduction
Java applications often use JSON as a common format for sending and receiving data. Moreover, it’s used as a serialization protocol for storing data. With smaller JSON data sizes, our applications become cheaper and faster.
In this tutorial, we’ll look at various ways of reducing the size of JSON in our Java applications.
2. Domain Model and Test Data
Let’s create a domain model for a Customer with some contact data:
public class Customer {
private long id;
private String firstName;
private String lastName;
private String street;
private String postalCode;
private String city;
private String state;
private String phoneNumber;
private String email;
Note that all fields will be mandatory, except for phoneNumber and email.
To properly test JSON data size differences, we need at least a few hundred Customer instances. They must have different data to make our tests more lifelike. The data generation web site mockaroo helps us here. We can create 1,000 JSON data records there for free, in our own format, and with authentic test data.
Let’s configure mockaroo for our domain model:
Here, some items to keep in mind:
- This is where we specified the field names
- Here we selected the data types of our fields
- 50% of the phone numbers are empty in the mock data
- 30% of the email addresses are empty, too
All the code examples below use the same data of 1,000 customers from mockaroo. We use the factory method Customer.fromMockFile() to read that file and turn it into Customer objects.
We’ll be using Jackson as our JSON processing library.
3. JSON Data Size with Jackson Default Options
Let’s write a Java object to JSON with the default Jackson options:
Customer[] customers = Customer.fromMockFile();
ObjectMapper mapper = new ObjectMapper();
byte[] feedback = mapper.writeValueAsBytes(customers);
Let’s see the mock data for the first Customer:
{
"id" : 1,
"firstName" : "Horatius",
"lastName" : "Strognell",
"street" : "4848 New Castle Point",
"postalCode" : "33432",
"city" : "Boca Raton",
"state" : "FL",
"phoneNumber" : "561-824-9105",
"email" : "[email protected]"
}
When using default Jackon options, the JSON data byte array with all 1,000 customers is 181.0 KB in size.
4. Compressing with gzip
As text data, JSON data compresses nicely. That’s why gzip is our first option to reduce the JSON data size. Moreover, it can be automatically applied in HTTP, the common protocol for sending and receiving JSON.
Let’s take the JSON produced with the default Jackson options and compress it with gzip. This results in 45.9 KB, just 25.3% of the original size. So if we can enable gzip compression through configuration, we’ll cut down the JSON data size by 75% without any change to our Java code!
If our Spring Boot application delivers the JSON data to other services or front-ends, then we’ll enable gzip compression in the Spring Boot configuration. Let’s see a typical compression configuration in YAML syntax:
server:
compression:
enabled: true
mime-types: text/html,text/plain,text/css,application/javascript,application/json
min-response-size: 1024
First, we enabled compression in general by setting enabled as true. Then, we specifically enabled JSON data compression by adding application/json to the list of mime-types. Finally, notice that we set min-response-size to 1,024 bytes long. This is because if we compress short amounts of data, we may produce bigger data than the original.
Often, proxies such as NGINX or web servers such as the Apache HTTP Server deliver the JSON data to other services or front-ends. Configuring JSON data compression in these tools is beyond the scope of this tutorial.
A previous tutorial on gzip tells us that gzip has various compression levels. Our code examples use gzip with the default Java compression level. Spring Boot, proxies, or web servers may get different compression results for the same JSON data.
If we use JSON as the serialization protocol to store data, we’ll need to compress and decompress the data ourselves.
5. Shorter Field Names in JSON
It’s a best practice to use field names that are neither too short nor too long. Let’s omit this for the sake of demonstration: We’ll use single-character field names in JSON, but we’ll not change the Java field names. This reduces the JSON data size but lowers JSON readability. Since it would also require updates to all services and front-ends, we’ll probably use these short field names only when storing data:
{
"i" : 1,
"f" : "Horatius",
"l" : "Strognell",
"s" : "4848 New Castle Point",
"p" : "33432",
"c" : "Boca Raton",
"a" : "FL",
"o" : "561-824-9105",
"e" : "[email protected]"
}
It’s easy to change the JSON field names with Jackson while leaving the Java field names intact. We’ll use the @JsonProperty annotation:
@JsonProperty("p")
private String postalCode;
Using single-character field names leads to data that is 72.5% of the original size. Moreover, using gzip will compress that to 23.8%. That’s not much smaller than the 25.3% we got from simply compressing the original data with gzip. We always need to look for a suitable cost-benefit relation. Losing readability for a small gain in size won’t be recommendable for most scenarios.
6. Serializing to an Array
Let’s see how we can further reduce the JSON data size by leaving out the field names altogether. We can achieve this by storing a customers array in our JSON. Notice that we’ll be also reducing readability. And we’ll also need to update all the services and front-ends that use our JSON data:
[ 1, "Horatius", "Strognell", "4848 New Castle Point", "33432", "Boca Raton", "FL", "561-824-9105", "[email protected]" ]
Storing the Customer as an array leads to output that’s 53.1% of the original size, and 22.0% with gzip compression. This is our best result so far. Still, 22% is not significantly smaller than the 25.3% we got from merely compressing the original data with gzip.
In order to serialize a customer as an array, we need to take full control of JSON serialization. Refer again to our Jackson tutorial for more examples.
7. Excluding null Values
Jackson and other JSON processing libraries may not handle JSON null values correctly when reading or writing JSON. For example, Jackson writes a JSON null value by default when it encounters a Java null value. That’s why it’s a good practice to remove empty fields in JSON data. This leaves the initialization of empty values to each JSON processing library and reduces the JSON data size.
In our mock data, we set 50% of the phone numbers, and 30% of the email addresses, as empty. Leaving out these null values reduces our JSON data size to 166.8kB or 92.1% of the original data size. Then, gzip compression will drop it to 24.9%.
Now, if we combine ignoring null values with the shorter field names from the previous section, then we’ll get more significant savings: 68.3% of the original size and 23.4% with gzip.
We can configure the omission of null value fields in Jackson per class or globally for all classes.
8. New Domain Class
We achieved the smallest JSON data size so far by serializing it to an array. One way of reducing that even further is a new domain model with fewer fields. But why would we do that?
Let’s imagine a front-end for our JSON data that shows all customers as a table with two columns: name and street address. Let’s write JSON data specifically for this front-end:
{
"id" : 1,
"name" : "Horatius Strognell",
"address" : "4848 New Castle Point, Boca Raton FL 33432"
}
Notice how we concatenated the name fields into name and the address fields into address. Also, we left out email and phoneNumber.
This should produce much smaller JSON data. It also saves the front-end from concatenating the Customer fields. But on the downside, this couples our back-end tightly to the front-end.
Let’s create a new domain class CustomerSlim for this front-end:
public class CustomerSlim {
private long id;
private String name;
private String address;
If we convert our test data to this new CustomerSlim domain class, we‘ll reduce it to 46.1% of the original size. That will be using the default Jackson settings. If we use gzip it goes down to 15.1%. This last result is already a significant gain over the previous best result of 22.0%.
Next, if we also use one-character field names, this gets us down to 40.7% of the original size, with gzip further reducing this to 14.7%. This result is only a small gain of over 15.1% we reached with the Jackson default settings.
No fields in CustomerSlim are optional, so leaving out empty values has no effect on the JSON data size.
Our last optimization is the serialization of an array. By serializing CustomerSlim to an array, we achieve our best result: 34.2% of the original size and 14.2% with gzip. So even without compression, we remove nearly two-thirds of the original data. And compression shrinks our JSON data to just one-seventh of the original size!
9. Conclusion
In this article, we first saw why we need to reduce JSON data sizes. Next, we learned various ways to reduce this JSON data size. Finally, we learned how to further reduce JSON data size with a domain model that’s custom to one front-end.
The complete code is available, as always, over on GitHub.