1. Overview
Tagging is a standard design pattern that allows us to categorize and filter items in our data model.
In this article, we’ll implement tagging using Spring and JPA. We’ll be using Spring Data to accomplish the task. Furthermore, this implementation will be useful if you want to use Hibernate.
This is the second article in a series on implementing tagging. To see how to implement it with Elasticsearch, go here.
2. Adding Tags
First, we’re going to explore the most straightforward implementation of tagging: a List of Strings. We can implement tags by adding a new field to our entity like this:
@Entity
public class Student {
// ...
@ElementCollection
private List<String> tags = new ArrayList<>();
// ...
}
Notice the use of the ElementCollection annotation on our new field. Since we’re running in front of a data store, we need to tell it how to store our tags.
If we didn’t add the annotation, they’d be stored in a single blob which would be harder to work with. This annotation creates another table called STUDENT_TAGS (i.e.,
This creates a One-To-Many relationship between our entity and tags! We’re implementing the simplest version of tagging here. Because of this, we’ll potentially have a lot of duplicate tags (one for each entity that has it). We’ll talk more about this concept later.
3. Building Queries
Tags allow us to perform some interesting queries on our data. We can search for entities with a specific tag, filter a table scan, or even limit what results come back in a particular query. Let’s take a look at each of these case.
3.1. Searching Tags
The tag field we added to our data model can be searched similar to other fields on our model. We keep the tags in a separate table when building the query.
Here is how we search for an entity containing a specific tag:
@Query("SELECT s FROM Student s JOIN s.tags t WHERE t = LOWER(:tag)")
List<Student> retrieveByTag(@Param("tag") String tag);
Because the tags are stored in another table, we need to JOIN them in our query – this will return all of the Student entities with a matching tag.
First, let’s set up some test data:
Student student = new Student(0, "Larry");
student.setTags(Arrays.asList("full time", "computer science"));
studentRepository.save(student);
Student student2 = new Student(1, "Curly");
student2.setTags(Arrays.asList("part time", "rocket science"));
studentRepository.save(student2);
Student student3 = new Student(2, "Moe");
student3.setTags(Arrays.asList("full time", "philosophy"));
studentRepository.save(student3);
Student student4 = new Student(3, "Shemp");
student4.setTags(Arrays.asList("part time", "mathematics"));
studentRepository.save(student4);
Next, let’s test it and make sure it works:
// Grab only the first result
Student student2 = studentRepository.retrieveByTag("full time").get(0);
assertEquals("name incorrect", "Larry", student2.getName());
We’ll get back the first student in the repository with the full time tag. This is exactly what we wanted.
In addition, we can extend this example to show how to filter a larger dataset. Here is the example:
List<Student> students = studentRepository.retrieveByTag("full time");
assertEquals("size incorrect", 2, students.size());
With a little refactoring, we can modify the repository to take in multiple tags as a filter so we can refine our results even more.
3.2. Filtering a Query
Another useful application of our simple tagging is applying a filter to a specific query. While the previous examples also allowed us to do filtering, they worked on all of the data in our table.
Since we also need to filter other searches, let’s look at an example:
@Query("SELECT s FROM Student s JOIN s.tags t WHERE s.name = LOWER(:name) AND t = LOWER(:tag)")
List<Student> retrieveByNameFilterByTag(@Param("name") String name, @Param("tag") String tag);
We can see that this query is nearly identical to the one above. A tag is nothing more than another constraint to use in our query.
Our usage example is also going to look familiar:
Student student2 = studentRepository.retrieveByNameFilterByTag(
"Moe", "full time").get(0);
assertEquals("name incorrect", "moe", student2.getName());
Consequently, we can apply the tag filter to any query on this entity. This gives the user a lot of power in the interface to find the exact data they need.
4. Advanced Tagging
Our simple tagging implementation is a great place to start. But, due to the One-To-Many relationship, we can run into some issues.
First, we’ll end up with a table full of duplicate tags. This won’t be a problem on small projects, but larger systems could end up with millions (or even billions) of duplicate entries.
Also, our Tag model isn’t very robust. What if we wanted to keep track of when the tag was initially created? In our current implementation, we have no way of doing that.
Finally, we can’t share our tags across multiple entity types. This can lead to even more duplication that can impact our system performance.
Many-To-Many relationships will solve most of our problems. To learn how to use the @manytomany annotation, check out this article (since this is beyond the scope of this article).
5. Conclusion
Tagging is a simple and straightforward way to be able to query data and combined with the Java Persistence API, we’ve got a powerful filtering feature that is easily implemented.
Although the simple implementation may not always be the most appropriate, we’ve highlighted the routes to take to help resolve that situation.
As always, the code used in this article can be found on over on GitHub.