1. Overview

Tagging is a Design Pattern that allows us to perform advanced filtering and sorting on our data. This article is a continuation of a Simple Tagging Implementation with JPA.

Therefore, we’ll pick up where that article left off and cover advanced use cases for Tagging.

2. Endorsed Tags

Probably the best known advanced tagging implementation is the Endorsement Tag. We can see this pattern on sites like Linkedin.

Essentially, the tag is a combination of a string name and a numerical value. Then, we can use the number to represent the number of times the tag has been voted or “endorsed”.

Here’s an example of how to create this kind of tag:

@Embeddable
public class SkillTag {
    private String name;
    private int value;

    // constructors, getters, setters
}

To use this tag, we simply add a List of them to our data object:

@ElementCollection
private List<SkillTag> skillTags = new ArrayList<>();

We mentioned in the previous article that the @ElementCollection annotation automatically creates a one-to-many mapping for us.

This is a model use-case for this relationship. Because each tag has personalized data associated with the entity it’s stored on, we can’t save space with a many-to-many storage mechanism.

Later in the article, we’ll cover an example of when many-to-many makes sense.

Because we’ve embedded the skill tag into our original entity, we can query on it just like any other attribute.

Here’s an example query looking for any student with more than a certain number of endorsements:

@Query(
  "SELECT s FROM Student s JOIN s.skillTags t WHERE t.name = LOWER(:tagName) AND t.value > :tagValue")
List<Student> retrieveByNameFilterByMinimumSkillTag(
  @Param("tagName") String tagName, @Param("tagValue") int tagValue);

Next, let’s look at an example of how to use this:

Student student = new Student(1, "Will");
SkillTag skill1 = new SkillTag("java", 5);
student.setSkillTags(Arrays.asList(skill1));
studentRepository.save(student);

Student student2 = new Student(2, "Joe");
SkillTag skill2 = new SkillTag("java", 1);
student2.setSkillTags(Arrays.asList(skill2));
studentRepository.save(student2);

List<Student> students = 
  studentRepository.retrieveByNameFilterByMinimumSkillTag("java", 3);
assertEquals("size incorrect", 1, students.size());

Now we can search for either the presence of the tag or having a certain number of endorsements for the tag.

Consequently, we can combine this with other query parameters to create a variety of complex queries.

3. Location Tags

Another popular tagging implementation is the Location Tag. We can use a Location Tag in two primary ways.

First of all, it can be used to tag a geophysical location.

Also, it can be used to tag a location in media such as a photo or video. The implementation of the model is nearly identical in all of these cases.

Here’s an example of tagging a photo:

@Embeddable
public class LocationTag {
    private String name;
    private int xPos;
    private int yPos;

    // constructors, getters, setters
}

The most noteworthy aspect of Location Tags is how difficult it is to perform a Geolocation Filter using just a database. If we need to search within geographic bounds, a better approach is loading the model into a Search Engine (like Elasticsearch) which has built-in support for geolocations.

Therefore, we should focus on filtering by the tag name for these location tags.

The query is going to look similar to our simple tagging implementation from the previous article:

@Query("SELECT s FROM Student s JOIN s.locationTags t WHERE t.name = LOWER(:tag)")
List<Student> retrieveByLocationTag(@Param("tag") String tag);

The example to use location tags will also look familiar:

Student student = new Student(0, "Steve");
student.setLocationTags(Arrays.asList(new LocationTag("here", 0, 0));
studentRepository.save(student);

Student student2 = studentRepository.retrieveByLocationTag("here").get(0);
assertEquals("name incorrect", "Steve", student2.getName());

If Elasticsearch is out of the question and we still need to search on geographic bounds, using simple geometric shapes will make the query criteria much more readable.

We’ll leave finding if a point is within a circle or rectangle is straightforward as an exercise for the reader.

4. Key-Value Tags

Sometimes, we need to store tags that are slightly more complicated. We might want to tag an entity with a small subset of key tags, but that can contain a wide variety of values.

For instance, we could tag a student with a department tag and set its value to Computer Science. Each student will have the department key, but they could all have different values associated with it.

The implementation will look similar to the Endorsed Tags above:

@Embeddable
public class KVTag {
    private String key;
    private String value;

    // constructors, getters and setters
}

We can add it to our model like this:

@ElementCollection
private List<KVTag> kvTags = new ArrayList<>();

Now we can add a new query to our repository:

@Query("SELECT s FROM Student s JOIN s.kvTags t WHERE t.key = LOWER(:key)")
List<Student> retrieveByKeyTag(@Param("key") String key);

We can also quickly add a query to search by value or by both key and value. This gives us additional flexibility in how we search our data.

Let’s test this out and verify it all works:

@Test
public void givenStudentWithKVTags_whenSave_thenGetByTagOk(){
    Student student = new Student(0, "John");
    student.setKVTags(Arrays.asList(new KVTag("department", "computer science")));
    studentRepository.save(student);

    Student student2 = new Student(1, "James");
    student2.setKVTags(Arrays.asList(new KVTag("department", "humanities")));
    studentRepository.save(student2);

    List<Student> students = studentRepository.retrieveByKeyTag("department");
 
    assertEquals("size incorrect", 2, students.size());
}

Following this pattern, we can design even more complicated nested objects and use them to tag our data if we need to.

Most use cases can be met with the advanced implementations we have talked about today, but the option is there to go as complicated as needed.

5. Reimplementing Tagging

Finally, we’re going to explore one last area of tagging. So far, we’ve seen how to use the @ElementCollection annotation to make adding tags to our model easy. While it’s simple to use, it has a pretty significant trade-off. The one-to-many implementation under the hood can lead to a lot of duplicated data in our data store.

To save space, we need to create another table that will join our Student entities to our Tag entities. Luckily, Spring JPA will do most of the heavy lifting for us.

We’re going to reimplement our Student and Tag entities to see how this is done.

5.1. Define Entities

First of all, we need to recreate our models. We’ll start with a ManyStudent model:

@Entity
public class ManyStudent {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private int id;
    private String name;

    @ManyToMany(cascade = CascadeType.ALL)
    @JoinTable(name = "manystudent_manytags",
      joinColumns = @JoinColumn(name = "manystudent_id", 
      referencedColumnName = "id"),
      inverseJoinColumns = @JoinColumn(name = "manytag_id", 
      referencedColumnName = "id"))
    private Set<ManyTag> manyTags = new HashSet<>();

    // constructors, getters and setters
}

There’re a couple of things to notice here.

First, we’re generating our ID, so the table linkages are easier to manage internally.

Next, we’re using the @ManyToMany annotation to tell Spring we want a linkage between the two classes.

Finally, we use the @JoinTable annotation to set up our actual join table.

Now we can move on to our new tag model which we’ll call ManyTag:

@Entity
public class ManyTag {

    @Id
    @GeneratedValue(strategy = GenerationType.AUTO)
    private int id;
    private String name;

    @ManyToMany(mappedBy = "manyTags")
    private Set<ManyStudent> students = new HashSet<>();

    // constructors, getters, setters
}

Because we’ve already set up our join table in the student model, all we have to worry about is setting up the reference inside this model.

We use the mappedBy attribute to tell JPA we want this link to the Join Table we created before.

5.2. Define Repositories

In addition to the models, we also need to set up two repositories: one for each entity. We’ll let Spring Data do all the heavy lifting here:

public interface ManyTagRepository extends JpaRepository<ManyTag, Long> {
}

Since we don’t need to search on just tags currently, we can leave the repository class empty.

Our student repository is only slightly more complicated:

public interface ManyStudentRepository extends JpaRepository<ManyStudent, Long> {
    List<ManyStudent> findByManyTags_Name(String name);
}

Again, we’re letting Spring Data auto-generate the queries for us.

5.3. Testing

Finally, let’s see what this all looks like in a test:

@Test
public void givenStudentWithManyTags_whenSave_theyGetByTagOk() {
    ManyTag tag = new ManyTag("full time");
    manyTagRepository.save(tag);

    ManyStudent student = new ManyStudent("John");
    student.setManyTags(Collections.singleton(tag));
    manyStudentRepository.save(student);

    List<ManyStudent> students = manyStudentRepository
      .findByManyTags_Name("full time");
 
    assertEquals("size incorrect", 1, students.size());
}

The flexibility added by storing the tags in a separate searchable table far outweighs the minor amount of complexity that is added to the code.

This also allows us to reduce the total number of tags we store in the system by removing duplicate tags.

However, many-to-many isn’t optimized for cases where we want to store state information specific to the entity along with the tag.

6. Conclusion

This article picked up where the previous one left off.

First of all, we introduced several advanced models that are useful when designing a tagging implementation.

Finally, we re-examined the implementation of tagging from the last article in the context of a many-to-many mapping.

To see working examples of what we talked about today, please check out the code on GitHub.