1. Overview

In this tutorial, we’ll take a look at a simple tagging implementation using Java and MongoDB.

For those unfamiliar with the concept, a tag is a keyword used as a “label” to group documents into different categories. This allows the users to quickly navigate through similar content and it’s especially useful when dealing with a big amount of data.

That being said, it’s not surprising that this technique is very commonly used in blogs. In this scenario, each post has one or more tags according to the topics covered. When the user finishes reading, he can follow one of the tags to view more content related to that topic.

Let’s see how we can implement this scenario.

2. Dependency

In order to query the database, we’ll have to include the MongoDB driver dependency in our pom.xml:

<dependency>
    <groupId>org.mongodb</groupId>
    <artifactId>mongo-java-driver</artifactId>
    <version>3.6.3</version>
</dependency>

The current version of this dependency can be found here.

3. Data Model

First of all, let’s start by planning out what a post document should look like.

To keep it simple, our data model will only have a title, which we’ll also use as the document id, an author, and some tags.

We’ll store the tags inside an array since a post will probably have more than just one:

{
    "_id" : "Java 8 and MongoDB",
    "author" : "Donato Rimenti",
    "tags" : ["Java", "MongoDB", "Java 8", "Stream API"]
}

We’ll also create the corresponding Java model class:

public class Post {
    private String title;
    private String author;
    private List<String> tags;

    // getters and setters
}

4. Updating Tags

Now that we have set up the database and inserted a couple of sample posts, let’s see how we can update them.

Our repository class will include two methods to handle the addition and removal of tags by using the title to find them. We’ll also return a boolean to indicate whether the query updated an element or not:

public boolean addTags(String title, List<String> tags) {
    UpdateResult result = collection.updateOne(
      new BasicDBObject(DBCollection.ID_FIELD_NAME, title), 
      Updates.addEachToSet(TAGS_FIELD, tags));
    return result.getModifiedCount() == 1;
}

public boolean removeTags(String title, List<String> tags) {
    UpdateResult result = collection.updateOne(
      new BasicDBObject(DBCollection.ID_FIELD_NAME, title), 
      Updates.pullAll(TAGS_FIELD, tags));
    return result.getModifiedCount() == 1;
}

We used the addEachToSet method instead of push for the addition so that if the tags are already there, we won’t add them again.

Notice also that the addToSet operator wouldn’t work either since it would add the new tags as a nested array which is not what we want.

Another way we can perform our updates is through the Mongo shell. For instance, let’s update the post JUnit5 with Java. In particular, we want to add the tags Java and JUnit5 and remove the tags Spring and REST:

db.posts.updateOne(
    { _id : "JUnit 5 with Java" }, 
    { $addToSet : 
        { "tags" : 
            { $each : ["Java", "JUnit5"] }
        }
});

db.posts.updateOne(
    {_id : "JUnit 5 with Java" },
    { $pull : 
        { "tags" : { $in : ["Spring", "REST"] }
    }
});

5. Queries

Last but not least, let’s go through some of the most common queries we may be interested in while working with tags. For this purpose, we’ll take advantage of three array operators in particular:

  • $in – returns the documents where a field contains any value of the specified array
  • $nin – returns the documents where a field doesn’t contain any value of the specified array
  • $all – returns the documents where a field contains all the values of the specified array

We’ll define three methods to query the posts in relation to a collection of tags passed as arguments. They will return the posts which match at least one tag, all the tags and none of the tags. We’ll also create a mapping method to handle the conversion between a document and our model using Java 8’s Stream API:

public List<Post> postsWithAtLeastOneTag(String... tags) {
    FindIterable<Document> results = collection
      .find(Filters.in(TAGS_FIELD, tags));
    return StreamSupport.stream(results.spliterator(), false)
      .map(TagRepository::documentToPost)
      .collect(Collectors.toList());
}

public List<Post> postsWithAllTags(String... tags) {
    FindIterable<Document> results = collection
      .find(Filters.all(TAGS_FIELD, tags));
    return StreamSupport.stream(results.spliterator(), false)
      .map(TagRepository::documentToPost)
      .collect(Collectors.toList());
}

public List<Post> postsWithoutTags(String... tags) {
    FindIterable<Document> results = collection
      .find(Filters.nin(TAGS_FIELD, tags));
    return StreamSupport.stream(results.spliterator(), false)
      .map(TagRepository::documentToPost)
      .collect(Collectors.toList());
}

private static Post documentToPost(Document document) {
    Post post = new Post();
    post.setTitle(document.getString(DBCollection.ID_FIELD_NAME));
    post.setAuthor(document.getString("author"));
    post.setTags((List<String>) document.get(TAGS_FIELD));
    return post;
}

Again, let’s also take a look at the shell equivalent queries. We’ll fetch three different post collection respectively tagged with MongoDB or Stream API, tagged with both Java 8 and JUnit 5 and not tagged with Groovy nor Scala:

db.posts.find({
    "tags" : { $in : ["MongoDB", "Stream API" ] } 
});

db.posts.find({
    "tags" : { $all : ["Java 8", "JUnit 5" ] } 
});

db.posts.find({
    "tags" : { $nin : ["Groovy", "Scala" ] } 
});

6. Conclusion

In this article, we showed how to build a tagging mechanism. Of course, we can use and readapt this same methodology for other purposes apart from a blog.

If you are interested further in learning MongoDB, we encourage you to read this introductory article.

As always, all the code in the example is available over on the Github project.