1. Introduction

In this tutorial, we’re going to discuss how to use MongoDB as an infinite data stream by utilizing tailable cursors with Spring Data MongoDB.

2. Tailable Cursors

When we execute a query, the database driver opens a cursor to supply the matching documents. By default, MongoDB automatically closes the cursor when the client reads all results. Therefore, turning results in a finite data stream.

However, we can use capped collections with a tailable cursor that remains open, even after the client consumed all initially returned data – making the infinite data stream. This approach is useful for applications dealing with event streams, like chat messages, or stock updates.

Spring Data MongoDB project helps us utilizing reactive database capabilities, including tailable cursors.

3. Setup

To demonstrate the mentioned features, we’ll implement a simple logs counter application. Let’s assume there is some log aggregator that collects and persists all logs into a central place – our MongoDB capped collection.

Firstly, we’ll use the simple Log entity:

@Document
public class Log {
    private @Id String id;
    private String service;
    private LogLevel level;
    private String message;
}

Secondly, we’ll store the logs in our MongoDB capped collection. Capped collections are fixed-size collections that insert and retrieve documents based on the insertion order. We can create them with the MongoOperations.createCollection:

db.createCollection(COLLECTION_NAME, new CreateCollectionOptions()
  .capped(true)
  .sizeInBytes(1024)
  .maxDocuments(5));

For capped collections, we must define the sizeInBytes property. Moreover, the maxDocuments specifies the maximum number of documents a collection can have. Once reached, the older documents will be removed from the collection.

Thirdly, we’ll use the appropriate Spring Boot starter dependency:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-mongodb-reactive</artifactId>
    <versionId>2.2.2.RELEASE</versionId>
</dependency>

4. Reactive Tailable Cursors

We can consume tailable cursors with both the imperative and the reactive MongoDB API. It’s highly recommended to use the reactive variant.

Let’s implement WARN level logs counter using a reactive approach. We’re able to create infinite stream queries with ReactiveMongoOperations.tail method.

A tailable cursor remains open and emits data – a Flux of entities – as new documents arrive in a capped collection and match the filter query:

private Disposable subscription;

public WarnLogsCounter(ReactiveMongoOperations template) {
    Flux<Log> stream = template.tail(
      query(where("level").is(LogLevel.WARN)), 
      Log.class);
    subscription = stream.subscribe(logEntity -> 
      counter.incrementAndGet()
    );
}

Once the new document, having the WARN log level, is persisted in the collection, the subscriber (lambda expression) will increment the counter.

Finally, we should dispose of the subscription to close the stream:

public void close() {
    this.subscription.dispose();
}

Also, please note that tailable cursors may become dead, or invalid if the query initially returns no match. In other words, even if new persisted documents match the filter query, the subscriber will not be able to receive them. This is a known limitation of MongoDB tailable cursors. We must ensure that there are matching documents in the capped collection, before creating a tailable cursor.

5. Tailable Cursors with a Reactive Repository

Spring Data projects offer a repository abstraction for different data stores, including the reactive versions.

MongoDB is no exception. Please check the Spring Data Reactive Repositories with MongoDB article for more details.

Moreover, MongoDB reactive repositories support infinite streams by annotating a query method with @Tailable. We can annotate any repository method returning Flux or other reactive types capable of emitting multiple elements:

public interface LogsRepository extends ReactiveCrudRepository<Log, String> {
    @Tailable
    Flux<Log> findByLevel(LogLevel level);
}

Let’s count INFO logs using this tailable repository method:

private Disposable subscription;

public InfoLogsCounter(LogsRepository repository) {
    Flux<Log> stream = repository.findByLevel(LogLevel.INFO);
    this.subscription = stream.subscribe(logEntity -> 
      counter.incrementAndGet()
    );
}

Similarly, as for WarnLogsCounter, we should dispose of the subscription to close the stream:

public void close() {
    this.subscription.dispose();
}

6. Tailable Cursors with a MessageListener

Nevertheless, if we can’t use the reactive API, we can leverage Spring’s messaging concept.

First, we need to create a MessageListenerContainer which will handle sent SubscriptionRequest objects. The synchronous MongoDB driver creates a long-running, blocking task that listens to new documents in the capped collection.

Spring Data MongoDB ships with a default implementation capable of creating and executing Task instances for a TailableCursorRequest:

private String collectionName;
private MessageListenerContainer container;
private AtomicInteger counter = new AtomicInteger();

public ErrorLogsCounter(MongoTemplate mongoTemplate,
  String collectionName) {
    this.collectionName = collectionName;
    this.container = new DefaultMessageListenerContainer(mongoTemplate);

    container.start();
    TailableCursorRequest<Log> request = getTailableCursorRequest();
    container.register(request, Log.class);
}

private TailableCursorRequest<Log> getTailableCursorRequest() {
    MessageListener<Document, Log> listener = message -> 
      counter.incrementAndGet();

    return TailableCursorRequest.builder()
      .collection(collectionName)
      .filter(query(where("level").is(LogLevel.ERROR)))
      .publishTo(listener)
      .build();
}

TailableCursorRequest creates a query filtering only the ERROR level logs. Each matching document will be published to the MessageListener that will increment the counter.

Note that we still need to ensure that the initial query returns some results. Otherwise, the tailable cursor will be immediately closed.

In addition, we should not forget to stop the container once we no longer need it:

public void close() {
    container.stop();
}

7. Conclusion

MongoDB capped collections with tailable cursors help us receive information from the database in a continuous way. We can run a query that will keep giving results until explicitly closed. Spring Data MongoDB offers us both the blocking and the reactive way of utilizing tailable cursors.

The source code of the complete example is available over on GitHub.


« 上一篇: MapDB指南