White to Black Gradient
blog

Saga Patterns in Kalix Part 1 - Event Choreography

Andrzej Ludwikowski.
Software Architect, Lightbend.
  • 1 Sept 2023,
  • 8 minute read

One of the biggest challenges in a microservice architecture is how to handle a transaction that spans multiple systems/services. Forget about distributed transactions based on “XA transactions”, with some kind of two-phase commit implementation. This is a very limited solution and in most cases a dead end for our system architecture. Performance cost, low availability, low throughput, and high contention are some of the most common problems with distributed transactions (and certainly not all of them).

However, there is another way to solve the problem. The famous Saga pattern is already well described in many existing blog posts, publications and videos. For a quick recap, a Saga is a sequence of local transactions triggered one after the other. Usually based on published events. If a local transaction fails, then the Saga executes a series of compensating actions that undo the changes and make the data (possibly in many services) consistent again. This way, you have more flexibility in the distributed transaction implementation. You can scale parts of the Saga process independently. With asynchronous event processing your system is more fault-tolerant and resistant to cascading failures. The “only” catch here is that it is not so easy to get it right.

If you have ever implemented a Saga pattern in a distributed system you are perfectly aware of how complex and tedious this task can be. You can divide the challenges into two main categories.

  • Business challenges - a business process could be modified and simplified but at the same time, you can’t just throw away tasks that need to be accomplished to deliver something meaningful. A business-driven Saga with multiple steps and a sophisticated compensation flow will be complex. No matter what the underlying solution is.
  • Technical challenges - they will arise even for a basic business process.

I have good news! With Kalix it has never been easier to implement the Saga pattern. We can finally focus on just the business stuff. All the technical challenges are nicely solved by Kalix.

In this five part blog series, I’m going to show how to easily build an event-driven Saga choreography. Then slowly evolve the application to be more production-ready to finally switch to a second flavor of Saga implementation - orchestration. That includes the new Kalix component - Workflows.

Having said that, if this is your first adventure with Kalix, I would strongly recommend starting with the introduction and getting familiar with some of the basic concepts.

Domain

We are building a simple application to book seats for a cinema show. The two main players are Show and Wallet event-sourced domain objects. The best way to describe their functionality is to examine the domain API. With the Show you can:

  • create a Show
  • reserve a Seat
  • confirm a Seat reservation (after successful charging)
  • cancel a Seat reservation (in case of insufficient balance)
public sealed interface ShowCommand {

  record CreateShow(String title, int maxSeats) implements ShowCommand {}

  record ReserveSeat(String walletId, String reservationId, int seatNumber) implements ShowCommand {}

  record ConfirmReservationPayment(String reservationId) implements ShowCommand {}

  record CancelSeatReservation(String reservationId) implements ShowCommand {}

}

The Show will emit, accordingly, possible events:

public sealed interface ShowEvent {

  String showId();

  record ShowCreated(String showId, InitialShow initialShow) implements ShowEvent {}
    
  record SeatReserved(String showId, String walletId, String reservationId, int seatNumber, BigDecimal price) implements ShowEvent {}
    
  record SeatReservationPaid(String showId, String reservationId, int seatNumber) implements ShowEvent {}

  record SeatReservationCancelled(String showId, String reservationId, int seatNumber) implements ShowEvent {}

}

It’s time for a Wallet domain aggregate, which is much simpler. You can only:

  • create a Wallet
  • charge a Wallet.
public sealed interface WalletCommand {

  record CreateWallet(BigDecimal initialAmount) implements WalletCommand {}

  record ChargeWallet(BigDecimal amount, String reservationId) implements WalletCommand {}

}

In case of insufficient funds, a special WalletChargeRejected event is emitted. Besides that, everything should be relatively straightforward.

public sealed interface WalletEvent {

  record WalletCreated(String walletId, BigDecimal initialAmount) implements WalletEvent {}
    
  record WalletCharged(String walletId, BigDecimal amount, String reservationId) implements WalletEvent {}

  record WalletChargeRejected(String walletId, String reservationId) implements WalletEvent {}

}

The full source code - the glue between commands and events is available here and here. That’s the domain in plain Java, with some extras like pattern matching (preview enabled) and the vavr library to make it more pleasant to use.

Event Sourced Entity

It is time to expose the domain as Kalix components. That’s a very straightforward operation. You just need to wrap our domain objects with Kalix Event Source Entities and expose the endpoints to interact with them.

ShowEntity:

  • POST /cinema-show/{id}
  • PATCH /cinema-show/{id}/reserve
  • PATCH /cinema-show/{id}/cancel-reservation
  • PATCH /cinema-show/{id}/confirm-payment
  • GET /cinema-show/{id}

WalletEntity:

  • POST /wallet/{id}/create/{initialBalance}
  • PATCH /wallet/{id}/charge
  • GET /wallet/{id}

After this operation, you have accomplished a lot of technical tasks. You can interact with the domain code from the outside (and inside) world. Your data is persisted in the form of events. Each Kalix Entity is a separately scalable component, so running thousands of Shows and Wallets is not a problem. Kalix scales and distributes them across available nodes, according to the load. Not to mention that with a specific Kalix concurrency model, you don’t have to worry about data consistency in case of parallel updates. That’s a lot of burdens off your shoulders.

Side Notes
  • For some endpoints, a command model from the domain is reused, which is not recommended. It’s better not to expose the domain layer in the public API and have separate models;
  • Keeping the domain code separate from the application layer improves our testability. You can cover up to 100% of the domain code with simple unit tests and write fewer integration tests to verify if all the pieces talk with each other.

Business Saga

Now we can focus on the main part of this blog post. A seat reservation process consists of several steps, which could be expressed in a basic event storming session.

Saga Patterns in Kalix Part 1 - Events Choreography

From the user’s perspective, everything starts with a reservation request (assuming that the user has a Wallet with some balance and the Show already exists). A SeatReserved event emitted by the Show is a trigger to charge the Wallet. Once we have charged the Wallet, our process continues with either a confirmation (triggered by WalletCharged event) or a cancellation path (triggered by WalletChargeRejected event). A fairly straightforward event-driven solution.

It’s not rocket science to implement such an event flow, but if you start working on it, you will soon realize that there is a lot of technical fuss around it:

  • saving events is not a problem in most of the solutions, but effectively streaming them from the database is a much bigger challenge;
  • we can use an event bus for the streaming part, but that means another piece to manage in the production environment;
  • at-least-once delivery is required, we cannot lose any events;
  • events order is often very important;
  • design for failure should be also one of our objectives;
  • last but not least, the solution should be scalable.

If you want to build such a system from scratch, you need to deal with all of this. Over and over again. Many times, you reinvent the wheel, even though the team next to you is doing the same job. Let’s examine how easy it is to do all these tasks in Kalix.

We will use an existing abstraction for that, which is a Subscription. A Subscription is a very powerful mechanism to consume a stream of events from a given Entity type (but not only). You will often use them to create Views and build a CQRS-style application. In our case, we act on events to perform the next step in the processing pipeline.

@Subscribe.EventSourcedEntity(value = ShowEntity.class, ignoreUnknown = true)
public class ChargeForReservation extends Action {

  public Effect<String> charge(SeatReserved seatReserved) {
    String reservationId = seatReserved.reservationId();
    var chargeWallet = new ChargeWallet(seatReserved.price(), reservationId);

    var chargeCall = componentClient.forEventSourcedEntity(seatReserved.walletId()).call(WalletEntity::charge).params(chargeWallet);

    return effects().forward(chargeCall);
  }
}

That’s it. It’s all you need to create your first ShowEntity events projection (green dotted line on the diagram). Because we run a request to a single Kalix component, WalletEntity, we can use small optimization and simply forward a deferred call. The Kalix machinery will handle all the technical challenges I mentioned above, at-least-once delivery, recovery, scaling, and more.

The second projection is very similar, but you need to handle two event types from the WalletEntity this time.

@Subscribe.EventSourcedEntity(value = WalletEntity.class, ignoreUnknown = true)
public class CompleteReservation extends Action {

  public Effect<String> confirmReservation(WalletCharged walletCharged) {

    String reservationId = walletCharged.reservationId();
    String showId = "show1";

    return effects().forward(confirmReservation(showId, reservationId));
  }

  public Effect<String> cancelReservation(WalletChargeRejected walletChargeRejected) {

    String reservationId = walletChargeRejected.reservationId();
    String showId = "show1";

    return effects().forward(cancelReservation(showId, reservationId));
  }
}

A careful reader will quickly notice that the showId value is hardcoded. It’s a conscious decision because fixing this will be a topic of the next blog post from the series. The idea was to focus only on event choreography and present how easy it is to implement choreography-based Sagas with Kalix.

Summary

We have only scratched the surface of the Kalix components' functionalities. I hope that you get the impression of how much less boilerplate code is needed to deliver something basic but nontrivial. With incoming blog posts, we will unleash the full potential of the Kalix ecosystem to build more complex, resilient, and production-ready solutions.

Part 2 in this series will be about read models based on other Kalix components (Views and Value Entities), which will help us to fix the hardcoded showId. Check out part1 tag from the source code and follow the Readme file to play with the current codebase.