The Reactive Manifesto

Some interesting resources for application and system architecture:

Four traits:

  1. Responsive
  2. Resilient
  3. Elastic
  4. Message-Driven

are essential building blocks

This is a key point, and something I’ve discovered as well – message based systems work very well.

2. Communicate Facts

Choose immutable event streams over mutable state

Mutable state is not stable throughout time. It always represents the current/latest value and evolves through destructive in-place updates that overwrite the previous value. The essence of the problem is that mutable state treats the concepts of value and identity as the same thing. An identity can’t be allowed to evolve without changing the value it currently represents, forcing us to safeguard it with mutexes and the likes.

Concurrent updates to mutable state are a notorious source of data corruption. While there exist well-established techniques and algorithms for safe handling of updates to shared mutual state, they bring two major downsides. The complexity of these algorithms is easy to get wrong, especially as code evolves, and they require a certain level of coordination that places an upper bound on performance and scalability. Due to the destructive nature of updates to mutable state, mistakes can easily lead to corruption and loss of data that are expensive to detect and recover from.

Instead, rely on immutable state—values representing facts—which can be shared safely as local or distributed events without worrying about corrupt or inconsistent data, or guarding it with transactions or locks.

A fact is immutable and represents something that has already happened sometime in the past, something that can not be changed or retracted. It is a stable value that you can reason about and trust, indefinitely. After all, we can’t change the past, even if we sometimes wish that we could. Knowledge is cumulative and occurs either by receiving new facts or by deriving new facts from existing facts. Invalidation of existing knowledge is done by adding new facts to the system that refute existing facts. Facts are never deleted, only made irrelevant for current knowledge.

Facts are best shared by publishing them as events through the component’s event stream where they can be subscribed to and consumed by others—components, databases, or subsystems—serving as a medium for communication, integration, and replication. Facts stored as events in an _event log new tab, in their causal order, can represent the full history of a component’s state changes through time (e.g. using the Event Sourcing new tab pattern) while serving reads safely from memory (called Memory Image new tab).

Jonas: The biggest impediment for scale is shared mutable state (to be precise: contended access to shared mutable state). As soon as you have shared mutable state, you need to guard that state through a gateway of serial access, which means adding coordination and mutual exclusion, and that adds contention—services waiting in line for access the shared state. Contention is the biggest scalability killer.

How people approach state today—they want to continue to program as if everything were still running on a single CPU, where they have a full control of the ordering of the instructions. That’s a nice model because it’s easy to understand, and it was true 15 years ago—but we shouldn’t lie to ourselves any longer, it’s a vastly different world now, and we need to rethink how we design and think about software. The current reality doesn’t match our beloved von Neumann Architecture anymore, and hanging on to it by trying to emulate it will just make matters worse.

But unfortunately, way too often I have seen that instead of addressing the problem at its root cause and simplify it by applying the right design and principles from the start, people keep adding layers in complexity by bringing in more tools and products in an attempt to keep their mutable state in sync. The problem with this is that it doesn’t scale – the more nodes (or cores) you add, the more nodes need to be part of that consistent view, and that’s more and more costly and will make the system run slower and slower.

The antidote is share nothing designs. In a share nothing architecture components do not share state, every component (or node) is fully self-contained, lives in isolation, with its own life-cycle and communicates by sending immutable messages. If you rely on share nothing designs then you will both minimize contention and maximize locality reference. This means that things used often together are sitting together, and not just conceptually but in code, which simplifies caching—both less CPU cache line invalidations, better prefetching, as well as more efficient application level caching. It also means minimizing the waiting time in the system by decreasing contention, which makes things more efficient in terms of resource utilization. Now, adding more CPUs and/or nodes just helps, since you have partitioned the system and removed most bottlenecks.