Learning about CRDTs has been very interesting because that is essentially what I have been coming up with in the Simple IoT project node and point data types… However, having clear rules like Associative, Commutative, and Idempotent is helpful in testing the data structures. Also, have a vocabulary and a more structured way to think about and describe these concepts is also very helpful. All this is very encouraging that SIOT is on the right path.
IoT systems may not be “large scale”, but the remote locations and unreliable/high latency networks (cellular) produce some of the same dynamics. Thus, state at the device (edge) is important, and reliably and efficiently reconciling changes is also important.
This article has some more information on the sync vs async question:
InfoQ: You made the argument that synchronous calls for state synchronization might be the better option than an asynchronous, event-driven one. Why is that?
Bourgon: This was sort of a minor point, but an important one, related to the implementation of distributed systems. It’s essentially impossible to build confidence in the safety and reliability of large-scale systems like this without being able to run the system under deterministic test, or simulation. And it’s essentially impossible to simulate a system unless each component can be modeled as a plain, deterministic state machine. It’s certainly possible to get these properties using asynchronous events as the messaging pattern, but experience has taught me that it’s significantly easier with a synchronous RPC-style approach. Similar to how CRDTs eliminate entire classes of complexity related to fault handling, synchronous RPCs eliminate entire classes of complexity related to queueing theory. They get you automatic backpressure, they let you take advantage of the queues that already exist at various layers of the operating system and network stack, and, importantly, they make it a lot easier to build deterministic components.
Will need to think about this more. With NATS, we have the option to do either async or sync operations.
Typical RDBMS systems and common distributed systems techniques are incredibly useful and productive up to a certain scale. But past that scale, the complexity required to prop up the illusion of global atomicity becomes unreliable, unproductive, and ultimately unjustifiable. While there’s an upfront cost to thinking about multiple parallel universes of state in your application, I believe biting that bullet offsets orders of magnitude more hidden complexity in the alternate, leaky abstraction. And I believe that, eventually, we’re going to realize it’s the only way to build reliable, large-scale distributed systems.
For what it’s worth, this isn’t exactly novel thinking. The natural world is full of incredibly complex systems whose behaviors are emergent from simple primitives and rules. My favorite example is probably how groups of fireflies manage to synchronize their lights — no leader election or consensus rounds involved.
I also agree CRDTs are the future, but not for any reason as specific as the ones in the article. Distributed state is so fundamentally complex that I think we actually need CRDTs (or something like them) to reason about it effectively. And certainly to build reliable systems. The abstraction of a single, global, logical truth is so nice and tidy and appealing, but it becomes so leaky that I think all successful systems for distributed state will abandon it beyond a certain scale. – Peter Bourgon