The Problem with Threads

cbrake · October 12, 2022, 2:18am

This is an interesting paper and reflects my experience. I feel Go (first appeared in 2009, three years after this paper) provides some of the tools (channels/goroutines/select) needed to tame most concurrency problem, yet retain good performance. And these features are built into the language, as the author suggests. The key in Go is to synchronize access to data in one select statement using channels. Go still gives you all the classic synchronization primitives (mutexes, etc), which are sometimes needed for performance, but should generally be avoided when possible as that is the classic thread model. Stay tuned, more on this in the future …

Quotes:

referenced by Richard Hipp: SQLite Frequently Asked Questions

“to replace the conventional metaphor a sequence of steps with the notion of a community of interacting entities”

Non-homogeneous code; fine-grain, complicated interactions; and pointer-based data structures make this type of program difficult to execute concurrently.

I will argue that we must (and can) build concurrent models of computation that are far more deterministic, and that we must judiciously and carefully introduce nondeterminism where needed. Nondeterminism should be explicitly added to programs, and only where needed, as it is in sequential programming. Threads take the opposite approach. They make programs absurdly nondeterministic, and rely on programming style to constrain that nondeterminism to achieve deterministic aims.

The message is clear. We should not replace established languages. We should instead build on them. However, building on them using only libraries is not satisfactory. Libraries offer little structure, no enforcement of patterns, and few composable properties.

I believe that the right answer is coordination languages. Coordination languages do introduce new syntax, but that syntax serves purposes that are orthogonal to those of established programming languages.

Concurrency models with stronger determinism than threads, such as Kahn process networks, CSP, and dataflow, have also been available for a long time. Some of these have led to programming languages, such as Occam [21] (based on CSP), and some have led to domain-specific frameworks, such as YAPI [18]. Most, however, have principally been used to build elaborate process calculi, and have not had much effect on mainstream programming. I believe this can change if these concurrency models are used to define coordination languages rather than replacement languages.

Concurrency in software is difficult. However, much of this difficulty is a consequence of the abstractions for concurrency that we have chosen to use. The dominant one in use today for general-purpose computing is threads. But non-trivial multi-threaded programs are incomprehensible humans. It is true that the programming model can be improved through the use of design patterns, better granularity of atomicity (e.g. transactions), improved languages, and formal methods. However, these techniques merely chip away at the unnecessarily enormous nondeterminism of the threading model. The model remains intrinsically intractable.

If we expect concurrent programming to be mainstream, and if we demand reliability and predictability from programs, then we must discard threads as a programming model. Concurrent programming models can be constructed that are much more predictable and understandable than threads. They are based on a very simple principle: deterministic ends should be accomplished with deterministic means. Nondeterminism should be judiciously and carefully introduced where needed, and should be explicit in programs. This principle seems obvious, yet it is not accomplished by threads. Threads must be relegated to the engine room of computing, to be suffered only by expert technology providers.

khem · October 12, 2022, 4:42am

yeah

Threads must be relegated to the engine room of computing, to be suffered only
by expert technology providers.

seems to be exactly what go decided to do.

bminer · October 13, 2022, 7:53pm

Go is a tool for developers. The best tools let you shoot yourself in the foot but only after you have jumped through a number of hoops designed to prevent you from doing so.

khem · October 14, 2022, 4:55pm

Concurrency is hard and parallel execution is hard to follow, from programming point of view thinking threads is contrary to how humans think solutions, thread complexity goes up exponentially with number of threads. I can understand execution flow of 2 threads but if there are 4 its orders of magnitude harder. So go’s decision to manage it once for everyone is a good one, this makes less cognitive load on programmers and thats one think less to worry about. low level primitives like threads are not something you wanna design every time you write a new program, I think same about memory management as well. Although putting my embedded hat on and compiler bias, I like rust way of doing it is perhaps more optimized but cost of doing it is higher, GC is good but you are forcing the compute cost to everywhere it will be run, if I can get same benefits during compile, even though its a bit slower process and bit inconvenient. My philosophy is push things to left of software life cycle as much as possible.