Profile picture
, 20 tweets, 3 min read Read on Twitter
1/ Eons ago, I worked on a distributed operating system. We had distributed "in the small" and distributed "in the large." I often wonder what such a system built using containers would look like. Thread 👇
2/ In the small, processes were isolated by software, not hardware, thanks to type safety. In the large, entire systems were connected over RPC channels, much like today's distributed systems. Message passing was everywhere.
3/ In the small presented some fascinating challenges in the areas of 1) programmability and 2) resource management. Both were a constant struggle, and we battled them persistently, and they taught us many lessons.
4/ Imagine if every process in your operating system was effectively "container-like." Its code is packaged up and runs in isolation. All communication with other processes happens over loosely typed asynchronous RPC.
5/ Imagine further that processes are extremely fine-grained. Each device driver runs inside its own. The browser has dozens of them.
6/ The system must be resilient to failure. And yet, at the same time, the system is built on a philosophy of fail-fast, to improve the safety and robustness of handling software errors. In this world, you just assume failure will happen.
7/ Applications must adopt fundamentally different design patterns when faced with these extremes.
8/ Processes need to be configured and connected to other processes. The operating system now needs to handle discovery, routing, and handling failure appropriately with restarts. Updating one process's code has to happen continuously, without rebooting the entire system.
9/ Processes are no longer just bits of application code. They also need some amount of configuration metadata to tell the system about their nature, their requirements, and how they connect to the world around them.
10/ The sheer number of processes places immense burden on the operating system scheduler. If now has 1,000s of things wanting to run at once and needs to place them to minimize RPC latency and maximize data locality.
11/ These processes, when scheduled, generate more work than the system can reasonable tolerate. Relentless contention will form around scarce resources unless an intelligent scheduler can mediate.
12/ In fact, we probably want workloads to exhibit "self control," and willfully participate in backpressure, to stop generating work when the system, or downstream consumers, are too busy to handle it anyway.
13/ Debugging such a system is daunting. If your operating system won't boot because of an eventual consistency bug in your NIC driver, and that entails tracking down some exotic asynchronous communication pattern that spans large and small distributed connections, good luck.
14/ Making this tractable requires system-wide causality tracing, a "log everything" philosophy (and departure from interactive debugging), programming language innovation for reasoning about asynchronous interleavings, visualizations, and interactive IDE support.
15/ Does all of this sound familiar? Yeah, it does to me too.
16/ When I look at the challenges we are facing with cloud applications and infrastructure today, I often go back to those days where we thought of every program being distributed "in the small" and "in the large." It inspires me.
17/ What would have happened if we went back and took those same design tenets, but built the system using containers, rather than software isolated process?
18/ My belief is that such a system would look a lot like the system we just happen to be organically building, one step at a time, using Docker, Kubernetes, infrastructure as code, and service meshes, ... 5 years into the future. END//
P. S. If you want to read more about this operating system, and some lessons learned, check out joeduffyblog.com/2015/11/03/blo…
P. P. S. For more recent progress on some related challenges, check out pulumi.io. Although we're focused on "design time" -- making cloud apps and infrastructure more programmable and easier to configure -- we're working our way through many similar hard problems.
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Joe Duffy
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!