There is a fallacy about how domain modelling works, and we need to talk about it. With software design, you're not just solving problems, you can reframe the problem itself. 🧵 /1
(No need to use a threading app, you can also read the whole article @rebeccawb & I wrote here: verraes.net/2021/09/design… ) /2
The fallacy about domain modelling is that we can design software by discovering all the relevant concepts in the domain, turn them into concepts in our design, add some behaviours, and voilà, we’ve solved our problem. /3
It’s a simplistic perception of how design works: a linear path from A to B:

1. understand the problem,
2.apply design,
3. end up with a solution. /4
That idea was central to early Object-Oriented Design. It was expressed in advice such as "Find objects by identifying nouns in the specifications." In hindsight, it was naive. /5
The idea has persisted in many naive interpretations of Domain-Driven Design as well. Domain language and Ubiquitous Language are often conflated. They’re not the same. /6
Domain language is what is used by people working in the domain. It’s a natural language, and therefore messy. It’s organic: concepts are introduced out of necessity, without deliberation, without agreement, without precision. /7
Terminology spreads across the organisation or fades out. Meaning shifts. People adapt old terms into new meanings, or terms acquire multiple, ambiguous meanings. It exists because it works, at least well enough for human-to-human communication. /8
A domain language (like all language) only works in the specific context it evolved in. /9
For us system designers, messy language is not good enough. We need precise language with well understood concepts, and explicit context. /10
This is what a Ubiquitous Language is: a constructed, formalised language, agreed upon by stakeholders and designers, to serve the needs of our design. /11
We need more control over this language than we have over the domain language. The Ubiquitous Language has to be deeply connected to the domain language, or there will be discord. /12
The level of formality and precision in any Ubiquitous Language depends on its environment: a meme sharing app and an oil rig control system have different needs. /13
Rebecca was invited to consult for a company that makes hardware and software for oil rigs. She was asked to help with object design and modelling, working on redesigning the control system that monitors and manages sensors and equipment on the oil rig. /14
Drilling causes a lot of friction, and “drilling mud” (a proprietary chemical substance) is used as a lubricant. It’s also used as a carrier for the rocks and debris you get from drilling, lifting it all up and out of the hole. /15
Equipment monitors the drilling mud pressure, and by changing the composition of the mud during drilling, you can control that pressure. Too much pressure is a really bad thing. /16
And then an oil rig in the gulf exploded. /17
As the news stories were coming out, the team found out that the rig was using a competitor’s equipment. Whew! The team started speculating about what could have happened, and were thinking about how something like that could happen with their own systems. /18
Was it faulty equipment, sensors, the telemetry, communications between various components, the software? /19
When in doubt, look for examples. The team ran through scenarios. What happens when a catastrophic condition occurs? How do people react? When something fails, it’s a noisy environment for the oil rig engineers: sirens blaring, alarms going off, … /20
We discovered that when a problem couldn’t be fixed immediately, the engineers, in order to concentrate, would turn off the alarms after a while. /21
When a failure is easy to fix, the control system logs reflect that the alarm went on and was turned off a few minutes later. /22
But for more consequential failures, even though these problems take much longer to resolve, it still shows up on the logs as being resolved within minutes. Then, when people study the logs, it looks like the failure was resolved quickly. /23
But that’s totally inaccurate. This may look like a software bug, but it’s really a flaw in the model. And we should use it as an opportunity to improve that model. /24
The initial modelling assumption is that alarms are directly connected to the emergency conditions in the world. However, the system’s perception of the world is distorted: when the engineers turn off the alarm, the system believes the emergency is over. /25
But it’s not, turning an alarm off doesn’t change the emergency condition in the world. The alarms are only indirectly connected to the emergency. If it’s indirectly connected, there’s something else in between, that doesn’t exist in our model. /26
The model is an incomplete representation of a fact of the world, and that could be catastrophic. /27
The team explored scenarios, specifically the weird ones, the awkward edge cases where nobody really knows how the system behaves, or even how it should behave. One such scenario is when two separate sensor measurements raise alarms at the same time. /28
The alarm sounds, an engineer turns it off, but what happens to the second alarm? Should the alarm still sound or not? Should turning off one turn off the other? If it didn’t turn off, would the engineers think the off switch didn’t work and just push it again? /29
By working through these scenarios, the team figured out there was a distinction between the alarm sounding, and the state of alertness. /30
Now, in this new model, when measurements from the sensors exceed certain thresholds or exhibit certain patterns, the system doesn’t sound the alarm directly anymore. Instead, it raises an alert condition, which is also logged. /31
It’s this alert condition that is associated with the actual problem. The new alert concept is now responsible for sounding the alarm (or not). The alarm can still be turned off, but the alert condition remains. /32
Two alert conditions with different causes can coexist without being confused by the single alarm. This model decouples the emergency from the sounding of the alarm. /33
The old model didn’t make that distinction, and therefore it couldn’t handle edge cases very well. When at last the team understood the need for separating alert conditions from the alarms, they couldn’t unsee it. /34
It’s one of those aha-moments that seem obvious in retrospect. Such distinctions are not easily unearthed. It’s what Eric Evans calls a Breakthrough. /35
There was a missing concept, and at the first the team didn’t know something was missing. It wasn’t obvious at first, because there wasn’t a name for “alert condition” in the domain language. /36
The oil rig engineers’ job isn’t designing software or creating a precise language, they just want to be able to respond to alarms and fix problems in peace. /37
Alert conditions didn’t turn up in a specification document, or in any communication between the oil rig engineers. The concept was not used implicitly by the engineers or the software; no, the whole concept did not exist. /38
Then where did the concept come from? /39
People in the domain experienced the problem, but without explicit terminology, they couldn’t express the problem to the system designers. So it’s us, the designers, who created it. It’s an act of creative modelling. The concept is invented. /40
In our oil rig monitoring domain, it was a novel way to perceive reality. /41
Of course, in English, alert and alarm exist. They are almost synonymous. But in our Ubiquitous Language, we agreed to make them distinct. We designed our Ubiquitous Language to fit our purpose, and it’s different from the domain language. /42
After we introduced “alert conditions”, the oil rig engineers incorporated it in their language. This change in the domain is driven by the design. /43
This is a break with the linear, unidirectional understanding of moving from problem to solution through design. Instead, through design, we reframed the problem. /44
Is it a better model? How do we know that this newly invented model is in fact better (specifically, more fit for purpose)? We find realistic scenarios and test them against the alert condition model, as well as other candidate models. /45
In our case, with the new model, the logs will be more accurate, which was the original problem. /46
But in addition to helping with the original problem, a deeper model often opens new possibilities. This alert conditions model suggests several: /47
- Different measurements can be associated with the same alert.
- Alert conditions can be qualified.
- We can define alarm behaviours for simultaneous alert conditions, for example by spacing the alarms, or picking different sound patterns. /48
- Critical alerts could block less critical ones from hogging the alarm.
- Alert conditions can be lowered as the situation improves, without resolving them.
- … /49
These new options are relevant, and likely to bring value. Yet another sign we’d hit on a better model is that we had new conversations with the domain experts. A lot of failure scenarios became easier to detect and respond to. /50
We started asking, what other alert conditions could exist? What risks aren’t we mitigating yet? How should we react? /51
In a world-centric view of design, only the sensors and the alarms existed in the real world, and the old software model reflected that accurately. Therefore it was an accurate model. /52
The new model that includes alerts isn’t more “accurate” than the old one, it doesn’t come from the real world,it’s not more realistic, and it isn’t more “domain-ish”. But it is more useful. /53
Sensors and alarms are objective, compared to alert conditions. Something is an alert condition because in this environment, we believe it should be an alert condition, and that’s subjective. /54
The model works for the domain and is connected to it, but it is not purely a model of the problem domain. It addresses the problems in the contexts we envision better. The solution clarified the problem. /55
Having only a real world focus for modelling blinds us to better options and innovations. /56
These creative introductions of novel concepts into the model are rarely discussed in literature about modelling. Software design books talk about turning concepts into types and data structures, but what if the concept isn’t there yet? /57
Forming distinctions, not just abstractions, however, can help clarify a model. These distinctions create opportunities. /58
The model must be radically about its utility in solving the problem. /59
“Our measure of success lies in how clearly we invent a software reality that satisfies our application’s requirements—and not in how closely it resembles the real world.” — Object Design, Rebecca Wirfs-Brock /60
Thanks for reading! It may look like a long Twitter thread, but it's really a short blog post: verraes.net/2021/09/design…
/61

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Mathias Verraes

Mathias Verraes Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mathiasverraes

16 Sep
Reposting this thread about the coffee room conversations where the software design conversations happen. (The thread's order was a little messed up, and I want to be easy to find.)
Read 8 tweets
28 Aug
I should probably clarify this one before people start calling me names :-)
I probably used the term monolith a bit liberally there. But imagine you have a large chunk of your system that is very complex, because it has to deal with something with a lot of variability, in this case, different ways of doing things in different countries.
Naive strategic design here is to separate by domain (billing, invoicing, debt collection). But Bounded Contexts don't need to align with subdomains. That separation is still complex because each context needs to know country-specific things.
Read 11 tweets
27 Aug
This nonsense that tactical #DDDesign isn't important has to stop. It's not even an actual opinion anyone has given proper thought, it's just a fashionable meme people are parroting. None of your strategic design matters if nobody in your org can properly implement it.
If you think tactics don't matter, you might as well wear a t-shirt saying "Ivory tower architect 4 life".
(Yes I deliberately deployed one overused meme to battle the use of another, I'm a sly fox.)
Read 27 tweets
5 Jul
I'm reviewing submissions for a conference I'm somewhat involved it, so here's some advice. 1/n
If the number of years you've been a programmer really is the most interesting fact about you, then by all means, open your bio with that. For everybody else, make your bio about the good stuff. 2/n
"X can be challenging. In this talk, we'll look at some [techniques|patterns|approaches] to deal with X." -> Tells us something about those approaches, because if you don't, it's hard to evaluate if it's going to be worth watching. 3/n
Read 23 tweets
7 May
@stijnvnh @cesardelatorre @yreynhout @Indu_alagarsamy I think the naming is sloppy and creates a false dichotomy. (I imagine it exists for historic reasons). The naming suggests that integration events somehow are not messages that convey something has happened in the domain.
@stijnvnh @cesardelatorre @yreynhout @Indu_alagarsamy It also suggests that events (wether persisted or not) should not leave the Bounded Context, and instead different events should be created.
@stijnvnh @cesardelatorre @yreynhout @Indu_alagarsamy I agree with the reasoning behind that, but I think the conclusion that people usually draw ("Never share domain events") is unnuanced and doesn't consider other forces. In other words, whether or not to share domain events should be a deliberate tradeoff.
Read 11 tweets
29 Apr
The right time to fix it, is right before the cost of fixing it becomes exponential. 1/
If the thing works and you don't have to add anything, don't improve it.
If you add something, and the addition of N raises the cost of improving it with N, be on high alert. 2/
If you add something and the cost raises with 2N, first improve the system to get that particular impact down to N.
If you add something of N and the cost of improving raises with N², stop the work and do system-wide improvements first. 3/
Read 12 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(