Greetings from #strangeloop! Today is the #tlaconf, which I just got to. Gotta take a couple of breaks from it to socialize and help people rehearse and stuff, but I'll be doing my usual livetweeting of the talks I see.
---
"TLA+ for Nike Stores", #tlaconf#StrangeLoop. Yes, speaker is wearing nikes. Apparently they had to go through a LOT of legal to get permission to give this talk!
"Athletes" (sales people) use mobile devices to do all sorts of tasks. Need to handle cash: logistical problem
Rules: physical cash drawer MUST not open twice, overly delayed requests must not open the drawer. Everything must be auditable. Drawers must not open if not connected to network.
Humans can route around these but network issues limit what they can do.
So: use TLA+!
Example liveness constraint:
drawer_resource.stats = Opening ~> status \in {Claimed, Open, Available}
"I was a math major, but dropped out to go make movies. But I kept using programming to edit VFX workflows. I don't have regrets, but stay in school."
Nike owns 100s of stores in a variety of concepts, tailored to local demand.
Serves "Athletes*": "u have a body, ur an athlete"
Payment methods and legal complaint processes are wildly different in different places. 4 years ago teams relied on intuition for design. Doesn't work for distributed systems: hard problems take many iterations to get right.
Nike studied TLA+ in their "freeform learn days"
[Tangent: I've seen a lot of places have freeform learn days. It fits into my theory of "moonshot experimentation": most ideas in freeform learning will fail, but if one thing works, it's all worth the investment.]
TLA+ was cool, but they couldn't see how to make it useful
They realize if they used much simpler models they could more easily specify it. Was a great lesson in abstract thinking.
Made them rethink their understanding of programming.
Started using it to model new microservices in payments and stuff. Used sequence diagrams to communicate ideas, then converted to tla+, found bugs in the sequence diagrams, fixed the bugs.
[Microservices make a LOT of sense for large non-tech companies, I realize]
They discovered that a completely different retail team was ALSO using TLA+!
After 3 months of interweaving design and TLA+, first spec was complete and useful. "Problem: state space was HUGE." Took a week to check on powerful computers. Still SUPER useful tho!
[He's showing a bunch of safety properties in the system, but he's moving too fast for me to transcribe them.]
Simple liveness property: `checkout.state = Cancelling ~> state = Cancelled`
[I like that, might adapt it for a workshop]
CheckingOut ~> CheckedOut \/ Cancelled. Can't be just CHeckout because it involves a human making a decision, and the human can decide not to actually pay.
"When we deployed, we had five known bugs in the system." Intentional, because system wasn't fully implemented. Never saw them in practice before so decided to deploy anyway.
...Every bug eventually happened.
And they knew EXACTLY what was wrong, due to the spec.
"It was a completely different experience." Instead of trying to figure out the bug, they knew what it was, where it was, and what to do to fix it. TLA+ gave them an incredibly strong foundation. According to one senior manager: "fifteen times better."
Great for meta concerns. Spec was more complex than intuition, but they could say for sure why simpler alternatives WOULDN'T work.
[This is something I found too.]
TLA+ is now optional part of RFC phase. Part of their scrum. "Test-Driven Architecture"
- Translate the syntax ASAP.
- Everything is boolean expression
- all immutable
- "No when, only order"
- Get into details of BFS QUICKLY. Grounds TLA+ into what people know. "Helps people believe it can work."
- "How does TLA+ test my code?" "Same way a whiteboard does" [I like this a lot]
- "It allows testing the unthinkable."
- "You don't always need FM" "You don't always need a parachute either."
- It's all about VALUE TO CUSTOMERS
Enterprise needs:
- Better CLI tooling
- Configuration as code would help, like git tracking config
- VSCode plugin drives adoption
- Would be nice to have module system [don't we have one? Maybe he means package manager]
- Autoformatter would be great
TLA+ is great because it gives fast feedback, and things like race conditions are impossible to get fast feedback. Saves years of leaning. We <3 TLA+, and it works great with Agile.
TLC's a "worst-case" checker, but lots of important properties are statistical: latency, uptime, etc. Statistics is hard, let's use the TLC simulator instead
Process: run TLC in "generation" mode, generate lots of traces, put metadata in the spec, export the data to CSV, analyze in R or Python.
[Markus is going to do a demo of Knuth-Yao: simulating a fair d6 with a coin]
Knuth-Yao is this graph. It's provably fair. "There are probabilistic model checkers, like PRISM. PRISM is a fine tool, but let's see what we can do with TLA+"
[editor's note: PRISM is not a fine tool.]
Problems with doing this in TLC: no real numbers,, infinite state (due to the probabilities),, can't state "fair die" property in TLA+, since it's not a binary true/false property on each behavior.
"Instead of ℝ, I created the module DyadicRationals."
This uses a new mode of TLC: "generator" mode, which is distinct from simulator mode. [missed how they're putting it in a CSV, think it's using the IOUtils module]
We can also add probabilities with the "RandomElement" module [which I need to document in learntla.com]
Oh I think they're collecting the information with the POSTCONDITION TLC configuration option.
Markus now demonstrating modeling a biased coin on Knuth-Yao, getting a new distribution in TLC, and then checking against the analytic bias calculated by PRISM
[I'm seeing a new `-note` flag on TLC]
POSTCONDITION is a new TLC config option which evaluates at the end of the trace.
[SHIT there's also a new TLCGet param, "config", that lets you get the runtime config options, like if you're checking for deadlocks!]
Now let's do a probabilistic distributed system: EWD998. Ring of nodes are either active or passive. Active nodes can go passive or send delayed messages to activate other nodes. Problem: find out when "stable": all nodes passive & no in-flight messages
EWD998 is a particular algorithm to detect stability. We're going to do probablistic modeling of two optimizations to see if they're safe optimizations. We can see, via modeling, that an "obvious optimization" changes an O(n) algorithm to O(n²)! Pretty dang cool
Jack now showing case studies, like the SWIM protocol and RabbitMQ Reactor Streams.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Day one of #StrangeLoop proper! I'll be livetweeting all the talks I see on this twitter thread (except possibly the closing keynote). If you're at the conference, come find me for homemade chocolate! I have:
In addition to the regular conf, there's a miniconf happening today: @gvwilson and @mhoye are hosting a version of "It Will Never Work in Theory", about Empirical Software Engineering!
Taking a break from workshopping for a quick Twitter rant: it's really sad how inaccessible mobile programming is for even other software developers; you almost never hear of people writing a small mobile program for themselves. Technology with immense potential, squandered
Imagine if you could run programs on your phone as easily as you could run programs on your laptop. What would you make? Think of all the sensors you'd have access to. I can think of a million small programs that'd make my life better
Like recording the time I plug the phone in at night and unplug it in the morning, to better track my sleep schedule. Or making a parameterizable "hey google" timer starter. Or writing note- and grocery widgets that are more tailored to my specific thought process
In the Crossover Project, I studied what lessons software developers can take from traditional engineers. Here's a list of other professions I think would have useful insights for us!
(Caveat: I have no experience with these fields, just being an interested layfolk!)
Librarians: Obviously in how they handle both information classification and retrieval, but I also "reference interviewing": the librarian practice of learning what a client wants when they can't explain it precisely, ie they don't know the book title or can't speak English well
Event planners: For big events especially (festivals, conferences, etc). Things are constantly going wrong in new and unexpected ways and there's no do-overs, you just gotta fix the plane while you're flying it. I wonder what their debugging strategies look like
Humans: "AI, please maximize the number of paperclips in the universe."
AI: "Okay, I just hacked my reward function to say I made infinite paperclips. All done!"
Humans: "Maximize paperclips, based on this paperclip recognizer AI."
AI: "I hack its reward function, too."
Humans: "We've created a 1,000 copies of the recognizer and given them all different security."
AI: "Okay, but they must learn 'paperclip' from somewhere, right? I add the definition 'paperclip: 1 cubic yoctometer' to urbandict and get a million sockpuppets to upvote it."
Humans: "We've replaced the recognizers with a human judge who can't be hacked."
AI: "I bribe 'em."
Humans: "We've added a council of 99 more judges."
AI: "I bribe em' too."
Apparently this week there was a huge blowup on Twitter about whether Historians are Bad?
Do I count as enough of a historian to have an opinion on this
unrelated but I think that "is it important to study the humanities" might be one of those topics where my work-peers might vehemently disagree with each other
And where the people who agree with me will have a completely different reason for why they agree with me
How come we can "easily" explain concurrency concepts with real-life analogies, but can't explain monads with real-life analogies? Monads aren't a conceptually more difficult topic than, like, weak vs strong fairness, but teachers better than me have given up on reallifing monads
Maybe it is possible, but people are looking for the wrong analogies? With a "good" analogy, you can answer a question about a topic by walking through the rules of the analogy. "Monads are like burritos" doesn't help because there's no intuitive rules in the burrito analogy
He got a lot of flak for it, but I can't shake the feeling Dan Kaminsky (RIP) was onto something with his "monads are pipes" analogy
(Or maybe the problem is we haven't yet found analogies for functor and applicative)