Latest Twitter Threads by @RealGeneKim on Thread Reader App

Sep 9, 2024 • 12 tweets • 6 min read

This is such an amazing talk from Dr. Erik Meijer (@headinthebox, famous for his work on Visual Basic, C#, LINQ, Hack), on how LLMs upended his research, and are changing coding and what developers do.

I've clipped some of my fave parts of his talk:

- His team found that the specialized models they built to do codegen, and find/fix bugs at Meta were completely outclassed by ChatGPT. They were surprised that ChatGPT could write Hack code, despite it not being used widely outside of Facebook.

(@DynamicWebPaige has talked about the phenomenon of Gemini and general-purpose frontier models outperforming and replacing older/smaller specialized models built over the years throughout Google.)

Source:

- when they had three models essentially replaced by ChatGPT 3.5, "It felt like we were digging a tunnel using pickaxes, and then someone suddenly shows up with dynamite and heavy equipment. 'Get out of the way, kid. The professionals are here."

Sep 1, 2024 • 5 tweets • 2 min read

Amazing and sobering WSJ interview of Dutch Admiral Bauer, NATO's top military officer, on supply chain risks.

"We've long thought that mutual economic relationships between nations would prevent war. We thought we were buying our gas from a company. Turns out that wasn't true.

"Putin told Gazprom to turn the [gas] off, despite what was in the contract... turning energy supply into an economic weapon against us"

More supply chain risks were seen during Ukraine invasion:

"The military has to prepare or the governments for war, but I think the businesses have to prepare for war as well and it is the dependencies."

- IKEA found 25% of all raw materials came from Russia, and had to change overnight
- BMW had all their cables produced in two factories in Ukraine. In first 2 weeks of invasion, no more cables

Jul 19, 2024 • 65 tweets • 17 min read

I'm in awe of the scale of the Crowdstrike / Windows BSOD issue.

Here are the most startling images I've seen morning.

Let's start with this: at 10pm PT yesterday, famous @troyhunt notices that something odd is happening to Windows systems:

https://twitter.com/troyhunt/status/1814174010202345761?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Etweet

12-hour timelapse of American Airlines, Delta, and United flights:

https://x.com/us_stormwatch/status/1814268813879206397?s=61&t=JL9ikpkh2HXt-O32JzLrng

Mar 18, 2024 • 5 tweets • 4 min read

I was talking with my friend @nealriley about how much fun I’m having with LLMs lately, including the project I've written about before to interpret screenshots of podcasts and YouTube videos. I made the observation that working with LLMs seems to boil down to a pattern of creating ETL pipelines, over and over again.

He quickly corrected me, suggesting, "Don't you mean ELT pipelines?"

I laughed at his curious and unexpected comment, which seemed unusually pedantic. However, I found myself thinking about his comment for days afterward.

It reminded me of a quip from Rich Hickey, inventor of the Clojure programming language: "Information is simple. The only thing you can do with information is ruin it."

In my experience, what Rich said is so true, and shows that Neal was absolutely correct. What I meant to say is "ELT," not "ETL." In other words, you should Extract, Load/Store, and then Transform data, in that order (ELT). Unless you love misery, you should not Transform before you Load/Store (ETL).

Over the last many decades, I've ruined data in so many many ways, for me, the “T” in “ETL” stands for tarnish, trash, taint, etc.

I’ve accidentally converted date-time strings incorrectly (turning them all into nils), I’ve discarded data fields that I thought were irrelevant, but years later, I wish I had kept them because it is priceless information, impossible to reconstruct. I’ve accidentally overwritten data, deleted data, and so many more terrible things, Thankfully, I've wiped those episodes from my memory, lest I start kicking myself again.

This is because I was Transforming the data (or Trashing it) before storing it.

One of the things I've learned from the Clojure and cloud community is the value of just storing data in its original form, often in storage buckets, deferring any transformation of the data to afterward, often at runtime (i.e., Extract, Load/Store, then Transform).

I almost made this mistake last week: I mentioned how I use @revai and @deepgram and to create podcast audio transcripts. At first, I converted the audio transcripts to match the format used by the python-transcript-api program. That way, I could use the same code to render the transcript, regardless of what the transcript source was.

This is when I once again learned that the only thing I could do to the data is to ruin it.

So instead, I saved the entirety of the transcript in its original format, including the full API return payload. That way, I’m preserving optionality, enabling myself to use that data in new ways in the future. By discarding the data, I destroy that optionality.

Today, I saw one such potential option: I realized that transcript diarization could be helpful (this is when transcripts detect when there are multiple speakers talking, and ideally labels them). The YouTube-transcription-api doesn’t support speaker labels, and I would have discarded that data to match it.

My lessons:

- Store your data in as close to its original form
- Defer data transformation to after you load/store it Citations and resources:

Citation for Rich Hickey quote: "Information is simple. The only thing you can do with information is ruin it" - Rich Hickey (56:09)

A fun dialogue I had learning about ETL vs ELT with ChatGPT:

I'm finding using XTDB to be fantastic for these ELT pipelines: it's amazing to have an append-only database, where if you trash your data, you can just rewind the event log.

The benefits of ELT vs. ETL falls shows up in "Wiring the Winning Organization," co-authored with Dr. @StevenJSpear. Problems are much easier to solve when you can undo them — if you can't undo, you can't iterate and experiment, which also means you can't learn.

When experiments are highly consequential and thus expensive, you are very much in the Danger Zone.

Merely shifting Transformation to after Loading/Storing, we make problems dramatically easier to solve, enabling Winning Zone characteristics.

cc'ing @fistsOfReason, because I'm proud of using the word "optionality," @refset because I love @xtdb_com, @enigma2a .infoq.com/presentations/…
chat.openai.com/share/aca1afd1…
v1-docs.xtdb.com/guides/quickst…

Jun 23, 2022 • 33 tweets • 19 min read

One of biggest learnings for me was the importance of architecture to get great outcomes, both in DevOps and in any engineered system.

I was trying to think of a great example of modularity, and started marveling at the USB interface. What's amazing about USB?

The USB cable doesn't care what it's connected to — it connects computers to peripherals (e.g., printers, keyboards, mice, network adapters).

It's remarkable looking at the incredible explosion of innovation on both sides of that interface.

Jun 19, 2021 • 6 tweets • 5 min read

A genuine question:

Can someone come up with an example NOT RELATED to software that has this property:

As time progresses, it becomes more and more difficult to add things, even when only one person is responsible for entire system.

(And then explain why this happens?) In a call with @mik_kersten and @stevenjspear, I was amazed that I couldn't verbalize why this happens.

Ultimately, I'm hoping that this will inform why this phenomena observed by @johncutlefish happens:

https://twitter.com/johncutlefish/status/1046169469268111361

Jun 12, 2021 • 33 tweets • 8 min read

Taking notes from this great short video on USAF Colonel John Boyd, his Energy Maneuverability theory, OODA loops, and more.

I've read the awesome book "Boyd: The Fighter Pilot Who Changed the Art of War" decades ago, but...

amazon.com/Boyd-Fighter-P… ...but I'm trying to refresh my memory on his teachings, and view it through the lens of how it informs how we create and exploit optionality, maybe against an adversary, but also on how one creates and operates systems.

(From Boyd to Steve Blank/Eric Ries & Dr. Carliss Baldwin)

Jun 11, 2021 • 10 tweets • 5 min read

Eroom's Law:

Since 1950s, for every $1B spent on pharmaceutical R&D, the # of drugs that make it market halving every 9 yrs! (The opposite of Moore's Law)

@StevenJSpear & I speculate that major cause is the # of disciplines of which integrated problem solving must occur!

https://twitter.com/spfeiffr/status/1403398884504870917

We speculate this aspect of Eroom's Law is what causes:

- the before vs. after state in Team of Teams
- waterfall -> DevOps

I think we even saw this happen in the COVID vaccinations (those that get 100% into people's arms vs 30%)

...and so many more domains!

Jun 10, 2021 • 6 tweets • 4 min read

In this episode, Dr @StevenJSpear and I learn about how human creativity was unleashed to enable vaccinating 8K people/day, up from 2K/day in Jan, through relentless improvement.

And we explore how the lessons learned may inform how we can improve the overall healthcare system!

https://twitter.com/ITRevBooks/status/1402958478847643649

@StevenJSpear The things I learned from Trent Green, visiting the mass vaccination site here in Portland, or this interview were…

…some of the most hopeful & inspiring I’ve heard in years.

He describes how we can win this race to vaccinate everyone on planet, in shortest possible time.

Feb 20, 2021 • 54 tweets • 20 min read

Many of you have seen the famous Westrum Organizational Typology model, so prominently featured in State of DevOps Research, Accelerate, DevOps Handbook, etc.

This model was created Dr. Ron Westrum, a widely-cited sociologist who studied the impact of culture on safety

Thanks to Dr. @nicolefv, I was able to interview him for an upcoming episode of the Idealcast! 🤯

It was a very heady experience, and while preparing to interview him, I was startled to discover how much work he's done in healthcare, aviation, spaceflight, but also innovation.

Feb 19, 2021 • 4 tweets • 3 min read

I loved this podcast: @mik_kersten interviews Dr. @gail_murphy, discussing their decades-long research on dev productivity.

Favorite phrase: “better frame of dev productivity (and knowledge work) is on how we make decisions.”

How utterly wonderful!!

overcast.fm/+XEReQPegs @mik_kersten @gail_murphy It was a startling thing to hear, as I was talking with @girba, and he mentioned the same thing.

What are the best papers that describe the nature of decision making, and how might inform great decision making (frequent, fast feedback, high levels of exploration, safe?)

vs…

Dec 22, 2018 • 9 tweets • 6 min read

Wow, a super interesting question! Wasn’t as easy to answer quickly as I thought it would be!

TL;DR: My notes almost always go into Trello first, where I triage and organize them. I actually wrote a (Clojure) app to help manage these cards. Then all into @ScrivenerApp.. 1/N

https://twitter.com/ericnormand/status/1076227712061186048

@ScrivenerApp Almost all my notes start as Trello cards first: I use Zapier to put all starred tweets in there, I send myself emails that get turned into cards.

Each book often has one board, with lists for each broach category. I wrote app to enable moving cards w/1 keystroke, like vi.

2/N

Share this page!

Enter URL or ID to Unroll