12,399 views

Dan Abramov

@dan_abramov

, 12 tweets, 2 min read

My Authors

We need to talk about benchmarks in the JavaScript community. A typical benchmark used to compare libraries is 95% the thing being tested, 5% app code. The app code is rarely more than a dozen of functions in a couple of files.

The actual apps many of us are writing are the direct opposite of that. There’s a shitload of app code (hopefully with code splitting!), and then there’s the 5% which is taken the library code.

Here’s a few obvious problems that come out of that:

- A huge 25% difference in performance might actually be more like 3% (oops)

- A fixed cost difference in 5ms looks a lot more impressive against 5ms of app code than against 500ms of app code (oh no)

Here’s a few less obvious problems:

- JIT warms up all of the app code very fast. This is not representative of how apps actually run, where init code is cold.

- Worse, to reduce variable in benchmarks, we usually “warm them up” (by running them a few times before measuring) 🤦‍♂️

Imagine a “mounting 1000 rows” benchmark. What it’s usually measuring is library overhead. What it doesn’t say is that in a more practical example, that overhead is minimal compared to costs of running user code (can’t optimize that away), computing styles, and layout.

I think at some level we’ve all come to embrace that benchmarks are flawed but we can’t help making them. If only to be able to compare the pure overhead (whether or not it’s relevant). But I wonder if there’s a better way to create benchmarks we haven’t discovered yet.

Here’s my wishlist for a benchmarking tool:

- Can generate files from template. A benchmark should have hundreds of files with app code, just like a real app.

- Can insert predictable repetitive “filler” code into the generated files with a certain size and runtime cost.

- Can produce hierarchies. Not one component rendered recursively (apps don’t do that) but hundreds of component files, each component rendering other components. (Of course, generation needs to be framework-agnostic.)

- Uses markup with realistic styles to show layout cost.

- Can express common idioms in the generated code (mostly, conditions and loops in rendering). Shows whether or not adding them inflates the code output disproportionately.

- Focuses on “replace a subview” perf (most interactions in my experience) rather than “update 1000 rows”.

A tool like this still won’t be perfect. There’s plenty of tradeoffs it might fail to cover. For example, when do modules get initialized? If initialization is lazy (during first import use), a lot of that work moves to rendering. Some frameworks are able to spread that cost.

What happens when an app is waiting on network, and the CPU is not busy? Some libraries can start pre-rendering an optional UI element like a popup you’ll likely click — without blocking. But a benchmark rarely deals with IO. Perhaps, it should be able to simulate some of it?

In either case, this kind of benchmark tool is long overdue in the JavaScript community. Maybe it will also be flawed, but I hope we’ll learn from it. Your next project?

Enjoying this thread?

Keep Current with Dan Abramov

Stay in touch and get notified when new unrolls are available from this author!

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Enjoying this thread?

Try unrolling a thread yourself!

More from @dan_abramov see all

Related threads

Trending hashtags

Did Thread Reader help you today?