- A huge 25% difference in performance might actually be more like 3% (oops)
- A fixed cost difference in 5ms looks a lot more impressive against 5ms of app code than against 500ms of app code (oh no)
- JIT warms up all of the app code very fast. This is not representative of how apps actually run, where init code is cold.
- Worse, to reduce variable in benchmarks, we usually “warm them up” (by running them a few times before measuring) 🤦♂️
- Can generate files from template. A benchmark should have hundreds of files with app code, just like a real app.
- Can insert predictable repetitive “filler” code into the generated files with a certain size and runtime cost.
- Uses markup with realistic styles to show layout cost.
- Focuses on “replace a subview” perf (most interactions in my experience) rather than “update 1000 rows”.