You can probably optimize your @Java programs to run 2-10x faster, by focusing on the bottlenecks.

Those bottlenecks are not where you think they are.

Use free tools such as async-profiler, VisualVM, JFR and vmstat to find them.

Let me show you how. 🧵
Create a main() Java application that runs the bulk of your code on a seriously sized dataset for at least a minute.

Run async-profiler and Java Flight Recorder (JFR) on that, either from the command line (free) or through @intellijidea Ultimate (paid) as shown below.
1) Async-profiler outputs a flamegraph to show the relative CPU time taken by each method (including methods it calls).

Slow methods stick out like a soar thumb, even if they loop (unlike in other visualizations).

This graph made one of our use cases 10x faster.
Async-profiler doesn't suffer from Safepoint Bias.

This means it's more honest than:
- VisualVM's sampler which rounds measurements to the nearest safepoint
- VisualVM instrumentation profiler which inflates short methods that are called a lot with instrumentation overhead
2) VisualVM is useful too, to quickly monitor GC activity, detect potential memory leaks etc.

Download and run it, then start your main(), double click on your process in the VisualVM list and switch to the monitor tab:
3) Java Flight Recorder (JFR) shows memory allocations and lock congestion.
To get memory allocations data in @intellijidea you need to explicitly change its settings, for a performance overhead.

(Thanks to Jeremy Whiting for teaching me at @DevoxxUK)
The JFR data is stored in your home folder by default. Download and run Java Mission Control (JMC) to open these files and visualize the data recorded by Java Flight Recorder.
Then switch to the TLAB Allocations screen.
JFR records all memory allocations. Every memory allocation costs CPU time for creation and CPU time for Garbage Collection.

This method allocated 7GB per minute in some configurations. Now, it does the same functionality with a fraction of that, significantly faster.
4) vmstat
The tools in above work on the JVM level. The real truth is at the OS level, as explained by Kirk Pepperdine (@kcpeppe) at @Jfokus. Especially to detect Noisy Neighbors (other containers/VM with which you share a bare metal machine).

Call vmstat on the linux cmd line:
I hope you found this thread useful.
Anything important that I missed?
Coincidentally, the mailman just now delivered the "Optimizing Java" book by Ben Evans (@kittylyst), so I'll probably learn a bunch more about performance tuning over the next few weeks.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Geoffrey De Smet

Geoffrey De Smet Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(