, 16 tweets, 5 min read Read on Twitter
DATABASE THREAD ALERT: So @memsql ran some database benchmarks and threw some shade at @CockroachDB. As super serious professionals with a serious company name, I imagine the Roach is going to be measured in their response. As an UNSHACKLED database enthusiast I am... not.
>> BEGIN RANT;
Let's focus on TPC-C to start. First, they have some bar graphs with loooooong bars for MemSQL, and short stubby bars for CockroachDB. Wow, such performance. Very impress.
The problem is that TPC-C has a strict limit on how many transactions you're allowed to run (the "tpmC) per 'scalefactor'. The goal is to prevent you from running 10000 transactions on a mere 10 megs of data (Look ma, no L2 cache misses!). MemSQL does not obey this limit.
The benchmark goal is to force you to 'scale out' your data if you want to show bigger numbers. Roughly, it's 12.8 tpmC per warehouse (which is about 80 megs of unreplicated state). Want to show 100,000 tpmC? Gotta have a working set of 600 gigabytes of data (times replication!).
MemSQL, instead, shows 800,000 tpmC on scale factor 10,000. This is not allowed. At SF 10,000, you're limited to ~128,000 tpmC. The point is this: if you want to show more tpmC, do it at higher data scalefactors.
But MemSQL says they didn't cheat, so what gives?
OK, so forget the ridiculously long bars that are completely off-spec. Instead, let's talk about why you put things in a database in the first place. Probably cause you like the data and you'd like to keep it around. So how does this little detail look to you?
@MemSQL made actually keeping the data around purely optional. I don't know about your experience with computers, but if you make the fsync part optional, this tends to end pretty badly. I dunno how else to put this, but this seems... pretty Mongo.
"But Arjun, they use synchronous replication! Isn't that sufficient?" I can imagine the retort. No. Because neither machine is required to write it down, and they don't replicate across datacenters (MemSQL only supports async cross-DC replication[1]).
[1] docs.memsql.com/operational-ma…
So you're necessarily in a single AZ deployment (MemSQL is incapable of HA across clusters), and nobody is writing anything to durable storage. I hope this data isn't important to you. Probably not right, I mean, you're only transacting with it thousands of times a second...
Enough dunking on their TPC-C. On to TPC-H. This is the legacy data warehouse benchmark. It really tests your ability to do SQL query optimization, and you really need a columnar in-memory representation to squeeze the juice out of these queries. MemSQL does decently here, kudos!
But TPC-H is a bit of a legacy benchmark, known to be deceptively easy to optimize, because the data is simply not realistic enough[1]. TPC-DS is much better! And on that benchmark, MemSQL is beaten by ~2x perf by an unnamed 'Cloud DW' product.

[1] vldb.org/pvldb/vol9/p20…
But wait... what's the Y axis on their graph? "Sum of Query Runtimes"? According to MemSQL, they only report a 'power run' number.
Let's see what the TPC-DS spec has to say about the proper way to compare performance...
Hm, sure seems to be a lot more complicated! Let's head over to the 'The Making of TPC-DS' paper[1] to see what they have to say about power runs... Hmm.

[1] dl.acm.org/citation.cfm?i…
Listen, I get that benchmarking is really, really, hard. Which is why it's best approached with humility, and serious methodological rigor.

But if you're gonna throw cheap shade, you best not miss.
>> COMMIT RANT
UPDATE: Kudos to MemSQL for removing the comparison. Unlike the original snarky post and my ultra-snarky response, they've handled the follow-up with grace and humility and we should applaud them for it!
Missing some Tweet in this thread?
You can try to force a refresh.

Like this thread? Get email updates or save it to PDF!

Subscribe to Arjun Narayan
Profile picture

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just three indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!