v Profile picture
Dec 29 27 tweets 8 min read Read on X
Collection of insane and fun facts about SQLite. Let's go!

SQLite is the most deployed and most used database. There are over one trillion (1000000000000 or a million million) SQLite databases in active use.

It is maintained by three people. They don't allow outside contributions.
SQLite is likely used more than all other database engines combined. Billions and billions of copies of SQLite exist in the wild. It's everywhere! Image
It is also probably one of the top five most deployed software modules! Image
Hwaci is the company behind SQLite - hwaci.comImage
SQLite originated from a US warship. Dr. Richard Hipp (DRH) was building software for the USS Oscar Austin, a Navy destroyer. The existing software would just stop working whenever the server went down (this was in the 2000s). For a battleship, this was unacceptable.

So DRH asked the question: what if the database just worked without any server? This was an innovative idea back then.
SQLite is not open source in the strict legal sense, as "open source" has a specific legal definition and requires licenses approved by the Open Source Initiative (OSI).

Instead, SQLite is in the public domain, which means it has even fewer restrictions than any open source license.Image
They don't allow outside contributions. You *cannot* just send a pull request and hope the patch will be accepted. Image
Open Source, Not Open Contribution

Contributing to SQLite is invite-only (I don't have a source). Only after you are invited and have signed an affidavit dedicating your contribution to the public domain can you submit patches. Image
So, how do they cook?

There are over 600 lines of test code for every line of code in SQLite. Tests cover 100% of branches (and 100% MC/DC) in the library. The test suite is extremely diverse, including fuzz tests, boundary value tests, regression tests, and tests that simulate operating system crashes, power losses, I/O errors, and out-of-memory errors.
Interestingly, some SQLite tests are proprietary. The test suite called TH3 (Test Harness 3), which achieves 100% branch coverage of the code, is paid.

I don't know any other project which has made code free, but test suites are paid.
It's an interesting business model. Along with the paid test suite, they generate revenue through paid support, maintenance services, and commercial extensions.
SQLite does not have a Code of Conduct (CoC), rather Code of Ethics derived from "instruments of good works" from chapter 4 of The Rule of St. Benedict Image
Image
SQLite is so fast, they compete with `fopen`. For some use cases, you can use SQLite instead of a filesystem, that can be 35% faster. Image
Check this out, SQLite vs Redis (guess which is faster?)

But, unlike most databases, SQLite has a single writer model. You cannot have more than one concurrent writer.

This was also changed recently in 2010 by adding WAL mode. Before that, you could have either readers or a writer, but never together.
There are other things which are very common in other databases but not in SQLite:

- The default is rollback journal mode, which restricts you to have either multiple readers or a single writer
- Foreign Keys are disabled; they are opt-in
- Types are disabled; they are opt-in!
- Many of the ALTER commands you expect in other databases don't work. For example, you cannot rename a column
I hate that types are disabled. Not only that, it does not throw any error if you give some random type.

`CREATE TABLE t(value TIMMYSTAMP)`

There is no `TIMMYSTAMP` type, but SQLite accepts this happily.
SQLite takes backward compatibility very seriously

> All releases of SQLite version 3 can read and write database files created by the very first SQLite 3 release (version 3.0.0) going back to 2004-06-18. This is “backwards compatibility”. The developers promise to maintain backwards compatibility of the database file format for all future releases of SQLite 3.
But they take backward compatibility so seriously that even if they have shipped a bug, they won't fix it

SQLite's author Dr. Richard Hipp (DRH) did not find existing version control systems suitable. So he wrote his own called Fossil. Fossil is powered by SQLite, of course.

This reminds me of how Linus wrote Git.

DRH also wrote his own parser generator called Lemon.
DRH wrote the B-Tree based on the algorithm in the book TAOCP by Donald Knuth, coding it on a plane while traveling (super based)
SQLite is pronounced as "Ess-Cue-El-Lite". There is no official guideline though.
Here are the sources:

1. Most Deployed - sqlite.org/mostdeployed.h…

2. Open Source vs Public Domain opensource.org/blog/public-do…

3. Public Domain and Contributions sqlite.org/copyright.html

4. Testing - sqlite.org/testing.html

5. TH3 - sqlite.org/th3.html and sqlite.org/prosupport.html

6. Code of Ethics - sqlite.org/codeofethics.h…

7. Faster than FS - sqlite.org/fasterthanfs.h…
That's it for today! If I missed any, let me know.

Happy holidays and Happy New Year! 🎄 ☃️
here is the first tweet for you to retweet

This thread is doing numbers. I don’t have Soundcloud, but you can follow me on WhatsApp / Telegram for more serious database internals stuff

WhatsApp - whatsapp.com/channel/0029Va…

Telegram - t.me/databases_v

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with v

v Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @iavins

Dec 27
Did you know that if SQLite performs better, then the lifetime of your mobile increases?

But how?

Notes on my paper: SQL Statement Logging for Making SQLite Truly Lite SQL Statement Logging for Making SQLite Truly Lite
This paper appeared in VLDB, 2017 and I am reposting my old notes. I will also explain some necessary database internals.

tl;dr is instead of physical logging, they propose using logical logging to reduce write amplification

vldb.org/pvldb/vol11/p5…
First, let's understand the problem. The authors claim that by adopting a simple design, SQLite took a less complicated transaction mechanism, which causes excessive write amplification. SQLite has two kinds of transaction modes: rollback and WAL journal.

You may read here for detailed info on transactions and atomics:

- How SQLite Scales Read Concurrency - fly.io/blog/sqlite-in…
- Atomic Commit In SQLite - sqlite.org/atomiccommit.h…
Read 20 tweets
Dec 21
Some notes on how SQLite implements MVCC using WAL and provides Snapshot Isolation

Let's dive in SQLite DB and WAL mode
First, let's understand the transaction model of SQLite, which is quite different (and more limited) than other databases.

What SQLite offers:
- Concurrent transactions
- Multiple readers AND a writer simultaneously
- But no multiple concurrent writers
In other words, readers don't block the writer, and the writer doesn't block readers.

Fun fact: SQLite added WAL mode only in 2010. Until then, you could have either a single writer or multiple readers, but never both!

Now that we understand these semantics, let's discuss the implementation.
Read 11 tweets
Dec 15
Here's a fascinating story of how researchers used Bloom filters cleverly to make SQLite 10x faster.

I'll also explain some database internals and how databases implement joins.

Let's dive in! SQLite: Past, Present, and Future paper cover - https://www.vldb.org/pvldb/vol15/p3535-gaffney.pdf
I have posted about this paper earlier and also about filesystem-blob benchmarks. But let's discuss again for our new friends.

The results of this research have been applied to SQLite already and were released in v3.38.0.

This is going to be a long thread ☕️

Link to the paper: vldb.org/pvldb/vol15/p3…
Some technical details: SQLite is a B-tree on disk, row-based storage.

It internally uses a VM called VDBE to execute the queries. It is cross-platform, single-threaded, and runs almost everywhere.
Read 23 tweets
Nov 1
I watched @penberg 's talk "Patterns of Low Latency" at the P99 conference.

This talk is dense with practical insights and to the point. I was wondering why watch a talk on latency, but Pekka's background as a Linux Kernel contributor and his work on OSv and ScyllaDB made it worth hearing what he got to say.

Here are my notes.
The highest ROI tips from the talk:

- Disable Nagle's algorithm (stops TCP from buffering small packets, reduces cold start)
- Eliminate dynamic memory allocation (use static/pool allocation instead)

Here is the link to the talk:
What the talk covers?

• Why is latency important?
• Measuring latency
• Reducing latency
• Hiding latency
• Tuning the system
Read 16 tweets
Oct 29
if you discover a bug after shipping, would you leave it to maintain backward compatibility? how far would you go?

SQLite allows NULL values in primary key columns. This was an oversight by the developers, but since it was already shipped, they kept it to maintain backward compatibility. If you want standard behaviour, you need to explicitly add a NOT NULL constraint to the primary key column. In contrast, PostgreSQL automatically adds NOT NULL constraints to primary keys.

The bug lives on happily.NOT NULL is enforced on every column of the PRIMARY KEY in a WITHOUT ROWID table. This is in accordance with the SQL standard. Each column of a PRIMARY KEY is supposed to be individually NOT NULL. However, NOT NULL was not enforced on PRIMARY KEY columns by early versions of SQLite due to a bug. By the time that this bug was discovered, so many SQLite databases were already in circulation that the decision was made not to fix this bug for fear of breaking compatibility. So, ordinary rowid tables in SQLite violate the SQL standard and allow NULL values in PRIMARY KEY fields. But WITHOUT ROWI...
don't forget that there were applications "depended" on this bug

sqlite.org/quirks.html#pr…PRIMARY KEYs Can Sometimes Contain NULLs  A PRIMARY KEY in an SQLite table is usually just a UNIQUE constraint. Due to an historical oversight, the column values of a PRIMARY KEY are allowed to be NULL. This is a bug, but by the time the problem was discovered there where so many databases in circulation that depended on the bug that the decision was made to support the buggy behavior moving forward. You can work around this problem by adding a NOT NULL constraint on each column of the PRIMARY KEY.
Read 4 tweets
Jun 15
SQLite is cracked SQLite is cracked  Fun fact: SQLite is the most deployed and most used database. There are over one trillion (1000000000000 or a million million) SQLite databases in active use.  It is maintained by three people. They don’t allow outside contributions.  It is everywhere  SQLite is likely used more than all other database engines combined. Billions and billions of copies of SQLite exist in the wild. SQLite is in:  Every Mobile (Android, iOS, Windows) device Every Mac or Windows10 machine Every Web browser Every instance of Skype, iTunes, and Dropbox client PHP and Python Most TV, set-top cab...
I am building a community of cracked developers who are interested in database internals. Anon, are you in?

Telegram -

WhatsApp - t.me/databases_v
whatsapp.com/channel/0029Va…
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(