🧵 Today I decided to tackle a performance issue in our #MySQL Database.
We use timestamp columns for storing states (is paid, is invoiced, is shipped etc.) and clients would like to be able to filter by these states (see screenshot)
Thread below👇🏻
Despite having indexes on all 4 of our timestamp columns, this query can become incredibly slow as data grows. Screenshot from first tweet is an example on 1 million records.
In order to optimize this, we can add a indexed boolean state column that concats all the timestamps.
Notice the leading "1", this is simply to ensure no leading zeroes are stripped. Using a string column to combat this would result in worse performance, as querying integers is much faster.
This change means we can now query by the state column instead of the timestamp column.
What I am seeing is reducing query time by almost 90% (this will vary a lot depending on context and data)
—
What if I dont want to query by all the columns? Maybe ignore the shipped_at column?
This is where things get slightly tricky:
With a normal query, you just remove the "AND shipped_at IS NOT NULL". Done.
For this, you'd need to change your query to use "WHERE IN" and include both variations.
It would be fairly straightforward to add a scope to your #Laravel model to simplify usage.
Note: This mostly affects COUNT(*) queries (which are called if you use default Laravel Pagination, or any page based pagination really) The main select query will be same-is performance for both cases and wont gain same benefits.
🚨HOWEVER🚨
If you use LIMIT, the new state-based query will be faster again.
SELECT * ... LIMIT:
Timestamp based query: ~60ms🐢
State based query: ~20ms🚀
• • •
Missing some Tweet in this thread? You can try to
force a refresh