Ian Butler Profile picture
Jun 5, 2022 9 tweets 2 min read Read on X
Continuing my trend of writing about search and search engines, one question I've had is why aren't we making better use of the databases people are already storing their data in? #tweet100 #programming #search 🧵
Every search engine I know of builds their own index and requires you to take all of your data from your existing database and move it into the search engine. 1/
Most search engines are based on inverted indexes which allows for very efficient word look up within an indexed document, but many databases have long since included either the ability to create an inverted index or similar indexes that are good enough for most use cases… 2/
…such as trigram indexes. Plus there is a whole bunch of reranking and processing that occurs afterwards which in my experience can contribute more query latency than those lookups. 3/
Looking at general purpose relational and NoSQL databases we can see that MongoDB's Atlas cloud offering includes an inverted index and builds in text analyzers similar to Elasticsearch. 4/
They're not as mature as Elastic but they do allow for a more than good enough text search in many use cases. Postgres provides trigram and GIN (general inverted index) indexes and MySQL InnoDB also has an inverted index for full text search. 5/
While I've yet to sit down and do a full time comparison, my suspicion is that Lucene will be a bit faster any time it needs to hit the inverted index by virtue of being highly optimized specifically for that use case, but the question as engineers we need to ask is by how… 6/
…large a margin and how much of a difference does it make in the business domain we're building for. 7/
How important is that minimal matching difference if we can build an easier to use search engine that leverages existing db technology and builds the result processing and analysis pipelines on top. 8/

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Ian Butler

Ian Butler Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @kinglycrow

Jun 2, 2022
I think a lot about search every day, at one level there's search at the scale of Google, but at the other end there's search as a feature of a product, like searching a catalog of things to buy, or looking for a movie, etc. #tweet100 #tech #business #search A 🧵 about search.
Today I sat down to write and the first thing that came to mind is how difficult it is for companies to implement search well in their products. 1/
How many times have you searched for a product on a site that you thought "surely I'll find what I'm looking for'', only to realize that they don't handle synonyms well or that you can only do an exact match to find the item you want. 2/
Read 33 tweets
Jun 1, 2022
What is a SPAC? A Special Purpose Acquisition Company, or SPAC are companies who are formed specifically with the purpose of raising funds at an IPO. Why do they do this? They do it to acquire an existing business. And they've risen in popularity.🧵
#Tweet100 #startups #business
They've risen in popularity compared to traditional IPOs because they're quicker to push through to completion, have less scrutiny and present very favorable terms to their (institutional) investors. This means that a business that may not have the financials to succeed through
the classic IPO process is often able to IPO through a SPAC and provide great returns to their investors while having questionable longevity as an investment. As an example, wsj.com/articles/spacs…

Those 25 companies range from EV businesses to Scooters, but what they have in
Read 14 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(