I'd like to start by not-so-briefly outlining the existing approaches to text search and their use-cases, primarily around Postgres. They all have

Moosieus' Things n' Stuff

submited by
Style Pass
2024-10-09 23:00:15

I'd like to start by not-so-briefly outlining the existing approaches to text search and their use-cases, primarily around Postgres.

They all have certain advantages when it comes to handling partial matches, misspellings, and phonetically similar names, but aren't suitable for searching longer text passages.

Postgres' Full Text Search is best characterized by its convenience-to-capability ratio. It accomodates basic document retrieval without the need to synchronize or Extract-Transform-Load (ETL) your data to another service, but languishes in terms of capability and ranking of results.

Incumbent offerings in the search engine space include Apache Solr, Elasticsearch, and more recently OpenSearch. All of these are built on top of Apache Lucene and the JVM. A newer wave of search engines has also come about, built on Tantivy and Rust. Two prominent examples include Meilisearch (focuses on simplicity) and Quickwit (focused on logs and object storage backing).

ParadeDB is a set of extensions that add pretty amazing search and analytics features to Postgres. In particular, ParadeDB embeds Tantivy as an extension via pgrx.

Leave a Comment