Traditional vector search methods typically employ sentence embeddings to locate similar content. However, generating sentence embeddings through pool

Supercharge vector search with ColBERT rerank in PostgreSQL

submited by
Style Pass
2025-01-24 02:30:03

Traditional vector search methods typically employ sentence embeddings to locate similar content. However, generating sentence embeddings through pooling token embeddings can potentially sacrifice fine-grained details present at the token level. ColBERT overcomes this by representing text as token-level multi-vectors rather than a single, aggregated vector. This approach, leveraging contextual late interaction at the token level, allows ColBERT to retain more nuanced information and improve search accuracy compared to methods relying solely on sentence embeddings.

As illustrated in the above image, ColBERT encodes each document/query into a list of token vectors and computes the MaxSim during the query time.

Token-level late interaction requires more computing power and storage. This makes using ColBERT search in large datasets challenging, especially when low latency is important.

One possible solution is to combine sentence-level vector search with token-level late interaction rerank, which leverages the efficiency of approximate vector search and the high quality of multi-vector similarity search.

Leave a Comment