Why do we need vector databases? The proliferation of embeddings immediately brought forth the need to efficiently store, index, and search these arra

Operationalizing Vector Databases on Postgres

submited by
Style Pass
2024-04-16 13:00:08

Why do we need vector databases? The proliferation of embeddings immediately brought forth the need to efficiently store, index, and search these arrays of floats. However, these steps are just a small piece of the overall technology stack required to make use of embeddings. The task of transforming source data to embeddings and the serving of the transformer models that make this happen is often left as a task to the application developer. If that developer is part of a large organization, they might have a machine learning or data engineering team to help them. But in any case, the generation of embeddings is not a one-time task, but a lifecycle that needs to be maintained. Embeddings need to be transformed on every search request, and inevitably the new source data is generated or updated, requiring a re-compute of embeddings.

Traditionally, machine learning projects have two distinct phases: training and inference. In training, a model is generated from a historical dataset. The data that go into the model training are called features, and typically undergo transformations.

Leave a Comment