“You Do Not Need a Vector Database” is the provocative title of a recent blog post (with code) by Dr. Yucheng Low, co-founder of XetHub. In this p

Using Vectors without a Vector Database

submited by
Style Pass
2024-04-25 11:30:04

“You Do Not Need a Vector Database” is the provocative title of a recent blog post (with code) by Dr. Yucheng Low, co-founder of XetHub. In this post, I’ll explain what a vector database is, why Dr. Low says you don’t need one, and provide context to Dr. Low’s answer.

To make sense of Dr. Low’s article, it’s important to understand why vectors of numbers have become such an important tool for search systems and why that popularity can translate into unnecessary expense for organizations that deploy vector-based systems. 

The word “vector” in computer programming means “a sequence of numbers.” For example, this vector represents the location of The White House in Washington DC:

Both of these vectors represent a point in some mathematical space. The first vector represents the location of the White House in a 2-dimensional space where the first dimension is the latitude and the second dimension is the longitude. The second vector represents the price,  range improvement with a 15-minute charge, the total range, and the number of motors for an electric vehicle (in this case, a Tesla Model 3). 

For the first few decades of text processing, one common way to represent the contents of a document was with a vector where each number represented the number of times a specific word appeared in the text: the first element in the vector might count the word “the” (the most common word), the second “be” (the second most common word), and so on.

Leave a Comment