AI Explained · Infra

What is a vector database?

A vector database stores embeddings and finds the closest ones to a query, fast, at scale. It's the storage-and-search layer that makes semantic search and RAG practical.

What it’s for

Once you turn content into embeddings, you need somewhere to keep those vectors and a way to ask “which stored items are most similar to this one?” A vector database is built for exactly that: store millions of vectors, and given a query vector, return its nearest neighbours in milliseconds. It’s the engine under semantic search, recommendations, and RAG.

Why a regular database isn’t enough

A normal database is great at exact matches and ranges, WHERE price < 50. But embeddings aren’t looked up by equality; you want the closest vectors by meaning, across a huge, high-dimensional space. Comparing a query against every stored vector (an exact search) is accurate but slow at scale. Vector databases solve this with specialized indexes that make similarity search fast.

How it works

  • Store each item as { id, vector, metadata }: the embedding plus fields you can filter on (author, date, type).
  • Index the vectors with an ANN (Approximate Nearest Neighbour) algorithm, commonly HNSW (a navigable graph) or IVF (clustering). ANN trades a tiny bit of accuracy for a huge speed-up, which is what makes million-scale search feel instant.
  • Query by embedding your input the same way, then asking the index for the top-k nearest by cosine similarity (or dot product / Euclidean distance).
  • Filter with metadata, “closest chunks from these docs, since January”, combining semantic and structured constraints.

Where it fits in RAG

In a RAG pipeline, the vector database is the retrieval step. You index your chunks once; at query time you embed the question, pull the top matching chunks from the vector store, and hand them to the model. Better recall here matters more than a bigger model, if the right chunk isn’t retrieved, the model never sees it.

Your options

There are dedicated vector databases and extensions that add vector search to a database you already run (for example, the pgvector extension for PostgreSQL, or vector features in Redis and Elasticsearch). For a small project, an extension on your existing database is often enough; at large scale or with heavy filtering, a purpose-built vector database earns its keep. Evaluate current options against your scale, latency, and filtering needs rather than a remembered shortlist.

FAQ

Do I actually need a vector database?

For a handful of vectors, no, you can compare them in memory. Once you have thousands to millions of chunks and need fast filtered search, a vector index is what keeps it usable.

What does “approximate” nearest neighbour mean?

Exact search checks every vector and is slow at scale. ANN indexes find almost the exact nearest neighbours far faster, with accuracy high enough that the tiny miss rarely matters.

Is hybrid search worth it?

Often yes, combining keyword (exact-term) and vector (semantic) search catches both the precise matches and the meaning-based ones, which neither does alone.

Related

It stores embeddings and powers RAG. More in AI Explained.