AI Explained · Concept

What are embeddings?

An embedding turns a piece of content into a list of numbers that captures its meaning — so that things which mean similar things sit close together. That one idea powers search, RAG, and recommendations.

The core idea

An embedding is a vector — a fixed-length list of numbers — that represents the meaning of a piece of content. An embedding model reads your text (or image, or audio) and outputs something like 768 or 1,536 numbers. The numbers themselves aren’t meant to be read; what matters is position: content with similar meaning produces vectors that land close together, and unrelated content lands far apart. “Dog” and “puppy” are neighbors; “dog” and “quarterly tax form” are nowhere near each other. Meaning becomes geometry.

Why that’s powerful

Keyword search matches characters: search “car” and you miss “automobile,” “vehicle,” and “sedan.” Embeddings match meaning, so a search for “car” finds all of them — that’s semantic search. The same trick underlies a lot of modern AI plumbing:

  • Semantic search — find by what you meant, not the exact words.
  • RAG — retrieve the document chunks most relevant to a question and feed them to the model.
  • Recommendations — surface items similar to what someone liked.
  • Clustering & classification — group similar content, or label it.
  • Deduplication — spot near-identical text even when the wording differs.

How similarity is measured

Once content is vectors, “similar” becomes a distance you can compute. The most common measure is cosine similarity — the angle between two vectors (1.0 means identical direction, 0 means unrelated). Dot product and Euclidean distance are also used. To find the best matches for a query, you embed the query the same way and look for its nearest neighbors in the vector space.

Where the vectors live

For a handful of items you can compare vectors in memory. At scale — thousands or millions of chunks — you store them in a vector database built for fast approximate nearest-neighbor search. Indexing once and querying many times is exactly the pattern RAG uses.

Practical things that trip people up

  • Use the same model for indexing and querying. Vectors from different embedding models aren’t comparable — they live in different spaces.
  • Dimensions are a tradeoff. More dimensions can capture more nuance but cost more to store and search; pick what your use case needs.
  • Chunking matters. Embedding a whole document blurs its meaning into one vector; embedding well-sized passages keeps retrieval sharp.
  • Embeddings aren’t the model’s “knowledge.” They’re a representation of this content for comparison — not a store of facts the model reasons over.

FAQ

What is a “dimension” in an embedding?

Each number in the vector is one dimension. A 1,536-dimensional embedding is a point in 1,536-dimensional space — impossible to picture, but the math of “how close are these two points” works exactly like it does in 2D or 3D.

Can I compare embeddings from two different models?

No. Each model defines its own space, so a vector from model A is meaningless next to one from model B. Always embed everything you’ll compare with the same model.

Are embeddings only for text?

No — images, audio, and code can all be embedded, and multimodal models can put text and images in the same space, so you can search images with a text query.

Related

Embeddings power RAG and fill the context window. More in AI Explained.