INTERACTIVE COURSES

Searching by meaning: vector databases and retrieval

An accessible university-level course on semantic search. We start from the limit of keyword search, which ignores meaning, and rebuild search by meaning: represent a text as a vector, measure proximity, index millions of vectors without comparing them all, make the index durable, marry it with the lexical world, and finally feed a language model. From the geometry of embeddings to RAG.

  1. 00
    Foreword
    Why search by meaning, what this course covers, and how to read it.
    8 min
  2. 01
    Embeddings and the geometry of similarity
    Meaning as a position in space, and three ways to measure how close two meanings are.
    26 min
  3. 02
    Exact search and the curse of dimensionality
    Comparing every vector gives the perfect answer. Here is its price, and the trap that high dimension sets for our intuition.
    28 min
  4. 03
    HNSW: navigating a proximity graph
    What if finding the nearest neighbor became a short stroll of a few hops, instead of a scan across millions of vectors?
    30 min
  5. 04
    The landscape of ANN indexes
    Four index families, three riches you can never keep all at once: how to choose between recall, speed and memory?
    28 min
  6. 05
    Testing the approximate: the differential oracle
    An index can pass every test, two reviews, and still return bad results. How do you catch an algorithm that lies?
    26 min