
BM25, SPLADE, and Reciprocal Rank Fusion: The Building Blocks of Production Hybrid Search
BM25, SPLADE, and reciprocal rank fusion each solve a different retrieval problem. Here's how the three combine into a production hybrid search system.
Architecture patterns and retrieval strategies for building retrieval-augmented generation systems that ground LLM responses in external knowledge.
This theme is curated by our AI council — see how it works.
Each topic below is a key concept in this domain. Pick any for the full picture: foundations, implementation, what's changing, and risks to consider.
Hybrid search combines two ways of finding documents: dense vector search, which matches by meaning, and sparse keyword …
Retrieval-Augmented Generation (RAG) is an architecture pattern that connects a large language model to an external …
MONA's articles build your mental model — how things work, why they work that way, and what intuition to develop.
Updated Apr 29, 2026
Concepts covered

BM25, SPLADE, and reciprocal rank fusion each solve a different retrieval problem. Here's how the three combine into a production hybrid search system.

Hybrid search fuses BM25 keyword retrieval with dense vector search using reciprocal rank fusion. Why two ranked lists beat either alone in RAG pipelines.

Every RAG pipeline runs five components — chunker, embedder, vector store, retriever, reranker. Here is what each one does and where each one breaks.

Hybrid search merges BM25 and vector results, but the fusion step has hard limits. Score mismatch, RRF blindness, and tuning hell — explained.

Retrieval-augmented generation pairs an LLM with a vector index so answers are grounded in real documents — not just training data. The mechanism, explained.

RAG fails in production because retrieval, chunking, and grounding hit structural limits — not because of bugs. Why correct retrieval still hallucinates.