Weaviate
- Weaviate
- Weaviate is an open-source vector database that stores embeddings alongside their original objects and supports hybrid search out of the box, blending dense vector similarity with BM25 keyword scoring through a tunable alpha parameter to power retrieval-augmented generation, semantic search, and AI agent memory.
Weaviate is an open-source vector database written in Go that combines dense vector search with BM25 keyword search through a single hybrid query, making it a popular choice for retrieval-augmented generation systems.
What It Is
Weaviate is an open-source vector database that stores objects alongside their embeddings, indexes them for fast similarity search, and lets you query vectors and keywords through the same API. For teams building retrieval-augmented generation (RAG) pipelines or AI agents, this is the layer that decides which chunk of context the model actually sees when answering a question.
The database was designed around hybrid search from the start. Most search systems force a choice: dense vector retrieval (good at meaning, weak on exact terms like product codes or error messages) or BM25 keyword retrieval (good at exact terms, weak on paraphrasing). Weaviate runs both at once and fuses the two rankings into one ordered list. The alpha parameter on each query controls the balance — closer to 1 means trust embeddings, closer to 0 means trust keywords.
Internally, Weaviate is written in Go and organised around schemas called collections. Each collection carries a vector index (HNSW by default) and an inverted index for BM25, so you don’t run two stores in parallel. According to Weaviate Docs, the system supports two fusion methods: rankedFusion, which uses a reciprocal-rank-style formula of 1/(rank+60), and relativeScoreFusion, which normalises and weights the raw scores. According to Weaviate Docs, relativeScoreFusion has been the default since version 1.24.
On top of the core database, Weaviate has been adding features aimed at AI agents — multi-tenancy, generative-search modules that wire LLMs into the query path, and integrations with coding assistants. According to GlobeNewswire, the team launched Agent Skills in February 2026, an open-source repository of templates that let assistants like Claude Code, Cursor, and Copilot work against a Weaviate instance through a shared protocol.
How It’s Used in Practice
The mainstream encounter is through a RAG pipeline. A team builds a chatbot that answers questions over internal docs: support tickets, product manuals, contracts. The docs are split into chunks, run through an embedding model, and stored in Weaviate. When a user asks a question, the query gets embedded, Weaviate runs a hybrid search that combines vector similarity with keyword matching, and the top results are passed into the LLM as context. The hybrid step is what stops the chatbot from missing tickets that mention a literal SKU like “ABC-274” or the exact error code in a stack trace — pure vector search drops those.
A second use is agent memory. Long-running AI agents need to recall past actions, retrieved documents, and user preferences across sessions. According to GlobeNewswire, the Agent Skills release in 2026 makes this kind of integration the default deployment shape rather than a side experiment.
Pro Tip: Start with relativeScoreFusion and an alpha of 0.5, then look at queries where users complain. If exact-term searches are dropping out, push alpha down to around 0.3. If users ask conceptual questions and get nothing relevant, push it up to 0.7. Tune per query type, not globally.
When to Use / When Not
| Scenario | Use | Avoid |
|---|---|---|
| Building a RAG chatbot over mixed text and structured metadata | ✅ | |
| Storing only numeric vectors with no need for keyword search | ❌ | |
| Running multi-tenant agent memory where each customer needs isolation | ✅ | |
| Replacing a transactional database for billing or order management | ❌ | |
| Hybrid search over technical docs with exact identifiers and error codes | ✅ | |
| Tiny prototype with under a thousand records and no plan to scale | ❌ |
Common Misconception
Myth: Weaviate is just another vector database, so any of them will work.
Reality: Most vector databases bolt keyword search on later or treat it as a separate index. Weaviate was built around hybrid search and exposes the fusion algorithm and alpha parameter as first-class query knobs. If your retrieval problem mixes semantic meaning with exact-term matching — which describes most real RAG workloads — that design choice changes the failure modes you’ll hit in production.
One Sentence to Remember
Weaviate’s value isn’t that it stores vectors — most databases can do that now — but that it treats hybrid retrieval as the default, giving you a single tunable knob to balance meaning against exact terms in production RAG and agent workloads.
FAQ
Q: Is Weaviate free to use? A: The Weaviate core is open-source under the BSD-3-Clause license and runs locally with no cost. According to Weaviate Pricing, the managed Weaviate Cloud offers a time-limited free Sandbox plus tiered paid plans.
Q: How is Weaviate different from Pinecone or Qdrant? A: All three store vectors, but Weaviate’s hybrid search is native rather than added on, with two fusion algorithms exposed in the query API. Pinecone is fully managed only; Qdrant has hybrid features but a different architecture.
Q: Does Weaviate support SPLADE or other learned sparse models? A: According to Weaviate Docs, BM25 is the built-in sparse algorithm. SPLADE-style learned sparse retrievers are not natively supported, but you can store SPLADE token weights in custom properties and index them yourself.
Sources
- Weaviate Docs: Hybrid search — concepts & API - Official documentation for Weaviate’s hybrid search, fusion algorithms, and alpha parameter behaviour.
- Weaviate GitHub releases: weaviate/weaviate releases - Source code, BSD-3-Clause license, and current version history of the open-source core.
Expert Takes
The principle here is that lexical search and dense retrieval surface different signals — exact tokens versus learned semantics. Fusing the two rankings produces results that survive queries neither method handles alone. Weaviate exposes that combination as one tunable parameter rather than two stitched-together systems. Not a clever optimization. A direct response to how queries actually behave in production.
Treat the alpha parameter as a contract between query intent and ranking behaviour. Set it high and you trust embeddings; set it low and you trust keywords. The default leans toward dense search, which is fine until a user types an exact product code and gets nothing back. The fix is to expose alpha in your retrieval spec, document the chosen value, and route per-query overrides through the same configuration layer the rest of your stack reads from.
Vector databases used to compete on raw nearest-neighbour speed. That race is over. The new fight is whether your store can run hybrid queries, agent memory, and tenant isolation without bolting on three other services. Weaviate moved early, kept the open-source core, and built a managed tier for teams that don’t want to operate clusters themselves. You’re either picking a database that fits agent-era workloads, or you’re picking infrastructure that won’t survive your next architecture review.
Hybrid search hides a lot of decisions. The fusion algorithm, the alpha value, the tokenizer, the embedding model — every one of them shapes which documents a user is allowed to see and which ones quietly disappear. Who reviews those defaults? Whose answer gets ranked last because the keyword score collapsed? When the database becomes the primary memory of an AI agent, the boring infrastructure choices become the most consequential ones in the system.