FAISS vs. ScaNN vs. USearch on ANN-Benchmarks: The Similarity Search Library Race in 2026

Table of Contents
TL;DR
- The shift: The similarity search library race has split into two competitions — raw throughput on single nodes versus billion-scale search on constrained infrastructure.
- Why it matters: Your choice of ANN library now determines not just speed, but what scale you can reach without re-architecting.
- What’s next: GPU-accelerated FAISS and disk-based DiskANN are pulling away at opposite ends of the scale spectrum.
Six months ago, picking an ANN library was a benchmarking exercise. Recall at 0.99, queries per second, done. That era is over. The libraries that power Similarity Search Algorithms just forked into two separate races — and the winners in each lane are building moats the others can’t cross.
The Race That Split in Two
Thesis: The ANN library competition is no longer a single race — it’s two, and optimizing for the wrong one is a strategic dead end.
For years, every ANN library competed on one playing field: recall versus throughput on glove-100-angular. One leaderboard. Comparable.
That framing broke.
GPU acceleration matured from experiment to production default. FAISS integrated NVIDIA’s cuVS, delivering 12.3x faster index builds and 8.1x lower search latency at 95% recall (Meta Engineering). It’s a lane change.
Disk-based ANN search crossed the viability threshold simultaneously. DiskANN indexes over one billion vectors at 95% recall with sub-10ms latency — using 90% less memory than in-memory HNSW (Microsoft Research).
One race rewards raw speed. The other rewards scale under constraint. They share a name. They no longer share a strategy.
Three Libraries, Three Divergent Bets
As of the last full run in April 2025, Glass, HNSW via nmslib, Milvus Knowhere, and QSG-NGT dominate the ANN-Benchmarks Pareto frontier on glove-100-angular (ANN-Benchmarks). Those rankings may have shifted since. But the most consequential moves happened off that leaderboard.
FAISS pushed GPU-first. Version 1.14.1 shipped March 2026, building on cuVS. Meta stopped competing on CPU benchmarks.
USearch went opposite — radical simplicity. Roughly 3,000 lines of code versus FAISS’s 84,000, claiming 10x faster HNSW search (USearch GitHub). Self-reported, not independently verified at scale. But ClickHouse, DuckDB, ScyllaDB, and YugabyteDB embedded it anyway.
DiskANN bet on the memory wall: what happens when your Embedding index doesn’t fit in RAM? Microsoft shipped it as a public preview inside SQL Server 2025 (Microsoft Blog) — Vamana graph algorithm, NVMe-backed. Not a research project. A database feature.
Compatibility note:
- DiskANN: Active Rust rewrite underway; C++ codebase on
cpp_mainis no longer the primary development branch. Pin to v0.49.1 for stability.
ScaNN decoupled from TensorFlow in v1.4.0 — a necessary survival move. Last release: August 2025. In a race where FAISS ships monthly, eight months of silence is a signal.
Who Gains Ground
Teams with GPU infrastructure. FAISS-cuVS is a force multiplier for organizations with NVIDIA hardware. The throughput ceiling moved dramatically upward.
Microsoft’s database customers. DiskANN in SQL Server 2025 collapses an entire architectural layer — billion-scale vector search without a separate stack.
Lightweight-stack builders. USearch embedded in databases rather than deployed standalone — vector search as a feature, not a platform.
Spotify showed where this leads. The company moved from Annoy to its own HNSW-based Voyager — 10x faster search, 4x less memory, hundreds of millions of daily queries powering Discover Weekly (Spotify Engineering). Scale forces specialization.
Who Gets Left Behind
One-size-fits-all library strategies. If your team picked an ANN library three years ago and hasn’t reassessed, you’re optimizing for a race that already ended.
CPU-only architectures at scale. The GPU acceleration gap compounds with every FAISS release. Teams without GPU access are falling further behind.
Libraries with stalled momentum. ScaNN’s eight-month release gap raises questions about Google’s investment in the standalone library. The code works. The trajectory is unclear.
The ANN-Benchmarks leaderboard itself — last full run April 2025. Decisions based solely on those plots are built on stale data.
What Happens Next
Base case (most likely): The two-lane split deepens. FAISS dominates GPU search, DiskANN owns billion-scale, USearch takes the embedded niche. ScaNN holds existing users without expanding. Signal to watch: ANN-Benchmarks adds a GPU track or billion-scale dataset. Timeline: 6-12 months.
Bull case: A unified framework abstracts across GPU, CPU, and disk search — switch backends without re-indexing. DiskANN’s Rust rewrite becomes the foundation. Signal: DiskANN Rust hits stable release with GPU support. Timeline: 12-18 months.
Bear case: Fragmentation accelerates. Migration costs become prohibitive. Teams lock into whichever bet they made first. Signal: Cloud providers ship proprietary ANN implementations bypassing open-source libraries. Timeline: 18-24 months.
Frequently Asked Questions
Q: How does Spotify use approximate nearest neighbor algorithms to power music and podcast recommendations? A: Spotify built Voyager, a custom HNSW-based library replacing its earlier Annoy system. Voyager processes hundreds of millions of daily queries with 10x faster search and 4x less memory, driving features like Discover Weekly through real-time Dot Product similarity matching.
Q: Which similarity search libraries lead ANN-Benchmarks in recall and throughput in 2026? A: As of the April 2025 full run, Glass, HNSW via nmslib, Milvus Knowhere, and QSG-NGT lead the Pareto frontier on glove-100-angular. FAISS’s GPU gains and DiskANN’s billion-scale results fall outside this benchmark’s current scope.
Q: How are GPU-accelerated FAISS and disk-based DiskANN reshaping billion-scale vector search in 2026? A: FAISS’s cuVS integration delivers over 12x faster GPU index builds. DiskANN indexes more than a billion vectors at high recall with sub-10ms latency using far less memory than in-memory HNSW. Together, they’re splitting the market into speed-first and scale-first lanes.
The Bottom Line
The similarity search algorithms market isn’t consolidating — it’s diverging. GPU-first, disk-first, and embedded-first are three distinct strategies with three distinct winners. Pick the lane that matches your infrastructure reality, not last year’s benchmark chart. You’re either choosing deliberately or you’re drifting into a dead end.
Disclaimer
This article discusses financial topics for educational purposes only. It does not constitute financial advice. Consult a qualified financial advisor before making investment decisions.
AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors