DAN Analysis 7 min read March 20, 2026

FAISS vs. ScaNN vs. USearch on ANN-Benchmarks: The Similarity Search Library Race in 2026

Racing chart of vector search library benchmarks with diverging performance curves at billion scale

Table of Contents

TL;DR

The shift: The similarity search library race has split into two competitions — raw throughput on single nodes versus billion-scale search on constrained infrastructure.
Why it matters: Your choice of ANN library now determines not just speed, but what scale you can reach without re-architecting.
What’s next: GPU-accelerated FAISS and disk-based DiskANN are pulling away at opposite ends of the scale spectrum.

Six months ago, picking an ANN library was a benchmarking exercise. Recall at 0.99, queries per second, done. That era is over. The libraries that power Similarity Search Algorithms just forked into two separate races — and the winners in each lane are building moats the others can’t cross.

The Race That Split in Two

Thesis: The ANN library competition is no longer a single race — it’s two, and optimizing for the wrong one is a strategic dead end.

For years, every ANN library competed on one playing field: recall versus throughput on glove-100-angular. One leaderboard. Comparable.

That framing broke.

GPU acceleration matured from experiment to production default. FAISS integrated NVIDIA’s cuVS, delivering 12.3x faster index builds and 8.1x lower search latency at 95% recall (Meta Engineering). It’s a lane change.

Disk-based ANN search crossed the viability threshold simultaneously. DiskANN indexes over one billion vectors at 95% recall with sub-10ms latency — using 90% less memory than in-memory HNSW (Microsoft Research).

One race rewards raw speed. The other rewards scale under constraint. They share a name. They no longer share a strategy.

Three Libraries, Three Divergent Bets

As of the last full run in April 2025, Glass, HNSW via nmslib, Milvus Knowhere, and QSG-NGT dominate the ANN-Benchmarks Pareto frontier on glove-100-angular (ANN-Benchmarks). Those rankings may have shifted since. But the most consequential moves happened off that leaderboard.

FAISS pushed GPU-first. Version 1.14.1 shipped March 2026, building on cuVS. Meta stopped competing on CPU benchmarks.

USearch went opposite — radical simplicity. Roughly 3,000 lines of code versus FAISS’s 84,000, claiming 10x faster HNSW search (USearch GitHub). Self-reported, not independently verified at scale. But ClickHouse, DuckDB, ScyllaDB, and YugabyteDB embedded it anyway.

DiskANN bet on the memory wall: what happens when your Embedding index doesn’t fit in RAM? Microsoft shipped it as a public preview inside SQL Server 2025 (Microsoft Blog) — Vamana graph algorithm, NVMe-backed. Not a research project. A database feature.

Compatibility note:
DiskANN: Active Rust rewrite underway; C++ codebase on cpp_main is no longer the primary development branch. Pin to v0.49.1 for stability.

ScaNN decoupled from TensorFlow in v1.4.0 — a necessary survival move. Last release: August 2025. In a race where FAISS ships monthly, eight months of silence is a signal.

Who Gains Ground

Teams with GPU infrastructure. FAISS-cuVS is a force multiplier for organizations with NVIDIA hardware. The throughput ceiling moved dramatically upward.

Microsoft’s database customers. DiskANN in SQL Server 2025 collapses an entire architectural layer — billion-scale vector search without a separate stack.

Lightweight-stack builders. USearch embedded in databases rather than deployed standalone — vector search as a feature, not a platform.

Spotify showed where this leads. The company moved from Annoy to its own HNSW-based Voyager — 10x faster search, 4x less memory, hundreds of millions of daily queries powering Discover Weekly (Spotify Engineering). Scale forces specialization.

Who Gets Left Behind

One-size-fits-all library strategies. If your team picked an ANN library three years ago and hasn’t reassessed, you’re optimizing for a race that already ended.

CPU-only architectures at scale. The GPU acceleration gap compounds with every FAISS release. Teams without GPU access are falling further behind.

Libraries with stalled momentum. ScaNN’s eight-month release gap raises questions about Google’s investment in the standalone library. The code works. The trajectory is unclear.

The ANN-Benchmarks leaderboard itself — last full run April 2025. Decisions based solely on those plots are built on stale data.

What Happens Next

Base case (most likely): The two-lane split deepens. FAISS dominates GPU search, DiskANN owns billion-scale, USearch takes the embedded niche. ScaNN holds existing users without expanding. Signal to watch: ANN-Benchmarks adds a GPU track or billion-scale dataset. Timeline: 6-12 months.

Bull case: A unified framework abstracts across GPU, CPU, and disk search — switch backends without re-indexing. DiskANN’s Rust rewrite becomes the foundation. Signal: DiskANN Rust hits stable release with GPU support. Timeline: 12-18 months.

Bear case: Fragmentation accelerates. Migration costs become prohibitive. Teams lock into whichever bet they made first. Signal: Cloud providers ship proprietary ANN implementations bypassing open-source libraries. Timeline: 18-24 months.

Frequently Asked Questions

Q: How does Spotify use approximate nearest neighbor algorithms to power music and podcast recommendations? A: Spotify built Voyager, a custom HNSW-based library replacing its earlier Annoy system. Voyager processes hundreds of millions of daily queries with 10x faster search and 4x less memory, driving features like Discover Weekly through real-time Dot Product similarity matching.

Q: Which similarity search libraries lead ANN-Benchmarks in recall and throughput in 2026? A: As of the April 2025 full run, Glass, HNSW via nmslib, Milvus Knowhere, and QSG-NGT lead the Pareto frontier on glove-100-angular. FAISS’s GPU gains and DiskANN’s billion-scale results fall outside this benchmark’s current scope.

Q: How are GPU-accelerated FAISS and disk-based DiskANN reshaping billion-scale vector search in 2026? A: FAISS’s cuVS integration delivers over 12x faster GPU index builds. DiskANN indexes more than a billion vectors at high recall with sub-10ms latency using far less memory than in-memory HNSW. Together, they’re splitting the market into speed-first and scale-first lanes.

The Bottom Line

The similarity search algorithms market isn’t consolidating — it’s diverging. GPU-first, disk-first, and embedded-first are three distinct strategies with three distinct winners. Pick the lane that matches your infrastructure reality, not last year’s benchmark chart. You’re either choosing deliberately or you’re drifting into a dead end.

Disclaimer

This article discusses financial topics for educational purposes only. It does not constitute financial advice. Consult a qualified financial advisor before making investment decisions.

Aha Moments

MONA

The split has a mathematical root. HNSW graphs exploit the geometric structure of high-dimensional Euclidean Distance spaces — but that structure behaves differently depending on whether your index fits in RAM. When it does, graph traversal is fast because random access is cheap. When it doesn’t, every hop becomes a disk seek and the performance model collapses. DiskANN’s Vamana algorithm redesigns the graph topology for sequential access patterns — a different algorithm, not a deployment trick. The GPU story is equally structural: Product Quantization and Locality Sensitive Hashing parallelize naturally across CUDA cores, which is why cuVS speedups are multiplicative rather than additive. The two-lane split is a consequence of the underlying math behaving differently at different scales.

MAX

Mona’s right about the math, but the operational consequence is what most teams miss. If your system needs both fast-turnaround queries and massive catalog searches, you’re looking at a two-library architecture — two index pipelines, two deployment patterns, two failure modes. The spec question isn’t which library is fastest. It’s how many libraries your team can maintain without operational overhead eating the performance gains. DiskANN inside SQL Server changes that equation by eliminating the standalone service layer. Vector search becomes a query type, not an infrastructure project. That kind of architectural collapse shifts adoption curves fast. The teams that struggle will be the ones forcing a single library across both lanes.

ALAN

Both of you are mapping the technical terrain. But when search infrastructure fragments into specialized lanes, the benchmarks that unified them lose authority. Dan flagged the staleness of the main leaderboard — last full run nearly a year ago. What happens when the standard measuring stick can no longer measure what matters? GPU throughput and billion-scale disk search aren’t different optimization targets — they’re different definitions of good enough. When an industry loses its shared definition of quality, purchasing decisions stop being technical and start being political. The question isn’t which library wins. It’s who gets to define what winning means — and whether that answer will be set by open benchmarks or by whichever cloud vendor ships the default first?

AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors