Microsoft GraphRAG

Also known as: GraphRAG, Microsoft GraphRAG library, graph-based RAG

Microsoft GraphRAG: Microsoft GraphRAG is an open-source modular graph-based RAG system that uses LLM extraction to build a knowledge graph from source documents, runs hierarchical Leiden community detection, and pre-computes community summaries to support both global query-focused summarization and local entity-anchored search.

Microsoft GraphRAG is an open-source retrieval-augmented generation system that builds a knowledge graph from your documents, then uses community detection and pre-computed summaries to answer broad questions traditional RAG cannot.

What It Is

Plain RAG was built for one kind of question: “find me the passage that answers this.” It splits documents into chunks, embeds them, retrieves the closest few, and stitches them into a prompt. That works when the answer lives in a single passage. It fails when the question is broader — “what are the main themes across this dataset?” or “how do these ideas connect across the whole corpus?” — because no single chunk holds the answer. Microsoft GraphRAG was published to handle exactly that class of question.

Instead of treating documents as isolated chunks, GraphRAG runs an LLM over every chunk twice. The first pass extracts entities and relationships, building a knowledge graph where nodes are concepts (people, companies, ideas, events) and edges describe how they relate. The second pass groups densely connected nodes into communities. According to Microsoft GraphRAG concepts, this grouping uses the Leiden algorithm — a hierarchical, recursive method that partitions the graph at multiple resolutions, so you get coarse top-level themes and fine-grained sub-clusters inside the same structure.

Once the graph and communities exist, the system pre-computes a written summary for each community. At query time, two retrieval modes become possible. Local search starts from a specific entity and walks outward through its neighbours — useful for “tell me about X.” Global search reads the community summaries themselves and asks the model to combine them — useful for “what are the patterns across everything?” The graph and summaries are built once during indexing; queries are cheap, but indexing is not. According to Microsoft’s GraphRAG repository, the current release adds an --estimate-cost CLI flag so teams can preview indexing spend before committing.

How It’s Used in Practice

Most teams reach for GraphRAG when they need to query a fixed body of documents that is too large for any single prompt — a year of internal reports, a corpus of regulatory filings, a research literature collection — and where the questions span the whole set rather than asking about one passage. The workflow is consistent: clone the repository, point the indexer at a folder of source files, and let it run. Indexing pulls every chunk through the LLM twice (entities, then summaries), so the upfront bill scales with corpus size and chunk strategy, not with how many people will eventually query the result.

Pro Tip: Run --estimate-cost on a small sample before you index the whole corpus. According to Microsoft’s GraphRAG repository, that flag exists precisely because production indexing is the expensive step — not querying. If the estimate looks painful, drop entity types you do not actually search on. Each unused entity class adds tokens you never recover.

When to Use / When Not

Scenario	Use	Avoid
Multi-hop questions across hundreds of documents	✅
Quick chatbot over a single PDF		❌
Synthesizing themes from a research corpus	✅
Real-time queries over a corpus that changes hourly		❌
Fixed reference dataset with budget for one-time indexing	✅
Tight unit-economics on per-document Q&A		❌

Common Misconception

Myth: Microsoft GraphRAG is a hosted service you can call from an API. Reality: GraphRAG is an MIT-licensed Python library you install and run yourself. According to Microsoft’s GraphRAG repository, there is no managed endpoint — indexing happens on your infrastructure with your LLM credentials, which is why per-corpus cost varies so widely between teams.

One Sentence to Remember

Pick GraphRAG when the question is shaped like “what are the patterns across everything,” and budget for the indexing pass before you fall in love with the demo — the query side is cheap, the build side is where the cost lives.

FAQ

Q: How is Microsoft GraphRAG different from regular RAG? A: Standard RAG retrieves text chunks based on similarity. GraphRAG first extracts a knowledge graph and pre-summarizes connected communities, so it can answer dataset-wide questions where no single chunk holds the answer.

Q: Is Microsoft GraphRAG free to use? A: The code is MIT-licensed and free. Running it is not — every chunk is processed by an LLM during indexing, so the real cost is whatever your model provider charges to extract entities and summarize communities.

Q: Should I use GraphRAG or LazyGraphRAG? A: GraphRAG pays cost upfront during indexing; LazyGraphRAG defers most of that work to query time. According to Microsoft Research, the lazy variant cuts indexing dramatically when you do not need every community pre-summarized.

Sources

Microsoft’s GraphRAG repository: microsoft/graphrag — A modular graph-based RAG system - canonical implementation, current release notes, and the --estimate-cost CLI flag
“From Local to Global” arXiv: From Local to Global: A GraphRAG Approach to Query-Focused Summarization - original method paper that introduced the entity-graph + community-summarization pattern

Expert Takes

MONA

Traditional RAG retrieves chunks; GraphRAG retrieves structure. The Leiden algorithm partitions the entity graph into communities at multiple resolutions, and the system pre-computes a summary for each. When you ask a global question, the model reads community summaries instead of trying to reconstruct the whole corpus from scattered passages. The graph isn’t decoration — it is the index. The cost of building it is the cost of converting prose into queryable topology.

MAX

Treat GraphRAG as a build step, not a runtime. The indexing pass is where context gets shaped: which entity types you extract, how chunks are split, which prompt drives community summarization. Specify those choices in a config file the same way you would pin a build target. If your indexing config is vague, your retrieval will be noisy regardless of how clever the graph is. Garbage in, graph out.

DAN

GraphRAG is the reference architecture every vendor now positions against. The pattern is clear: Microsoft published the heavyweight version, the market responded with lighter siblings, and the original team shipped its own cheaper variant within a year. That is the rhythm of an emerging category — one expensive proof of concept legitimizes the idea, then the field races to make it affordable. Picking GraphRAG today is picking the benchmark, not always the production answer.

ALAN

A question worth sitting with: when an indexing pass rewrites every document into LLM-summarized communities, whose interpretation are you actually searching? The graph is built by a model that has its own training, its own omissions, its own quiet edits. You ask a question of your corpus and get back an answer about what the model thought your corpus was. The cost is not just compute. It is interpretive distance from the source you thought you indexed.

Back to Glossary