Implementing Causal Graph RAG to Enhance Multi-Hop Reasoning and Root-Cause Analysis

A new approach to Retrieval-Augmented Generation (RAG) leverages directed causal graphs instead of traditional text chunking, resulting in a +0.33 performance increase on multi-hop queries when paired with Claude Haiku.

Moving Beyond Flat RAG Architectures

Traditional RAG systems typically rely on "flat" indexing, where documents are split into discrete chunks and retrieved via vector similarity. While effective for simple fact retrieval, this methodology often fails when addressing complex queries—specifically those asking "what ultimately caused X?". In such cases, the cause and the effect frequently reside in separate chunks without shared vocabulary, creating a retrieval gap that prevents the LLM from connecting the dots.

The Causal Graph Approach

To solve this limitation, a novel system has been developed that constructs a directed causal graph from source documents rather than utilizing standard chunking. By mapping relationships as edges within a graph, the system can traverse the chain of causality. When a query requires root-cause analysis, the system follows these graph edges to retrieve the full causal chain, ensuring that logically linked information is captured regardless of whether the text shares similar embeddings.

Benchmark Results and Evaluation

The system was evaluated using a benchmark of 54 questions across two complex domains: the subprime mortgage crisis and the Chernobyl disaster. The evaluation focused on three distinct question categories:

  • Fact Lookups: Simple retrieval of specific data points.
  • Multi-hop Queries: Questions requiring the synthesis of information from multiple sources.
  • Root-cause Analysis: Questions identifying the primary drivers of an event.

Using Claude Haiku for the generation phase, the Causal Graph RAG demonstrated a significant improvement of +0.33 on multi-hop queries compared to a standard flat RAG implementation.

Note: The provided data is based on a community report; specific metric definitions (e.g., the exact nature of the +0.33 increase) and the full methodology for graph construction were not detailed in the source material.

Original Source
RAG Knowledge Graphs Causal Inference Multi-hop Reasoning Claude Haiku