The same underlying knowledge expressed two ways: as flat text chunks retrieved by embedding similarity, and as a typed subgraph retrieved by hash lookup and traversal. Compare what an agent actually receives in context — and why structure outperforms prose for multi-hop reasoning.
| Dimension | RAG Chunk Retrieval | Subgraph Retrieval |
|---|---|---|
| Retrieval mechanism | Approximate nearest-neighbor search in embedding space (cosine similarity over dense vectors) | Hash lookup + graph traversal from seed node (BFS / DFS over an adjacency list) |
| Context representation | Flat unstructured prose — sentences, paragraphs, and surrounding noise all arrive together | Typed nodes + directed labeled edges — only the relational structure, no prose filler |
| Typical token cost | 300–800 tokens per query (chunks contain much text irrelevant to the specific relation needed) | 30–80 tokens per query (only the nodes and edges within the retrieved subgraph) |
| Relation fidelity | Relations are implied by prose — the agent must infer "requires", "enables", "opposes" from natural language | Relations are explicit typed edge labels — unambiguous, machine-readable, provably present in the graph |
| Multi-hop reasoning | Requires multiple retrieval rounds or long-context reasoning across stacked chunks; hops are not guaranteed to be covered | Inherent — graph traversal follows edges across hops by construction; k-hop neighborhood is a single query |
| Hallucination risk | Higher — agent fills gaps between implied relations with parametric prior knowledge, which may not match the source | Lower — every edge is either in the graph or it isn't; no interpolation needed, no relations invented |
| Update cost | Re-embed changed chunks; similarity index must be rebuilt or incrementally updated | Edit the node or edge; existing hashes remain stable; no re-indexing of unrelated content |
A hash is not just an opaque ID. It is a dereferenceable pointer into a graph where every edge is labeled, directed, and typed. Compare what the agent actually sees in context for the same conceptual question — and what cognitive work each form demands.