Practical GraphRAG: Making LLMs smarter with Knowledge Graphs — Michael, Jesus, and Stephen, Neo4j

Introduction & The Case for GraphRAG 00:00

  • GraphRAG aims to make LLMs smarter by adding knowledge graphs into the Retrieval-Augmented Generation (RAG) pipeline.
  • Traditional LLMs often lack enterprise domain knowledge, struggle with verification/explanation, are prone to hallucinations, and present ethical/data bias concerns.
  • Vector databases used in standard RAG systems provide limited, sometimes irrelevant results and lack explainability/scalability for robust enterprise solutions.
  • Knowledge graphs bring accurate, contextual, and explainable answers by providing structured data.

Advantages & Adoption of GraphRAG 03:06

  • Knowledge graphs improve context and explanation by structuring nodes, relationships, and properties, allowing for richer metadata connections.
  • Microsoft Research and other papers demonstrate GraphRAG leads to better results and reduced token costs versus standard RAG methods.
  • Industry studies, including an early data.world study, show a 3x improvement in LLM response accuracy using graph-based retrieval.
  • Gartner reports GraphRAG as an emerging trend, breathing new life into AI with its grounding in facts and reduction of hallucinations.
  • Large organizations (e.g., LinkedIn) achieve production benefits: LinkedIn’s use case noted a 28.6% reduction in customer support issue resolution time with knowledge graphs.

Building Knowledge Graphs for GraphRAG 07:02

  • Constructing a knowledge graph from unstructured data involves:
    • Substructuring data into a lexical graph (representing documents, chunks, and relationships).
    • Entity extraction using LLMs based on a graph schema to identify entities and relationships.
    • Enriching the graph via algorithms like PageRank or community summarization.
  • High initial effort in data engineering leads to higher quality, structured data and richer retrieval results for queries.

Graph Patterns, Models & Contextual Retrieval 08:35

  • Patterns for structuring/querying graphs are collected and publicly shared on graph.com.
  • Lexical graphs capture document structure; domain graphs capture relationships between entities and concepts.
  • Graphs facilitate complex relationships, like linking document elements by parent/child or semantic similarity.
  • Entity extraction now leverages LLMs' multilingual capabilities and large context windows for recognition/matching tasks.
  • Knowledge graphs can be enriched with existing structured data (e.g., CRM entities linked with call transcripts).
  • Graph algorithms support further enrichment, such as topic clustering across multiple documents.

Retrieval with Graph-Based Approaches 12:56

  • Retrieval in GraphRAG is more advanced than simple vector search; it includes index search (vector, full text, spatial) followed by relationship-based context expansion.
  • User and external context determine what information and relationships are retrieved, making responses more tailored.
  • Modern LLMs can process graph patterns, enabling richer context provision (node-relationship-node patterns).
  • Graph algorithms further enhance retrieval with features like clustering and link prediction.

Practical Examples & Tooling 14:33

  • Tools exist for extracting knowledge graphs from unstructured sources like PDFs, YouTube transcripts, Wikipedia articles.
  • Demonstration showed a tool that lets users upload varied content, build a knowledge graph with extracted entities and relationships, and define extraction schemas for improved results.
  • Users can trace back responses to their sources and the graph entities involved, supporting explainability and evaluation.
  • An "agentic" approach breaks questions down into subtasks, each handled by domain-specific retrievers/cypher queries, creating more nuanced responses.

Developer Resources & Wrap-Up 18:34

  • The graph python package consolidates graph construction and retrieval into one pipeline, with visualization capabilities.
  • All tools, patterns, and examples are available as open source on graph.com, with contributions welcomed.
  • Attendees were invited to visit the Neo4j booth for further discussion and demonstrations.