Why Your Agent’s Brain Needs a Playbook: Practical Wins from Using Ontologies - Jesús Barrasa, Neo4j

Introduction to Graph RAG and Ontologies 00:30

  • Jesus Barrasa, Field CTO for AI with Neo4j, discusses the combination of knowledge graphs and large language models (LLMs) for AI applications.
  • The session focuses on using ontologies to achieve practical wins in graph RAG architecture.

Understanding Graph RAG and its Benefits 01:08

  • Graph RAG replaces a traditional vector database with a knowledge graph built on a graph database.
  • The AI application receives a user prompt, retrieves relevant information from a curated knowledge base (the graph), and passes it to the LLM for a more grounded response.
  • Knowledge graphs offer richer retrieval strategies than vector search alone, including contextualization based on data connections and the ability to generate structured queries.
  • This approach leads to improved quality, completeness, relevance, precision, and faithfulness in results.

Building Knowledge Graphs 02:45

  • Neo4j implements the property graph model, consisting of nodes (entities like persons, objects, documents) and directed relationships connecting them, both having key-value properties.
  • When building graphs from unstructured data, a "lexical graph" (description of source documents) is often used to enrich the representation.
  • Documents are partitioned into chunks, which can be simple sequential lists or richer tree-like structures, and are connected to the domain graph through entity extraction.
  • Node properties can include vectors, allowing for integrated vector storage and indexing within the graph platform.
  • Graph creation pipelines differ for structured and unstructured data, but both require a target schema.
  • For structured data, tabular representations are mapped to a defined graph model.
  • For unstructured data, documents are processed, split into chunks, embedded, and subjected to entity extraction, requiring the injection of a domain-specific schema (e.g., types of entities and relationships to extract).

The Role of Ontologies in Schema Management 06:49

  • Ontologies provide an agnostic and general approach to representing schemas, acting as a shared, formal, and implementation-agnostic description of a domain.
  • They define classes (e.g., "privately held company," "stock corporation"), subclass relationships, and connections between classes (e.g., "stock corporation is governed by a board agreement").
  • Ontologies are ideal for driving knowledge graph creation for both structured and unstructured data pipelines.
  • A model-driven approach is fundamental for building better graphs and achieving long-term benefits, as a good graph description improves text-to-structured queries and contextualized vector searches.

Dynamic Retrieval Strategies with Ontologies 09:02

  • Graph retrieval involves vector searching embedded text chunks within nodes, then dereferencing these back to the graph to contextualize results by navigating and enriching connected data.
  • A code example shows a retriever performing a vector search on movie plots and then using a "retrieval query" to explore the graph based on the search results.
  • A limitation of this approach is the need to hardcode retrieval logic, which makes it rigid.
  • To overcome this, the ontology itself can be persisted within the graph, alongside the data.
  • By querying the ontology, dynamic queries can be created that consult the ontology to determine which relationships to navigate for contextualization (e.g., finding "actor in" relationships for a movie, but not "producer").
  • This allows ontologies to drive the behavior of retrievers dynamically, meaning changes to the ontology (a data artifact) can alter retriever behavior on the fly.

Key Takeaways 13:25

  • Ontologies serve as an implementation-agnostic data model for knowledge graph creation.
  • Storing ontologies within the graph can enable dynamic behavior in retrievers.