Memory Masterclass: Make Your AI Agents Remember What They Do! — Mark Bain, AIUS

Introduction and Workshop Overview 00:00

  • Mark Bain introduces himself and the AI Engineer conference.
  • The workshop agenda includes a power talk on AI memory, four live demos, a new solution called "graph chat arena," and a Q&A session.
  • Attendees are encouraged to join a Slack channel and set up materials for the workshop.

Mark Bain's AI Memory Philosophy 02:07

  • Mark's background includes math olympiads and a deep understanding of math and physics.
  • He recalls a 2014 conversation with OpenAI co-founders, Wojciech Zaremba and Ilya Sutskever, where Ilya couldn't explain how future AI systems would communicate.
  • Mark's research over the last two years led him to define AI memory as any data in any format, including code, algorithms, hardware, and their causal changes.
  • He observed parallels between biological, cosmological, and quantum systems' use of memory.
  • Principles governing LLMs, neuroscience, and mathematics are the same, linking concepts like entropy, curvature, and attention.
  • He proposes an equation: memory * compute = (imaginary unit circle)^2, suggesting perfect symmetries would prevent existence, and asymmetries are necessary for us to exist.
  • LLM weights provide structure, compute transforms data into weights, and biases are tiny shifts that adapt the model to reality.
  • Scaling models aims for the "disappearance of biases" by having equal amounts of memory and compute.
  • The universe is conceptualized as a temporal network database with a graph structure, where relationships preserve causality.
  • Preserving causal links in memory systems (like graph RAG) helps solve hallucinations and optimize hypothesis generation/testing.
  • Hallucinations are seen as necessary when there is insufficient memory or compute for the problem's combinatorial space.
  • Agentic systems are the "next big thing" because they follow the network database principle, but require graph databases for causality.

Vasilia Marovitz (Cogni) Demo: Semantic Memory 16:27

  • Vasilia Marovitz, from Cogni, has a background in business, big data engineering, and clinical psychology.
  • She demonstrated Cogni, a memory tool built on graph and vector databases.
  • The demo involved a "Mexican standoff" between two developers, analyzing their GitHub repositories using a crew of agents.
  • Cognify ingests GitHub data, builds a semantic graph, enriches it, and allows agents to search and make decisions.
  • Cogni is a modular framework supporting over 30 data sources, custom graph building from various data types, and memory association layers inspired by cognitive science.
  • The graph is stateful and temporal, continuously updated by agents who can write and add data.
  • The demo concluded with a final report recommending one developer (who had a PhD in graphs) based on their contributions and benchmarks.
  • Cogni is open-source.

Alex Gilmore (Neo4j) Demo: MCP Server 22:20

  • Alex Gilmore, an AI architect at Neo4j, presented their Memory MCP (Multi-Client Protocol) server.
  • The setup requires a Neo4j database (cloud or local) and Claude desktop, configured to connect to Neo4j.
  • The MCP server uses tools accessible via it, and a system prompt ensures proper memory recall and logging.
  • The demo showed a conversation about starting an agentic AI memory company, where the server recalls relevant memories and creates new entities and relationships in a knowledge graph after each interaction.
  • Entities have a name, type, and observations (facts), while relationships identify how entities relate, providing rich context.
  • The knowledge graph can be visualized in the Neo4j browser, showing nodes (e.g., Neo4j, MCP, LangGraph) and their associated observations.
  • This knowledge graph, created from a single conversation, can be reused in additional conversations and with other clients (e.g., Cursor IDE, Windsurf), serving as a powerful memory layer for applications.

Daniel Chalef (Graffiti & Zep) Demo: Temporal Graphs & Domain-Aware Memory 27:37

  • Daniel Chalef, co-founder of Graffiti and Zep, emphasized that there is no one-size-fits-all memory solution and memory must be modeled after the business domain.
  • He explained that Zep, an open-source temporal graph framework, allows building custom entities and edges for specific business objects.
  • Current LLMs struggle with relevance, often pulling arbitrary facts that lead to inaccurate responses or hallucinations because semantic similarity does not equate to business relevance.
  • Many frameworks simply dump facts into vector databases, making it hard to differentiate what should be returned.
  • Zep's solution provides "domain-aware memory" by enabling developers to define explicit business objects (e.g., financial goals, debts, income sources) using schemas like Pydantic or Zod.
  • These schemas allow for defining fields, business rules, and building tools for agents to retrieve specific, filtered information by node type.
  • A demo of a finance coach application showed Zep parsing messages and capturing relevant financial data as structured entities in a knowledge graph, which can be visualized in the Zep front end.
  • Zep registers these objects to build an ontology in the graph, allowing for precise retrieval of information.

Mark Bain's Agentic Firewall Use Case & GraphRAG Chat Arena 35:41

  • Mark's past experience in cybersecurity involved navigating many terminals and proprietary shells, highlighting the need for LLMs to translate languages and create "human language shells."
  • These shells would benefit from episodic/temporal memory to track user behavior, code execution, and span across users, machines, and sessions for better security context.
  • He identified a niche for "agentic firewalls" that would process agent code execution in terminals.
  • He demonstrated a shell interface where users can type natural language commands (e.g., pwd, show me running docker containers, show if Apache is running, show the command we did three commands ago).
  • This temporal logging and episodic memory, he believes, will be crucial for enterprise-grade agents.
  • He introduced the GraphRAG Chat Arena, a "web arena for memory" prototype designed for simulating and evaluating the evolution of agentic graph memory.
  • The arena incorporates different memory solutions (Mezzerro, Graffiti, Cogni, Neo4j) and allows switching between agents (e.g., Neo4j agent, Cipher graph agent) to test how graphs are created and retrieved.
  • The goal is to provide a simulation environment for evolving agentic graph memory, as traditional benchmarks are insufficient.

Q&A Session 45:30

  • How to decide what is "bad memory" over time?
    • Mark suggests noisy memory, lacking relationships, is likely incorrect; less connected nodes have potential for error.
    • Vasilia adds that modeling data with Pydantic, adding weights to edges/nodes (e.g., temporal weighting), and custom logic helps track data evolution and relevance.
    • Mark concludes that missing causal links are a good indicator of fuzziness.
  • How to embed security or privacy (e.g., corporate top-secret data, personal data sharing restrictions)?
    • Mark emphasizes the need for understanding context, intentions, and applying correct ontologies to the enterprise cybersecurity stack for guard rails.
    • Alex adds that Neo4j supports role-based access controls (RBAC) to restrict user access to data.
    • Vasilia suggests isolating graphs per user or keeping them physically separate.
  • Re-explanation of the equation relating gravity, entropy, memory, and compute.
    • Mark briefly explains: memory * compute = (imaginary unit circle)^2, which suggests perfect symmetry, but asymmetries are needed for existence.
    • He links curvature to attention and gravity, and diffusion to heat and entropy.
    • Perelman's proof (for 3D spheres) suggests that attention, diffusion models, and VAEs are likely the only enduring LLM architectures, as they smooth and preserve necessary asymmetries, allowing for "biases" that reflect reality.