Spotlight on Databricks: Driving data intelligence with AI

Introduction and Databricks Overview 00:12

  • The speaker, Craig, leads product management at Databricks and has prior experience at Google (Vertex AI) and AWS (SageMaker).
  • Databricks is a leading cross-cloud data platform with tens of thousands of customers and billions in revenue.
  • The company is known for popular open-source tools such as Spark, MLflow, and Delta.

Data Challenges in Large Enterprises 01:12

  • Large enterprises often deal with highly fragmented data due to multiple acquisitions and different systems across various clouds.
  • Data is scattered among numerous warehouses, making integration and access difficult.
  • Expertise within organizations is often siloed, complicating cross-platform data utilization, especially for AI initiatives.

Enabling AI with Databricks and Mosaic AI 02:59

  • Databricks focuses on managing data and delivering AI capabilities, particularly with Mosaic AI.
  • Emphasis is placed on addressing "data intelligence" as distinct from "general intelligence," focusing on connecting AI systems to an enterprise's complex data estate.

Case Study: FactSet's Query Language and AI Transformation 03:46

  • FactSet, a financial services company, had a proprietary query language (FQL) that limited customer access.
  • Initial GenAI solution: English-to-FQL translation achieved 59% accuracy with a 15-second latency (correlating to cost).
  • Decomposition and agent-based multi-step process improved accuracy to 85% with a 6-second latency, later reaching "the 9s."
  • Highlights the importance of breaking down complex prompts into manageable tasks for performance tuning.

Building Robust, High-Risk AI Systems 07:58

  • Databricks prioritizes enabling high-value use cases with financial or reputational risk.
  • Successful enterprise AI systems require:
    • Governance: Controlling access to data, models, tools, and queries at a granular level.
    • Evaluation: Quantifying and improving model/system accuracy objectively.
  • Many organizations try to create deterministic systems using inherently probabilistic AI components, requiring careful management.

Governance and Deterministic AI Systems 09:11

  • Databricks governs not just data but also models and tool access.
  • Agents are treated as principals, and all actions are tightly controlled, with further capabilities forthcoming.
  • Incorporating vector stores or feature stores for reasoning over data is standard.

Tool Calling and Integration with Claude 10:57

  • Tool calling allows LLMs to select among various tools or pathways, supporting quasi-deterministic outcomes.
  • The integration of Claude (frontier LLM by Anthropic) into Databricks has improved the accuracy of tool selection.
  • Claude is natively available across all major clouds (Azure, AWS, GCP) within Databricks, supporting advanced agent use cases.

Enterprise Adoption and Value Creation 13:47

  • Highly governed industries (banks, hospitals) are now able to use generative AI after implementing robust controls.
  • Databricks and Claude together enable customers to unlock high-value use cases, moving AI from experimental to operational stages.

Evaluation Platform for AI Quality 16:01

  • Databricks provides an evaluation (eval) platform involving golden data sets and LLM-based judges for assessing system performance.
  • The platform includes simplified UIs for subject matter experts to provide feedback and corrections.
  • Much of the evaluation tooling is open source via MLflow, though some custom judges remain proprietary.

Practical Examples and Customer Success 17:53

  • Databricks uses Claude to automate answering extensive questionnaires from analysts (e.g., Gartner, Forrester), streamlining from hundreds of employee hours to simple editing.
  • Iteration from open-source models to non-Anthropic models finally culminated in using Claude for shippable, high-quality results.
  • Block (formerly Square) uses Databricks and Claude to power "Goose," an open-source agentic developer environment, achieving 40-50% weekly user adoption increases and saving 8-10 hours per week.

Best Practices and Final Recommendations 21:49

  • Enterprises are encouraged to identify AI use cases, define success metrics, and contact Databricks or Anthropic for tailored support.
  • Building composable agentic systems allows for greater control and tuning in high-risk environments.
  • The approach involves decomposing problems and building deterministic behaviors atop probabilistic LLMs.
  • Databricks aims to tightly integrate AI and data layers for next-level productivity gains.

Audience Questions and Closing Thoughts 22:58

  • "Safe score" in LLM judges acts as a simple guardrail, not an adversarial (red teaming) metric.
  • Differentiation from competitors is rooted in deep integration between data and AI rather than point solutions.
  • Encouragement is given to use composable, agent-based architectures for fine-grained control and error mitigation.
  • Final thanks and invitation for further discussion.