Ollama Gets a Turbo Update

Ollama’s Birthday and New App Launch 00:00

  • Ollama celebrated its second birthday at the ICML conference in Vancouver, including hosting a booth and a party.
  • The main announcement was a new app providing enhanced usability and features beyond the existing menu bar interface.
  • The new app interface allows users to access and use multiple models directly.
  • First-time use of any model triggers an automatic download within the app.

Key Features and User Experience 01:10

  • The interface functions similarly to ChatGPT, supporting chat-like interactions and displaying “thinking” statuses for responses.
  • Supports retrieval-augmented generation (RAG) by allowing users to chat with PDFs, images, and other files as source context.
  • Users can drag and drop multiple files for model context, such as slides or books, though there may be limits.
  • The app provides settings for adjusting model context size and managing file locations.

Introduction of Turbo Mode (Cloud Models) 02:58

  • Previously, Ollama was limited to running small, local models, but now introduces “turbo mode” for accessing larger models hosted in the cloud.
  • Turbo mode allows fast interaction with models like Kimmy K2 directly from the app without needing a personal GPU or API setup.
  • Conversations using turbo mode are not stored in the cloud to the presenter’s knowledge.

Account Requirements and Credits 04:16

  • Turbo mode requires creating an ollama.com account.
  • A free plan grants 10,000 credits (interpreted as tokens) per 7 days.
  • Users can upgrade to a Pro plan, though pricing details were not finalized in this pre-release version.
  • The founders do not intend turbo mode to be a major source of revenue but rather to fill gaps for users needing larger models.

Model Options and Future Outlook 05:07

  • Users have access to the Kimmy K2 model and Quen 3 Mixture of Experts models (both large and small variants).
  • The app allows importing custom models to work with images and PDFs.
  • Ollama’s app now runs on its own engines, not just Llama CPP.
  • The new app aims to appeal to users preferring a graphical interface over the command line.
  • Anticipated future updates may bring additional features.
  • The release offers a quick way to test local and large open models before integrating with other APIs or systems.