Ollama Gets a New App

Introduction and Event Overview 00:00

  • The speaker attended the ICML conference in Vancouver, where Ollama had a booth and celebrated its second anniversary.
  • Ollama used the occasion to preview launch a new app, moving beyond their previous menu bar-based interface.

New Ollama App Features 00:35

  • The new Ollama app allows direct access to various models from within the application interface.
  • Models automatically download when first used if not already present.
  • The interface operates similarly to a chat GPT-style chat, with a visible “thinking” status during processing.
  • Users can interact with PDFs, images, and other files by dragging them into the app, enabling context-based queries.
  • The system works with multiple files, though there may be file number or size limits.
  • Settings allow customization such as context size and specifying model storage locations.
  • All chat histories are accessible within the app.

Turbo Mode and Cloud Models 02:52

  • Ollama’s new app introduces a “turbo mode,” enabling use of cloud-hosted models (not limited to local resources).
  • Turbo mode allows fast interaction with larger models, such as Kimmy K2, directly from the cloud.
  • This feature provides access to big models without requiring users to set up GPUs or API endpoints.
  • Conversations are kept in the user’s chat system and, according to the speaker, are not stored in the cloud.
  • Using turbo mode requires an Ollama.com account.
  • The free plan includes 10,000 credits per rolling 7-day period, with an option to upgrade to Pro.
  • There is no indication that the turbo feature is intended as a major revenue source; the primary aim is to provide users with access to bigger models.

Supported Models and Customization 05:02

  • Available models include Kimmy K2 and Quen 3 Mixture of Experts, with both large (235B) and smaller (30B, 3B active) versions.
  • Users can add their own models, including those suitable for images or PDFs.
  • The app is positioned as particularly useful for people who prefer not to use command-line interfaces.

Technical Details and Release Context 05:32

  • The new app is based on Ollama’s own engine, not exclusively on Llama CPP.
  • Further updates and new features are anticipated.
  • The updated interface enables rapid testing of both local and cloud-based large open models, helping users evaluate models before committing to API use.

Conclusion 06:07

  • The speaker encourages viewers to try out the new Ollama app and provides standard video sign-off.