SUMM

The speaker attended the ICML conference in Vancouver, where Ollama had a booth and celebrated its second anniversary.
Ollama used the occasion to preview launch a new app, moving beyond their previous menu bar-based interface.

The new Ollama app allows direct access to various models from within the application interface.
Models automatically download when first used if not already present.
The interface operates similarly to a chat GPT-style chat, with a visible “thinking” status during processing.
Users can interact with PDFs, images, and other files by dragging them into the app, enabling context-based queries.
The system works with multiple files, though there may be file number or size limits.
Settings allow customization such as context size and specifying model storage locations.
All chat histories are accessible within the app.

Ollama’s new app introduces a “turbo mode,” enabling use of cloud-hosted models (not limited to local resources).
Turbo mode allows fast interaction with larger models, such as Kimmy K2, directly from the cloud.
This feature provides access to big models without requiring users to set up GPUs or API endpoints.
Conversations are kept in the user’s chat system and, according to the speaker, are not stored in the cloud.
Using turbo mode requires an Ollama.com account.
The free plan includes 10,000 credits per rolling 7-day period, with an option to upgrade to Pro.
There is no indication that the turbo feature is intended as a major revenue source; the primary aim is to provide users with access to bigger models.

Available models include Kimmy K2 and Quen 3 Mixture of Experts, with both large (235B) and smaller (30B, 3B active) versions.
Users can add their own models, including those suitable for images or PDFs.
The app is positioned as particularly useful for people who prefer not to use command-line interfaces.

The new app is based on Ollama’s own engine, not exclusively on Llama CPP.
Further updates and new features are anticipated.
The updated interface enables rapid testing of both local and cloud-based large open models, helping users evaluate models before committing to API use.

The speaker encourages viewers to try out the new Ollama app and provides standard video sign-off.

Ollama Gets a New App