Introduction and Event Overview 00:00
- The speaker attended the ICML conference in Vancouver, where Ollama had a booth and celebrated its second anniversary.
- Ollama used the occasion to preview launch a new app, moving beyond their previous menu bar-based interface.
New Ollama App Features 00:35
- The new Ollama app allows direct access to various models from within the application interface.
- Models automatically download when first used if not already present.
- The interface operates similarly to a chat GPT-style chat, with a visible “thinking” status during processing.
- Users can interact with PDFs, images, and other files by dragging them into the app, enabling context-based queries.
- The system works with multiple files, though there may be file number or size limits.
- Settings allow customization such as context size and specifying model storage locations.
- All chat histories are accessible within the app.
Turbo Mode and Cloud Models 02:52
- Ollama’s new app introduces a “turbo mode,” enabling use of cloud-hosted models (not limited to local resources).
- Turbo mode allows fast interaction with larger models, such as Kimmy K2, directly from the cloud.
- This feature provides access to big models without requiring users to set up GPUs or API endpoints.
- Conversations are kept in the user’s chat system and, according to the speaker, are not stored in the cloud.
- Using turbo mode requires an Ollama.com account.
- The free plan includes 10,000 credits per rolling 7-day period, with an option to upgrade to Pro.
- There is no indication that the turbo feature is intended as a major revenue source; the primary aim is to provide users with access to bigger models.
Supported Models and Customization 05:02
- Available models include Kimmy K2 and Quen 3 Mixture of Experts, with both large (235B) and smaller (30B, 3B active) versions.
- Users can add their own models, including those suitable for images or PDFs.
- The app is positioned as particularly useful for people who prefer not to use command-line interfaces.
Technical Details and Release Context 05:32
- The new app is based on Ollama’s own engine, not exclusively on Llama CPP.
- Further updates and new features are anticipated.
- The updated interface enables rapid testing of both local and cloud-based large open models, helping users evaluate models before committing to API use.
Conclusion 06:07
- The speaker encourages viewers to try out the new Ollama app and provides standard video sign-off.