Ollama Gets a Turbo Update
Ollama’s Birthday and New App Launch 00:00
- Ollama celebrated its second birthday at the ICML conference in Vancouver, including hosting a booth and a party.
- The main announcement was a new app providing enhanced usability and features beyond the existing menu bar interface.
- The new app interface allows users to access and use multiple models directly.
- First-time use of any model triggers an automatic download within the app.
Key Features and User Experience 01:10
- The interface functions similarly to ChatGPT, supporting chat-like interactions and displaying “thinking” statuses for responses.
- Supports retrieval-augmented generation (RAG) by allowing users to chat with PDFs, images, and other files as source context.
- Users can drag and drop multiple files for model context, such as slides or books, though there may be limits.
- The app provides settings for adjusting model context size and managing file locations.
Introduction of Turbo Mode (Cloud Models) 02:58
- Previously, Ollama was limited to running small, local models, but now introduces “turbo mode” for accessing larger models hosted in the cloud.
- Turbo mode allows fast interaction with models like Kimmy K2 directly from the app without needing a personal GPU or API setup.
- Conversations using turbo mode are not stored in the cloud to the presenter’s knowledge.
Account Requirements and Credits 04:16
- Turbo mode requires creating an ollama.com account.
- A free plan grants 10,000 credits (interpreted as tokens) per 7 days.
- Users can upgrade to a Pro plan, though pricing details were not finalized in this pre-release version.
- The founders do not intend turbo mode to be a major source of revenue but rather to fill gaps for users needing larger models.
Model Options and Future Outlook 05:07
- Users have access to the Kimmy K2 model and Quen 3 Mixture of Experts models (both large and small variants).
- The app allows importing custom models to work with images and PDFs.
- Ollama’s app now runs on its own engines, not just Llama CPP.
- The new app aims to appeal to users preferring a graphical interface over the command line.
- Anticipated future updates may bring additional features.
- The release offers a quick way to test local and large open models before integrating with other APIs or systems.