yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Cracking the Enigma of Ollama Templates


5m read
·Mar 21, 2025

In olama, the template and parameters and maybe the system prompt and license are all part of the model. This is one of the strengths of the platform and it's why it's the easiest to use for beginners and advanced users alike. Every other tool that runs AI models leaves it up to the user to determine the right template to use. The tool can guess, but you don't have to do that.

For many new models to find that the first guess isn't always the best, but when you import any supported Hugging Face model, either using the manual conversion method or the newer Hugging Face convenience method, you often need to come up with a valid template on your own. This can be a real paino for users who see a complicated template on any other model and not know where to start for the one they're importing. So, let's take a look at how templates work and how to build a new one. This is the first of the more advanced topics in the olama course and will be added to a new playlist, but it's still part of the same olama course.

This is a course I've been building out based on the most common questions on the olama Discord server. I've been in the Discord starting at the beginning since I was part of the founding olama team. I'm now focused on building out my YouTube channel and helping people learn how to use olama more effectively. This is the stuff I love to do after being a trainer at Open Text and then starting the training and documentation and Community teams at Data Dog.

So, if you're new to olama or just want a deeper understanding of how templates work, this is the perfect place to start. There are a lot of really complicated templates out there, so we're going to start out with one of the first templates that happens to also be the simplest. At the beginning of ama's Journey, Llama 2 from Meta had just been released in the docs; we see the model expects the prompt to be in this format.

The reason it expects the format is that it's the way the model was trained, so we see the whole prompt is wrapped in an ins block and then the system prompt is in the Sy block, and then after that is the user prompt. In the template used in olama, this shows up with the same structure, replacing system prompt with do system in double curly brackets and prompt also in double curly brackets. This is our introduction to the specific template language that olama uses: Go templates. All instructions in Go templates are wrapped in double curly brackets, and a word with a period at the beginning indicates that this is a variable that'll be injected into the template.

If you run the Llama server in debug mode, we can see the template working. Try asking the usual questions like, "Why is the sky blue?" and in the debug output we see this. Let's set the system prompt to be, "You are an AI that explains everything at a third grade level." Now, when we ask, "What is a black hole?" we can see this in the debug output. Let's go to a model that's a bit newer and it uses a slightly more complicated template; this is Orca Mini.

Different models often use different prompt formats, and the older models often use very different formats. What's interesting about this one is that the hashash system: will only appear if the prompt, if the system prompt is defined; otherwise, the prompt will just be three hashes, then user: then the prompt, then three hashes and response: to get this. We see the if statement that says if the system variable is defined, then show the system block, and that's ended with the end instruction. Notice also the dash at the beginning of the instruction; the dash says to trim white space from before that instruction.

Okay, let's fast forward a bit more in time to the first mistal. This template is much more complicated, but you know what? It isn't complicated. You know where I'm going with this, right? Liking the video and subscribing to the channel goes a long way to support my work here, and it's so not complicated. This template is, but it uses a lot of the same basic principles as the previous templates.

One aspect that makes it difficult to understand is that spacing isn't important. So, let's tweak the template here to look like this. What's new here is the range instruction; range is basically a "for each" instruction. We see it at the top with messages, for each message in the array. First, check if the message is a user role, and then if there are any tools defined. If tools is defined and we have at least one message, then list the tools; and if system is defined and we have at least one message, then show that.

Then it checks if the message is an assistant role, meaning the previous output from a model. If the message is content, then output the content, but if it's a tool call, then display each in a special format. Then finally check if the message has a tool role and output that. Then, going back to the top: if messages is not defined, then we have a simple system prompt and user prompt, so output that format. I don't love this template because there are a few things I find a bit clunky about it.

If we Zoom forward to some of the latest models like Small LM2, I feel like the template is easier to follow and understand. It's more streamlined and uses fewer nested blocks. I think that's a good direction for templates to go in, making them simpler and more intuitive for users. When you import a new model from Hugging Face that doesn't have a template, start simple like the first one we saw in this video—just work with the system and user prompts.

Create the model and test if it works; if it does, start working with the range of messages. When you get that working, add tools to the template, assuming tools are supported by the model. In many cas, the template may be the same as one of the pre-existing templates in ama. Look at the recent models and see if there's a similar structure; if you don't find an exact match, maybe you can adapt one of the existing templates or at least find inspiration in them.

You might think it would be great if all models could just agree on a single template to use, but that would also ensure there's no progress in making better templates. So, it can be a challenge to create new ones, but hopefully now you understand how to read a template and understand what it's doing. The next step is to just practice building them. There is a doc that goes into detail on different aspects of templates and lists all the variables available; you can find that here.

And that's everything you need to know to start building out templates, or at least being able to read them. And if you've watched all the videos in the course, you know a lot about using ama; if not, check out the rest of the course and let me know what you think. Thanks so much for being here. Goodbye.

More Articles

View All
The Most Disturbing Reality TV Show of All Time
[Applause] What else? Yeah, let’s see. What else would you walk around naked with 17 million people watching, including your friends and family? Get locked away for 15 months and have zero contact with the outside world, and have to choose between starvat…
Stealth Wealth (Explained)
They say to live happy, live hidden. Something you’re not yet aware of is happening in the markets, and the implication it has will most likely impact you. Rich people are changing their behaviors to accommodate the current moment in time, and the average…
Measuring lengths in different units
So I have the same green rectangle up here and down here, and what I want to do is measure its width. But we’re going to measure its width in two different ways. Up here, we’re going to measure its width in terms of how many of these paper clips wide the …
Linear vs. exponential growth: from data (example 2) | High School Math | Khan Academy
The temperature of a glass of warm water after it’s put in a freezer is represented by the following table. So we have time in minutes and then we have the corresponding temperature at different times in minutes. Which model for C of T, the temperature of…
Ian Somerhalder Goes on a Sub Adventure | Years of Living Dangerously
[Music] I’m aboard this amazing research and filming ship called the Aluia. It’s equipped with two deep diving submersibles. There’s one behind me, the Triton, and behind that is the Deep Rover, a two-man submarine. Both subs are rated for 1000 meters. We…
Basic Site Navigation on Khan Academy
In this video, we will browse through Khan Academy together. We will start by logging into the platform and reviewing some of the key navigation features together. To get started, go to khanacademy.org and click “Teachers” in the center of the screen. If …