yego.me
💡 Stop wasting time. Read Youtube instead of watch. Download Chrome Extension

Cracking the Enigma of Ollama Templates


5m read
·Mar 21, 2025

In olama, the template and parameters and maybe the system prompt and license are all part of the model. This is one of the strengths of the platform and it's why it's the easiest to use for beginners and advanced users alike. Every other tool that runs AI models leaves it up to the user to determine the right template to use. The tool can guess, but you don't have to do that.

For many new models to find that the first guess isn't always the best, but when you import any supported Hugging Face model, either using the manual conversion method or the newer Hugging Face convenience method, you often need to come up with a valid template on your own. This can be a real paino for users who see a complicated template on any other model and not know where to start for the one they're importing. So, let's take a look at how templates work and how to build a new one. This is the first of the more advanced topics in the olama course and will be added to a new playlist, but it's still part of the same olama course.

This is a course I've been building out based on the most common questions on the olama Discord server. I've been in the Discord starting at the beginning since I was part of the founding olama team. I'm now focused on building out my YouTube channel and helping people learn how to use olama more effectively. This is the stuff I love to do after being a trainer at Open Text and then starting the training and documentation and Community teams at Data Dog.

So, if you're new to olama or just want a deeper understanding of how templates work, this is the perfect place to start. There are a lot of really complicated templates out there, so we're going to start out with one of the first templates that happens to also be the simplest. At the beginning of ama's Journey, Llama 2 from Meta had just been released in the docs; we see the model expects the prompt to be in this format.

The reason it expects the format is that it's the way the model was trained, so we see the whole prompt is wrapped in an ins block and then the system prompt is in the Sy block, and then after that is the user prompt. In the template used in olama, this shows up with the same structure, replacing system prompt with do system in double curly brackets and prompt also in double curly brackets. This is our introduction to the specific template language that olama uses: Go templates. All instructions in Go templates are wrapped in double curly brackets, and a word with a period at the beginning indicates that this is a variable that'll be injected into the template.

If you run the Llama server in debug mode, we can see the template working. Try asking the usual questions like, "Why is the sky blue?" and in the debug output we see this. Let's set the system prompt to be, "You are an AI that explains everything at a third grade level." Now, when we ask, "What is a black hole?" we can see this in the debug output. Let's go to a model that's a bit newer and it uses a slightly more complicated template; this is Orca Mini.

Different models often use different prompt formats, and the older models often use very different formats. What's interesting about this one is that the hashash system: will only appear if the prompt, if the system prompt is defined; otherwise, the prompt will just be three hashes, then user: then the prompt, then three hashes and response: to get this. We see the if statement that says if the system variable is defined, then show the system block, and that's ended with the end instruction. Notice also the dash at the beginning of the instruction; the dash says to trim white space from before that instruction.

Okay, let's fast forward a bit more in time to the first mistal. This template is much more complicated, but you know what? It isn't complicated. You know where I'm going with this, right? Liking the video and subscribing to the channel goes a long way to support my work here, and it's so not complicated. This template is, but it uses a lot of the same basic principles as the previous templates.

One aspect that makes it difficult to understand is that spacing isn't important. So, let's tweak the template here to look like this. What's new here is the range instruction; range is basically a "for each" instruction. We see it at the top with messages, for each message in the array. First, check if the message is a user role, and then if there are any tools defined. If tools is defined and we have at least one message, then list the tools; and if system is defined and we have at least one message, then show that.

Then it checks if the message is an assistant role, meaning the previous output from a model. If the message is content, then output the content, but if it's a tool call, then display each in a special format. Then finally check if the message has a tool role and output that. Then, going back to the top: if messages is not defined, then we have a simple system prompt and user prompt, so output that format. I don't love this template because there are a few things I find a bit clunky about it.

If we Zoom forward to some of the latest models like Small LM2, I feel like the template is easier to follow and understand. It's more streamlined and uses fewer nested blocks. I think that's a good direction for templates to go in, making them simpler and more intuitive for users. When you import a new model from Hugging Face that doesn't have a template, start simple like the first one we saw in this video—just work with the system and user prompts.

Create the model and test if it works; if it does, start working with the range of messages. When you get that working, add tools to the template, assuming tools are supported by the model. In many cas, the template may be the same as one of the pre-existing templates in ama. Look at the recent models and see if there's a similar structure; if you don't find an exact match, maybe you can adapt one of the existing templates or at least find inspiration in them.

You might think it would be great if all models could just agree on a single template to use, but that would also ensure there's no progress in making better templates. So, it can be a challenge to create new ones, but hopefully now you understand how to read a template and understand what it's doing. The next step is to just practice building them. There is a doc that goes into detail on different aspects of templates and lists all the variables available; you can find that here.

And that's everything you need to know to start building out templates, or at least being able to read them. And if you've watched all the videos in the course, you know a lot about using ama; if not, check out the rest of the course and let me know what you think. Thanks so much for being here. Goodbye.

More Articles

View All
Hypothesis test for difference in proportions example | AP Statistics | Khan Academy
We are told that researchers suspect that myopia, or nearsightedness, is becoming more common over time. A study from the year 2000 showed 132 cases of myopia in 400 randomly selected people. A separate study from 2015 showed 228 cases in 600 randomly sel…
Why Geeks are Sexy: The Wing Girls
Hey Vsauce! I’ve got something special for you today. I’m sure you’ve heard of a wingman before, but have you ever heard of a wing girl? Well, guess what? There’s two of them right now! They met with Ben and Mark in LA like a few weeks ago, and I said, “H…
These Men Love Extraordinarily Dull Things | Short Film Showcase
We formed the Dolan’s Club a while back. We got tired of reading and hearing so much about people always trying to get a fancier car, a bigger house, uh, travel to more exotic places, and come home and tell everybody they go to Las Vegas and come back sai…
London dispersion forces introduction | States of matter | High school chemistry | Khan Academy
What we’re going to do in this video is start talking about forces that exist between even neutral atoms or neutral molecules. The first of these intermolecular forces we will talk about are London dispersion forces. So, it sounds very fancy, but it’s act…
Envy Can Be Useful, or It Can Eat You Alive
Do you want to tell us about some of the jobs that you had as a youth and the specific job that kicked off your fanatical obsession with creating wealth? This gets a little personal, and I don’t want to do the humble brag thing. There was some thread goin…
Mapping a Mayan Crypt | Lost Cities with Albert Lin
I’m deep inside an ancient pyramid on the trail of a mysterious Maya dynasty called the Snake Kings. I’m so far into the heart of the pyramid my radio doesn’t work. Within these twisting tunnels, it’s impossible to know just how deep I am. But if my team …