If a model is not available in the official Ollama library, or if you want to use a specific GGUF file from Hugging Face, you can download it manually and then import it into Ollama.
Step 1: Download the GGUF file from Hugging Face
First, find the target model’s GGUF file on Hugging Face. You will usually see multiple quantized versions, such as:
Q4_K_MQ5_K_MQ8_0
Which version you choose depends on your VRAM, RAM, and your tradeoff between speed and quality. After downloading, place the .gguf file in a fixed directory so you can reference it from the Modelfile.
Step 2: Write the Modelfile
Create a Modelfile in the same directory as the model file. The most basic version looks like this:
|
|
If the filename is different, replace it with the actual filename, for example:
|
|
If your goal is just to get it running, this single FROM line is usually enough.
Step 3: Import it into Ollama
Then run:
|
|
myModelNameis the local model name you want to use inside Ollama-f Modelfiletells Ollama to create the model from that file
Once the creation succeeds, the GGUF file becomes a local model that you can call directly.
Step 4: Run the model
After creation, run:
|
|
From that point on, it works much like a model pulled with ollama pull.
How to inspect an existing model’s Modelfile
If you are not sure how to write a Modelfile, you can inspect the configuration of an existing model directly:
|
|
This command prints the Modelfile for llama3.2, which is useful as a reference for:
- How
FROMshould be written - How the template and system prompt are structured
- How parameters are declared
When this approach makes sense
This manual Hugging Face import flow is useful when:
- The model you want is not available in Ollama’s official library
- You want a specific quantized variant
- You have already downloaded the
GGUFfile manually - You want finer control over how the model is packaged
If Ollama already provides an official version, using pull is usually simpler. But when you need a specific quantization or a custom wrapper, GGUF + Modelfile gives you more flexibility.
Common notes
- The path after
FROMmust match the actual location of the.gguffile. - If the filename contains spaces or special characters, it is better to rename it first.
- Different
GGUFquantization levels can greatly affect memory use and speed, so successful import does not guarantee smooth runtime performance. - If the model is a chat model, you may still need to adjust the prompt template later for better results.
Conclusion
Downloading a GGUF file from Hugging Face and importing it into Ollama is not complicated. Prepare the model file, write a minimal Modelfile, then run ollama create, and you can bring a third-party GGUF model into your Ollama workflow.