Skip to content

🍋 Lemonade Server Models

This document provides the models we recommend for use with Lemonade Server.

Click on any model to learn more details about it, such as the Lemonade Recipe used to load the model. Content:

Model Management GUI

Lemonade Server offers a model management GUI to help you see which models are available, install new models, and delete models. You can access this GUI by starting Lemonade Server, opening http://localhost:8000 in your web browser, and clicking the Model Management tab.

Supported Models

🔥 Hot Models

Qwen3-4B-Instruct-2507-GGUF
lemonade-server pull Qwen3-4B-Instruct-2507-GGUF
KeyValue
Checkpointunsloth/Qwen3-4B-Instruct-2507-GGUF
GGUF VariantQwen3-4B-Instruct-2507-Q4_K_M.gguf
Recipellamacpp
Labelshot
Size (GB)2.5
Qwen3-Coder-30B-A3B-Instruct-GGUF
lemonade-server pull Qwen3-Coder-30B-A3B-Instruct-GGUF
KeyValue
Checkpointunsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF
GGUF VariantQwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf
Recipellamacpp
Labelscoding, tool-calling, hot
Size (GB)18.6
Gemma-3-4b-it-GGUF
lemonade-server pull Gemma-3-4b-it-GGUF
KeyValue
Checkpointggml-org/gemma-3-4b-it-GGUF
GGUF VariantQ4_K_M
Mmprojmmproj-model-f16.gguf
Recipellamacpp
Labelshot, vision
Size (GB)3.61
gpt-oss-120b-mxfp-GGUF
lemonade-server pull gpt-oss-120b-mxfp-GGUF
KeyValue
Checkpointggml-org/gpt-oss-120b-GGUF
GGUF Variant*
Recipellamacpp
Labelshot, reasoning, tool-calling
Size (GB)63.3
gpt-oss-20b-mxfp4-GGUF
lemonade-server pull gpt-oss-20b-mxfp4-GGUF
KeyValue
Checkpointggml-org/gpt-oss-20b-GGUF
Recipellamacpp
Labelshot, reasoning, tool-calling
Size (GB)12.1
Gemma-3-4b-it-FLM
lemonade-server pull Gemma-3-4b-it-FLM
KeyValue
Checkpointgemma3:4b
Recipeflm
Labelshot, vision
Size (GB)5.26
Qwen3-4B-Instruct-2507-FLM
lemonade-server pull Qwen3-4B-Instruct-2507-FLM
KeyValue
Checkpointqwen3-it:4b
Recipeflm
Labelshot
Size (GB)3.07

GGUF

Qwen3-0.6B-GGUF
lemonade-server pull Qwen3-0.6B-GGUF
KeyValue
Checkpointunsloth/Qwen3-0.6B-GGUF
GGUF VariantQ4_0
Recipellamacpp
Labelsreasoning
Size (GB)0.38
Qwen3-1.7B-GGUF
lemonade-server pull Qwen3-1.7B-GGUF
KeyValue
Checkpointunsloth/Qwen3-1.7B-GGUF
GGUF VariantQ4_0
Recipellamacpp
Labelsreasoning
Size (GB)1.06
Qwen3-4B-GGUF
lemonade-server pull Qwen3-4B-GGUF
KeyValue
Checkpointunsloth/Qwen3-4B-GGUF
GGUF VariantQ4_0
Recipellamacpp
Labelsreasoning
Size (GB)2.38
Qwen3-8B-GGUF
lemonade-server pull Qwen3-8B-GGUF
KeyValue
Checkpointunsloth/Qwen3-8B-GGUF
GGUF VariantQ4_1
Recipellamacpp
Labelsreasoning
Size (GB)5.25
DeepSeek-Qwen3-8B-GGUF
lemonade-server pull DeepSeek-Qwen3-8B-GGUF
KeyValue
Checkpointunsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF
GGUF VariantQ4_1
Recipellamacpp
Labelsreasoning
Size (GB)5.25
Qwen3-14B-GGUF
lemonade-server pull Qwen3-14B-GGUF
KeyValue
Checkpointunsloth/Qwen3-14B-GGUF
GGUF VariantQ4_0
Recipellamacpp
Labelsreasoning
Size (GB)8.54
Qwen3-4B-Instruct-2507-GGUF
lemonade-server pull Qwen3-4B-Instruct-2507-GGUF
KeyValue
Checkpointunsloth/Qwen3-4B-Instruct-2507-GGUF
GGUF VariantQwen3-4B-Instruct-2507-Q4_K_M.gguf
Recipellamacpp
Labelshot
Size (GB)2.5
Qwen3-30B-A3B-GGUF
lemonade-server pull Qwen3-30B-A3B-GGUF
KeyValue
Checkpointunsloth/Qwen3-30B-A3B-GGUF
GGUF VariantQ4_0
Recipellamacpp
Labelsreasoning
Size (GB)17.4
Qwen3-30B-A3B-Instruct-2507-GGUF
lemonade-server pull Qwen3-30B-A3B-Instruct-2507-GGUF
KeyValue
Checkpointunsloth/Qwen3-30B-A3B-Instruct-2507-GGUF
GGUF VariantQwen3-30B-A3B-Instruct-2507-Q4_0.gguf
Recipellamacpp
Size (GB)17.4
Qwen3-Coder-30B-A3B-Instruct-GGUF
lemonade-server pull Qwen3-Coder-30B-A3B-Instruct-GGUF
KeyValue
Checkpointunsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF
GGUF VariantQwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf
Recipellamacpp
Labelscoding, tool-calling, hot
Size (GB)18.6
Gemma-3-4b-it-GGUF
lemonade-server pull Gemma-3-4b-it-GGUF
KeyValue
Checkpointggml-org/gemma-3-4b-it-GGUF
GGUF VariantQ4_K_M
Mmprojmmproj-model-f16.gguf
Recipellamacpp
Labelshot, vision
Size (GB)3.61
Qwen2.5-VL-7B-Instruct-GGUF
lemonade-server pull Qwen2.5-VL-7B-Instruct-GGUF
KeyValue
Checkpointggml-org/Qwen2.5-VL-7B-Instruct-GGUF
GGUF VariantQ4_K_M
Mmprojmmproj-Qwen2.5-VL-7B-Instruct-f16.gguf
Recipellamacpp
Labelsvision
Size (GB)4.68
Llama-4-Scout-17B-16E-Instruct-GGUF
lemonade-server pull Llama-4-Scout-17B-16E-Instruct-GGUF
KeyValue
Checkpointunsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
GGUF VariantQ4_K_S
Mmprojmmproj-F16.gguf
Recipellamacpp
Labelsvision
Size (GB)61.5
nomic-embed-text-v1-GGUF
lemonade-server pull nomic-embed-text-v1-GGUF
KeyValue
Checkpointnomic-ai/nomic-embed-text-v1-GGUF
GGUF VariantQ4_K_S
Recipellamacpp
Labelsembeddings
Size (GB)0.0781
nomic-embed-text-v2-moe-GGUF
lemonade-server pull nomic-embed-text-v2-moe-GGUF
KeyValue
Checkpointnomic-ai/nomic-embed-text-v2-moe-GGUF
GGUF VariantQ8_0
Recipellamacpp
Labelsembeddings
Size (GB)0.51
bge-reranker-v2-m3-GGUF
lemonade-server pull bge-reranker-v2-m3-GGUF
KeyValue
Checkpointpqnet/bge-reranker-v2-m3-Q8_0-GGUF
Recipellamacpp
Labelsreranking
Size (GB)0.53
Devstral-Small-2507-GGUF
lemonade-server pull Devstral-Small-2507-GGUF
KeyValue
Checkpointmistralai/Devstral-Small-2507_gguf
GGUF VariantQ4_K_M
Recipellamacpp
Labelscoding, tool-calling
Size (GB)14.3
Qwen2.5-Coder-32B-Instruct-GGUF
lemonade-server pull Qwen2.5-Coder-32B-Instruct-GGUF
KeyValue
CheckpointQwen/Qwen2.5-Coder-32B-Instruct-GGUF
GGUF VariantQ4_K_M
Recipellamacpp
Labelscoding
Size (GB)19.85
gpt-oss-120b-mxfp-GGUF
lemonade-server pull gpt-oss-120b-mxfp-GGUF
KeyValue
Checkpointggml-org/gpt-oss-120b-GGUF
GGUF Variant*
Recipellamacpp
Labelshot, reasoning, tool-calling
Size (GB)63.3
gpt-oss-20b-mxfp4-GGUF
lemonade-server pull gpt-oss-20b-mxfp4-GGUF
KeyValue
Checkpointggml-org/gpt-oss-20b-GGUF
Recipellamacpp
Labelshot, reasoning, tool-calling
Size (GB)12.1
GLM-4.5-Air-UD-Q4K-XL-GGUF
lemonade-server pull GLM-4.5-Air-UD-Q4K-XL-GGUF
KeyValue
Checkpointunsloth/GLM-4.5-Air-GGUF
GGUF VariantUD-Q4_K_XL
Recipellamacpp
Labelsreasoning
Size (GB)73.1

Ryzen AI Hybrid (NPU+GPU)

Llama-3.2-1B-Instruct-Hybrid
lemonade-server pull Llama-3.2-1B-Instruct-Hybrid
KeyValue
Checkpointamd/Llama-3.2-1B-Instruct-awq-g128-int4-asym-fp16-onnx-hybrid
Recipeoga-hybrid
Size (GB)1.75
Llama-3.2-3B-Instruct-Hybrid
lemonade-server pull Llama-3.2-3B-Instruct-Hybrid
KeyValue
Checkpointamd/Llama-3.2-3B-Instruct-awq-g128-int4-asym-fp16-onnx-hybrid
Recipeoga-hybrid
Size (GB)3.97
Phi-3-Mini-Instruct-Hybrid
lemonade-server pull Phi-3-Mini-Instruct-Hybrid
KeyValue
Checkpointamd/Phi-3-mini-4k-instruct-awq-g128-int4-asym-fp16-onnx-hybrid
Recipeoga-hybrid
Size (GB)3.89
Qwen-1.5-7B-Chat-Hybrid
lemonade-server pull Qwen-1.5-7B-Chat-Hybrid
KeyValue
Checkpointamd/Qwen1.5-7B-Chat-awq-g128-int4-asym-fp16-onnx-hybrid
Recipeoga-hybrid
Size (GB)8.22
Qwen-2.5-7B-Instruct-Hybrid
lemonade-server pull Qwen-2.5-7B-Instruct-Hybrid
KeyValue
Checkpointamd/Qwen2.5-7B-Instruct-awq-uint4-asym-g128-lmhead-g32-fp16-onnx-hybrid
Recipeoga-hybrid
Size (GB)8.42
Qwen-2.5-3B-Instruct-Hybrid
lemonade-server pull Qwen-2.5-3B-Instruct-Hybrid
KeyValue
Checkpointamd/Qwen2.5-3B-Instruct-awq-uint4-asym-g128-lmhead-g32-fp16-onnx-hybrid
Recipeoga-hybrid
Size (GB)3.84
Qwen-2.5-1.5B-Instruct-Hybrid
lemonade-server pull Qwen-2.5-1.5B-Instruct-Hybrid
KeyValue
Checkpointamd/Qwen2.5-1.5B-Instruct-awq-uint4-asym-g128-lmhead-g32-fp16-onnx-hybrid
Recipeoga-hybrid
Size (GB)2.08
DeepSeek-R1-Distill-Llama-8B-Hybrid
lemonade-server pull DeepSeek-R1-Distill-Llama-8B-Hybrid
KeyValue
Checkpointamd/DeepSeek-R1-Distill-Llama-8B-awq-asym-uint4-g128-lmhead-onnx-hybrid
Recipeoga-hybrid
Labelsreasoning
Size (GB)8.45
Mistral-7B-v0.3-Instruct-Hybrid
lemonade-server pull Mistral-7B-v0.3-Instruct-Hybrid
KeyValue
Checkpointamd/Mistral-7B-Instruct-v0.3-awq-g128-int4-asym-fp16-onnx-hybrid
Recipeoga-hybrid
Size (GB)7.31
Llama-3.1-8B-Instruct-Hybrid
lemonade-server pull Llama-3.1-8B-Instruct-Hybrid
KeyValue
Checkpointamd/Llama-3.1-8B-Instruct-awq-asym-uint4-g128-lmhead-onnx-hybrid
Recipeoga-hybrid
Size (GB)8.47
Llama-xLAM-2-8b-fc-r-Hybrid
lemonade-server pull Llama-xLAM-2-8b-fc-r-Hybrid
KeyValue
Checkpointamd/Llama-xLAM-2-8b-fc-r-awq-g128-int4-asym-bfp16-onnx-hybrid
Recipeoga-hybrid
Size (GB)8.47

Ryzen AI NPU

Qwen-2.5-7B-Instruct-NPU
lemonade-server pull Qwen-2.5-7B-Instruct-NPU
KeyValue
Checkpointamd/Qwen2.5-7B-Instruct-awq-g128-int4-asym-bf16-onnx-ryzen-strix
Recipeoga-npu
Size (GB)10.14
Qwen-2.5-1.5B-Instruct-NPU
lemonade-server pull Qwen-2.5-1.5B-Instruct-NPU
KeyValue
Checkpointamd/Qwen2.5-1.5B-Instruct-awq-g128-int4-asym-bf16-onnx-ryzen-strix
Recipeoga-npu
Size (GB)2.89
DeepSeek-R1-Distill-Llama-8B-NPU
lemonade-server pull DeepSeek-R1-Distill-Llama-8B-NPU
KeyValue
Checkpointamd/DeepSeek-R1-Distill-Llama-8B-awq-g128-int4-asym-bf16-onnx-ryzen-strix
Recipeoga-npu
Size (GB)10.63
Mistral-7B-v0.3-Instruct-NPU
lemonade-server pull Mistral-7B-v0.3-Instruct-NPU
KeyValue
Checkpointamd/Mistral-7B-Instruct-v0.3-awq-g128-int4-asym-bf16-onnx-ryzen-strix
Recipeoga-npu
Size (GB)11.75
Phi-3.5-Mini-Instruct-NPU
lemonade-server pull Phi-3.5-Mini-Instruct-NPU
KeyValue
Checkpointamd/Phi-3.5-mini-instruct-awq-g128-int4-asym-bf16-onnx-ryzen-strix
Recipeoga-npu
Size (GB)4.18

FastFlowLM (NPU)

Gemma-3-4b-it-FLM
lemonade-server pull Gemma-3-4b-it-FLM
KeyValue
Checkpointgemma3:4b
Recipeflm
Labelshot, vision
Size (GB)5.26
Qwen3-4B-Instruct-2507-FLM
lemonade-server pull Qwen3-4B-Instruct-2507-FLM
KeyValue
Checkpointqwen3-it:4b
Recipeflm
Labelshot
Size (GB)3.07
Qwen3-8b-FLM
lemonade-server pull Qwen3-8b-FLM
KeyValue
Checkpointqwen3:8b
Recipeflm
Labelsreasoning
Size (GB)5.57
Llama-3.2-1B-FLM
lemonade-server pull Llama-3.2-1B-FLM
KeyValue
Checkpointllama3.2:1b
Recipeflm
Size (GB)1.21
Llama-3.2-3B-FLM
lemonade-server pull Llama-3.2-3B-FLM
KeyValue
Checkpointllama3.2:3b
Recipeflm
Size (GB)2.62
Llama-3.1-8B-FLM
lemonade-server pull Llama-3.1-8B-FLM
KeyValue
Checkpointllama3.1:8b
Recipeflm
Size (GB)5.36
gpt-oss-20b-FLM
lemonade-server pull gpt-oss-20b-FLM
KeyValue
Checkpointgpt-oss:20b
Recipeflm
Size (GB)13.4

CPU

Qwen2.5-0.5B-Instruct-CPU
lemonade-server pull Qwen2.5-0.5B-Instruct-CPU
KeyValue
Checkpointamd/Qwen2.5-0.5B-Instruct-quantized_int4-float16-cpu-onnx
Recipeoga-cpu
Size (GB)0.77
Phi-3-Mini-Instruct-CPU
lemonade-server pull Phi-3-Mini-Instruct-CPU
KeyValue
Checkpointamd/Phi-3-mini-4k-instruct_int4_float16_onnx_cpu
Recipeoga-cpu
Size (GB)2.23
Qwen-1.5-7B-Chat-CPU
lemonade-server pull Qwen-1.5-7B-Chat-CPU
KeyValue
Checkpointamd/Qwen1.5-7B-Chat_uint4_asym_g128_float16_onnx_cpu
Recipeoga-cpu
Size (GB)5.89
DeepSeek-R1-Distill-Llama-8B-CPU
lemonade-server pull DeepSeek-R1-Distill-Llama-8B-CPU
KeyValue
Checkpointamd/DeepSeek-R1-Distill-Llama-8B-awq-asym-uint4-g128-lmhead-onnx-cpu
Recipeoga-cpu
Labelsreasoning
Size (GB)5.78
DeepSeek-R1-Distill-Qwen-7B-CPU
lemonade-server pull DeepSeek-R1-Distill-Qwen-7B-CPU
KeyValue
Checkpointamd/DeepSeek-R1-Distill-Llama-8B-awq-asym-uint4-g128-lmhead-onnx-cpu
Recipeoga-cpu
Labelsreasoning
Size (GB)5.78

Naming Convention

The format of each Lemonade name is a combination of the name in the base checkpoint and the backend where the model will run. So, if the base checkpoint is meta-llama/Llama-3.2-1B-Instruct, and it has been optimized to run on Hybrid, the resulting name is Llama-3.2-3B-Instruct-Hybrid.

Model Storage and Management

Lemonade Server relies on Hugging Face Hub to manage downloading and storing models on your system. By default, Hugging Face Hub downloads models to C:\Users\YOUR_USERNAME\.cache\huggingface\hub.

For example, the Lemonade Server Llama-3.2-3B-Instruct-Hybrid model will end up at C:\Users\YOUR_USERNAME\.cache\huggingface\hub\models--amd--Llama-3.2-1B-Instruct-awq-g128-int4-asym-fp16-onnx-hybrid. If you want to uninstall that model, simply delete that folder.

You can change the directory for Hugging Face Hub by setting the HF_HOME or HF_HUB_CACHE environment variables.

Installing Additional Models

Once you've installed Lemonade Server, you can install any model on this list using the pull command in the lemonade-server CLI.

Example:

lemonade-server pull Qwen2.5-0.5B-Instruct-CPU

Note: lemonade-server is a utility that is added to your PATH when you install Lemonade Server with the GUI installer. If you are using Lemonade Server from a Python environment, use the lemonade-server-dev pull command instead.