🍋 Lemonade Server Models
This document provides the models we recommend for use with Lemonade Server.
Click on any model to learn more details about it, such as the Lemonade Recipe used to load the model. Content:
- Model Management GUI
- Supported Models
- Naming Convention
- Model Storage and Management
- Installing Additional Models
Model Management GUI
Lemonade Server offers a model management GUI to help you see which models are available, install new models, and delete models. You can access this GUI by starting Lemonade Server, opening http://localhost:8000 in your web browser, and clicking the Model Management tab.
Supported Models
🔥 Hot Models
Qwen3-30B-A3B-Instruct-2507-GGUF
lemonade-server pull Qwen3-30B-A3B-Instruct-2507-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF |
GGUF Variant | Qwen3-30B-A3B-Instruct-2507-Q4_0.gguf |
Recipe | llamacpp |
Labels | hot |
Qwen3-Coder-30B-A3B-Instruct-GGUF
lemonade-server pull Qwen3-Coder-30B-A3B-Instruct-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF |
GGUF Variant | Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf |
Recipe | llamacpp |
Labels | coding, hot |
gpt-oss-120b-GGUF
lemonade-server pull gpt-oss-120b-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/gpt-oss-120b-GGUF |
GGUF Variant | Q4_K_M |
Recipe | llamacpp |
Labels | hot, reasoning |
gpt-oss-20b-GGUF
lemonade-server pull gpt-oss-20b-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/gpt-oss-20b-GGUF |
GGUF Variant | Q4_K_M |
Recipe | llamacpp |
Labels | hot, reasoning |
GLM-4.5-Air-UD-Q4K-XL-GGUF
lemonade-server pull GLM-4.5-Air-UD-Q4K-XL-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/GLM-4.5-Air-GGUF |
GGUF Variant | UD-Q4_K_XL |
Recipe | llamacpp |
Labels | reasoning, hot |
GGUF
Qwen3-0.6B-GGUF
lemonade-server pull Qwen3-0.6B-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-0.6B-GGUF |
GGUF Variant | Q4_0 |
Recipe | llamacpp |
Labels | reasoning |
Qwen3-1.7B-GGUF
lemonade-server pull Qwen3-1.7B-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-1.7B-GGUF |
GGUF Variant | Q4_0 |
Recipe | llamacpp |
Labels | reasoning |
Qwen3-4B-GGUF
lemonade-server pull Qwen3-4B-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-4B-GGUF |
GGUF Variant | Q4_0 |
Recipe | llamacpp |
Labels | reasoning |
Qwen3-8B-GGUF
lemonade-server pull Qwen3-8B-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-8B-GGUF |
GGUF Variant | Q4_1 |
Recipe | llamacpp |
Labels | reasoning |
DeepSeek-Qwen3-8B-GGUF
lemonade-server pull DeepSeek-Qwen3-8B-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF |
GGUF Variant | Q4_1 |
Recipe | llamacpp |
Labels | reasoning |
Qwen3-14B-GGUF
lemonade-server pull Qwen3-14B-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-14B-GGUF |
GGUF Variant | Q4_0 |
Recipe | llamacpp |
Labels | reasoning |
Qwen3-30B-A3B-GGUF
lemonade-server pull Qwen3-30B-A3B-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-30B-A3B-GGUF |
GGUF Variant | Q4_0 |
Recipe | llamacpp |
Labels | reasoning |
Qwen3-30B-A3B-Instruct-2507-GGUF
lemonade-server pull Qwen3-30B-A3B-Instruct-2507-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF |
GGUF Variant | Qwen3-30B-A3B-Instruct-2507-Q4_0.gguf |
Recipe | llamacpp |
Labels | hot |
Qwen3-Coder-30B-A3B-Instruct-GGUF
lemonade-server pull Qwen3-Coder-30B-A3B-Instruct-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF |
GGUF Variant | Qwen3-Coder-30B-A3B-Instruct-Q4_K_M.gguf |
Recipe | llamacpp |
Labels | coding, hot |
Gemma-3-4b-it-GGUF
lemonade-server pull Gemma-3-4b-it-GGUF
Key | Value |
---|---|
Checkpoint | ggml-org/gemma-3-4b-it-GGUF |
GGUF Variant | Q4_K_M |
Mmproj | mmproj-model-f16.gguf |
Recipe | llamacpp |
Labels | vision |
Qwen2.5-VL-7B-Instruct-GGUF
lemonade-server pull Qwen2.5-VL-7B-Instruct-GGUF
Key | Value |
---|---|
Checkpoint | ggml-org/Qwen2.5-VL-7B-Instruct-GGUF |
GGUF Variant | Q4_K_M |
Mmproj | mmproj-Qwen2.5-VL-7B-Instruct-f16.gguf |
Recipe | llamacpp |
Labels | vision |
Llama-4-Scout-17B-16E-Instruct-GGUF
lemonade-server pull Llama-4-Scout-17B-16E-Instruct-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF |
GGUF Variant | Q4_K_S |
Mmproj | mmproj-F16.gguf |
Recipe | llamacpp |
Labels | vision |
nomic-embed-text-v1-GGUF
lemonade-server pull nomic-embed-text-v1-GGUF
Key | Value |
---|---|
Checkpoint | nomic-ai/nomic-embed-text-v1-GGUF |
GGUF Variant | Q4_K_S |
Recipe | llamacpp |
Labels | embeddings |
nomic-embed-text-v2-moe-GGUF
lemonade-server pull nomic-embed-text-v2-moe-GGUF
Key | Value |
---|---|
Checkpoint | nomic-ai/nomic-embed-text-v2-moe-GGUF |
GGUF Variant | Q8_0 |
Recipe | llamacpp |
Labels | embeddings |
bge-reranker-v2-m3-GGUF
lemonade-server pull bge-reranker-v2-m3-GGUF
Key | Value |
---|---|
Checkpoint | pqnet/bge-reranker-v2-m3-Q8_0-GGUF |
Recipe | llamacpp |
Labels | reranking |
Devstral-Small-2507-GGUF
lemonade-server pull Devstral-Small-2507-GGUF
Key | Value |
---|---|
Checkpoint | mistralai/Devstral-Small-2507_gguf |
GGUF Variant | Q4_K_M |
Recipe | llamacpp |
Labels | coding |
Qwen2.5-Coder-32B-Instruct-GGUF
lemonade-server pull Qwen2.5-Coder-32B-Instruct-GGUF
Key | Value |
---|---|
Checkpoint | Qwen/Qwen2.5-Coder-32B-Instruct-GGUF |
GGUF Variant | Q4_K_M |
Recipe | llamacpp |
Labels | coding |
gpt-oss-120b-GGUF
lemonade-server pull gpt-oss-120b-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/gpt-oss-120b-GGUF |
GGUF Variant | Q4_K_M |
Recipe | llamacpp |
Labels | hot, reasoning |
gpt-oss-20b-GGUF
lemonade-server pull gpt-oss-20b-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/gpt-oss-20b-GGUF |
GGUF Variant | Q4_K_M |
Recipe | llamacpp |
Labels | hot, reasoning |
GLM-4.5-Air-UD-Q4K-XL-GGUF
lemonade-server pull GLM-4.5-Air-UD-Q4K-XL-GGUF
Key | Value |
---|---|
Checkpoint | unsloth/GLM-4.5-Air-GGUF |
GGUF Variant | UD-Q4_K_XL |
Recipe | llamacpp |
Labels | reasoning, hot |
Hybrid
Llama-3.2-1B-Instruct-Hybrid
lemonade-server pull Llama-3.2-1B-Instruct-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Llama-3.2-1B-Instruct-awq-g128-int4-asym-fp16-onnx-hybrid |
Recipe | oga-hybrid |
Llama-3.2-3B-Instruct-Hybrid
lemonade-server pull Llama-3.2-3B-Instruct-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Llama-3.2-3B-Instruct-awq-g128-int4-asym-fp16-onnx-hybrid |
Recipe | oga-hybrid |
Phi-3-Mini-Instruct-Hybrid
lemonade-server pull Phi-3-Mini-Instruct-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Phi-3-mini-4k-instruct-awq-g128-int4-asym-fp16-onnx-hybrid |
Recipe | oga-hybrid |
Qwen-1.5-7B-Chat-Hybrid
lemonade-server pull Qwen-1.5-7B-Chat-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Qwen1.5-7B-Chat-awq-g128-int4-asym-fp16-onnx-hybrid |
Recipe | oga-hybrid |
Qwen-2.5-7B-Instruct-Hybrid
lemonade-server pull Qwen-2.5-7B-Instruct-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Qwen2.5-7B-Instruct-awq-uint4-asym-g128-lmhead-g32-fp16-onnx-hybrid |
Recipe | oga-hybrid |
Qwen-2.5-3B-Instruct-Hybrid
lemonade-server pull Qwen-2.5-3B-Instruct-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Qwen2.5-3B-Instruct-awq-uint4-asym-g128-lmhead-g32-fp16-onnx-hybrid |
Recipe | oga-hybrid |
Qwen-2.5-1.5B-Instruct-Hybrid
lemonade-server pull Qwen-2.5-1.5B-Instruct-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Qwen2.5-1.5B-Instruct-awq-uint4-asym-g128-lmhead-g32-fp16-onnx-hybrid |
Recipe | oga-hybrid |
DeepSeek-R1-Distill-Llama-8B-Hybrid
lemonade-server pull DeepSeek-R1-Distill-Llama-8B-Hybrid
Key | Value |
---|---|
Checkpoint | amd/DeepSeek-R1-Distill-Llama-8B-awq-asym-uint4-g128-lmhead-onnx-hybrid |
Recipe | oga-hybrid |
Labels | reasoning |
Mistral-7B-v0.3-Instruct-Hybrid
lemonade-server pull Mistral-7B-v0.3-Instruct-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Mistral-7B-Instruct-v0.3-awq-g128-int4-asym-fp16-onnx-hybrid |
Recipe | oga-hybrid |
Llama-3.1-8B-Instruct-Hybrid
lemonade-server pull Llama-3.1-8B-Instruct-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Llama-3.1-8B-Instruct-awq-asym-uint4-g128-lmhead-onnx-hybrid |
Recipe | oga-hybrid |
Llama-xLAM-2-8b-fc-r-Hybrid
lemonade-server pull Llama-xLAM-2-8b-fc-r-Hybrid
Key | Value |
---|---|
Checkpoint | amd/Llama-xLAM-2-8b-fc-r-awq-g128-int4-asym-bfp16-onnx-hybrid |
Recipe | oga-hybrid |
NPU
Qwen-2.5-7B-Instruct-NPU
lemonade-server pull Qwen-2.5-7B-Instruct-NPU
Key | Value |
---|---|
Checkpoint | amd/Qwen2.5-7B-Instruct-awq-g128-int4-asym-bf16-onnx-ryzen-strix |
Recipe | oga-npu |
Qwen-2.5-1.5B-Instruct-NPU
lemonade-server pull Qwen-2.5-1.5B-Instruct-NPU
Key | Value |
---|---|
Checkpoint | amd/Qwen2.5-1.5B-Instruct-awq-g128-int4-asym-bf16-onnx-ryzen-strix |
Recipe | oga-npu |
DeepSeek-R1-Distill-Llama-8B-NPU
lemonade-server pull DeepSeek-R1-Distill-Llama-8B-NPU
Key | Value |
---|---|
Checkpoint | amd/DeepSeek-R1-Distill-Llama-8B-awq-g128-int4-asym-bf16-onnx-ryzen-strix |
Recipe | oga-npu |
Mistral-7B-v0.3-Instruct-NPU
lemonade-server pull Mistral-7B-v0.3-Instruct-NPU
Key | Value |
---|---|
Checkpoint | amd/Mistral-7B-Instruct-v0.3-awq-g128-int4-asym-bf16-onnx-ryzen-strix |
Recipe | oga-npu |
Phi-3.5-Mini-Instruct-NPU
lemonade-server pull Phi-3.5-Mini-Instruct-NPU
Key | Value |
---|---|
Checkpoint | amd/Phi-3.5-mini-instruct-awq-g128-int4-asym-bf16-onnx-ryzen-strix |
Recipe | oga-npu |
CPU
Qwen2.5-0.5B-Instruct-CPU
lemonade-server pull Qwen2.5-0.5B-Instruct-CPU
Key | Value |
---|---|
Checkpoint | amd/Qwen2.5-0.5B-Instruct-quantized_int4-float16-cpu-onnx |
Recipe | oga-cpu |
Phi-3-Mini-Instruct-CPU
lemonade-server pull Phi-3-Mini-Instruct-CPU
Key | Value |
---|---|
Checkpoint | amd/Phi-3-mini-4k-instruct_int4_float16_onnx_cpu |
Recipe | oga-cpu |
Qwen-1.5-7B-Chat-CPU
lemonade-server pull Qwen-1.5-7B-Chat-CPU
Key | Value |
---|---|
Checkpoint | amd/Qwen1.5-7B-Chat_uint4_asym_g128_float16_onnx_cpu |
Recipe | oga-cpu |
DeepSeek-R1-Distill-Llama-8B-CPU
lemonade-server pull DeepSeek-R1-Distill-Llama-8B-CPU
Key | Value |
---|---|
Checkpoint | amd/DeepSeek-R1-Distill-Llama-8B-awq-asym-uint4-g128-lmhead-onnx-cpu |
Recipe | oga-cpu |
Labels | reasoning |
DeepSeek-R1-Distill-Qwen-7B-CPU
lemonade-server pull DeepSeek-R1-Distill-Qwen-7B-CPU
Key | Value |
---|---|
Checkpoint | amd/DeepSeek-R1-Distill-Llama-8B-awq-asym-uint4-g128-lmhead-onnx-cpu |
Recipe | oga-cpu |
Labels | reasoning |
Naming Convention
The format of each Lemonade name is a combination of the name in the base checkpoint and the backend where the model will run. So, if the base checkpoint is meta-llama/Llama-3.2-1B-Instruct
, and it has been optimized to run on Hybrid, the resulting name is Llama-3.2-3B-Instruct-Hybrid
.
Model Storage and Management
Lemonade Server relies on Hugging Face Hub to manage downloading and storing models on your system. By default, Hugging Face Hub downloads models to C:\Users\YOUR_USERNAME\.cache\huggingface\hub
.
For example, the Lemonade Server Llama-3.2-3B-Instruct-Hybrid
model will end up at C:\Users\YOUR_USERNAME\.cache\huggingface\hub\models--amd--Llama-3.2-1B-Instruct-awq-g128-int4-asym-fp16-onnx-hybrid
. If you want to uninstall that model, simply delete that folder.
You can change the directory for Hugging Face Hub by setting the HF_HOME
or HF_HUB_CACHE
environment variables.
Installing Additional Models
Once you've installed Lemonade Server, you can install any model on this list using the pull
command in the lemonade-server
CLI.
Example:
lemonade-server pull Qwen2.5-0.5B-Instruct-CPU
Note:
lemonade-server
is a utility that is added to your PATH when you install Lemonade Server with the GUI installer. If you are using Lemonade Server from a Python environment, use thelemonade-server-dev pull
command instead.