lemonade

llama.cpp Backend Options

Lemonade uses llama.cpp as its primary LLM inference backend, supporting multiple hardware acceleration options. This document explains the available backends and how to choose between them.

Available Backends

CPU

Vulkan

ROCm

Metal

System

ROCm Channel Configuration

The ROCm backend supports three channels to balance stability, performance, and access to latest features:

Preview Channel (Default)

{
  "rocm_channel": "preview"
}

Stable Channel

{
  "rocm_channel": "stable"
}

Nightly Channel

{
  "rocm_channel": "nightly"
}

Changing Channels

To switch between channels, update your config.json:

{
  "rocm_channel": "stable"
}

Or use the Lemonade CLI:

# Switch to stable channel
lemonade config set rocm_channel=stable

# Switch to preview channel (default)
lemonade config set rocm_channel=preview

# Switch to nightly channel (experimental)
lemonade config set rocm_channel=nightly

After changing channels, you’ll need to reinstall the ROCm backend:

lemonade backends install llamacpp:rocm

Pinning to a Specific Version Tag

You can pin llamacpp.rocm_bin to a specific release tag instead of using "builtin" or "latest". Each channel downloads from a different GitHub repository, so you must set the correct channel before setting a specific tag.

Channel Repository Tag format
preview (default) lemonade-sdk/llama.cpp Lemonade-specific build tags
stable ggml-org/llama.cpp Upstream tags, e.g. b4888
nightly lemonade-sdk/llamacpp-rocm Nightly tags, e.g. b1260

Always set rocm_channel to the correct channel before setting rocm_bin to a specific tag. If the tag does not exist in the current channel’s repository, the download will fail with HTTP 404.

Example — pin to a specific nightly build:

# 1. Switch to the nightly channel first
lemonade config set rocm_channel=nightly

# 2. Then pin to the desired nightly tag
lemonade config set llamacpp.rocm_bin=b1260

Choosing the Right Backend

Decision Tree

  1. Do you have an NVIDIA or Intel GPU?
    • Use Vulkan
  2. Do you have an AMD GPU?
    • For Radeon RX 6000/7000 or Ryzen AI iGPU:
      • Try ROCm first for best performance
      • Fall back to Vulkan if you encounter issues
    • For older AMD GPUs (RX 5000 and earlier):
      • Use Vulkan (ROCm not supported)
  3. Do you have Apple Silicon?
    • Use Metal
  4. No GPU or unsupported GPU?
    • Use CPU

ROCm Channel Selection

Platform Specifics

Linux

Windows

macOS