lemonade

Embeddable Lemonade Guide

Embeddable Lemonade is a portable build of the lemond service that you can bundle into your app.

Contents:

Who is this for?

Use Embeddable Lemonade instead of a global Lemonade Service when you want a cohesive end-to-end experience for users of your app.

What’s in the release artifact?

Embeddable Lemonade is an zip/tarball artifact shipped in Lemonade releases.

Note: see the Building from Source for instructions for building your own embeddable Lemonade from source, including for other Linux distros.

Each archive has the following contents:

Customization Overview

While you can ship Embeddable Lemonade as-is, there many opportunities to customize it before packaging it into your app.

How it Works

Many of the customization options rely of lemond’s config.json file, a persistent store of settings. Learn more about the individual settings in the configuration guide.

config.json is automatically generated based on the values in resources/defaults.json the first time lemond starts. The positional arg lemond DIR determines where config.json and other runtime files (e.g., backend binaries) will be located.

In the examples in this guide, we start lemond ./ to place these files in the same directory as lemond itself. Then:

  1. We use the lemonade CLI’s config set command to programmatically customize the contents of config.json (you can also manually edit config.json if you prefer).
  2. Use lemonade backends install to pre-download backends to be bundled in your app.
  3. Edit server_models.json and backend_versions.json to fully customize the experience for your users.
  4. You can delete the lemonade CLI and defaults.json files to minimize the footprint of your app.

Finally, you can place the fully-configured Embeddable Lemonade folder into your app’s installer.

Deployment-Ready Layout

Once you’ve finished customization, you’ll have a portable Lemonade folder ready for deployment with a layout like this:

=== “Windows (cmd.exe)”

```text
lemond.exe                      # App runs lemond as a subprocess
lemonade.exe                    # Optional: CLI management for lemond
LICENSE                         # Lemonade license file
config.json                     # Persistent customized settings for lemond
recipe_options.json             # Per-model customization (e.g., llama args)

resources\
    |- server_models.json       # Customized lemond models list
    |- backend_versions.json    # Customized version numbers for llamacpp, etc.

bin\                            # Pre-downloaded backends bundled into app
    |- llamacpp\                # GPU LLMs, embedding, and reranking
        |- rocm\
            |- llama-server.exe
        |- vulkan\
            |- llama-server.exe
    |- ryzenai-server\          # NPU LLMs
    |- flm\                     # NPU LLMs, embedding, and ASR
    |- sdpp\                    # GPU image generation
    |- whispercpp\              # NPU and GPU ASR

models\                         # Hugging Face standard layout for models
    |- models--unsloth--Qwen3-0.6B-GGUF\
extra_models\                   # Additional GGUF files
    |- my_custom_model.gguf
```

=== “Linux (bash)”

```text
lemond                          # App runs lemond as a subprocess
lemonade                        # Optional: CLI management for lemond
LICENSE                         # Lemonade license file
config.json                     # Persistent customized settings for lemond
recipe_options.json             # Per-model customization (e.g., llama args)

resources/
    |- server_models.json       # Customized lemond models list
    |- backend_versions.json    # Customized version numbers for llamacpp, etc.

bin/                            # Pre-downloaded backends bundled into app
    |- llamacpp/                # GPU LLMs, embedding, and reranking
        |- rocm/
            |- llama-server
        |- vulkan/
            |- llama-server
    |- ryzenai-server/          # NPU LLMs
    |- flm/                     # NPU LLMs, embedding, and ASR
    |- sdpp/                    # GPU image generation
    |- whispercpp/              # NPU and GPU ASR

models/                         # Hugging Face standard layout for models
    |- models--unsloth--Qwen3-0.6B-GGUF/
extra_models/                   # Additional GGUF files
    |- my_custom_model.gguf
```

In-Depth Customization

Reference detailed guides for each of the following subjects: