This guide covers everything you need to build, test, and contribute to Lemonade from source. Whether you’re fixing a bug, adding a feature, or just exploring the codebase, this document will help you get started.
Lemonade consists of these main executables:
lemondlemond or lemonade)All Platforms:
Windows:
Linux:
A helper script is available that will set up the build environment on popular Linux distributions and macOS. This will prompt to install dependencies via native package managers and create the build directory.
Linux / macOS
./setup.sh
Windows
./setup.ps1
Build by running:
Linux / macOS
cmake --build --preset default
Windows (Visual Studio 2022)
cmake --build --preset windows
Windows (Visual Studio 2026)
cmake --build --preset vs18
build/Release/lemond.exe - HTTP serverbuild/Release/LemonadeServer.exe - GUI app (embedded server + system tray)build/Release/lemonade.exe - CLI clientbuild/Release/lemonade-server.exe - Legacy shim (deprecated)build/lemond - HTTP serverbuild/lemonade - CLI clientbuild/lemonade-tray - System tray client (macOS always; Linux when AppIndicator3 found)build/lemonade-server - Legacy shim (deprecated)build/Release/resources/ on Windows, build/resources/ on Linux/macOS (web UI files, model registry, backend version configuration)The tray menu’s “Open app” option and the lemonade run command can launch the Tauri desktop app. To include it in your build:
Prerequisites:
libwebkit2gtk-4.1-dev, libsoup-3.0-dev, libjavascriptcoregtk-4.1-dev, librsvg2-dev, libayatana-appindicator3-dev (the setup.sh script checks for these and prompts to install)Build the Tauri app using CMake:
Linux / macOS
cmake --build --preset default --target tauri-app
Windows (Visual Studio 2022)
cmake --build --preset windows --target tauri-app
Windows (Visual Studio 2026)
cmake --build --preset vs18 --target tauri-app
This will:
npm ci in src/app/ to install webpack and the Tauri CLIsrc/app/dist/renderer/cargo tauri build the Rust host against that renderer bundlebuild/app/lemonade-app[.exe|.app]First build is slow. The cold path downloads ~80 Rust crates and compiles them with LTO; expect several minutes the first time. Incremental rebuilds (cargo cache hot, no Rust changes) are <30s.
Hot reload during UI iteration. Going through CMake rebuilds the cargo binary on every change. For frontend iteration, run the Tauri CLI’s dev mode directly — it watches the renderer with webpack and only re-runs cargo when Rust source changes:
cd src/app npm run devThis is dramatically faster (<1s per renderer change) and is the right loop for any work that doesn’t touch
src-tauri/.
The tray app searches for the Tauri app in these locations:
../app/lemonade-app.exe (relative to bin/ directory)../../build/app/lemonade-app.exe (from build/Release/)/opt/share/lemonade-server/app/lemonade-app../app/lemonade-app (from build/)If not found, the “Open app” menu option is hidden but everything else works.
Windows:
npm run dev or cargo commands fail with “program not found”, ensure %USERPROFILE%\.cargo\bin is in your PATH. Add it permanently from PowerShell with a duplicate-safe update, then restart your IDE or terminal:
$cargoBin = Join-Path $env:USERPROFILE ".cargo\bin"
$userPath = [System.Environment]::GetEnvironmentVariable("PATH", "User")
if (($userPath -split ";") -notcontains $cargoBin) {
[System.Environment]::SetEnvironmentVariable("PATH", "$cargoBin;$userPath", "User")
}
link.exe or msvcrt.lib errors, launch from a Visual Studio Developer shell and confirm where link lists the Visual Studio linker before Git’s link.exe.Linux:
lemond is always headless on Linux (GTK-free, daemon-friendly); use lemond to start the server directlylemonade-tray is a separate binary for the system tray, auto-detected at build time: built if AppIndicator3 libraries are found (GTK3 only needed for non-glib variants)-DREQUIRE_LINUX_TRAY=ONayatana-appindicator-glib-devel (preferred, no GTK3 needed), ayatana-appindicator3-devel, or libappindicator-gtk3-devel (the latter two also require gtk3-devel)/opt/bin~/.cache/lemonade/bin/~/.cache/huggingface/ (follows HF conventions)$XDG_RUNTIME_DIR/lemonade/ when set and writable, otherwise /tmp/macOS (beta):
.pkg installer is available from the releases pagecodesign --sign -) so they run without an Apple Developer certificate. To sign with a real Developer ID, export DEVELOPER_ID_APPLICATION_IDENTITY before configuring CMake (see Building and Notarizing for Distribution).Prerequisites:
Building:
cmake --build build --config Release --target wix_installers
Installer Output:
Creates lemonade-server-minimal.msi which:
%LOCALAPPDATA%\lemonade_server\, adds to user PATH, no UAC required%PROGRAMFILES%\Lemonade Server\, adds to system PATH, requires elevationlemonade-tray.exe)Available Installers:
lemonade-server-minimal.msi - Server only (~3 MB)lemonade.msi - Full installer with Tauri desktop app (~10 MB)Prerequisites:
Building:
cd build
cpack
Package Output:
Creates lemonade-server_<VERSION>_amd64.deb (e.g., lemonade-server_9.0.3_amd64.deb) which:
/opt/bin/ (executables)/opt/share/lemonade-server//opt/share/applications/libcurl4, libssl3, libz1, unzip, fonts-katexffmpeg for whisper.cpp audio resampling and/or transcoding, plus a Chromium-compatible browser for lemonade-web-app/opt/share/lemonade-server/llama/ directoryInstallation:
# Replace <VERSION> with the actual version (e.g., 9.0.0)
sudo apt install ./lemonade-server_<VERSION>_amd64.deb
Uninstallation:
sudo dpkg -r lemonade-server
Post-Installation:
The executables will be available in PATH:
lemonade --help
lemond --help
Very similar to the Debian instructions above with minor changes
Building:
cd build
cpack -G RPM
Package Output:
Creates lemonade-server-<VERSION>.x86_64.rpm (e.g., lemonade-server-9.1.2.x86_64.rpm) and
resources are installed as per DEB version above
Installation:
# Replace <VERSION> with the actual version (e.g., 9.0.0)
sudo dnf install ./lemonade-server-<VERSION>.x86_64.rpm
Uninstallation:
sudo dnf remove lemonade-server
Post-Installation:
Same as .deb above
macOS:
For access with P
xcrun notarytool store-credentials AC_PASSWORD --apple-id "[email protected]" --team-id "your-team-id" --private-key "/path/to/AuthKey_XXXXXX.p8"
or For access with API password
xcrun notarytool store-credentials AC_PASSWORD --apple-id "[email protected]" --team-id "your-team-id" --password ""
Get your team id at: https://developer.apple.com/account
# Install Xcode command line tools
xcode-select --install
# Navigate to the C++ source directory
cd src/cpp
# Create and enter build directory
mkdir build
cd build
# Configure with CMake
cmake ..
# Build with all cores
cmake --build . --config Release -j
The build system provides several CMake targets for different build configurations:
lemond: The main HTTP server executable that handles LLM inference requestspackage-macos: Creates a signed macOS installer package (.pkg) using productbuildnotarize_package: Builds and submits the package to Apple for notarization and staples the tickettauri-app: Builds the Tauri desktop application (Rust host + React renderer)prepare_tauri_app: Prepares the Tauri .app bundle for inclusion in the macOS installerTo build a notarized macOS installer for distribution:
export DEVELOPER_ID_APPLICATION_IDENTITY="Developer ID Application: Your Name (TEAMID)"
export DEVELOPER_ID_INSTALLER_IDENTITY="Developer ID Installer: Your Name (TEAMID)"
export AC_PASSWORD="your-app-specific-password"
xcrun notarytool store-credentials "AC_PASSWORD" \
--apple-id "[email protected]" \
--team-id "YOURTEAMID" \
--password "your-app-specific-password"
cd src/cpp/build
cmake --build . --config Release --target package-macos
cmake --build . --config Release --target notarize_package
The notarization process will:
Note: The package is signed with hardened runtime entitlements during the build process for security.
Dev Containers and install it.>Dev Containers: Open Workspace in Container or >Dev Containers: Open Folder in Container which you can do in the command bar in the IDE and it should reopen the visual studio code project."cmake.debugConfig": {
"args": [
"--llamacpp", "cpu"
]
}
lemonade you may pass a subcommand (e.g., run MODEL) as arguments.Cmake: Select a Kit
Select a kit or Scan for kit. (Two options should be available gcc or clang)
Cmake: Configure
Optional commands are:
Cmake: Build Target
use this to select a cmake target to build
Cmake: Set Launch/Debug target
use this to select/set your cmake target you want to build/debug
Cmake: Debug
Cmake: Delete Cache and Reconfigure ```
.vscode/settings.json in which you may set custom args for launching the debug in the json key cmake.debugConfigNote
For running Lemonade as a containerized application (as an alternative to the MSI-based distribution), see
DOCKER_GUIDE.md.
src/cpp/
├── CPackRPM.cmake # RPM packaging configuration
├── DOCKER_GUIDE.md # Docker containerization guide
├── Extra-Models-Dir-Spec.md # Extra models directory specification
├── Multi-Model-Spec.md # Multi-model loading specification
├── postinst # Debian package post-install script
├── postinst-full # Debian package post-install script (full version)
├── resources/ # Configuration and data files (self-contained)
│ ├── backend_versions.json # llama.cpp/whisper version configuration
│ ├── server_models.json # Model registry (available models)
│ └── static/ # Web UI assets
│ ├── index.html # Server landing page (with template variables)
│ └── favicon.ico # Site icon
│
├── installer/ # WiX MSI installer (Windows)
│ ├── Product.wxs.in # WiX installer definition template
│ ├── installer_banner_wix.bmp # Left-side banner (493×312)
│ └── top_banner.bmp # Top banner with lemon icon (493×58)
│
├── server/ # Server implementation
│ ├── main.cpp # Entry point, CLI routing
│ ├── server.cpp # HTTP server (cpp-httplib)
│ ├── router.cpp # Routes requests to backends
│ ├── model_manager.cpp # Model registry, downloads, caching
│ ├── cli_parser.cpp # Command-line argument parsing (CLI11)
│ ├── recipe_options.cpp # Recipe option handling
│ ├── wrapped_server.cpp # Base class for backend wrappers
│ ├── streaming_proxy.cpp # Server-Sent Events for streaming
│ ├── system_info.cpp # NPU/GPU device detection
│ ├── lemonade.manifest.in # Windows manifest template
│ ├── version.rc # Windows version resource
│ │
│ ├── backends/ # Model backend implementations
│ │ ├── backend_utils.cpp # Shared backend utilities
│ │ ├── llamacpp_server.cpp # Wraps llama.cpp for LLM inference (CPU/GPU)
│ │ ├── fastflowlm_server.cpp # Wraps FastFlowLM for NPU inference
│ │ ├── ryzenaiserver.cpp # Wraps RyzenAI server for hybrid NPU
│ │ ├── sd_server.cpp # Wraps Stable Diffusion for image generation
│ │ └── whisper_server.cpp # Wraps whisper.cpp for audio transcription (CPU/NPU)
│ │
│ └── utils/ # Utility functions
│ ├── http_client.cpp # HTTP client using libcurl
│ ├── json_utils.cpp # JSON file I/O
│ ├── process_manager.cpp # Cross-platform process management
│ ├── path_utils.cpp # Path manipulation
│ ├── wmi_helper.cpp # Windows WMI for NPU detection
│ └── wmi_helper.h # WMI helper header
│
├── include/lemon/ # Public headers
│ ├── server.h # HTTP server interface
│ ├── router.h # Request routing
│ ├── model_manager.h # Model management
│ ├── cli_parser.h # CLI argument parsing
│ ├── recipe_options.h # Recipe option definitions
│ ├── wrapped_server.h # Backend wrapper base class
│ ├── streaming_proxy.h # Streaming proxy
│ ├── system_info.h # System information
│ ├── model_types.h # Model type definitions
│ ├── audio_types.h # Audio type definitions
│ ├── error_types.h # Error type definitions
│ ├── server_capabilities.h # Server capability definitions
│ ├── single_instance.h # Single instance enforcement
│ ├── version.h.in # Version header template
│ ├── backends/ # Backend headers
│ │ ├── backend_utils.h # Backend utilities
│ │ ├── llamacpp_server.h # LlamaCpp backend
│ │ ├── fastflowlm_server.h # FastFlowLM backend
│ │ ├── ryzenaiserver.h # RyzenAI backend
│ │ ├── sd_server.h # Stable Diffusion backend
│ │ └── whisper_server.h # Whisper backend
│ └── utils/ # Utility headers
│ ├── http_client.h # HTTP client
│ ├── json_utils.h # JSON utilities
│ ├── process_manager.h # Process management
│ |── path_utils.h # Path utilities
| |── network_beacon.h # Helps broadcast a beacon on port 13305 to network multicast
│
└── tray/ # System tray application
├── CMakeLists.txt # Tray-specific build config
├── main.cpp # Entry point (WinMain on Windows, main on macOS/Linux)
├── tray_ui.h # TrayUI class header
├── tray_ui.cpp # TrayUI class — menu, HTTP, icon, app launch (~500 lines)
├── agent_launcher.cpp # Agent (claude/codex) launcher (shared with CLI)
├── version.rc # Windows version resource
└── platform/ # Platform-specific implementations
├── windows_tray.cpp # Win32 system tray API
├── macos_tray.mm # Objective-C++ NSStatusBar
├── linux_tray.cpp # GTK/AppIndicator
└── tray_factory.cpp # Platform detection
The Lemonade Server C++ implementation uses a client-server architecture:
A pure HTTP server that:
/api/v0 and /api/v1)Key Layers:
Multi-Model Support:
--max-loaded-models N (default: 1)A console application for terminal users:
list, pull, delete, run, status, logs, launch, backends, scan)lemond via HTTP endpointsA GUI application for desktop users that exposes the server via a system tray icon:
LemonadeServer.exe — a SUBSYSTEM:WINDOWS app that embeds the server and shows a system tray icon. No console window.lemonade-tray — tray application (requires GTK3 + AppIndicator3). Connects to an already-running server if one is found; otherwise starts one (via systemd if a unit is installed, or by spawning lemond directly).The lemonade client communicates with lemond server via HTTP:
/api/v1/models, /api/v1/pull, /api/v1/delete/api/v1/load, /api/v1/unload/api/v1/health, /internal/shutdown, /internal/set, /internal/config, /internal/cleanup-cache/api/v1/chat/completions, /api/v1/completions, /api/v1/audio/transcriptionsThe client automatically:
Single-Instance Protection:
LemonadeServer.exe holds a system-wide mutex (Global\LemondMutex). A second launch shows a “Server is already running” dialog and exits.lemonade-tray acquires an exclusive flock() on a lock file in the runtime directory to prevent duplicate tray instances.Server Discovery:
lemonade CLI auto-discovers the running server via UDP beacon broadcast, falling back to the default port if no beacon is found.Network Beacon based broadcasting:
These endpoints are for first-party Lemonade software only (CLI, tray app, desktop app). They are not part of the public API, may change without notice, and must not be relied upon by third-party integrations.
Internal endpoints are restricted to loopback (127.0.0.1 / ::1) — requests from non-localhost addresses receive 403 Forbidden.
| Method | Path | Description |
|---|---|---|
POST |
/internal/shutdown |
Unloads all models and shuts down the server |
POST |
/internal/set |
Unified config setter (see below) |
GET |
/internal/config |
Returns the full runtime config snapshot |
POST |
/internal/cleanup-cache |
Cleans up orphaned files in the Hugging Face cache |
POST /internal/setAccepts a JSON object with one or more keys to update atomically. Returns {"status":"success","updated":{...}} on success, or 400 with an error message on validation failure.
Server-level keys (trigger immediate side effects):
| Key | Type | Side Effect |
|---|---|---|
port |
int (1–65535) | HTTP rebind |
host |
string | HTTP rebind |
log_level |
string (trace, debug, info, warning, error, fatal, none) |
Reconfigures log filter |
global_timeout |
int (positive) | Updates default HTTP client timeout |
no_broadcast |
bool | Stops or starts UDP beacon |
extra_models_dir |
string | Updates model manager search path |
Deferred keys (affect the next model load or eviction decision, no immediate side effect):
| Key | Type |
|---|---|
max_loaded_models |
int (-1 or positive) |
ctx_size |
int (positive) |
llamacpp_backend |
string |
llamacpp_args |
string |
sdcpp_backend |
string |
whispercpp_backend |
string |
whispercpp_args |
string |
steps |
int (positive) |
cfg_scale |
number |
width |
int (positive) |
height |
int (positive) |
flm_args |
string |
Example:
curl -X POST http://localhost:13305/internal/set \
-H "Content-Type: application/json" \
-d '{"ctx_size": 8192, "max_loaded_models": 3, "log_level": "debug"}'
GET /internal/configReturns the full runtime configuration as a flat JSON object containing all server-level and recipe option keys with their current values.
Example:
curl http://localhost:13305/internal/config
All dependencies are automatically fetched by CMake via FetchContent:
Platform-specific SSL backends are used (Schannel on Windows, SecureTransport on macOS, OpenSSL on Linux).
The lemond executable is a pure HTTP server without any command-based interface:
# Start server with default options
./lemond
# Start server with custom port
./lemond --port 8080
# Available options:
# [cache_dir] Path to lemonade cache directory (optional)
# --port PORT Port number (default: 13305)
# --host HOST Bind address (default: localhost)
# --version, -v Show version
# --help, -h Show help
All other server settings are managed via lemonade config set (see Server Configuration).
The lemonade executable is the command-line interface for terminal users:
lemond via HTTP endpoints# List available models
./lemonade list
# Pull a model
./lemonade pull Llama-3.2-1B-Instruct-CPU
# Delete a model
./lemonade delete Llama-3.2-1B-Instruct-CPU
# Check server status
./lemonade status
# Run a model (loads model and opens browser)
./lemonade run Llama-3.2-1B-Instruct-CPU
# View server logs
./lemonade logs
# List recipes and backends
./lemonade backends
The tray application provides a system tray icon for desktop users:
Platform support:
LemonadeServer.exe — a SUBSYSTEM:WINDOWS app that embeds lemond and shows a system tray icon. No separate console process. Auto-starts via the Windows startup folder.lemonade-tray — available when compiled with GTK3 + AppIndicator3 support (auto-detected at build time). Connects to an already-running server if one is found; otherwise starts one (via systemd if a unit is installed, or by spawning lemond directly).What it does (Linux):
lemond directly)When to use:
System Tray Features (when running):
UI Improvements:
When running LemonadeServer.exe or lemond:
%TEMP%\lemonade-server.log on Windows). When lemond runs as the systemd service, logs go to the journal instead.lemonade logs to open the desktop app’s logs view
Logs UI Features:
/logs/streamRun the commands from the Usage section above to verify basic functionality.
The C++ implementation is tested using the existing Python test suite.
Prerequisites:
pip install -r test/requirements.txtPython integration tests (from test/ directory, ordered least to most complex):
| Test File | Description |
|---|---|
server_cli.py |
CLI commands (version, list, pull, status, delete, serve, stop, run) |
server_endpoints.py |
HTTP endpoints (health, models, pull, load, unload, system-info, stats) |
server_llm.py |
LLM inference (chat completions, embeddings, reranking) |
server_whisper.py |
Audio transcription (whisper models) |
server_sd.py |
Image generation (Stable Diffusion, ~2-3 min per image on CPU) |
Running tests:
# CLI tests (no inference backend needed)
python test/server_cli.py
# Endpoint tests (no inference backend needed)
python test/server_endpoints.py
# LLM tests (specify wrapped server and backend)
python test/server_llm.py --wrapped-server llamacpp --backend vulkan
# Audio transcription tests
python test/server_whisper.py
# Image generation tests (slow)
python test/server_sd.py
The tests auto-discover the server binary from the build directory. Use --server-binary to override if needed.
See the .github/workflows/ directory for CI/CD test configurations.
Note: The Python tests should now use lemonade-server.exe as the entry point since it provides the CLI interface.
#pragma oncelemon:: namespacedocs/api/src/cpp/resources/server_models.jsonsrc/cpp/resources/static/src/cpp/resources/backend_versions.jsonThis project is licensed under the Apache 2.0 License. All dependencies use permissive licenses (MIT, BSD, Apache 2.0, curl license).