This guide covers everything you need to build, test, and contribute to Lemonade from source. Whether you’re fixing a bug, adding a feature, or just exploring the codebase, this document will help you get started.
Lemonade consists of these main executables:
lemonade-server.exe serveAll Platforms:
Windows:
Linux:
A helper script is available that will set up the build environment on popular Linux distributions and macOS. This will prompt to install dependencies via native package managers and create the build directory.
Linux / macOS
./setup.sh
Windows
./setup.ps1
Build by running:
Linux / macOS
cmake --build --preset default
Windows
cmake --build --preset windows
build/Release/lemonade-router.exe - HTTP serverbuild/Release/lemonade-server.exe - Console CLI clientbuild/Release/lemonade-tray.exe - GUI tray launcherbuild/Release/lemonade-log-viewer.exe - Log file viewerbuild/lemonade-router - HTTP serverbuild/lemonade-server - Console CLI clientbuild/Release/resources/ on Windows, build/resources/ on Linux/macOS (web UI files, model registry, backend version configuration)The tray menu’s “Open app” option and the lemonade-server run command can launch the Electron desktop app. To include it in your build:
Build the Electron app using CMake (requires Node.js 20+):
Linux
cmake --build --preset default --target electron-app
Windows
cmake --build --preset windows --target electron-app
This will:
The tray app searches for the Electron app in these locations:
../app/Lemonade.exe (relative to bin/ directory)../app/win-unpacked/Lemonade.exe (from build/Release/)/usr/local/share/lemonade-server/app/lemonade../app/linux-unpacked/lemonade (from build/)If not found, the “Open app” menu option is hidden but everything else works.
To create a standalone AppImage package that can run on any Linux distribution:
cmake --build --preset default --target appimage
This will:
The generated AppImage will be located in:
build/app-appimage/Lemonade-<version>-<arch>.AppImageThe AppImage is a self-contained executable that includes all dependencies and can be run on any Linux distribution without installation. Simply make it executable and run it:
chmod +x build/app-appimage/Lemonade-*.AppImage
./build/app-appimage/Lemonade-*.AppImage
Windows:
Linux:
lemonade-server serve (headless mode is automatic)/tmp/lemonade-router.pid) for reliable process management/opt/bin~/.cache/lemonade/bin/~/.cache/huggingface/ (follows HF conventions)macOS (beta):
.pkg installer is available from the releases pagePrerequisites:
Building:
Using PowerShell script (recommended):
cd src\cpp
.\build_installer.ps1
Manual build using CMake:
cd src\cpp\build
cmake --build . --config Release --target wix_installer
Installer Output:
Creates lemonade-server-minimal.msi which:
%LOCALAPPDATA%\lemonade_server\, adds to user PATH, no UAC required%PROGRAMFILES%\Lemonade Server\, adds to system PATH, requires elevationlemonade-tray.exe)Available Installers:
lemonade-server-minimal.msi - Server only (~3 MB)lemonade.msi - Full installer with Electron desktop app (~105 MB)Installation:
For detailed installation instructions including silent install, custom directories, and all-users installation, see the Server Integration Guide.
Prerequisites:
Building:
cd build
cpack
Package Output:
Creates lemonade-server_<VERSION>_amd64.deb (e.g., lemonade-server_9.0.3_amd64.deb) which:
/opt/bin/ (executables)/opt/share/lemonade-server//opt/share/applications//opt/share/lemonade-server/llama/ directoryInstallation:
# Replace <VERSION> with the actual version (e.g., 9.0.0)
sudo apt install ./lemonade-server_<VERSION>_amd64.deb
Uninstallation:
sudo dpkg -r lemonade-server
Post-Installation:
The executables will be available in PATH:
lemonade-server --help
lemonade-router --help
# Start server in headless mode:
lemonade-server serve --no-tray
# Or just:
lemonade-server serve
Very similar to the Debian instructions above with minor changes
Building:
cd build
cpack -G RPM
Package Output:
Creates lemonade-server-<VERSION>.x86_64.rpm (e.g., lemonade-server-9.1.2.x86_64.rpm) and
resources are installed as per DEB version above
Installation:
# Replace <VERSION> with the actual version (e.g., 9.0.0)
sudo dnf install ./lemonade-server-<VERSION>.x86_64.rpm
Uninstallation:
sudo dnf remove lemonade-server
Post-Installation:
Same as .deb above
macOS:
For access with P
xcrun notarytool store-credentials AC_PASSWORD --apple-id "[email protected]" --team-id "your-team-id" --private-key "/path/to/AuthKey_XXXXXX.p8"
or For access with API password
xcrun notarytool store-credentials AC_PASSWORD --apple-id "[email protected]" --team-id "your-team-id" --password ""
Get your team id at: https://developer.apple.com/account
# Install Xcode command line tools
xcode-select --install
# Navigate to the C++ source directory
cd src/cpp
# Create and enter build directory
mkdir build
cd build
# Configure with CMake
cmake ..
# Build with all cores
cmake --build . --config Release -j
The build system provides several CMake targets for different build configurations:
lemonade-router: The main HTTP server executable that handles LLM inference requestspackage-macos: Creates a signed macOS installer package (.pkg) using productbuildnotarize_package: Builds and submits the package to Apple for notarization and staples the ticketelectron-app: Builds the Electron-based GUI applicationprepare_electron_app: Prepares the Electron app for inclusion in the installerTo build a notarized macOS installer for distribution:
export DEVELOPER_ID_APPLICATION_IDENTITY="Developer ID Application: Your Name (TEAMID)"
export DEVELOPER_ID_INSTALLER_IDENTITY="Developer ID Installer: Your Name (TEAMID)"
export AC_PASSWORD="your-app-specific-password"
xcrun notarytool store-credentials "AC_PASSWORD" \
--apple-id "[email protected]" \
--team-id "YOURTEAMID" \
--password "your-app-specific-password"
cd src/cpp/build
cmake --build . --config Release --target package-macos
cmake --build . --config Release --target notarize_package
The notarization process will:
Note: The package is signed with hardened runtime entitlements during the build process for security.
Dev Containers and install it.>Dev Containers: Open Workspace in Container or >Dev Containers: Open Folder in Container which you can do in the command bar in the IDE and it should reopen the visual studio code project."cmake.debugConfig": {
"args": [
"--llamacpp", "cpu"
]
}
Cmake: Select a Kit
Select a kit or Scan for kit. (Two options should be available gcc or clang)
Cmake: Configure
Optional commands are:
Cmake: Build Target
use this to select a cmake target to build
Cmake: Set Launch/Debug target
use this to select/set your cmake target you want to build/debug
Cmake: Debug
Cmake: Delete Cache and Reconfigure ```
.vscode/settings.json in which you may set custom args for launching the debug in the json key cmake.debugConfigNote
For running Lemonade as a containerized application (as an alternative to the MSI-based distribution), see
DOCKER_GUIDE.md.
src/cpp/
├── build_installer.ps1 # Installer build script
├── CopyElectronApp.cmake # CMake module to copy Electron app to build output
├── CPackRPM.cmake # RPM packaging configuration
├── DOCKER_GUIDE.md # Docker containerization guide
├── Extra-Models-Dir-Spec.md # Extra models directory specification
├── Multi-Model-Spec.md # Multi-model loading specification
├── postinst # Debian package post-install script
├── postinst-full # Debian package post-install script (full version)
├── resources/ # Configuration and data files (self-contained)
│ ├── backend_versions.json # llama.cpp/whisper version configuration
│ ├── server_models.json # Model registry (available models)
│ └── static/ # Web UI assets
│ ├── index.html # Server landing page (with template variables)
│ └── favicon.ico # Site icon
│
├── installer/ # WiX MSI installer (Windows)
│ ├── Product.wxs.in # WiX installer definition template
│ ├── installer_banner_wix.bmp # Left-side banner (493×312)
│ └── top_banner.bmp # Top banner with lemon icon (493×58)
│
├── server/ # Server implementation
│ ├── main.cpp # Entry point, CLI routing
│ ├── server.cpp # HTTP server (cpp-httplib)
│ ├── router.cpp # Routes requests to backends
│ ├── model_manager.cpp # Model registry, downloads, caching
│ ├── cli_parser.cpp # Command-line argument parsing (CLI11)
│ ├── recipe_options.cpp # Recipe option handling
│ ├── wrapped_server.cpp # Base class for backend wrappers
│ ├── streaming_proxy.cpp # Server-Sent Events for streaming
│ ├── system_info.cpp # NPU/GPU device detection
│ ├── lemonade.manifest.in # Windows manifest template
│ ├── version.rc # Windows version resource
│ │
│ ├── backends/ # Model backend implementations
│ │ ├── backend_utils.cpp # Shared backend utilities
│ │ ├── llamacpp_server.cpp # Wraps llama.cpp for LLM inference (CPU/GPU)
│ │ ├── fastflowlm_server.cpp # Wraps FastFlowLM for NPU inference
│ │ ├── ryzenaiserver.cpp # Wraps RyzenAI server for hybrid NPU
│ │ ├── sd_server.cpp # Wraps Stable Diffusion for image generation
│ │ └── whisper_server.cpp # Wraps whisper.cpp for audio transcription (CPU/NPU)
│ │
│ └── utils/ # Utility functions
│ ├── http_client.cpp # HTTP client using libcurl
│ ├── json_utils.cpp # JSON file I/O
│ ├── process_manager.cpp # Cross-platform process management
│ ├── path_utils.cpp # Path manipulation
│ ├── wmi_helper.cpp # Windows WMI for NPU detection
│ └── wmi_helper.h # WMI helper header
│
├── include/lemon/ # Public headers
│ ├── server.h # HTTP server interface
│ ├── router.h # Request routing
│ ├── model_manager.h # Model management
│ ├── cli_parser.h # CLI argument parsing
│ ├── recipe_options.h # Recipe option definitions
│ ├── wrapped_server.h # Backend wrapper base class
│ ├── streaming_proxy.h # Streaming proxy
│ ├── system_info.h # System information
│ ├── model_types.h # Model type definitions
│ ├── audio_types.h # Audio type definitions
│ ├── error_types.h # Error type definitions
│ ├── server_capabilities.h # Server capability definitions
│ ├── single_instance.h # Single instance enforcement
│ ├── version.h.in # Version header template
│ ├── backends/ # Backend headers
│ │ ├── backend_utils.h # Backend utilities
│ │ ├── llamacpp_server.h # LlamaCpp backend
│ │ ├── fastflowlm_server.h # FastFlowLM backend
│ │ ├── ryzenaiserver.h # RyzenAI backend
│ │ ├── sd_server.h # Stable Diffusion backend
│ │ └── whisper_server.h # Whisper backend
│ └── utils/ # Utility headers
│ ├── http_client.h # HTTP client
│ ├── json_utils.h # JSON utilities
│ ├── process_manager.h # Process management
│ |── path_utils.h # Path utilities
| |── network_beacon.h # Helps broadcast a beacon on port 8000 to network multicast
│
└── tray/ # System tray application
├── CMakeLists.txt # Tray-specific build config
├── main.cpp # Tray entry point (lemonade-server)
├── tray_launcher.cpp # GUI launcher (lemonade-tray)
├── log-viewer.cpp # Log file viewer (lemonade-log-viewer)
├── server_manager.cpp # Manages lemonade-router process
├── tray_app.cpp # Main tray application logic
├── lemonade-server.manifest.in # Windows manifest template
├── version.rc # Windows version resource
└── platform/ # Platform-specific implementations
├── windows_tray.cpp # Win32 system tray API
├── macos_tray.mm # Objective-C++ NSStatusBar
├── linux_tray.cpp # GTK/AppIndicator
└── tray_factory.cpp # Platform detection
The Lemonade Server C++ implementation uses a client-server architecture:
A pure HTTP server that:
/api/v0 and /api/v1)Key Layers:
Multi-Model Support:
--max-loaded-models N (default: 1)A console application for terminal users:
list, pull, delete, run, status, stop, serve)lemonade-router via HTTP endpointslemonade-router with appropriate optionsserve commandCommand Types:
A minimal WIN32 GUI application for desktop users:
lemonade-server.exe serveThe lemonade-server client communicates with lemonade-router server via HTTP:
/api/v1/models, /api/v1/pull, /api/v1/delete/api/v1/load, /api/v1/unload/api/v1/health, /internal/shutdown/api/v1/chat/completions, /api/v1/completions, /api/v1/audio/transcriptionsThe client automatically:
Single-Instance Protection:
lemonade-router, lemonade-server serve, lemonade-tray) enforces single-instance using system-wide mutexesserve command is blocked when a server is runningstatus, list, pull, delete, stop can run alongside an active server/tmp/lemonade-router.pid) for efficient server discovery and port detection
Network Beacon based broadcasting:
All dependencies are automatically fetched by CMake via FetchContent:
Platform-specific SSL backends are used (Schannel on Windows, SecureTransport on macOS, OpenSSL on Linux).
The lemonade-router executable is a pure HTTP server without any command-based interface:
# Start server with default options
./lemonade-router
# Start server with custom options
./lemonade-router --port 8080 --ctx-size 8192 --log-level debug
# Available options:
# --port PORT Port number (default: 8000)
# --host HOST Bind address (default: localhost)
# --ctx-size SIZE Context size (default: 4096)
# --log-level LEVEL Log level: critical, error, warning, info, debug, trace
# --llamacpp BACKEND LlamaCpp backend: vulkan, rocm, metal
# --max-loaded-models N Maximum models per type slot (default: 1)
# --version, -v Show version
# --help, -h Show help
The lemonade-server executable is the command-line interface for terminal users:
lemonade-router via HTTP endpoints# List available models
./lemonade-server list
# Pull a model
./lemonade-server pull Llama-3.2-1B-Instruct-CPU
# Delete a model
./lemonade-server delete Llama-3.2-1B-Instruct-CPU
# Check server status
./lemonade-server status
# Stop the server
./lemonade-server stop
# Run a model (starts persistent server with tray and opens browser)
./lemonade-server run Llama-3.2-1B-Instruct-CPU
# Start persistent server (with tray on Windows/macOS, headless on Linux)
./lemonade-server serve
# Start persistent server without tray (headless mode, explicit on all platforms)
./lemonade-server serve --no-tray
# Start server with custom options
./lemonade-server serve --port 8080 --ctx-size 8192
Available Options:
--port PORT - Server port (default: 8000)--host HOST - Server host (default: localhost)--ctx-size SIZE - Context size (default: 4096)--log-level LEVEL - Logging verbosity: info, debug (default: info)--log-file PATH - Custom log file location--server-binary PATH - Path to lemonade-router executable--no-tray - Run without tray (headless mode)--max-loaded-models N - Maximum number of models to keep loaded per type slot (default: 1)Note: lemonade-router is always launched with --log-level debug for optimal troubleshooting. Use --log-level debug on lemonade-server commands to see client-side debug output.
The lemonade-tray executable is a simple GUI launcher for desktop users:
lemonade-server.exe serve in tray modeWhat it does:
lemonade-server.exe in the same directoryserve commandWhen to use:
System Tray Features (when running):
UI Improvements:
When running lemonade-server.exe serve:
%TEMP%\lemonade-server.log)lemonade-log-viewer.exe
Log Viewer Features:
Run the commands from the Usage section above to verify basic functionality.
The C++ implementation is tested using the existing Python test suite.
Prerequisites:
pip install -r test/requirements.txtPython integration tests (from test/ directory, ordered least to most complex):
| Test File | Description |
|---|---|
server_cli.py |
CLI commands (version, list, pull, status, delete, serve, stop, run) |
server_endpoints.py |
HTTP endpoints (health, models, pull, load, unload, system-info, stats) |
server_llm.py |
LLM inference (chat completions, embeddings, reranking) |
server_whisper.py |
Audio transcription (whisper models) |
server_sd.py |
Image generation (Stable Diffusion, ~2-3 min per image on CPU) |
Running tests:
# CLI tests (no inference backend needed)
python test/server_cli.py
# Endpoint tests (no inference backend needed)
python test/server_endpoints.py
# LLM tests (specify wrapped server and backend)
python test/server_llm.py --wrapped-server llamacpp --backend vulkan
# Audio transcription tests
python test/server_whisper.py
# Image generation tests (slow)
python test/server_sd.py
The tests auto-discover the server binary from the build directory. Use --server-binary to override if needed.
See the .github/workflows/ directory for CI/CD test configurations.
Note: The Python tests should now use lemonade-server.exe as the entry point since it provides the CLI interface.
#pragma oncelemon:: namespacedocs/server/server_spec.mdsrc/cpp/resources/server_models.jsonsrc/cpp/resources/static/src/cpp/resources/backend_versions.jsonThis project is licensed under the Apache 2.0 License. All dependencies use permissive licenses (MIT, BSD, Apache 2.0, curl license).