Documentation menu
CLI Reference

pals serve

Run an OpenAI-compatible HTTP server backed by your ProgressPals swarm.

$ pals serve [OPTIONS] [MODEL]

What it does

Starts an HTTP server on your machine that speaks the OpenAI wire format. Any client that can talk to api.openai.com — the openai Python SDK, LangChain, LiteLLM, OpenWebUI, Cursor, Aider, Continue — works unchanged. Internally, requests are translated to ProgressPals RPCs and routed through your swarm.

Endpoints exposed

  • POST /v1/chat/completions — streaming + non-streaming
  • GET /v1/models — returns the swarm’s current model
  • GET /healthz — liveness probe (returns 200 once the server is ready). Useful for ops/monitoring; bypasses --api-key auth.

Default-deny on public binding

pals serve binds to 127.0.0.1 by default. If you pass --host 0.0.0.0 (or any non-loopback address), you must also pass --api-key. The CLI refuses to start otherwise — an accidentally-exposed open endpoint with your swarm behind it is a serious security footgun.

Arguments

model
TEXT (optional)
HuggingFace model id (e.g. meta-llama/Llama-3.1-8B-Instruct). Falls back to config.default_model.

Options

--peer
TEXT (repeatable)
Bootstrap multiaddr for an existing swarm peer. Falls back to config.default_peers.
--host
TEXT
Interface to bind. Pass 0.0.0.0 to expose on all interfaces — requires --api-key.
Default: 127.0.0.1Env: PROGRESSPALS_SERVE_HOST
--port, -p
INTEGER
TCP port to bind.
Default: 8080Env: PROGRESSPALS_SERVE_PORT
--api-key
TEXT
Optional bearer token clients must present as Authorization: Bearer <key>. Required when --host is not a loopback address. Pass via env var to keep it out of /proc/<pid>/cmdline.
Env: PROGRESSPALS_SERVE_API_KEY
--config-dir
TEXT
Read config from this directory.

Examples

Localhost-only (the default)

local serve
$pals serve meta-llama/Llama-3.1-8B
✓ listening on http://127.0.0.1:8080/v1

Talk to it from Python (OpenAI SDK)

client.py
$python -
>>> from openai import OpenAI
>>> client = OpenAI(
... base_url="http://localhost:8080/v1",
... api_key="any-string",
... )
>>> client.chat.completions.create(
... model="meta-llama/Llama-3.1-8B",
... messages=[{"role":"user","content":"hi"}],
... )

LAN exposure with an API key

LAN serve
$export PROGRESSPALS_SERVE_API_KEY=$(openssl rand -hex 32)
$pals serve meta-llama/Llama-3.1-8B --host 0.0.0.0
✓ listening on http://0.0.0.0:8080/v1
API key required for all routes

Activation encryption is automatic

pals serve reads swarm_secret from config.json and sets the env var the inference client uses to derive the AES-256 key. You don’t do anything special — encryption is transparent.