CLI Reference
pals serve
Run an OpenAI-compatible HTTP server backed by your ProgressPals swarm.
$ pals serve [OPTIONS] [MODEL]
What it does
Starts an HTTP server on your machine that speaks the OpenAI wire format. Any client that can talk to api.openai.com — the openai Python SDK, LangChain, LiteLLM, OpenWebUI, Cursor, Aider, Continue — works unchanged. Internally, requests are translated to ProgressPals RPCs and routed through your swarm.
Endpoints exposed
POST /v1/chat/completions— streaming + non-streamingGET /v1/models— returns the swarm’s current modelGET /healthz— liveness probe (returns 200 once the server is ready). Useful for ops/monitoring; bypasses--api-keyauth.
Default-deny on public binding
pals serve binds to 127.0.0.1 by default. If you pass --host 0.0.0.0 (or any non-loopback address), you must also pass --api-key. The CLI refuses to start otherwise — an accidentally-exposed open endpoint with your swarm behind it is a serious security footgun.Arguments
modelTEXT (optional)
HuggingFace model id (e.g.
meta-llama/Llama-3.1-8B-Instruct). Falls back to config.default_model.Options
--peerTEXT (repeatable)
Bootstrap multiaddr for an existing swarm peer. Falls back to
config.default_peers.--hostTEXT
Interface to bind. Pass
0.0.0.0 to expose on all interfaces — requires --api-key.Default:
127.0.0.1Env: PROGRESSPALS_SERVE_HOST--port, -pINTEGER
TCP port to bind.
Default:
8080Env: PROGRESSPALS_SERVE_PORT--api-keyTEXT
Optional bearer token clients must present as
Authorization: Bearer <key>. Required when --host is not a loopback address. Pass via env var to keep it out of /proc/<pid>/cmdline.Env:
PROGRESSPALS_SERVE_API_KEY--config-dirTEXT
Read config from this directory.
Examples
Localhost-only (the default)
local serve
$pals serve meta-llama/Llama-3.1-8B
✓ listening on http://127.0.0.1:8080/v1
Talk to it from Python (OpenAI SDK)
client.py
$python -
>>> from openai import OpenAI
>>> client = OpenAI(
... base_url="http://localhost:8080/v1",
... api_key="any-string",
... )
>>> client.chat.completions.create(
... model="meta-llama/Llama-3.1-8B",
... messages=[{"role":"user","content":"hi"}],
... )
LAN exposure with an API key
LAN serve
$export PROGRESSPALS_SERVE_API_KEY=$(openssl rand -hex 32)
$pals serve meta-llama/Llama-3.1-8B --host 0.0.0.0
✓ listening on http://0.0.0.0:8080/v1
API key required for all routes
Activation encryption is automatic
pals serve reads swarm_secret from config.json and sets the env var the inference client uses to derive the AES-256 key. You don’t do anything special — encryption is transparent.