CLI Reference

pals serve

Run an OpenAI-compatible HTTP server backed by your ProgressPals swarm.

$ pals serve [OPTIONS] [MODEL]

What it does

Starts an HTTP server on your machine that speaks the OpenAI wire format. Any client that can talk to api.openai.com — the openai Python SDK, LangChain, LiteLLM, OpenWebUI, Cursor, Aider, Continue — works unchanged. Internally, requests are translated to ProgressPals RPCs and routed through your swarm.

Endpoints exposed

POST /v1/chat/completions — streaming + non-streaming
GET /v1/models — returns the swarm’s current model
GET /healthz — liveness probe (returns 200 once the server is ready). Useful for ops/monitoring; bypasses --api-key auth.

Default-deny on public binding

pals serve binds to 127.0.0.1 by default. If you pass --host 0.0.0.0 (or any non-loopback address), you must also pass --api-key. The CLI refuses to start otherwise — an accidentally-exposed open endpoint with your swarm behind it is a serious security footgun.

Arguments

model

TEXT (optional)

HuggingFace model id (e.g. meta-llama/Llama-3.1-8B-Instruct). Falls back to config.default_model.

Options

--peer

TEXT (repeatable)

Bootstrap multiaddr for an existing swarm peer. Falls back to config.default_peers.

--host

TEXT

Interface to bind. Pass 0.0.0.0 to expose on all interfaces — requires --api-key.

Default: 127.0.0.1Env: PROGRESSPALS_SERVE_HOST

--port, -p

INTEGER

TCP port to bind.

Default: 8080Env: PROGRESSPALS_SERVE_PORT

--api-key

TEXT

Optional bearer token clients must present as Authorization: Bearer <key>. Required when --host is not a loopback address. Pass via env var to keep it out of /proc/<pid>/cmdline.

Env: PROGRESSPALS_SERVE_API_KEY

--config-dir

TEXT

Read config from this directory.

Examples

Localhost-only (the default)

local serve

$pals serve meta-llama/Llama-3.1-8B

✓ listening on http://127.0.0.1:8080/v1

Talk to it from Python (OpenAI SDK)

client.py

$python -

>>> from openai import OpenAI

>>> client = OpenAI(

... base_url="http://localhost:8080/v1",

... api_key="any-string",

... )

>>> client.chat.completions.create(

... model="meta-llama/Llama-3.1-8B",

... messages=[{"role":"user","content":"hi"}],

... )

LAN exposure with an API key

LAN serve

$export PROGRESSPALS_SERVE_API_KEY=$(openssl rand -hex 32)

$pals serve meta-llama/Llama-3.1-8B --host 0.0.0.0

✓ listening on http://0.0.0.0:8080/v1

API key required for all routes

Activation encryption is automatic

pals serve reads swarm_secret from config.json and sets the env var the inference client uses to derive the AES-256 key. You don’t do anything special — encryption is transparent.