Documentation menu
Help

Troubleshooting.

Most failures fall into one of the patterns below. If yours doesn’t, raise the log level with PROGRESSPALS_LOGGING=1 (the default) and re-run the command — the extra output is usually enough to diagnose.

Install errors

ERROR: No matching distribution found for progresspals

Your Python is too old or too new. ProgressPals supports 3.10 and 3.11. Check with python3 --version and use python3.11 -m venv ... if needed.

RuntimeError: bitsandbytes not compiled with GPU support

Install the CUDA-matched torch wheel before installing ProgressPals. On Apple Silicon this message is informational — bitsandbytes falls back to a CPU path automatically.

torch.cuda.is_available() returns False

Your driver and your torch wheel don’t agree about CUDA version. nvidia-smi shows the driver version; pip show torch shows the installed wheel’s target. Reinstall a matching torch wheel from https://download.pytorch.org/whl/cuXXX.

Account / auth errors

401 Unauthorized on pals swarm create / pals invite

Your Supabase access token is missing or expired. Re-export PROGRESSPALS_SUPABASE_ACCESS_TOKEN and PROGRESSPALS_SUPABASE_REFRESH_TOKEN from your account dashboard.

pals login says invite token is invalid

The most common causes:

  • The token was revoked by the operator.
  • The token has expired (operators can set --expires-hours).
  • All --max-uses slots have been consumed (another joiner used the last one).
  • You copy-pasted with extra whitespace. Re-copy without surrounding characters.

Ask the operator to pals invite list and confirm the token’s status, or to mint a fresh one with pals invite create.

Swarm connectivity

pals join hangs on “connecting to peer”

The operator’s multiaddr may not be reachable from your network. Confirm:

  • The operator ran pals create with --public (default is loopback-only).
  • Their firewall / NAT allows inbound on the multiaddr’s port.
  • You can nc -zv HOST PORT from your machine.

Peer accepted at handshake then RPC fails

You likely connected before the server’s allow-list refresh propagated your peer ID. The poll cadence is ~30 seconds; wait a cycle and retry.

InvalidTag from cryptography

The swarm secret on your machine doesn’t match the operator’s. Make sure you redeemed the invite for this swarm — not a stale token from a previous one — and re-run pals login.

pals serve errors

pals serve refuses to start on --host 0.0.0.0

Intentional. Binding a non-loopback interface without an API key would expose your team’s OpenAI endpoint to anyone on the network. Pass --api-key (or set PROGRESSPALS_SERVE_API_KEY) and retry.

503 Service Unavailable from /v1/chat/completions

Not enough peers are online to cover the model. Check pals dash — every transformer layer needs at least one peer hosting it. Bring more pals online or use a smaller model.

401 Invalid API key

Client must send Authorization: Bearer <key> matching PROGRESSPALS_SERVE_API_KEY. Headers are case-insensitive; values are not.

Always useful

Crank up the logs. PROGRESSPALS_LOGGING=1 pals <command> (the default) prints the chain of events that led to the failure. Most issues become obvious from the debug output alone.

Next steps