GPU CLI

GPU CLI runs local commands on remote NVIDIA GPUs by prefixing with gpu. It provisions a pod, syncs your code, streams logs, and syncs outputs back: uv run python train.py becomes gpu run uv run python train.py.

Quick diagnostics

gpu doctor --json       # Check if setup is healthy (daemon, auth, provider keys)
gpu status --json       # See running pods and costs
gpu inventory --json    # See available GPUs and pricing

Command families

Getting started

| Command | Purpose | |---|---| | gpu login | Browser-based authentication | | gpu logout [-y] | Remove session | | gpu init [--gpu-type T] [--force] | Initialize project config | | gpu upgrade | Open subscription upgrade page |

Running code

| Command | Purpose | |---|---| | gpu run <command> | Execute on remote GPU (main command) | | gpu run -d <command> | Run detached (background) | | gpu run -a <job_id> | Reattach to running job | | gpu run --cancel <job_id> | Cancel a running job | | gpu status [--json] | Show project status, pods, costs | | gpu logs [-j JOB] [-f] [--tail N] [--json] | View job output | | gpu attach <job_id> | Reattach to job output stream | | gpu stop [POD_ID] [-y] | Stop active pod |

Key gpu run flags: --gpu-type, --gpu-count <1-8>, --min-vram, --rebuild, -o/--output, --no-output, --sync, -p/--publish <PORT>, -e <KEY=VALUE>, -i/--interactive.

GPU inventory

| Command | Purpose | |---|---| | gpu inventory [--available] [--min-vram N] [--max-price P] [--json] | List GPUs with pricing |

Volumes

| Command | Purpose | |---|---| | gpu volume list [--detailed] [--json] | List network volumes | | gpu volume create [--name N] [--size GB] [--datacenter DC] | Create volume | | gpu volume delete <VOL> [--force] | Delete volume | | gpu volume extend <VOL> --size <GB> | Increase size | | gpu volume set-global <VOL> | Set default volume | | gpu volume status [--volume V] [--json] | Volume usage | | gpu volume migrate <VOL> --to <DC> | Migrate to datacenter | | gpu volume sync <SRC> <DEST> | Sync between volumes |

Vault (encrypted storage)

| Command | Purpose | |---|---| | gpu vault list [--json] | List encrypted output files | | gpu vault export <PATH> <DEST> | Export decrypted file | | gpu vault stats [--json] | Storage usage stats |

Configuration

| Command | Purpose | |---|---| | gpu config show [--json] | Show merged config | | gpu config validate | Validate against schema | | gpu config schema | Print JSON schema | | gpu config set <KEY> <VALUE> | Set global config option | | gpu config get <KEY> | Get global config value |

Authentication

| Command | Purpose | |---|---| | gpu auth login [--profile P] | Authenticate with cloud provider | | gpu auth logout | Remove credentials | | gpu auth status | Show auth status | | gpu auth add <HUB> | Add hub credentials (hf, civitai) | | gpu auth remove <HUB> | Remove hub credentials | | gpu auth hubs | List configured hubs |

Organizations

| Command | Purpose | |---|---| | gpu org list | List organizations | | gpu org create <NAME> | Create organization | | gpu org switch [SLUG] | Set active org context | | gpu org invite <EMAIL> | Invite member | | gpu org service-account create --name N | Create service token | | gpu org service-account list | List service accounts | | gpu org service-account revoke <ID> | Revoke token |

LLM inference

| Command | Purpose | |---|---| | gpu llm run [--ollama\|--vllm] [--model M] [-y] | Launch LLM inference | | gpu llm info [MODEL] [--url URL] [--json] | Show model info |

ComfyUI workflows

| Command | Purpose | |---|---| | gpu comfyui list [--json] | Browse available workflows | | gpu comfyui info <WORKFLOW> [--json] | Show workflow details | | gpu comfyui validate <WORKFLOW> [--json] | Pre-flight checks | | gpu comfyui run <WORKFLOW> | Run workflow on GPU | | gpu comfyui generate "<PROMPT>" | Text-to-image generation | | gpu comfyui stop [WORKFLOW] [--all] | Stop ComfyUI pod |

Notebooks

| Command | Purpose | |---|---| | gpu notebook [FILE] [--run] [--new NAME] | Run Marimo notebook on GPU |

Alias: gpu nb

Serverless endpoints

| Command | Purpose | |---|---| | gpu serverless deploy [--template T] [--json] | Deploy endpoint | | gpu serverless status [ENDPOINT] [--json] | Endpoint status | | gpu serverless logs [ENDPOINT] | View request logs | | gpu serverless list [--json] | List all endpoints | | gpu serverless delete [ENDPOINT] | Delete endpoint | | gpu serverless warm [--cpu\|--gpu] | Pre-warm endpoint |

Templates

| Command | Purpose | |---|---| | gpu template list [--json] | Browse official templates | | gpu template clear-cache | Clear cached templates |

Daemon control

| Command | Purpose | |---|---| | gpu daemon status [--json] | Show daemon health | | gpu daemon start | Start daemon | | gpu daemon stop | Stop daemon | | gpu daemon restart | Restart daemon | | gpu daemon logs [-f] [-n N] | View daemon logs |

Tools and utilities

| Command | Purpose | |---|---| | gpu dashboard | Interactive TUI for pods and jobs | | gpu doctor [--json] | Diagnostic checks | | gpu agent-docs | Print agent reference to stdout | | gpu update [--check] | Update CLI | | gpu changelog [VERSION] | View release notes | | gpu issue ["desc"] | Report issue | | gpu desktop | Desktop app management | | gpu support | Open community Discord |

Common workflows

Setup: gpu login then gpu init
Run job: gpu run python train.py --epochs 10
With specific GPU: gpu run --gpu-type "RTX 4090" python train.py
Detached job: gpu run -d python long_training.py then gpu status --json
Check status: gpu status --json
View logs: gpu logs --json
Stop pods: gpu stop -y
LLM inference: gpu llm run --ollama --model llama3 -y
ComfyUI: gpu comfyui run flux_schnell
Diagnose issues: gpu doctor --json

gpu run is pod-reuse oriented: after a command completes, the next gpu run reuses the existing pod until you gpu stop or cooldown ends.

JSON output

Most commands support --json for machine-readable output. Structured data goes to stdout; human-oriented status and progress messages go to stderr.

Commands with --json: status, logs, doctor, inventory, config show, daemon status, volume list, volume status, vault list, vault stats, comfyui list, comfyui info, comfyui validate, serverless deploy, serverless status, serverless list, template list, llm info.

Exit codes

| Code | Meaning | Recovery | |---|---|---| | 0 | Success | Proceed | | 1 | General error | Read stderr | | 2 | Usage error | Fix command syntax | | 10 | Auth required | gpu auth login | | 11 | Quota exceeded | gpu upgrade or wait | | 12 | Not found | Check resource ID | | 13 | Daemon unavailable | gpu daemon start, retry | | 14 | Timeout | Retry | | 15 | Cancelled | Re-run if needed | | 130 | Interrupted | Re-run if needed |

Configuration

Project config: gpu.toml, gpu.jsonc, or pyproject.toml [tool.gpu]
Global config: ~/.gpu-cli/config.toml (via gpu config set/get)
Sync model: .gitignore controls upload; outputs patterns control download
Secrets and credentials must stay in the OS keychain, never plaintext project files
CI env vars: GPU_RUNPOD_API_KEY, GPU_SSH_PRIVATE_KEY, GPU_SSH_PUBLIC_KEY

References

Project generation and task setup: references/create.md
Debugging and common failures: references/debug.md
Config schema and field examples: references/config.md
Cost and GPU selection guidance: references/optimize.md
Persistent storage and volumes: references/volumes.md

gpu-cli

GPU CLI

Quick diagnostics

Command families

Getting started

Running code

GPU inventory

Volumes

Vault (encrypted storage)

Configuration

Authentication

Organizations

LLM inference

ComfyUI workflows

Notebooks

Serverless endpoints

Templates

Daemon control

Tools and utilities

Common workflows

JSON output

Exit codes

Configuration

References