Nano Banana — Image Generation

Generate and edit AI images via the Gemini API (Nano Banana model).

When to Use

✅ USE this skill when:

User asks to generate/create an image from a text description
User asks to edit an existing image (style transfer, object removal, text replacement)
User needs a quick visual, thumbnail, icon, or concept art
User says "draw", "generate an image", "make me a picture", "create art"

❌ DON'T use this skill when:

User wants to analyze an existing image (use the image tool)
User wants screenshots (use browser/screen tools)
User wants to edit PDFs (use nano-pdf skill)

Setup

Requires a Google AI API key with Gemini access. The key is already configured in auth-profiles.

API Key: Stored in ~/.openclaw/agents/codex/agent/auth-profiles.json under google:default

Free tier limits: 100 requests/month, standard resolution, 10 req/min rate limit.

Usage

Text-to-Image Generation

Use the generation script:

python3 ~/.openclaw/workspace/skills/nano-banana/scripts/generate.py \
  --prompt "A futuristic cyberpunk city at night with neon signs" \
  --output /tmp/generated_image.png

Options

| Flag | Description | Default | |------|-------------|---------| | --prompt | Text description of the image to generate | Required | | --output | Output file path | /tmp/nano_banana_output.png | | --aspect | Aspect ratio: 1:1, 16:9, 9:16, 4:3, 3:4 | 1:1 | | --style | Style hint appended to prompt (e.g., "photorealistic", "anime", "pixel art") | None |

Image Editing

For editing existing images:

python3 ~/.openclaw/workspace/skills/nano-banana/scripts/edit.py \
  --image /path/to/input.png \
  --prompt "Remove the background and replace with a sunset" \
  --output /tmp/edited_image.png

Workflow

User requests an image → parse the description
Run the generate script with the prompt
If output succeeds, share the image file path back
On error, check: API key valid? Rate limit hit? Prompt too vague?

Rate Limits

Free tier: 100 requests/month, 10 requests/minute
If you get 429 errors, wait and retry
Track usage mentally — don't burn through 100 requests on iterations of the same prompt

Tips

Be specific in prompts: "A calico cat sitting on a red velvet throne, oil painting style" > "a cat"
For consistent style, append style keywords: "digital art", "watercolor", "3D render", "studio photo"
The model handles complex scenes well but may struggle with text in images
For text-heavy images, use the edit endpoint with text-replace capability