Nano Banana — Image Generation
Generate and edit AI images via the Gemini API (Nano Banana model).
When to Use
✅ USE this skill when:
- User asks to generate/create an image from a text description
- User asks to edit an existing image (style transfer, object removal, text replacement)
- User needs a quick visual, thumbnail, icon, or concept art
- User says "draw", "generate an image", "make me a picture", "create art"
❌ DON'T use this skill when:
- User wants to analyze an existing image (use the
imagetool) - User wants screenshots (use browser/screen tools)
- User wants to edit PDFs (use nano-pdf skill)
Setup
Requires a Google AI API key with Gemini access. The key is already configured in auth-profiles.
API Key: Stored in ~/.openclaw/agents/codex/agent/auth-profiles.json under google:default
Free tier limits: 100 requests/month, standard resolution, 10 req/min rate limit.
Usage
Text-to-Image Generation
Use the generation script:
python3 ~/.openclaw/workspace/skills/nano-banana/scripts/generate.py \
--prompt "A futuristic cyberpunk city at night with neon signs" \
--output /tmp/generated_image.png
Options
| Flag | Description | Default |
|------|-------------|---------|
| --prompt | Text description of the image to generate | Required |
| --output | Output file path | /tmp/nano_banana_output.png |
| --aspect | Aspect ratio: 1:1, 16:9, 9:16, 4:3, 3:4 | 1:1 |
| --style | Style hint appended to prompt (e.g., "photorealistic", "anime", "pixel art") | None |
Image Editing
For editing existing images:
python3 ~/.openclaw/workspace/skills/nano-banana/scripts/edit.py \
--image /path/to/input.png \
--prompt "Remove the background and replace with a sunset" \
--output /tmp/edited_image.png
Workflow
- User requests an image → parse the description
- Run the generate script with the prompt
- If output succeeds, share the image file path back
- On error, check: API key valid? Rate limit hit? Prompt too vague?
Rate Limits
- Free tier: 100 requests/month, 10 requests/minute
- If you get 429 errors, wait and retry
- Track usage mentally — don't burn through 100 requests on iterations of the same prompt
Tips
- Be specific in prompts: "A calico cat sitting on a red velvet throne, oil painting style" > "a cat"
- For consistent style, append style keywords: "digital art", "watercolor", "3D render", "studio photo"
- The model handles complex scenes well but may struggle with text in images
- For text-heavy images, use the edit endpoint with text-replace capability
Scan to join WeChat group