Photo Studio

Generate professional AI-enhanced portraits and group photos using Seedream 4.5 AI model.

Quick Start

# Interactive mode - easiest way to start
python scripts/main.py generate --photo path/to/your/photo.jpg

# Non-interactive mode - for agent integration
python scripts/main.py generate --photo "$USER_PHOTO" --scenario portrait --non-interactive

Core Workflow

Select scenario from 9 options: celebrity, portrait, couple, family, edit, fusion, series, poster, free
Provide inputs: photos, styles, templates, prompts based on scenario
Generate images: CLI preprocesses photos, calls Seedream 4.5 API, saves results to output/images/
Review and save: View, reorder, regenerate, or confirm images

Essential Commands

Generate Images

# Celebrity photos with characters
python scripts/main.py generate --photo "$USER_PHOTO" --scenario celebrity --non-interactive

# Portrait photos with style
python scripts/main.py generate --photo "$USER_PHOTO" --scenario portrait --style "职业商务照" --non-interactive

# Couple photos with pose and background
python scripts/main.py generate --photos "$PHOTO1,$PHOTO2" --scenario couple --pose "手牵手面向镜头" --background "海滩日落" --non-interactive

# Family photos with template
python scripts/main.py generate --photos "$PHOTO1,$PHOTO2,$PHOTO3" --scenario family --template "温馨家庭聚会" --non-interactive

# Edit images (change clothing, material, background, style, enhance)
python scripts/main.py generate --photo "$USER_PHOTO" --scenario edit --template change-clothing --clothing "运动外套" --non-interactive

# Fuse images (outfit, person-scenery, brand, multi-person)
python scripts/main.py generate --photos "$PHOTO1,$PHOTO2" --scenario fusion --template outfit-fusion --non-interactive

# Create series (seasons, brand kit, character states, story sequence)
python scripts/main.py generate --photo "$USER_PHOTO" --scenario series --template seasons --count 4 --non-interactive

# Design poster (movie, event, product)
python scripts/main.py generate --photo "$USER_PHOTO" --scenario poster --template movie-poster --non-interactive

# Free mode with custom prompt
python scripts/main.py generate --photo "$USER_PHOTO" --scenario free --prompt "A futuristic cyberpunk portrait" --non-interactive

List Available Options

# List all scenarios
python scripts/main.py list-scenarios

# List styles for portrait/couple/family/celebrity
python scripts/main.py list-styles --scenario <scenario_id>

# List couple poses
python scripts/main.py list-poses

# List family templates
python scripts/main.py list-templates

# List backgrounds for couple/family
python scripts/main.py list-backgrounds --scenario <scenario_id>

# List characters
python scripts/main.py list-characters

Configuration and Utilities

# View configuration
python scripts/main.py config --show

# Update configuration
python scripts/main.py config --set generation.default_image_count=3

# Add custom character
python scripts/main.py add-character "Character Name" "Description" --scene "Scene"

# Clean temporary files
python scripts/main.py cleanup

Scenarios Overview

| Scenario | Photos Required | Key Options | |----------|----------------|-------------| | Celebrity | 1 | characters, count | | Portrait | 1 | style, count | | Couple | 2 | pose, background, count | | Family | 1-6 | template, background, count | | Edit | 1 | template (5 options), template-specific params | | Fusion | 1-6 | template (4 options), template-specific params | | Series | 1 | template (4 options), count (4/6/8/10) | | Poster | 1 | template (3 options), template-specific params | | Free | 1-14 | prompt, negative-prompt, count |

Environment Setup

# Install dependencies
pip install -r requirements.txt

# Set API key (required for operation)
# API key environment variable name: ARK_API_KEY
# API will return error if key is not properly configured

# Mock mode for testing without API (optional)
export MOCK_API=true

Configuration

Key settings in config.json:

generation.image_width / generation.image_height - Image dimensions (default: 2048)
generation.default_image_count - Default number of images (default: 5)
scenarios.default_scenario - Default scenario (default: celebrity)

File Structure

photo-studio-skill/
 ├── SKILL.md                    # This file
 ├── scripts/                    # Executable CLI tools
 │   └── main.py                # Main entry point
 ├── data/                       # Scenario templates and options
 ├── references/                 # Feature documentation
 │   ├── celebrity.md           # Celebrity photos with movie characters
 │   ├── portrait.md            # Professional personal portraits
 │   ├── couple.md               # Couple/friend portraits
 │   ├── family.md               # Family group photos
 │   ├── edit.md                # Image editing
 │   ├── fusion.md              # Multi-photo fusion
 │   ├── series.md              # Series creation
 │   ├── poster.md              # Poster design
 │   └── free.md                # Free mode with custom prompts
 ├── output/images/             # Generated images
 ├── temp/                      # Temporary files
 ├── logs/                      # Error logs
 ├── config.json                # Configuration settings
 ├── requirements.txt           # Python dependencies
 ├── AGENTS.md                  # Agent development guidelines
 └── README.md                  # Project documentation

References

Load these reference files when working with specific features:

Feature Modules:

references/celebrity.md - Celebrity photos with movie characters
references/portrait.md - Professional personal portraits with various styles
references/couple.md - Couple or friend portraits with poses and backgrounds
references/family.md - Family group photos with templates
references/edit.md - Image editing (clothing, material, background, style, enhancement)
references/fusion.md - Multi-photo fusion (outfit, person-scenery, brand, composite)
references/series.md - Series creation (seasons, brand kit, character states, story)
references/poster.md - Poster design (movie, event, product)
references/free.md - Free mode with custom prompts

Technical Notes

Image Generation

Model: Seedream 4.5 (doubao-seedream-4.5-251128)
Resolution: 2048x2048 (configurable)
Supports 1-14 reference photos
Uses image-to-image generation with user photos as reference
Processing time: ~10-20 seconds per image

Multi-Photo Scenarios

Couple and family scenarios use multi-reference image fusion
Person count controlled via prompt descriptions (not precise)

Mock Mode Benefits

No API costs
Fast testing (500ms instead of 10-20 seconds)
No network dependency
Consistent test results

Troubleshooting

Image generation fails:

Check internet connection
Verify API key is properly configured (see Environment Setup)
Ensure photos are clear and well-lit (≥1024×1024 recommended)
Check logs/ directory for detailed errors

Common issues:

Large photos require more processing time
API rate limits may apply
Person count in group photos is controlled via prompt (not precise)