Back to skills
extension
Category: OtherAPI key required

Stella Selfie

Generate persona-consistent selfie images and send to any OpenClaw channel. Supports Gemini, fal, and laozhang.ai providers, multi-reference avatar blending.

personAuthor: tower1229hubclawhub

Stella Selfie

Generate persona-consistent selfie images using Google Gemini or fal (xAI Grok Imagine) and send them to messaging channels via OpenClaw. Supports multi-reference avatar blending for strong character consistency.

When to Use

  • User says "send a pic", "send me a photo", "send a selfie", "发张照片", "发自拍"
  • User says "show me what you look like...", "send a pic of you...", "展示你在..."
  • User describes a scene: "send a pic wearing...", "send a pic at...", "穿着...发张图"
  • User wants the agent to appear in a specific outfit, location, or situation

Prompt Modes

Mode 1: Mirror Selfie (default)

Best for: outfit showcases, full-body shots, fashion content

A mirror selfie of this person, [user's context], showing full body reflection.

Mode 2: Direct Selfie

Best for: close-up portraits, location shots, emotional expressions

A selfie of this person, [user's context], looking into the lens.

Mode 3: Third-Person Photo

Best for: non-selfie viewpoints, including explicit third-person requests and scenes that should not read as a selfie

A natural third-person photo of this person, [user's context], natural composition, not a selfie.

Mode Selection Logic

| Signal | Auto-Select Mode | | ------ | ---------------- | | Strong user keywords: outfit, wearing, clothes, dress, suit, fashion | mirror | | Strong user keywords: full-body, mirror, reflection, pose, show the look | mirror | | Strong user keywords: selfie, close-up, portrait, face, eyes, smile, looking into the lens | direct | | Strong user keywords: third-person, not a selfie, candid shot, 他拍, 路拍, 抓拍 | third_person | | Legacy keywords: travel photo, tourist photo, 旅拍, 打卡照, 风景合影 | third_person |

Default policy:

  • Interpret explicit user requirements first: camera style, outfit emphasis, body framing, scene, pose, and expression.
  • Use mirror by default for outfit / full-body / self-presentation requests, even if the user did not explicitly mention a mirror.
  • Use direct by default for selfie requests focused on face, emotion, immediacy, or in-the-moment presence.
  • Use third_person only when the user explicitly asks for a non-selfie style or clearly describes a shot that should not read as a selfie.

Default mode when no keywords match and timeline is unavailable: mirror

Resolution Keywords

| User says | Resolution | | ----------------------------------- | ---------- | | (default) | 1K | | 2k, 2048, medium res, 中等分辨率 | 2K | | 4k, high res, ultra, 超清, 高分辨率 | 4K |

Step-by-Step Instructions

Step 1: Collect User Input

Determine from the user's message:

  • Explicit context (optional): scene, outfit, location, activity — detect from keywords
  • Mode (optional): mirror, direct, or third_person — auto-detect from explicit user intent if not specified
  • Target channel: Where to send (e.g., #general, @username, channel ID)
  • Channel provider (optional): Which platform (discord, telegram, whatsapp, slack)
  • Resolution (optional): 1K / 2K / 4K — default 1K
  • Count (optional): How many images — default 1, only increase if explicitly requested
  • Has explicit scene?: Does the request contain any specific scene/outfit/location/activity keywords?

Step 2: Enrich with Timeline Context Or Recent Scene Recall

timeline_resolve is an optional enhancement, not a prerequisite.

  • If timeline_resolve is unavailable in the current environment, skip this step and proceed with Stella's default behavior.
  • If the request is a current-state Sparse prompt — for example "发张自拍", "发张照片", "想看看你", "send a selfie", "send a photo", "show me what you look like" — and timeline_resolve is available, load and follow references/timeline-integration.md.
  • If the current request clearly refers back to a single recently resolved timeline scene in the current conversation, load and follow references/timeline-integration.md even if the photo request itself is not Sparse.
  • If the user already provided a clear standalone scene, outfit, location, activity, or camera requirement and it is not a callback to a recently resolved timeline scene, do not use timeline enhancement. Follow the default policy directly.
  • When you do call timeline_resolve, do not freely rewrite the request into output-slot questions. Use the fixed query rules in references/timeline-integration.md.
  • Only enable Nano Banana real-world grounding when the prompt can explicitly include a concrete city plus an exact local date/time anchor from timeline data. If those anchors are missing, do not claim real-world synchronization.
  • If timeline returns fact.status === "empty", is missing result.consumption, or any error occurs, immediately fall back to Step 3 without mentioning timeline failure to the user.

Never block image generation on timeline availability. Timeline enrichment is best-effort and should only be used for current-state Sparse prompts or explicit callbacks to a recently resolved timeline scene.

Step 3: Assemble Prompt

Select mode from the default policy first.

If the request is Sparse, and you loaded references/timeline-integration.md and obtained usable timeline context, apply its Sparse-only merge and prompt rules.

When that timeline enrichment includes outdoor real-world grounding, keep the grounding clause as a separate strong instruction sentence rather than a soft atmosphere phrase like Make it feel like....

Otherwise, use the user's explicit context directly and keep Stella's original fallback behavior:

[mirror]  A mirror selfie of this person, [user's explicit context if any], showing full body reflection.
[direct]  A selfie of this person, [user's explicit context if any], looking into the lens.
[third_person] A natural third-person photo of this person, [user's explicit context if any], natural composition, not a selfie.

Step 4: Generate Image

Run the Stella script:

node {baseDir}/dist/scripts/skill.js \
  --prompt "<ASSEMBLED_PROMPT>" \
  --target "<TARGET_CHANNEL>" \
  --channel "<CHANNEL_PROVIDER>" \
  --caption "<CAPTION_TEXT>" \
  --resolution "<1K|2K|4K>" \
  --count <NUMBER>

Step 5: Confirm Result

After the script completes, confirm to the user:

  • Image was generated successfully
  • Image was sent to the target channel
  • If any error occurred, send a concise actionable failure message

Environment Variables

Stella supports multiple providers and a gateway-backed send path, so its sensitive runtime environment variables are explicitly declared in metadata.openclaw.requires.env for OpenClaw's env-injection allowlist. The skill also sets metadata.openclaw.always: true, so these declarations do not become hard load-time gates. Actual credential validation remains runtime-driven inside skill.js, based on the selected provider.

| Variable | Required | Description | | -------------------- | ------------------------------- | ------------------------------------------------------------------------------------ | | GEMINI_API_KEY | Required (if Provider=gemini) | Google Gemini API key | | FAL_KEY | Required (if Provider=fal) | fal.ai API key | | LAOZHANG_API_KEY | Required (if Provider=laozhang) | laozhang.ai API key (sk-xxx); get it at api.laozhang.ai | | Provider | Optional | Image provider: gemini, fal, or laozhang | | AvatarBlendEnabled | Optional | Enable or disable multi-reference avatar blending | | AvatarMaxRefs | Optional | Maximum number of reference images to blend |

Credential requirements are provider-specific:

  • Default Provider=gemini: requires GEMINI_API_KEY
  • Provider=fal: requires FAL_KEY
  • Provider=laozhang: requires LAOZHANG_API_KEY

Media File Handling (Gemini)

When Provider=gemini, Stella writes generated files to:

  • ~/.openclaw/workspace/stella-selfie/

After successful send, Stella deletes the local file immediately. If send fails, the file is kept for debugging.

Skill Environment Options

Configure in your OpenClaw openclaw.json under skills.entries.stella-selfie.env:

| Option | Default | Description | | -------------------- | -------- | ---------------------------------------------- | | Provider | gemini | Image provider: gemini, fal, or laozhang | | AvatarBlendEnabled | true | Enable multi-reference avatar blending | | AvatarMaxRefs | 3 | Maximum number of reference images to blend |

Note for Provider=fal users: fal's image editing API only accepts HTTP/HTTPS image URLs. Local file paths (from Avatar / AvatarsDir) are not supported. Configure AvatarsURLs in IDENTITY.md with public URLs of your reference images to enable image editing with fal.

Note for Provider=laozhang users: laozhang.ai uses the Google-native Gemini API format (gemini-3-pro-image-preview). It requires local reference images from Avatar / AvatarsDir and does not use AvatarsURLs. Supports 1K/2K/4K resolution and 10 aspect ratios. Get your API key at api.laozhang.ai — remember to configure a billing mode in the token settings before use.

Delivery Path

  • Stella sends via openclaw message send.
  • Delivery auth and routing are handled by the local OpenClaw installation, not by skill-level gateway tokens.

External Endpoints And Data Flow

| Endpoint / path | When used | Data sent | | ----------------------------------- | ------------------- | ------------------------------------------------------------------------------------ | | Google Gemini API | Provider=gemini | Prompt text and selected local reference images from Avatar / AvatarsDir | | fal API | Provider=fal | Prompt text and public reference image URLs from AvatarsURLs | | laozhang.ai API (api.laozhang.ai) | Provider=laozhang | Prompt text and local reference images (Avatar / AvatarsDir, uploaded as base64) | | Local OpenClaw CLI | Always for delivery | Target channel, target id, caption text, and generated media path/URL |

Security And Privacy

  • Stella reads ~/.openclaw/workspace/IDENTITY.md and local avatar files to build reference context.
  • Under Provider=gemini, selected local avatar images are uploaded to Gemini as part of normal image generation.
  • Under Provider=fal, only public http/https avatar URLs are sent; local avatar files are not uploaded to fal directly.
  • Under Provider=laozhang, local avatar files from Avatar / AvatarsDir are base64-encoded and uploaded to laozhang.ai.
  • Generated files (Gemini and laozhang) are written to ~/.openclaw/workspace/stella-selfie/ and deleted after successful send.

User Configuration

Before using this skill, you must configure your OpenClaw workspace. See templates/SOUL.fragment.md for the recommended capability snippet to add to your SOUL.md.

Required: IDENTITY.md

Add the following fields to ~/.openclaw/workspace/IDENTITY.md:

Avatar: ./assets/avatar-main.png
AvatarsDir: ./avatars
AvatarsURLs: https://cdn.example.com/ref1.jpg, https://cdn.example.com/ref2.jpg
  • Avatar: Path to your primary reference image (relative to workspace root)
  • AvatarsDir: Directory containing multiple reference photos of the same character (different styles, scenes, outfits)
  • AvatarsURLs: Comma-separated public URLs of reference images — required for Provider=fal (local files are not supported by fal's API)

Required: avatars/ Directory

Place your reference photos in ~/.openclaw/workspace/avatars/:

  • Use jpg, jpeg, png, or webp format
  • All photos should be of the same character
  • Different styles, scenes, outfits, and expressions work best
  • Images are selected by creation time (newest first)

Required: SOUL.md

Add the Stella capability block to ~/.openclaw/workspace/SOUL.md. See README.md ("4. SOUL.md") for the copy/paste snippet.

Installation

clawhub install stella-selfie

After installation, complete the configuration steps above before using the skill.