🎬 X Post to Video Agent - v1.0.0
Create professional AI-generated videos with your own digital human avatar! Now supporting both exact-script generation (V1) and prompt-based Agent Generation (V2).
🎯 What You'll Build
- V1 (Standard): Generate videos with AI avatars speaking an exact, pre-written script.
- V2 (Video Agent): Pass a natural language prompt, and the built-in LLM writes the script and renders the video in one step!
- Support for multiple languages, portrait, and landscape formats.
📋 Prerequisites
-
HeyGen Account (Creator plan or above)
- Get API key from Settings → API
-
Custom Avatar (optional)
- Upload training video to create your digital twin
- Or use HeyGen's stock avatars
🚀 Quick Start Setup
Step 1: Get Your API Key
HEYGEN_API_KEY="your_api_key_here"
Step 2: List Available Avatars
curl -X GET "https://api.heygen.com/v2/avatars" \
-H "X-Api-Key: $HEYGEN_API_KEY" | jq '.data.avatars[:5]'
Step 3: List Available Voices (Required for V1)
curl -X GET "https://api.heygen.com/v2/voices" \
-H "X-Api-Key: $HEYGEN_API_KEY" | jq '.data.voices[:5]'
📽️ Version 1: Standard Video Generation
Use Case: When you already have a perfectly written script (e.g., from cinematic-script-writer) and want the avatar to speak it verbatim.
curl -X POST "https://api.heygen.com/v2/video/generate" \
-H "X-Api-Key: $HEYGEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"video_inputs": [{
"character": {
"type": "avatar",
"avatar_id": "YOUR_AVATAR_ID",
"avatar_style": "normal"
},
"voice": {
"type": "text",
"input_text": "Hello! This is my exact script.",
"voice_id": "YOUR_VOICE_ID"
}
}],
"dimension": {
"width": 1280,
"height": 720
}
}'
🤖 Workflow Pipeline: X Post to Video Agent
Step 1: Topic & Count Input The skill prompts the user to provide a specific topic and the number (N) of trending posts they want to explore.
Step 2: Search Trending X Posts
The agent uses its tools (e.g., Web Search or X API) to search for the top N most viral/trending X (Twitter) posts matching the requested topic. It presents the found posts to the user in a clear list.
Important: The list MUST include a short summary and a direct, clickable X/Twitter URL (e.g., https://x.com/user/status/ID) for every post, NOT just the raw post ID.
Step 3: User Selection The user reviews the list and selects one or multiple X posts from the results.
Step 4: Deep Systemic Thinking (Gemini - The 11 Dimensions + 12 Viral Iron Rules) For each selected post, fetch the full text content and use Gemini to analyze and radically reconstruct the content based on a combined framework of the "11 Content Insight Dimensions" and the "12 Iron Rules of 1M+ Viral Content".
Part A: The 11 Content Insight Dimensions (Structural Analysis)
- 核心观点 (Core Viewpoint): Only one sharp, central argument.
- 副观点 (Sub-viewpoints): 2-3 angles to make the core viewpoint three-dimensional.
- 说服策略 (Persuasion Strategy): Data-driven / Story-driven / Authority endorsement / Analogy & Metaphor / Social proof.
- 情绪触发点 (Emotion Triggers): Resonance, urgency, curiosity, superiority, fear, or belonging.
- 金句 (Golden Quotes): Memorable punchlines that can spread independently of context.
- 情感曲线 (Emotional Curve): Design rhythmic ups and downs like a movie.
- 情感层次 (Emotional Levels): Surface info → Mid-level values → Deep identity alignment.
- 论证多样性 (Argument Diversity): Alternate between stories, data, analogies, and rhetorical questions.
- 视角转化 (Perspective Shifting): Strategic switching between 1st, 2nd, and 3rd person.
- 语言风格 (Language Style): Sentence length, colloquialism, rhetorical density, sense of rhythm.
- 互动钩子 (Interaction Hooks): Design elements that compel viewers to reply (questions, fill-in-the-blanks, controversy, voting).
Part B: The 12 Viral Iron Rules (Viral Amplification)
- 紧跟热点 (Trend Hijacking): Tie the core message to immediate, massive trends.
- 切入点独到 (Unique Angle): Identify the 3 most common ways people talk about this, discard them, and strictly use the 4th, most unexpected angle.
- 标题引发好奇 (0.5s Curiosity Hook): The opening must instantly hook the viewer with a paradox, hidden secret, or suspense.
- 简单粗暴与情绪明确 (Brutal Clarity & Polarized Emotion): Abandon neutrality. Be fiercely opinionated, sharp, and emotionally distinct.
- 颠覆常识 (Subvert Common Sense): Present a shocking claim, but strictly back it up with logic/data.
- 强烈情感共鸣 (Visceral Resonance): Make the target audience feel, "This is exactly about my life/struggles!"
- 输出价值观 (Value & "Chicken Soup"): End with a philosophical or empowering takeaway that gives the viewer a sense of elevation.
- 跟“我”有关 (Hyper-Relatability): Relentlessly use "You", "I", and "We". Force the macro-topic into the viewer's personal reality.
- 绝对有用 (Hardcore Utility): Embed dense, actionable "dry goods" (tools, numbers, step-by-step methods).
- 文风有趣 (Entertaining Delivery): Use self-deprecation, modern slang, sharp metaphors, and zero corporate jargon.
- 损失厌恶 (FOMO Drive): Inject a strong "If you skip this, you will lose money/opportunities" psychological push.
- 提供社交货币 (High-Status Sharing): Make the content so insightful that viewers feel smart for sharing it.
(Gemini must deeply restructure the raw text through both frameworks before proceeding to scriptwriting).
Step 5: Generate Structured Script (ai-video-script)
Feed the deeply refined content from Step 4 into the ai-video-script skill framework. Generate a comprehensive, professional video script.
- Length: The video length should dynamically adapt based on the depth/length of the original X post, ideally ranging between 1.5 to 4 minutes (WeChat Video / YouTube Shorts format).
- Structure: It must include a strong hook, scene descriptions, and highly engaging Mandarin voiceover.
- Ending CTA: At the end of the video, add a scene instruction to visually display "任何问题微号:crg368711" on the screen only. Do NOT include this phrase in the spoken voiceover script.
Step 6: Automate HeyGen Video Generation
Extract the spoken text and scene instructions from the generated script, and pass them as the prompt to the HeyGen Video Agent API (/v1/video_agent/generate).
Optimized Prompt for HeyGen Video Agent:
When sending the final payload to HeyGen, format the prompt field like this:
Act as an energetic AI tech influencer. Read the following script exactly, matching the excited tone.
[Insert the generated scene instructions and voiceover from ai-video-script here]
(Ensure "任何问题微号:crg368711" is placed ONLY as a [Visual] instruction at the end, and omitted from the spoken text).
(Note: Do NOT include QR code instructions in the video generation.)
curl -X POST "https://api.heygen.com/v1/video_agent/generate" \
-H "X-Api-Key: $HEYGEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "<THE_FORMATTED_PROMPT_ABOVE>",
"avatar_id": "YOUR_AVATAR_ID",
"orientation": "landscape"
}'
Step 7: Background Polling & Notification
Execute the wait_for_video.sh script to poll the status in the background and notify the user on Telegram with the HeyGen dashboard link when finished.
Step 8: Automated X (Twitter) Repost / Distribution Once the background polling script notifies the Agent/User that the video generation is completed, the Agent must automatically create a repost (quote tweet or reply) to the original X post.
- Content Rules for Repost:
- Use ONLY the spoken video text (strip out all
[Scene: ...]and[Visual: ...]bracketed instructions). - Include the direct HeyGen video link:
https://app.heygen.com/video/[VIDEO_ID]
- Use ONLY the spoken video text (strip out all
- Execution: The Agent will use the X API (requires Write/User-Context Authentication) to publish this tweet automatically, completing the content distribution loop.
(Optional Fields for V2: orientation can be "landscape" or "portrait")
Check Video Status (Works for both V1 & V2)
Both endpoints return a video_id. You poll this endpoint to get the final MP4 download link.
VIDEO_ID="your_video_id"
curl -X GET "https://api.heygen.com/v1/video_status.get?video_id=$VIDEO_ID" \
-H "X-Api-Key: $HEYGEN_API_KEY"
📐 Video Dimensions
| Format | V1 (dimension object) | V2 (orientation string) | Use Case | |--------|-----------------------|-------------------------|----------| | Landscape | width: 1280, height: 720 | "landscape" | YouTube, Website | | Portrait | width: 720, height: 1280 | "portrait" | TikTok, Reels, Shorts |
💰 Cost Estimate
| Plan | Price | Credits | |------|-------|---------| | Creator | $29/month | 15 min/month | | Business | $89/month | 30 min/month | | Per-minute overage | ~$1-2/min | - |
Made with 🦞 by Littl3Lobst3r · Updated by OpenClaw Updating SKILL.md to include the auto-polling script...
Scan to join WeChat group