返回 Skill 列表
extension
分类: 数据与分析需要 API Key

youtube copy of yt

通过APIFY API获取YouTube视频字幕,利用住宅代理绕过机器人检测,支持文本和JSON输出格式。

person作者: inaorhubclawhub

youtube-apify-transcript

Fetch YouTube transcripts via APIFY API (works from cloud IPs, bypasses YouTube bot detection).

Why APIFY?

YouTube blocks transcript requests from cloud IPs (AWS, GCP, etc.). APIFY runs the request through residential proxies, bypassing bot detection reliably.

Free Tier

  • $5/month free credits (~714 videos)
  • No credit card required
  • Perfect for personal use

Cost

  • $0.007 per video (less than 1 cent!)
  • Track usage at: https://console.apify.com/billing

Links

Setup

  1. Create free APIFY account: https://apify.com/
  2. Get your API token: https://console.apify.com/account/integrations
  3. Set environment variable:
# Add to ~/.bashrc or ~/.zshrc
export APIFY_API_TOKEN="apify_api_YOUR_TOKEN_HERE"

# Or use .env file (never commit this!)
echo 'APIFY_API_TOKEN=apify_api_YOUR_TOKEN_HERE' >> .env

Usage

Basic Usage

# Get transcript as text
python3 scripts/fetch_transcript.py "https://www.youtube.com/watch?v=VIDEO_ID"

# Short URL also works
python3 scripts/fetch_transcript.py "https://youtu.be/VIDEO_ID"

Options

# Output to file
python3 scripts/fetch_transcript.py "URL" --output transcript.txt

# JSON format (includes timestamps)
python3 scripts/fetch_transcript.py "URL" --json

# Both: JSON to file
python3 scripts/fetch_transcript.py "URL" --json --output transcript.json

# Specify language preference
python3 scripts/fetch_transcript.py "URL" --lang de

Output Formats

Text (default):

Hello and welcome to this video.
Today we're going to talk about...

JSON (--json):

{
  "video_id": "dQw4w9WgXcQ",
  "title": "Video Title",
  "transcript": [
    {"start": 0.0, "duration": 2.5, "text": "Hello and welcome"},
    {"start": 2.5, "duration": 3.0, "text": "to this video"}
  ],
  "full_text": "Hello and welcome to this video..."
}

Error Handling

The script handles common errors:

  • Invalid YouTube URL
  • Video has no transcript
  • API quota exceeded
  • Network errors

Metadata

metadata:
  clawdbot:
    emoji: "📹"
    requires:
      env: ["APIFY_API_TOKEN"]
      bins: ["python3"]