返回 Skill 列表
extension
分类: 数据与分析无需 API Key

Pinterest Scraper

抓取 Pinterest 画板、个人主页或搜索结果,支持无限滚动、图片质量选项、去重、断点续传以及 Telegram 相册发送。

person作者: kexu9hubclawhub

Pinterest Scraper

Full-featured Pinterest image scraper with automatic scrolling and multiple output options.

When This Skill Activates

This skill triggers when user wants to download images from Pinterest.

Reasoning Framework

| Step | Action | Why | |------|--------|-----| | 1 | EXTRACT | Parse Pinterest URL to determine board/user/search | | 2 | LAUNCH | Start Playwright browser with stealth options | | 3 | SCROLL | Incrementally load images (Pinterest uses infinite scroll) | | 4 | COLLECT | Extract image URLs with quality selection | | 5 | DEDUP | Hash-based duplicate detection | | 6 | DOWNLOAD | Save images to output folder | | 7 | NOTIFY | Optional: send to Telegram |


Setup

pip install playwright requests
playwright install chromium

Decision Tree

What are you trying to do?

├── Download images from a board/user
│   └── Use: -u "URL" -s [scrolls]
│
├── Get highest quality possible
│   └── Use: -q originals
│
├── Get smaller/faster downloads
│   └── Use: -q 736x or 236x
│
├── Send images to phone
│   └── Use: --telegram --token X --chat Y
│
├── Resume interrupted scrape
│   └── Use: --resume
│
└── Debug issues
    └── Use: -v (verbose logging)

Quality Selection Decision

| Quality | Use Case | File Size | |---------|----------|-----------| | originals | Best quality, archiving | Largest | | 736x | Good balance | Medium | | 474x | Thumbnail quality | Small | | 236x | Preview only | Smallest | | all | Save every version | Largest total |


Usage

Command Line

python scrape_pinterest.py -u "URL" [options]

| Option | Description | Default | |--------|-------------|---------| | -u, --url | Pinterest URL (required) | - | | -s, --scrolls | Number of scrolls | 50 | | -o, --output | Output folder | ./pinterest_output | | -q, --quality | Quality: originals/736x/474x/236x/all | originals | | -v, --verbose | Enable verbose logging | false | | --telegram | Send images to Telegram | false | | --token | Telegram bot token | - | | --chat | Telegram chat ID | - | | --resume | Resume from previous scrape | false | | --dedup | Skip duplicates | true | | --no-dedup | Disable deduplication | - | | --telegram-only | Only send existing files | false |

Common Examples

# Basic scrape (50 scrolls, originals, current dir)
python scrape_pinterest.py -u "URL"

# Verbose mode (logs to console + scrape.log)
python scrape_pinterest.py -u "URL" -v

# More scrolls, custom output, medium quality
python scrape_pinterest.py -u "URL" -s 100 -o ./output -q 736x -v

# With Telegram delivery
python scrape_pinterest.py -u "URL" --telegram --token "TOKEN" --chat "CHAT_ID"

# Resume interrupted scrape
python scrape_pinterest.py -u "URL" --resume -v

# Show help
python scrape_pinterest.py --help

Python API

This tool is CLI-based. Run it from your Python code:

import subprocess
import os

# Run the scraper
result = subprocess.run(
    ['python3', 'scrape_pinterest.py', '-u', 'URL', '-s', '50', '-q', 'originals'],
    cwd='./scripts',
    capture_output=True,
    text=True
)

print(result.returncode)  # 0 = success
print(result.stdout)

Features

| Feature | Description | |---------|-------------| | Infinite Scroll | Automatic scrolling loads more images | | Quality Options | originals/736x/474x/236x/all | | Telegram | Send directly to Telegram | | Deduplication | Hash-based duplicate detection | | Resume | Continue from previous scrape | | URL Types | Boards, user profiles, search results | | Verbose Logging | -v flag, logs to console + scrape.log |


Verbose Logging

Use -v or --verbose for detailed logging:

python scrape_pinterest.py -u "URL" -v

What gets logged:

  • Scroll progress (every 10 scrolls)
  • Images found per scroll
  • Download progress (X/Y)
  • Telegram send status
  • Errors and warnings

Log files:

  • Console: INFO level
  • scrape.log: DEBUG level (detailed)

Troubleshooting

Problem: No images downloaded

  • Cause: Not enough scrolls, Pinterest didn't load
  • Fix: Increase -s value (try 100-200)

Problem: "Browser not found"

  • Cause: Playwright not installed
  • Fix: playwright install chromium

Problem: SSL certificate errors (Mac)

  • Cause: macOS SSL issues
  • Fix: Use verify=False in requests calls

Problem: Duplicate images

  • Cause: Deduplication disabled or failed
  • Fix: Use --dedup flag (default: on)

Problem: Resume not working

  • Cause: State file missing or URL changed
  • Fix: Use same URL as original, check .scrape_state.json

Problem: Telegram not sending

  • Cause: Invalid token/chat ID, rate limiting
  • Fix: Verify bot token, check chat ID, Telegram limits 100 images/batch

Problem: Verbose logs not writing

  • Cause: File permission issue
  • Fix: Check write permissions in output directory

Self-Check

  • [ ] Pinterest URL is valid (board/user/search)
  • [ ] Playwright installed: playwright install chromium
  • [ ] Quality selected appropriately for use case
  • [ ] Output directory exists or is writable
  • [ ] For Telegram: token and chat ID correct
  • [ ] For resume: using same URL as original scrape

Notes

  • Pinterest loads dynamically - scrolling required for more images
  • Use verify=False for requests (Mac SSL issues)
  • State saved to .scrape_state.json for resume
  • Telegram limited to 100 images per batch
  • Verbose mode writes detailed logs to scrape.log

Quick Reference

| Task | Command | |------|---------| | Basic scrape | python scrape_pinterest.py -u "URL" | | Verbose debug | python scrape_pinterest.py -u "URL" -v | | High quality | python scrape_pinterest.py -u "URL" -q originals | | Fast/small | python scrape_pinterest.py -u "URL" -q 236x | | Send to Telegram | python scrape_pinterest.py -u "URL" --telegram --token X --chat Y | | Resume | python scrape_pinterest.py -u "URL" --resume | | Custom output | python scrape_pinterest.py -u "URL" -o ./myfolder |