返回 Skill 列表
extension
分类: 数据与分析无需 API Key

Site Summarizer

URL抓取与摘要生成。抓取URL,提取内容,生成摘要。可选缓存,支持自定义目录和TTL。适用于网页内容...

person作者: cloudcompilehubclawhub

Site Summarizer v4.1.0

Features

  • Content extraction from HTML pages
  • Automatic summarization
  • Keyword extraction and language detection
  • Optional caching with env vars

Output

{
  "success": true,
  "content": "...",
  "summary": "...",
  "metadata": {"title": "...", "description": "...", "author": "..."},
  "analysis": {"language": "en", "keywords": [...], "word_count": N, "read_time_min": N},
  "status": 200,
  "from_cache": false
}

Environment Variables

  • SITE_SUMMARIZER_CACHE_DIR - Cache directory (default: ~/.cache/site-summarizer)
  • SITE_SUMMARIZER_CACHE_TTL - Cache TTL in seconds (default: 3600)
  • SITE_SUMMARIZER_HIDE_IP - Set to "true" to hide resolved IP in output

Usage

python fetch_and_summarize.py <url>

v4.1.0 Fixes

  • Fixed redirect header parsing
  • Fixed regex patterns for redaction
  • Added optional IP hiding via env var
  • Code cleanup and bug fixes