โ† Back to skills
extension
Category: Data & AnalyticsNo API key required

finviz-crawler

Continuous financial news crawler for finviz.com with SQLite storage, article extraction, and query tool. Use when monitoring financial markets, building new...

personAuthor: camopelhubclawhub

finviz-crawler

Why This Skill?

๐Ÿ“ฐ Your own financial news database โ€” most finance skills just wrap an API for one-shot queries. This skill runs continuously, building a local archive of every headline and article from Finviz. Query your history anytime โ€” no API limits, no missing data.

๐Ÿ†“ No API key, no subscription โ€” scrapes finviz.com directly using Crawl4AI + RSS. Bloomberg, Reuters, Yahoo Finance, CNBC articles extracted automatically. Zero cost.

๐Ÿค– Built for AI summarization โ€” the query tool outputs clean text/JSON optimized for LLM digests. Pair with an OpenClaw cron job for automated morning briefings, evening wrap-ups, or weekly investment summaries.

๐Ÿ’พ Auto-cleanup โ€” configurable expiry automatically deletes old articles from both the database and disk. Set --expiry-days 30 to keep a month of history, or 0 to keep everything forever.

๐Ÿ”„ Daemon architecture โ€” runs as a background service that starts/stops with OpenClaw. No manual intervention needed after setup. Works with systemd (Linux) and launchd (macOS).

Install

python3 scripts/install.py

Works on macOS, Linux, and Windows. Installs Python packages (crawl4ai, feedparser), sets up Playwright browsers, creates data directories, and verifies everything.

Manual install

pip install crawl4ai feedparser
crawl4ai-setup  # or: python -m playwright install chromium

Usage

Run the crawler

# Default: ~/workspace/finviz/, 7-day expiry
python3 scripts/finviz_crawler.py

# Custom paths and settings
python3 scripts/finviz_crawler.py --db /path/to/finviz.db --articles-dir /path/to/articles/

# Keep 30 days of articles
python3 scripts/finviz_crawler.py --expiry-days 30

# Never auto-delete (keep everything)
python3 scripts/finviz_crawler.py --expiry-days 0

# Custom crawl interval (default: 300s)
python3 scripts/finviz_crawler.py --sleep 600

Query articles

# Last 24 hours of headlines
python3 scripts/finviz_query.py --hours 24

# Titles only (compact, good for LLM summarization)
python3 scripts/finviz_query.py --hours 12 --titles-only

# With full article content
python3 scripts/finviz_query.py --hours 12 --with-content

# List downloaded articles with content status
python3 scripts/finviz_query.py --list-articles --hours 24

# Database stats
python3 scripts/finviz_query.py --stats

Manage tickers

# List all tracked tickers
python3 scripts/finviz_query.py --list-tickers

# Add single ticker (auto-generates keywords from symbol)
python3 scripts/finviz_query.py --add-ticker NVDA

# Add with custom keywords
python3 scripts/finviz_query.py --add-ticker "NVDA:nvidia,jensen huang"

# Add multiple tickers (batch)
python3 scripts/finviz_query.py --add-ticker NVDA TSLA AAPL
python3 scripts/finviz_query.py --add-ticker "NVDA:nvidia,jensen" "TSLA:tesla,elon musk"

# Remove tickers (batch)
python3 scripts/finviz_query.py --remove-ticker NVDA TSLA

# Custom DB path
python3 scripts/finviz_query.py --list-tickers --db /path/to/finviz.db

Tickers are stored in the tickers table inside finviz.db alongside articles. The crawler reads this table each cycle to know which ticker pages to scrape.

Configuration

| Setting | CLI flag | Env var | Default | |---------|----------|---------|---------| | Database path | --db | โ€” | ~/workspace/finviz/finviz.db | | Articles directory | --articles-dir | โ€” | ~/workspace/finviz/articles/ | | Crawl interval | --sleep | โ€” | 300 (5 min) | | Article expiry | --expiry-days | FINVIZ_EXPIRY_DAYS | 7 days | | Timezone | โ€” | FINVIZ_TZ or TZ | System default |

๐Ÿ’ฌ Chat Commands (OpenClaw Agent)

When this skill is installed, the agent recognizes /finviz as a shortcut:

| Command | Action | |---------|--------| | /finviz list | Show tracked tickers | | /finviz add NVDA, TSLA | Add tickers to track | | /finviz remove NVDA | Remove a ticker | | /finviz stats | Show article/ticker counts | | /finviz help | Show available commands |

The agent runs these via the finviz_query.py CLI internally.

๐Ÿ“ฑ PrivateApp Dashboard

A companion mobile dashboard is available in PrivateApp โ€” a personal PWA dashboard for your home server.

The Finviz app provides:

  • Headlines browser with time-range filters (12h / 24h / Week)
  • Ticker-specific news filtering
  • LLM-powered summaries on demand

Install PrivateApp, and the Finviz dashboard is built-in โ€” no extra setup needed.

Architecture

Crawler daemon (finviz_crawler.py):

  • Crawls finviz.com/news.ashx headlines every 5 minutes
  • Fetches article content via Crawl4AI (Playwright) or RSS (paywalled sites)
  • Bot/paywall detection rejects garbage content
  • Per-domain rate limiting, user-agent rotation
  • Deduplicates via SHA-256 title hash
  • Auto-expires old articles (configurable)
  • Clean shutdown on SIGTERM/SIGINT

Query tool (finviz_query.py):

  • Read-only SQLite queries (no HTTP, stdlib only)
  • Filter by time window, export titles or full content
  • Designed for LLM summarization pipelines

Run as a service (optional)

systemd (Linux)

[Unit]
Description=Finviz News Crawler

[Service]
ExecStart=python3 /path/to/scripts/finviz_crawler.py --expiry-days 30
Restart=on-failure
RestartSec=30

[Install]
WantedBy=default.target

launchd (macOS)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key><string>com.finviz.crawler</string>
    <key>ProgramArguments</key>
    <array>
        <string>python3</string>
        <string>/path/to/scripts/finviz_crawler.py</string>
        <string>--expiry-days</string>
        <string>30</string>
    </array>
    <key>RunAtLoad</key><true/>
    <key>KeepAlive</key><true/>
</dict>
</plist>

Data layout

~/workspace/finviz/
โ”œโ”€โ”€ finviz.db          # SQLite: articles + tickers (single DB)
โ”œโ”€โ”€ articles/          # Full article content as .md files
โ”‚   โ”œโ”€โ”€ market/        # General market headlines
โ”‚   โ”œโ”€โ”€ nvda/          # Per-ticker articles
โ”‚   โ””โ”€โ”€ tsla/
โ””โ”€โ”€ summaries/         # LLM summary cache (.json)

Cron integration

Pair with an OpenClaw cron job for automated digests:

Schedule: 0 6 * * * (6 AM daily)
Task: Query last 24h โ†’ LLM summarize โ†’ deliver to Matrix/Telegram/Discord