Product Pricing Scraper
Scrape, normalize, and compare product prices across e-commerce sites.
Quick Start
"Compare prices for [product name] across Amazon, eBay, and Walmart"
"Scrape pricing from [URL]"
"Track price changes for [product]"
Workflow
- Identify target — accept product name (search across sites) or specific URL(s)
- Scrape — use
scripts/scrape_prices.pyfor each target - Normalize — strip currency symbols, unify units, convert currencies if needed
- Compare — rank by price, flag outliers, compute average/median
- Output — JSON or CSV via
--formatflag
Scraping Strategies
Static pages (most product listings)
Use web_fetch + cheerio-style parsing. No browser needed.
Dynamic pages (JS-rendered prices)
Use the browser tool with snapshot + act to navigate and extract.
Anti-bot considerations
- Randomize delays (2-5s between requests)
- Rotate User-Agent strings
- Respect robots.txt — check before scraping
- Limit concurrent requests per domain
Price Normalization
- Strip currency symbols and whitespace
- Convert per-unit pricing (e.g., "$2.99/oz" → unit price)
- Handle price ranges (take lowest)
- Flag "was/now" discounted prices — record both
Output Schema
{
"query": "product name or url",
"scraped_at": "ISO-8601",
"results": [
{
"source": "amazon",
"url": "https://...",
"title": "Product Title",
"price": 29.99,
"currency": "USD",
"unit_price": null,
"original_price": 39.99,
"in_stock": true,
"rating": 4.5,
"review_count": 1234
}
],
"summary": {
"lowest": { "source": "amazon", "price": 29.99 },
"highest": { "source": "walmart", "price": 34.99 },
"average": 32.49,
"median": 31.99
}
}
Scripts
scripts/scrape_prices.py— main scraper: accepts product name or URL list, outputs structured JSON- See
references/selectors.mdfor site-specific CSS selectors
Ethics & Compliance
- Respect robots.txt and rate limits
- Do not scrape behind login walls without explicit permission
- Do not collect personal data
- Check site ToS before bulk scraping
Scan to join WeChat group