Web to Markdown
Extract clean article content from any URL and convert it to readable Markdown.
Quick Start
web2md https://example.com/article
web2md https://example.com/article -o article.md
web2md https://example.com/article --summary
Installation
pip install web2md
Common Patterns
Save an article the user shared
web2md <url> -o ~/Documents/reading/<slug>.md
Extract with AI summary (needs Ollama running)
web2md <url> --summary --ai -o article.md
Batch convert a reading list
web2md urls.txt --batch -d output/
How It Works
web2md uses readability-lxml to extract the main article content (stripping ads, nav, sidebars) and markdownify to convert the HTML to clean Markdown. It preserves headings, links, lists, and text formatting while removing clutter.
What It's Good For
- Saving articles for offline reading
- Extracting content for LLM context
- Archiving blog posts and news articles
- Building a personal knowledge base from web content
- Research: batch-convert sources to Markdown
Limits (Free Tier)
- Single URL at a time
- No image download
- No AI summarization (heuristic fallback only)
For batch mode, image download, and AI features: https://web2md.dev (paid CLI, $19)
微信扫一扫