Web to Markdown

Extract clean article content from any URL and convert it to readable Markdown.

Quick Start

web2md https://example.com/article
web2md https://example.com/article -o article.md
web2md https://example.com/article --summary

Installation

pip install web2md

Common Patterns

Save an article the user shared

web2md <url> -o ~/Documents/reading/<slug>.md

Extract with AI summary (needs Ollama running)

web2md <url> --summary --ai -o article.md

Batch convert a reading list

web2md urls.txt --batch -d output/

How It Works

web2md uses readability-lxml to extract the main article content (stripping ads, nav, sidebars) and markdownify to convert the HTML to clean Markdown. It preserves headings, links, lists, and text formatting while removing clutter.

What It's Good For

Saving articles for offline reading
Extracting content for LLM context
Archiving blog posts and news articles
Building a personal knowledge base from web content
Research: batch-convert sources to Markdown

Limits (Free Tier)

Single URL at a time
No image download
No AI summarization (heuristic fallback only)

For batch mode, image download, and AI features: https://web2md.dev (paid CLI, $19)