Webclaw
JSON →Web content extraction for LLM pipelines â clean markdown or structured JSON from any URL using browser-grade TLS fingerprinting, no headless browser required. CLI, REST API, and MCP server.
Install
npx create-webclaw Tools · 10
- scrape Extract one URL as markdown, text, JSON, LLM format, or HTML
- crawl Follow same-origin links and extract discovered pages
- map Discover URLs without extracting every page
- batch Scrape multiple URLs in parallel
- extract Convert page content into structured data
- summarize Summarize a page
- diff Compare page content snapshots
- brand Extract colors, fonts, logos, and metadata
- search Search the web and scrape results
- research Multi-source research workflow
Environment variables
WEBCLAW_API_KEYOLLAMA_HOSTOPENAI_API_KEYOPENAI_BASE_URLANTHROPIC_API_KEYANTHROPIC_BASE_URL
Links
★ 1,192 GitHub stars