simplecrawler
JSON →simplecrawler is an event-driven web crawler for Node.js (v1.1.9, stable, low release cadence). It provides flexible queue and cache mechanisms with extensible backends, automatic robots.txt respect, and link discovery. Differentiators vs alternatives like node-crawler or puppeteer: lightweight, uses EventEmitter, supports freezing/defrosting queues to disk, and preserves binary data via buffers. Suitable for archiving, analysis, and large-scale crawling.
Resources
packagesimplecrawler ↗