Streaming WARC (and ARC) IO library

JSON →
library 1.8.1 ·python
verified May 22, 2026

warcio is a Python library (v1.8.1) for fast, low-level, streaming input/output of Web ARChive (WARC) and ARC files, adhering to WARC 1.0 and 1.1 ISO standards. It focuses on processing a stream of web archive records rather than entire files. Developed by Webrecorder, it includes features for both reading existing archives and capturing HTTP/S traffic directly into WARC files. The library is actively maintained, with recent updates adding support for remote file systems like S3 and HTTPS.

total hits 10
actors 6 distinct systems
last hit 4d ago Bingbot
Script
3
OAI-SearchBot
2
ChatGPT-User
1
Search engines
2

top countries 🇺🇸 United States · 🇩🇪 Germany · 🇮🇳 India · 🇨🇦 Canada · 🇫🇷 France