Streaming WARC (and ARC) IO library

library 1.8.1 ·python

✓ verified May 22, 2026

warcio is a Python library (v1.8.1) for fast, low-level, streaming input/output of Web ARChive (WARC) and ARC files, adhering to WARC 1.0 and 1.1 ISO standards. It focuses on processing a stream of web archive records rather than entire files. Developed by Webrecorder, it includes features for both reading existing archives and capturing HTTP/S traffic directly into WARC files. The library is actively maintained, with recent updates adding support for remote file systems like S3 and HTTPS.

Traffic · last 30 days ↓40% vs prev 7d · indexed Sun Apr 12 · updated Wed May 27

total hits 10

actors 6 distinct systems

last hit 4d ago Bingbot

Script

OAI-SearchBot

ChatGPT-User

Search engines

top countries 🇺🇸 United States · 🇩🇪 Germany · 🇮🇳 India · 🇨🇦 Canada · 🇫🇷 France

Resources

githubgithub.com/webrecorder/warcio ↗

packagepypi.org/project/warcio/ ↗

API endpoints

full doc /v1/registry/warcio

install /v1/registry/warcio/install

compatibility /v1/registry/warcio/compatibility