Extruct

library 0.18.0 ·python

✓ verified May 24, 2026

http-networking data serialization

Extruct is a Python library for extracting embedded metadata from HTML markup. It currently supports W3C's HTML Microdata, embedded JSON-LD, Microformat (via mf2py), Facebook's Open Graph, experimental RDFa (via rdflib), and Dublin Core Metadata (DC-HTML-2003). The library is actively maintained with its current stable version being 0.18.0.

Traffic · last 30 days ↑200% vs prev 7d · indexed Fri Apr 17 · updated Mon Jun 01

total hits 16

actors 6 distinct systems

last hit 19h ago AhrefsBot

MetaBot

4

Script

3

GPTBot

2

Search engines

1

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇷 France · 🇩🇪 Germany · 🇮🇳 India

Resources

githubgithub.com/scrapinghub/extruct ↗

packagepypi.org/project/extruct/ ↗

API endpoints

full doc /v1/registry/extruct

install /v1/registry/extruct/install

compatibility /v1/registry/extruct/compatibility