Simhash Python Library

library 2.1.2 ·python

✓ verified May 26, 2026

The `simhash` library provides a Python implementation of the Simhash Algorithm, a technique for quickly finding near-duplicate documents or comparing the similarity of two texts or data objects. It's highly useful for tasks like large-scale content deduplication, spam detection, and content recommendation, offering a fast way to identify perceptually similar items. The current version is 2.1.2, and it follows an irregular release cadence based on contributions and bug fixes.

Traffic · last 30 days ↓12% vs prev 7d · indexed Fri Apr 17 · updated Mon Jun 01

total hits 17

actors 7 distinct systems

last hit 3d ago MJ12bot

GPTBot

MetaBot

Script

ClaudeBot

Search engines

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇫🇷 France · 🇳🇴 Norway · 🇩🇪 Germany

Resources

githubgithub.com/1e0ng/simhash ↗

packagepypi.org/project/simhash/ ↗

homepageleons.im/posts/a-python-implementation-of-simhash-algorithm/ ↗

API endpoints

full doc /v1/registry/simhash

install /v1/registry/simhash/install

compatibility /v1/registry/simhash/compatibility