Splink

JSON →
library 4.0.16 ·python
verified May 23, 2026

Splink is a Python package for fast, accurate, and scalable probabilistic record linkage (entity resolution). It enables users to deduplicate and link records from datasets that lack unique identifiers, leveraging unsupervised learning based on the Fellegi-Sunter model. Splink supports various SQL backends like DuckDB, Apache Spark, and AWS Athena, allowing it to scale to datasets of 100 million records or more, and provides a suite of interactive visualizations for model understanding and diagnostics.

total hits 17
actors 6 distinct systems
last hit 1d ago ChatGPT-User
GPTBot
6
ChatGPT-User
4
Script
2
ClaudeBot
1
Search engines
1

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇩🇪 Germany · 🇦🇺 Australia · 🇫🇷 France