{"id":9745,"library":"fingerprints","title":"Fingerprints","description":"The `fingerprints` library is a utility for generating stable and deterministic hashes (fingerprints) for entities based on their identifying attributes like names, addresses, and identifiers. It's commonly used in data matching and deduplication scenarios, particularly within the 'opensanctions' ecosystem. The current version is 1.3.1, and its release cadence is irregular, typically corresponding to bug fixes or minor feature enhancements.","status":"active","version":"1.3.1","language":"en","source_language":"en","source_url":"https://github.com/opensanctions/fingerprints.git","tags":["fingerprinting","deduplication","entity-matching","data-quality","identity"],"install":[{"cmd":"pip install fingerprints","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Used for data cleaning and standardization before fingerprint generation, ensuring consistency across inputs.","package":"normality","optional":false}],"imports":[{"symbol":"generate","correct":"from fingerprints import generate"}],"quickstart":{"code":"from fingerprints import generate\n\n# Generate a fingerprint for a person\nperson_fp = generate(name=\"Angela Merkel\", country=\"de\", birth_date=\"1954-07-17\")\nprint(f\"Person Fingerprint: {person_fp}\")\n\n# Generate a fingerprint for an organization with an address\norg_fp = generate(\n    name=\"Global Corp Inc.\",\n    address=\"123 Main Street, Cityville, Countryland\",\n    country=\"us\",\n    url=\"http://globalcorp.com\"\n)\nprint(f\"Organization Fingerprint: {org_fp}\")\n\n# Fingerprints are deterministic for identical, normalized inputs\nsame_person_fp = generate(name=\"angela merkel\", country=\"germany\", birth_date=\"1954-07-17\")\nprint(f\"Same Person Fingerprint: {same_person_fp}\")\nassert person_fp == same_person_fp\n","lang":"python","description":"This quickstart demonstrates how to import and use the `generate` function to create fingerprints for different types of entities. It highlights that the library internally normalizes inputs, leading to identical fingerprints for semantically equivalent data."},"warnings":[{"fix":"Refer to the library's source code or documentation to understand the full list of supported input keys. Ensure you are passing only relevant and recognized identifying attributes as keyword arguments to `generate`.","message":"Input keys for `generate` are specific and others are ignored. The `generate` function processes a predefined set of keys (e.g., `name`, `address`, `country`, `id_number`, `email`). Providing other, unrecognized keys will not cause an error but will be silently ignored, potentially leading to less distinct or identical fingerprints for entities that differ only by ignored attributes.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Understand that `fingerprints` provides a strong, deterministic identifier for *exact* matches after normalization. For fuzzy matching or similarity detection, other libraries or algorithms (e.g., Levenshtein distance, Jaccard similarity) are required.","message":"Fingerprints are deterministic hashes, not fuzzy matches. This library generates exact cryptographic hashes (SHA1) based on *normalized* input. It is not designed for fuzzy matching (e.g., matching 'John Doe' to 'Jon Doh'). Even minor differences in input (like typos or extra spaces not handled by normalization) will result in completely different fingerprints.","severity":"gotcha","affected_versions":"All versions"},{"fix":"For production environments requiring stable fingerprint generation, it is advisable to pin the exact version of the `normality` library used in your project. Regularly test `fingerprints` behavior when updating `normality`.","message":"Dependency on `normality` version and its behavior. The `fingerprints` library relies heavily on `normality` for data cleaning and standardization. Changes, bug fixes, or new normalization rules in newer versions of `normality` can subtly alter how data is processed, potentially leading to different fingerprints for the same input across different `normality` versions or environments.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Install the library using pip: `pip install fingerprints`","cause":"The `fingerprints` library is not installed in your Python environment or is not accessible from the current path.","error":"ModuleNotFoundError: No module named 'fingerprints'"},{"fix":"Provide identifying attributes as keyword arguments, for example: `from fingerprints import generate; generate(name=\"Example Entity\", country=\"ZZ\")`.","cause":"The `generate` function was called without any keyword arguments. It expects identifying attributes to be passed as `key=value` pairs (e.g., `generate(name=\"Alice\", country=\"US\")`).","error":"TypeError: generate() missing 1 required positional argument: 'data'"},{"fix":"Ensure that country values are valid ISO 3166-1 alpha-2 codes (e.g., 'US', 'DE', 'GB') or common country names that `normality` can resolve. Consult `normality`'s documentation for supported country formats.","cause":"The `country` argument provided to `generate` could not be normalized by the underlying `normality` library. This typically happens when an invalid or unresolvable country code or name is passed.","error":"ValueError: Could not normalize country: <INVALID_CODE>"}]}