{"id":2666,"library":"presidio-anonymizer","title":"Presidio Anonymizer","description":"Presidio Anonymizer is a Python-based module designed for anonymizing detected Personally Identifiable Information (PII) entities in text. It offers a range of built-in operators (e.g., replace, mask, redact, hash, encrypt) and supports custom anonymization logic. It also includes deanonymization capabilities for reversible operations like decryption. The library is actively maintained by Microsoft, with frequent releases, and is currently at version 2.2.362.","status":"active","version":"2.2.362","language":"en","source_language":"en","source_url":"https://github.com/Microsoft/presidio","tags":["privacy","anonymization","data-protection","PII","NLP","security","Microsoft"],"install":[{"cmd":"pip install presidio-anonymizer","lang":"bash","label":"Install core anonymizer"},{"cmd":"pip install presidio-analyzer \"spacy[en]\"\npython -m spacy download en_core_web_lg","lang":"bash","label":"Recommended for full PII detection & anonymization"}],"dependencies":[{"reason":"Required Python version.","package":"python","version":">=3.10,<4.0","optional":false},{"reason":"Often used in conjunction for PII detection before anonymization. Provides the RecognizerResult objects needed by AnonymizerEngine.","package":"presidio-analyzer","optional":true},{"reason":"Required by presidio-analyzer for NLP capabilities and language models (e.g., en_core_web_lg) to detect PII.","package":"spacy","optional":true}],"imports":[{"symbol":"AnonymizerEngine","correct":"from presidio_anonymizer import AnonymizerEngine"},{"note":"While presidio-analyzer also has a RecognizerResult, it's a separate type. For direct use with AnonymizerEngine, import from `presidio_anonymizer.entities` to avoid mypy type errors.","wrong":"from presidio_analyzer.recognizer_result import RecognizerResult","symbol":"RecognizerResult","correct":"from presidio_anonymizer.entities import RecognizerResult"},{"symbol":"OperatorConfig","correct":"from presidio_anonymizer.entities import OperatorConfig"}],"quickstart":{"code":"from presidio_anonymizer import AnonymizerEngine\nfrom presidio_anonymizer.entities import RecognizerResult, OperatorConfig\n\n# Sample text and mock analyzer results (typically from presidio-analyzer)\ntext = \"My name is John Doe and my phone number is 123-456-7890.\"\nanalyzer_results = [\n    RecognizerResult(entity_type=\"PERSON\", start=11, end=19, score=0.9),\n    RecognizerResult(entity_type=\"PHONE_NUMBER\", start=38, end=50, score=0.8),\n]\n\n# Initialize the anonymizer engine\nanonymizer = AnonymizerEngine()\n\n# Define anonymization operators\n# Here, PERSON will be replaced with \"<PERSON>\", and PHONE_NUMBER will be masked\noperators = {\n    \"PERSON\": OperatorConfig(\"replace\", {\"new_value\": \"<PERSON>\"}),\n    \"PHONE_NUMBER\": OperatorConfig(\"mask\", {\n        \"masking_char\": \"*\", \n        \"chars_to_mask\": 10, \n        \"from_end\": True\n    })\n}\n\n# Perform anonymization\nanonymized_result = anonymizer.anonymize(\n    text=text,\n    analyzer_results=analyzer_results,\n    operators=operators\n)\n\nprint(f\"Original text: {text}\")\nprint(f\"Anonymized text: {anonymized_result.text}\")","lang":"python","description":"This quickstart demonstrates how to use `AnonymizerEngine` to anonymize text. It manually defines `RecognizerResult` objects, which would typically be generated by `presidio-analyzer`. It shows custom operators for different entity types."},"warnings":[{"fix":"If referential integrity (same hash for the same value) is required, you must explicitly provide a consistent 'salt' parameter to the 'hash' operator config. Use a secure method to generate and store this salt.","message":"The default behavior of the 'hash' operator changed in version 2.2.361. It now uses a random salt by default for enhanced security, which means the same PII value will yield different hashes across calls or entities. This breaks referential integrity unless a salt is explicitly provided.","severity":"breaking","affected_versions":">=2.2.361"},{"fix":"Always import `RecognizerResult` specifically from `presidio_anonymizer.entities` when passing results to `AnonymizerEngine`. You may need to cast or convert `presidio-analyzer`'s `RecognizerResult` objects if you generate them from `AnalyzerEngine` and need to satisfy strict type checking.","message":"When using `presidio-analyzer` and `presidio-anonymizer` together in a typed Python environment, `mypy` might report type errors due to `RecognizerResult` existing in both packages with incompatible types. The `AnonymizerEngine` expects `RecognizerResult` from `presidio_anonymizer.entities`.","severity":"gotcha","affected_versions":">=2.2.354"},{"fix":"If specific country-specific recognizers are required, ensure they are explicitly enabled or configured within your `presidio-analyzer` setup.","message":"Starting from version 2.2.359, many country-specific recognizers (e.g., SgFinRecognizer, AuAbnRecognizer) are disabled by default to prevent false positives when they are not explicitly needed. Users expecting these to work out-of-the-box might find them inactive.","severity":"gotcha","affected_versions":">=2.2.359"},{"fix":"Upgrade to version 2.2.362 or later. Ensure thorough testing of anonymization output, especially for texts with repeated PII entities separated by whitespace.","message":"In versions prior to 2.2.362, `AnonymizerEngine` could fail to anonymize all instances of an entity if multiple identical entities were separated only by spaces, potentially leading to PII leakage (e.g., 'email1@example.com email2@example.com'). A fix was implemented in 2.2.362.","severity":"gotcha","affected_versions":"<2.2.362"},{"fix":"For a complete solution, install `presidio-analyzer` and a spaCy language model using `pip install presidio-analyzer \"spacy[en]\"` and `python -m spacy download en_core_web_lg`.","message":"While `presidio-anonymizer` is a standalone package, a complete PII detection and anonymization pipeline typically requires `presidio-analyzer` for detection and an underlying NLP engine (like spaCy with a language model such as `en_core_web_lg`). Not installing these dependencies will prevent the full workflow from functioning.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}