Presidio Anonymizer

2.2.362 · active · verified Fri Apr 10

Presidio Anonymizer is a Python-based module designed for anonymizing detected Personally Identifiable Information (PII) entities in text. It offers a range of built-in operators (e.g., replace, mask, redact, hash, encrypt) and supports custom anonymization logic. It also includes deanonymization capabilities for reversible operations like decryption. The library is actively maintained by Microsoft, with frequent releases, and is currently at version 2.2.362.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `AnonymizerEngine` to anonymize text. It manually defines `RecognizerResult` objects, which would typically be generated by `presidio-analyzer`. It shows custom operators for different entity types.

from presidio_anonymizer import AnonymizerEngine
from presidio_anonymizer.entities import RecognizerResult, OperatorConfig

# Sample text and mock analyzer results (typically from presidio-analyzer)
text = "My name is John Doe and my phone number is 123-456-7890."
analyzer_results = [
    RecognizerResult(entity_type="PERSON", start=11, end=19, score=0.9),
    RecognizerResult(entity_type="PHONE_NUMBER", start=38, end=50, score=0.8),
]

# Initialize the anonymizer engine
anonymizer = AnonymizerEngine()

# Define anonymization operators
# Here, PERSON will be replaced with "<PERSON>", and PHONE_NUMBER will be masked
operators = {
    "PERSON": OperatorConfig("replace", {"new_value": "<PERSON>"}),
    "PHONE_NUMBER": OperatorConfig("mask", {
        "masking_char": "*", 
        "chars_to_mask": 10, 
        "from_end": True
    })
}

# Perform anonymization
anonymized_result = anonymizer.anonymize(
    text=text,
    analyzer_results=analyzer_results,
    operators=operators
)

print(f"Original text: {text}")
print(f"Anonymized text: {anonymized_result.text}")

view raw JSON →