EMR Validator

1.0.2 · active · verified Thu Apr 16

EMR Validator (emrvalidator) is a Python library designed for comprehensive data validation of healthcare data. It allows users to define validation rules in an Excel-based schema and apply them to various data formats like CSV. The current version is 1.0.2, and it receives active maintenance with minor releases addressing bug fixes and enhancements.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to set up and run a basic data validation using `EMRValidator`. It first creates dummy `schema.xlsx` and `data.csv` files in temporary locations for a runnable example, then initializes `EMRValidator` with these paths, runs the validation, and prints the summary, invalid records, and validated records. In a real application, `schema_path` and `data_path` would point to your actual data files.

import os
import pandas as pd
from emrvalidator import EMRValidator

# --- Dummy file creation for runnable example START ---
# In a real scenario, you would have these files pre-existing.
schema_data = {
    "Column Name": ["PatientID", "Name", "Age", "AdmissionDate"],
    "Data Type": ["STRING", "STRING", "INTEGER", "DATETIME"],
    "Is Mandatory": ["YES", "YES", "YES", "NO"],
    "Allowed Values": ["", "", "", ""],
    "Min Length": ["", "2", "0", ""],
    "Max Length": ["", "50", "120", ""],
    "Regex Pattern": ["", "", "", ""]
}
schema_df = pd.DataFrame(schema_data)

# Using tempfile for demonstration, replace with your actual file paths
import tempfile
temp_dir = tempfile.gettempdir()
schema_path = os.path.join(temp_dir, "registry_schema.xlsx")
data_path = os.path.join(temp_dir, "registry_data.csv")

with pd.ExcelWriter(schema_path, engine='openpyxl') as writer:
    schema_df.to_excel(writer, index=False, sheet_name='Sheet1')

data_csv_content = """PatientID,Name,Age,AdmissionDate
P001,Alice,30,2023-01-15
P002,Bob,25,
P003,Charlie,40,2024-03-20
"""
with open(data_path, 'w') as f:
    f.write(data_csv_content)
# --- Dummy file creation for runnable example END ---

# Initialize the EMRValidator
# Replace 'schema_path' and 'data_path' with your actual file paths
validator = EMRValidator(schema_path=schema_path, data_path=data_path)

# Run the validation
validation_result = validator.validate()

# Get summary of validation
summary = validator.get_summary()
print("Validation Summary:")
print(summary)

# Get invalid records
invalid_records = validator.get_invalid_records()
if not invalid_records.empty:
    print("\nInvalid Records:")
    print(invalid_records)
else:
    print("\nNo invalid records found.")

# Get validated records
validated_records = validator.get_validated_records()
if not validated_records.empty:
    print("\nValidated Records:")
    print(validated_records)

# Clean up temporary files (optional, for demonstration)
os.remove(schema_path)
os.remove(data_path)

view raw JSON →