Semantic Link Functions for Phone Numbers
This library integrates the `phonenumbers` package with Semantic Link, enabling validation and enrichment of phone numbers within Microsoft Fabric DataFrames (FabricDataFrames). It simplifies operations on phone number columns for data quality and analysis, adding columns indicating validity, type, and various formats. The current version is 0.14.0, and it appears to follow the active release cadence of the broader `semantic-link-functions` project.
Warnings
- gotcha This library has peer dependencies on `phonenumbers` (PyPI: `google-i18n-phonenumbers`) and `sempy` which must be installed separately. Failing to install them will result in runtime `ModuleNotFoundError` or similar issues.
- gotcha The library is primarily designed for use with `FabricDataFrame` within Microsoft Fabric. While it might accept `pandas.DataFrame` in some contexts, full integration, expected performance, and access to Fabric-specific features are achieved when working with `sempy.fabric.FabricDataFrame`.
- gotcha The `validate_phone_number` function offers optional parameters like `country_code`, `region_code`, and `strict`. The default validation behavior might not align with specific regional or strictness requirements without explicit configuration, potentially leading to unexpected validation results.
Install
-
pip install semantic-link-functions-phonenumbers -
pip install phonenumbers sempy
Imports
- validate_phone_number
from semantic_link_functions_phonenumbers import validate_phone_number
Quickstart
import pandas as pd
from sempy.fabric import FabricDataFrame # sempy is a required dependency for FabricDataFrame
from semantic_link_functions_phonenumbers import validate_phone_number
# In a real Microsoft Fabric environment, a FabricDataFrame
# would typically be loaded from a Lakehouse table or similar source.
# For demonstration, we create one from a pandas DataFrame.
data = {'phone': ['+12065550100', '123-456-7890', 'invalid phone number']}
df = FabricDataFrame(pd.DataFrame(data))
# Validate phone numbers in the 'phone' column
# The function adds new columns (e.g., '_is_valid', '_number_type', '_e164_format')
df_validated = validate_phone_number(df, 'phone')
print("Original DataFrame:\n", df)
print("\nValidated DataFrame (selected columns):\n")
print(df_validated[['phone', 'phone_is_valid', 'phone_number_type', 'phone_e164_format']].head())
# Example of filtering invalid numbers
# print(df_validated[~df_validated['phone_is_valid']])