Snowfakery
Snowfakery is a Python tool for generating fake data models with relational integrity. It allows users to define complex data structures, including relationships between 'tables' (objects), using YAML configuration. Each generated record is unique and random. Currently at version 4.2.1, it maintains an active development pace with frequent minor releases and occasional major updates addressing Python compatibility and core feature enhancements.
Common errors
-
ERROR: Package 'snowfakery' requires a different Python version: 3.11+ but you have 3.10.x
cause Attempting to install or run Snowfakery v4.0.0+ on an unsupported Python version (3.10 or older).fixUpgrade your Python environment to 3.11 or newer, or install an older version of Snowfakery: `pip install 'snowfakery<4'`. -
yaml.scanner.ScannerError: while scanning a simple key
cause Incorrect YAML syntax, often due to improper indentation, missing colons, or invalid key structures.fixCarefully review your Snowfakery YAML definition for syntax errors. Pay close attention to indentation and ensure all keys are correctly terminated with colons. Use a YAML linter. -
NameError: name 'my_custom_generator' is not defined
cause A custom generator or function is referenced in the YAML but has not been correctly registered or is not in scope.fixEnsure that any custom generators or functions are correctly defined and passed to `generate_data` using the `extra_functions` or `extra_generators` parameters if used programmatically, or defined in an included file if using the CLI.
Warnings
- breaking Snowfakery v4.0.0 dropped support for Python versions 3.8, 3.9, and 3.10. Users must upgrade to Python 3.11 or newer to use Snowfakery 4.x.x.
- breaking With Snowfakery v3.0.0 and the `snowfakery_version: 3` declaration in YAML, formula outputs can be types other than strings. Previously, all formula outputs were implicitly converted to strings. This change affects how formulas interact with fields expecting non-string types.
- gotcha Snowfakery's core functionality relies heavily on its custom YAML schema. Incorrect indentation, misspelled keywords, or invalid generator/function usage are common sources of errors.
- gotcha Features like `find_record` and SObject Upserts/Updates are specifically designed for integration with Salesforce and often require the CumulusCI framework or specific Salesforce connection setup. They will not work out-of-the-box in a generic Python environment.
Install
-
pip install snowfakery
Imports
- generate_data
from snowfakery import generate_data
from snowfakery.api import generate_data
Quickstart
import io
from snowfakery.api import generate_data
# Define a simple Snowfakery data model in YAML
data_model_yaml = """
- object: Contact
count: 2
fields:
FirstName:
random_first_name: {}
LastName:
random_last_name: {}
Email:
formula: f"{FirstName}.{LastName}@example.com"
"""
# Use io.StringIO to simulate file input and output for a quick example
input_stream = io.StringIO(data_model_yaml)
output_stream = io.StringIO()
# Generate data
generate_data(
yaml_file=input_stream,
output_format="json", # or "csv", "sqlite", "sql", "db"
output_file=output_stream
)
# Print the generated JSON data
print(output_stream.getvalue())