PyYAML Include
PyYAML Include extends PyYAML with a powerful mechanism to include other YAML files into a current YAML document. It supports local files, as well as remote files via HTTP, S3, SFTP, and more through its integration with `fsspec`. The current stable version is 2.2, with active development and somewhat irregular but consistent releases addressing new features and breaking changes.
Warnings
- breaking Version 2.0 introduced significant breaking changes, including a change in the package's import namespace from `yamlinclude` to `yaml_include` and the core `YamlIncludeConstructor` class being renamed to `Constructor`. Code written for `v1.x` is not directly compatible with `v2.x`.
- breaking Future versions (e.g., v2.3 and beyond) are expected to drop support for Python 3.8 and below. While v2.2 still supports Python >=3.8, plan to upgrade your Python environment if using older versions.
- gotcha Using shell-style wildcards (`**`, `*`, `?`) in include paths, especially with large directory trees or remote filesystems, can lead to performance issues and high memory consumption. All matched files are fully loaded into memory, not lazily.
- gotcha By default, `PyYAML`'s `yaml.load()` function can be insecure when dealing with untrusted input due to arbitrary code execution possibilities. Although `pyyaml-include` examples often use `yaml.full_load()`, it's a good practice to be aware.
Install
-
pip install "pyyaml-include" -
pip install "pyyaml-include" fsspec[http,s3]
Imports
- Constructor
from yaml_include import Constructor
- yaml_include
import yaml_include
Quickstart
import yaml
import yaml_include
import os
# Create dummy YAML files for the example
with open("database.yml", "w") as f:
f.write("host: localhost\nport: 5432\nname: mydb\n")
with open("config.yml", "w") as f:
f.write("database: !inc database.yml\napp:\n name: MyApp\n")
# Register the include tag with a YAML Loader
# It is recommended to use FullLoader for most use cases with unknown sources
yaml.add_constructor("!inc", yaml_include.Constructor(), yaml.FullLoader)
# Load the main YAML file
with open('config.yml', 'r') as f:
data = yaml.full_load(f)
print(data)
# Expected output: {'database': {'host': 'localhost', 'port': 5432, 'name': 'mydb'}, 'app': {'name': 'MyApp'}}
# Clean up dummy files
os.remove("database.yml")
os.remove("config.yml")