py-walk
py-walk is a Python library designed to filter filesystem paths based on .gitignore-like patterns. It aims to provide 100% compatibility with Git's wildmatch pattern syntax, making it useful for applications needing to mimic Git's file exclusion behavior. The library is currently at version 0.3.3 and appears to have an active, though not rapid, release cadence with recent patch updates.
Warnings
- gotcha Do not confuse `py-walk` with Python's standard library `os.walk`. While both traverse directory trees, `py-walk`'s primary function is to filter paths based on `.gitignore` style patterns, providing selective iteration based on these rules. `os.walk` simply provides an iterator for all files and directories.
- gotcha While `py-walk` aims for 100% compatibility with Git's `wildmatch` pattern syntax, the project's README encourages reporting any divergences. This suggests that in some complex or edge-case scenarios, its behavior might not perfectly mirror `git check-ignore`.
Install
-
pip install py-walk
Imports
- walk
from py_walk import walk
- get_parser_from_text
from py_walk import get_parser_from_text
- get_parser_from_file
from py_walk import get_parser_from_file
Quickstart
import os
from py_walk import walk, get_parser_from_text
# Create a dummy directory structure for demonstration
if not os.path.exists('my_project'):
os.makedirs('my_project/data')
os.makedirs('my_project/src')
with open('my_project/data/temp.bin', 'w') as f: f.write('binary data')
with open('my_project/data/foo.bin', 'w') as f: f.write('important binary data')
with open('my_project/src/main.py', 'w') as f: f.write('print("Hello")')
with open('my_project/__pycache__/cache.pyc', 'w') as f: f.write('cache')
with open('my_project/.env', 'w') as f: f.write('ENV_VAR=value')
patterns = """
**/data/*.bin
!**/data/foo.bin
# exclude python cache and env files
__pycache__/
*.py[cod]
.env
"""
# Example 1: Walk a directory with patterns
print("\n--- Walking 'my_project' with patterns ---")
for path in walk('my_project', ignore=patterns):
print(f"Included: {path}")
# Example 2: Manually check paths against a parser
print("\n--- Manually checking paths ---")
parser = get_parser_from_text(patterns, base_dir='my_project')
paths_to_check = [
'my_project/data/temp.bin',
'my_project/data/foo.bin',
'my_project/src/main.py',
'my_project/__pycache__/cache.pyc',
'my_project/.env',
'my_project/README.md' # This file doesn't exist but we can check the pattern
]
for p in paths_to_check:
if not parser.match(p):
print(f"'{p}' is NOT ignored (matches patterns if prefix is not '!')")
else:
print(f"'{p}' IS ignored")
# Cleanup dummy directory (optional)
import shutil
# shutil.rmtree('my_project') # Uncomment to clean up