{"id":9795,"library":"greenery","title":"Greenery","description":"Greenery is a Python library designed for the manipulation of regular expressions by converting them into Finite State Machines (FSMs). It enables powerful operations like finding matching strings, determining unions, intersections, and differences between regular expressions. The current version is 4.2.2, and it maintains a relatively active release cadence with several updates per year.","status":"active","version":"4.2.2","language":"en","source_language":"en","source_url":"https://github.com/qntm/greenery","tags":["regex","regular expressions","finite state machine","fsm","pattern matching"],"install":[{"cmd":"pip install greenery","lang":"bash","label":"Install latest version"}],"dependencies":[],"imports":[{"note":"Used to parse a regular expression string into a greenery Pattern object.","symbol":"parse","correct":"from greenery import parse"},{"note":"Allows direct manipulation of Finite State Machine objects.","symbol":"FSM","correct":"from greenery.fsm import FSM"},{"note":"For direct creation of Pattern objects, though `parse()` is more common.","symbol":"Pattern","correct":"from greenery.lego import Pattern"}],"quickstart":{"code":"from greenery import parse\n\n# Parse a regular expression string into a Pattern object\nre_pattern = parse('a(b|c)*d')\n\n# Check if a string matches the pattern\nassert re_pattern.matches('ad')\nassert re_pattern.matches('abd')\nassert re_pattern.matches('abbcd')\nassert not re_pattern.matches('aed')\n\n# Get a set of all possible strings matched by the pattern (if finite)\n# For infinite patterns, this will yield indefinitely, so use a limit.\n# A 'language' (alphabet) must often be supplied for operations like union/intersection in v4.0.0+\n\n# Example with specific alphabet (not strictly needed for .strings() but good practice)\nalphabet = frozenset({'a', 'b', 'c', 'd'})\nstrings_generator = re_pattern.strings(language=alphabet)\n\n# Get a few strings (it can be an infinite generator)\nsome_strings = [next(strings_generator) for _ in range(5)]\nprint(f\"Some strings matched by 'a(b|c)*d': {some_strings}\")\n\n# Demonstrate intersection of two patterns\npattern1 = parse('a.*b')\npattern2 = parse('axb')\n\n# When working with FSMs, ensure a consistent language/alphabet\nalph1 = pattern1.alphabet\nalph2 = pattern2.alphabet\ncommon_alphabet = alph1 | alph2\n\nintersection_pattern = (pattern1 & pattern2).reduce(language=common_alphabet)\nprint(f\"Intersection of 'a.*b' and 'axb': {intersection_pattern}\")\nassert intersection_pattern.matches('axb')","lang":"python","description":"This quickstart demonstrates how to parse a regular expression string into a Greenery Pattern object, check if strings match, and retrieve generated strings. It also shows a basic example of pattern intersection, highlighting the importance of the `language` parameter for FSM operations in newer versions."},"warnings":[{"fix":"Explicitly pass the `language` argument, typically a `frozenset` of characters. For combining FSMs, use `fsm1.alphabet | fsm2.alphabet` to ensure all relevant characters are included. Example: `fsm1.intersection(fsm2, language=fsm1.alphabet | fsm2.alphabet)`.","message":"Version 4.0.0 introduced a mandatory `language` parameter for many core FSM methods (e.g., `FSM.intersection`, `FSM.union`, `Pattern.strings`, `Pattern.matches`). Code written for versions < 4.0.0 will raise `TypeError` if these methods are called without the `language` argument.","severity":"breaking","affected_versions":"4.0.0+"},{"fix":"Always define and pass an explicit `language` (a `frozenset` of characters) that accurately reflects the intended alphabet of your regular expression, particularly for operations like intersection or when generating strings. For example, `language=frozenset('abc')` or `language=frozenset(string.ascii_letters + string.digits)`.","message":"The `language` parameter fundamentally affects the behavior of many operations. If not explicitly specified, it defaults to a `frozenset()` of characters seen so far, which can lead to unexpected results, especially when dealing with character classes (e.g., `[0-9]`) or when expecting an infinite language. Always consider the full character set your regex is intended to operate within.","severity":"gotcha","affected_versions":"4.0.0+"},{"fix":"Before iterating, check `if my_pattern.is_finite():` to confirm if the language is finite. If not, or if you only need a sample, use `itertools.islice` to limit the number of strings fetched: `import itertools; some_strings = list(itertools.islice(my_pattern.strings(), 100))`.","message":"`Pattern.strings()` can yield an infinite number of strings if the regular expression matches an infinite language (e.g., `a*`). Iterating over it without a limit or explicitly checking `Pattern.is_finite()` can lead to an infinite loop, consuming all system resources.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-17T00:00:00.000Z","next_check":"2026-07-16T00:00:00.000Z","problems":[{"fix":"Update method calls to explicitly include the `language` argument, which should be a `frozenset` of characters. Example: `(pattern1 & pattern2).reduce(language=pattern1.alphabet | pattern2.alphabet)`.","cause":"Attempting to use FSM or Pattern methods like `intersection()`, `union()`, `difference()`, or `reduce()` from `greenery` version 4.0.0 or later without providing the mandatory `language` argument.","error":"TypeError: intersection() missing 1 required positional argument: 'language'"},{"fix":"Consult the official `greenery` documentation or GitHub repository for the equivalent modern method. For `matches_any`, the `Pattern.matches()` method is now used for individual string matching.","cause":"Using an API method (like `matches_any`, `is_null`, etc.) that was deprecated, renamed, or removed in a major version upgrade, likely version 4.x. The API underwent significant changes.","error":"AttributeError: 'Pattern' object has no attribute 'matches_any'"},{"fix":"Ensure that all FSMs involved in an operation (like union or intersection) share a consistent and sufficiently broad `language` (alphabet) that encompasses all characters used by both. Provide this unified `language` explicitly when creating the FSMs or performing the operation.","cause":"This error can occur during FSM operations if the alphabets of the involved FSMs are not compatible, often when one FSM's alphabet is not a subset of another's, especially if they were created with different or implied `language` arguments.","error":"ValueError: not a subset"}]}