Python Lex & Yacc

3.11 verified Tue May 12 auth: no python install: verified quickstart: verified abandoned

PLY (Python Lex-Yacc) is a pure-Python implementation of the lex and yacc tools commonly used to write parsers and compilers. It implements the LALR(1) parsing algorithm and offers extensive error reporting and diagnostic information. While version 3.11 is the latest stable release, the project's author announced its abandonment on December 21, 2025, with no further maintenance expected.

pip install ply

Common errors

error WARNING: N shift/reduce conflicts ↓

cause The grammar is ambiguous, meaning the parser cannot decide whether to 'shift' the next input token onto the stack or 'reduce' a grammar rule using symbols already on the stack. PLY resolves these by default in favor of shifting, but it's a warning because this might not be the desired behavior for the specific grammar.

fix

Define operator precedence and associativity using the precedence tuple in your parser definition (e.g., precedence = (('left', 'PLUS', 'MINUS'), ('left', 'TIMES', 'DIVIDE'))), or rewrite the ambiguous grammar rules.

error WARNING: N reduce/reduce conflicts ↓

cause The grammar is ambiguous, meaning the parser encounters a state where it has two or more grammar rules that could be reduced using the same input.

fix

Rewrite the ambiguous grammar rules to remove the conflict. Unlike shift/reduce conflicts, these generally cannot be resolved with precedence rules and require careful restructuring of your grammar productions.

error Syntax error at token %s ↓

cause The input string does not conform to the defined grammar rules. This often means an unexpected token was encountered by the parser at the given position.

fix

Implement a p_error(p) function in your parser file to handle syntax errors gracefully, providing diagnostic information (e.g., print(f"Syntax error at {p.value!r}")). Debug your grammar and lexer rules to ensure they correctly recognize the expected input.

error SyntaxError: unexpected EOF while parsing ↓

cause The input ended prematurely, and the parser expected more tokens to complete a grammar rule. In PLY, this often manifests as the `p_error` function being called with `None` as the token argument.

fix

Ensure the input string is complete and well-formed according to the grammar. If using a p_error function, explicitly check for if p is None: to handle unexpected end-of-file conditions and provide an appropriate message.

error Illegal character '%s' ↓

cause The lexer encountered a character in the input string that does not match any defined token regular expression.

fix

Implement a t_error(t) function in your lexer file to handle illegal characters. This function should typically report the error (e.g., print(f"Illegal character '{t.value[0]}' at line {t.lineno}")) and then skip the offending character using t.lexer.skip(1) to attempt to continue lexing.

Warnings

breaking The author officially announced the abandonment of the PLY project on December 21, 2025. No further maintenance is expected. Users are advised to consider other parsing libraries or vendor PLY into their projects. ↓

fix Migrate to an alternative parsing library (e.g., SLY, Lark) or copy the PLY source code directly into your project if continued use is necessary, understanding it will not receive updates or bug fixes.

breaking PLY 3.11 requires Python 3.6 or greater. Python 2.x is not supported by this version or later development (PLY 4.0 drops Python 2 support entirely). ↓

fix Ensure your project runs on Python 3.6 or newer. For older Python 2.x projects, use a legacy PLY version (e.g., PLY 3.5 or earlier, if available and compatible, though this is not recommended due to abandonment) or migrate to Python 3.

gotcha PLY primarily uses Python function docstrings to define regular expressions for lexer rules and grammar rules for the parser. When running Python in optimized mode (`python -O`), docstrings are ignored, which breaks PLY's introspection-based rule discovery. ↓

fix When building your lexer or parser, pass `optimize=1` to `lex.lex()` and `yacc.yacc()`. This generates and uses `lextab.py` and `parsetab.py` files to store the parsing tables, making PLY compatible with optimized mode.

gotcha The author strongly recommends against using `pip install ply` and instead suggests copying the `ply` directory directly into your project (vendoring). This is due to the project's specialized nature, its 'zero-dependency' status, and the author's desire not to be a link in a software supply chain that might be affected by occasional breaking changes or the recent abandonment. ↓

fix Instead of `pip install ply`, manually copy the `ply` directory (containing `lex.py` and `yacc.py`) from the PLY GitHub repository into your project's source tree. Then, import using `from myproject.ply import lex` or `from ply import lex` if copied directly to your project root.

deprecated Using module-level functions like `lex.input()` and `lex.token()` directly (without first creating a lexer object) is discouraged and may be removed in future versions. These functions operate on the 'last' lexer created, which can lead to unexpected behavior in applications with multiple lexers or complex control flow. ↓

fix Always create an explicit lexer object (e.g., `my_lexer = lex.lex()`) and then call methods on that object (e.g., `my_lexer.input(data)`, `my_lexer.token()`).

Install compatibility verified last tested: 2026-05-12

python os / libc status wheel install import disk

3.10 alpine (musl) wheel - 0.02s 18.1M

3.10 alpine (musl) - - 0.02s 18.1M

3.10 slim (glibc) wheel 1.5s 0.01s 19M

3.10 slim (glibc) - - 0.01s 19M

3.11 alpine (musl) wheel - 0.03s 20.0M

3.11 alpine (musl) - - 0.04s 20.0M

3.11 slim (glibc) wheel 1.6s 0.03s 21M

3.11 slim (glibc) - - 0.03s 21M

3.12 alpine (musl) wheel - 0.03s 11.9M

3.12 alpine (musl) - - 0.04s 11.9M

3.12 slim (glibc) wheel 1.5s 0.04s 12M

3.12 slim (glibc) - - 0.04s 12M

3.13 alpine (musl) wheel - 0.05s 11.6M

3.13 alpine (musl) - - 0.03s 11.5M

3.13 slim (glibc) wheel 1.5s 0.04s 12M

3.13 slim (glibc) - - 0.03s 12M

3.9 alpine (musl) wheel - 0.01s 17.6M

3.9 alpine (musl) - - 0.02s 17.6M

3.9 slim (glibc) wheel 1.8s 0.02s 18M

3.9 slim (glibc) - - 0.01s 18M

Imports

lex
```
from ply import lex
```
yacc
```
from ply import yacc
```
Lexer
wrong
```
from ply.lex import Lexer
```
correct
```
import ply.lex as lex; lexer = lex.lex(...)
```
Lexer object is typically instantiated via `lex.lex()` after defining rules, not directly imported as a class.
Parser
wrong
```
from ply.yacc import Parser
```
correct
```
import ply.yacc as yacc; parser = yacc.yacc(...)
```
Parser object is typically instantiated via `yacc.yacc()` after defining grammar, not directly imported as a class.

Quickstart verified last tested: 2026-04-24

This quickstart demonstrates a simple calculator using PLY's lexer and parser. It defines tokens, regular expression rules for lexical analysis, and grammar rules for parsing. The lexer (`lex.lex()`) and parser (`yacc.yacc()`) are then built and used to process input.

import ply.lex as lex
import ply.yacc as yacc
import os # For optimizer cache

# --- LEXER --- 

# List of token names.
tokens = (
    'NUMBER',
    'PLUS',
    'MINUS',
    'TIMES',
    'DIVIDE',
    'LPAREN',
    'RPAREN'
)

# Regular expression rules for simple tokens
t_PLUS    = r'\+'
t_MINUS   = r'-'
t_TIMES   = r'\*'
t_DIVIDE  = r'/'
t_LPAREN  = r'\('
t_RPAREN  = r'\)'

# A regular expression rule with some action code
def t_NUMBER(t):
    r'\d+'
    t.value = int(t.value)
    return t

# Define a rule so we can track line numbers
def t_newline(t):
    r'\n+'
    t.lexer.lineno += len(t.value)

# A string containing ignored characters (spaces and tabs)
t_ignore  = ' \t'

# Error handling rule
def t_error(t):
    print(f"Illegal character '{t.value[0]}' at line {t.lexer.lineno}")
    t.lexer.skip(1)

# Build the lexer
# To handle Python's -O (optimized mode) ignoring docstrings, use optimize=1
# and ensure the lextab.py file is written. 
# For a quickstart, we'll assume not running in optimized mode or handle it.
# If running in optimized mode, you might need to pre-generate tables.
# For this example, we explicitly ensure the cache directory exists and optimize is off for simplicity.
# If you intend to use optimized mode, pre-generate tables:
# lexer = lex.lex(optimize=1, lextab='lextab.py', outputdir='.')
lexer = lex.lex()

# --- PARSER --- 

# Precedence rules for the arithmetic operators
precedence = (
    ('left', 'PLUS', 'MINUS'),
    ('left', 'TIMES', 'DIVIDE'),
)

# Grammar rules for expressions
def p_expression_binop(p):
    '''expression : expression PLUS expression
                  | expression MINUS expression
                  | expression TIMES expression
                  | expression DIVIDE expression'''
    if p[2] == '+':
        p[0] = p[1] + p[3]
    elif p[2] == '-':
        p[0] = p[1] - p[3]
    elif p[2] == '*':
        p[0] = p[1] * p[3]
    elif p[2] == '/':
        if p[3] == 0:
            print("Division by zero!")
            p[0] = None # Or raise an error
        else:
            p[0] = p[1] / p[3]

def p_expression_group(p):
    'expression : LPAREN expression RPAREN'
    p[0] = p[2]

def p_expression_number(p):
    'expression : NUMBER'
    p[0] = p[1]

def p_error(p):
    if p:
        print(f"Syntax error at token '{p.type}' value '{p.value}' line {p.lineno}")
    else:
        print("Syntax error at EOF")

# Build the parser
parser = yacc.yacc()

# Test it out
while True:
    try:
        s = input('calc > ')
    except EOFError:
        break
    if not s: continue
    result = parser.parse(s)
    if result is not None:
        print(result)