tokenize-rt

6.2.0 · active · verified Thu Apr 09

tokenize-rt is a Python library that provides a wrapper around the standard library's `tokenize` module, ensuring proper round-tripping of Python source code. It extends the standard token set with `ESCAPED_NL` and `UNIMPORTANT_WS` tokens, making it especially useful for refactoring tools that need to preserve whitespace and exact source representation. The library is actively maintained, with version 6.2.0 released on May 23, 2025, and generally follows a regular release cadence.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates the core round-tripping functionality of `tokenize-rt`. It tokenizes a given Python source string using `src_to_tokens`, then converts the token stream back into source using `tokens_to_src`, verifying that the output matches the input.

from tokenize_rt import src_to_tokens, tokens_to_src

def roundtrip_code(code: str) -> str:
    tokens = src_to_tokens(code)
    # You can now inspect or modify 'tokens'
    # For example, let's print them
    for token in tokens:
        print(f"Token(name={token.name!r}, src={token.src!r}, line={token.line}, offset={token.utf8_byte_offset})")
    
    # Convert back to source
    return tokens_to_src(tokens)

example_code = 'def foo(bar):
    if bar:  # a comment
        return "hello world"
'

roundtripped_code = roundtrip_code(example_code)
print("\nOriginal Code:\n", example_code)
print("\nRoundtripped Code:\n", roundtripped_code)
assert example_code == roundtripped_code

view raw JSON →