{"id":1747,"library":"tokenize-rt","title":"tokenize-rt","description":"tokenize-rt is a Python library that provides a wrapper around the standard library's `tokenize` module, ensuring proper round-tripping of Python source code. It extends the standard token set with `ESCAPED_NL` and `UNIMPORTANT_WS` tokens, making it especially useful for refactoring tools that need to preserve whitespace and exact source representation. The library is actively maintained, with version 6.2.0 released on May 23, 2025, and generally follows a regular release cadence.","status":"active","version":"6.2.0","language":"en","source_language":"en","source_url":"https://github.com/asottile/tokenize-rt","tags":["tokenization","refactoring","AST","parser","python-source"],"install":[{"cmd":"pip install tokenize-rt","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Core runtime dependency, requires Python >=3.9.","package":"python","optional":false}],"imports":[{"note":"Function to convert source text to a list of tokens.","symbol":"src_to_tokens","correct":"from tokenize_rt import src_to_tokens"},{"note":"Function to convert a list of tokens back to source text.","symbol":"tokens_to_src","correct":"from tokenize_rt import tokens_to_src"},{"note":"Dataclass representing a token, including name, source, and offset.","symbol":"Token","correct":"from tokenize_rt import Token"},{"note":"Constant for the backslash-escaped newline token.","symbol":"ESCAPED_NL","correct":"from tokenize_rt import ESCAPED_NL"},{"note":"Constant for the unimportant whitespace token.","symbol":"UNIMPORTANT_WS","correct":"from tokenize_rt import UNIMPORTANT_WS"}],"quickstart":{"code":"from tokenize_rt import src_to_tokens, tokens_to_src\n\ndef roundtrip_code(code: str) -> str:\n    tokens = src_to_tokens(code)\n    # You can now inspect or modify 'tokens'\n    # For example, let's print them\n    for token in tokens:\n        print(f\"Token(name={token.name!r}, src={token.src!r}, line={token.line}, offset={token.utf8_byte_offset})\")\n    \n    # Convert back to source\n    return tokens_to_src(tokens)\n\nexample_code = 'def foo(bar):\n    if bar:  # a comment\n        return \"hello world\"\n'\n\nroundtripped_code = roundtrip_code(example_code)\nprint(\"\\nOriginal Code:\\n\", example_code)\nprint(\"\\nRoundtripped Code:\\n\", roundtripped_code)\nassert example_code == roundtripped_code","lang":"python","description":"This quickstart demonstrates the core round-tripping functionality of `tokenize-rt`. It tokenizes a given Python source string using `src_to_tokens`, then converts the token stream back into source using `tokens_to_src`, verifying that the output matches the input."},"warnings":[{"fix":"Be aware of the specific token types and normalizations added by `tokenize-rt` when porting code from or comparing with `stdlib.tokenize`.","message":"tokenize-rt intentionally introduces additional token types (`ESCAPED_NL`, `UNIMPORTANT_WS`) and normalizes certain aspects (e.g., string prefixes, Python 2 literals in Python 3). This means its token stream will differ from that produced by the standard library's `tokenize` module, and direct comparisons or expectations of identical token streams should be adjusted.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Open files in binary mode: `with open('your_file.py', 'rb') as f: tokens = src_to_tokens(f.readline)`.","message":"When reading Python source from a file, `src_to_tokens` (similar to `stdlib.tokenize.tokenize`) expects a callable `readline` function that returns lines as *bytes*, not a file object directly. If you open a file, ensure it's in binary read mode (`'rb'`) and pass `file_object.readline` to `src_to_tokens`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Test your application thoroughly across Python 3.11 and 3.12 if f-string tokenization details are critical. `tokenize-rt` generally handles this, but custom logic relying on specific token sequences might be affected.","message":"The underlying `stdlib.tokenize` module introduced a breaking change in tokenization of f-strings between Python 3.11 and Python 3.12 due to the formalization of PEP 701. While `tokenize-rt` aims to roundtrip reliably, tools built on fine-grained f-string token introspection might need careful review if moving between these Python versions.","severity":"breaking","affected_versions":"Python 3.11 to 3.12 runtime environments (and tokenize-rt versions used within them)."}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}