Javalang: Pure Python Java Parser
Javalang is a pure Python library designed for working with Java source code. It provides a lexer and parser specifically targeting Java 8, with its implementation based on the Java Language Specification. The library's current version is 0.13.0, released on March 28, 2020.
Common errors
-
javalang.tokenizer.LexerError: Unknown token: ('var', <position>)cause Attempting to parse Java code that uses the `var` keyword (introduced in Java 10) or other post-Java 8 language features.fixRewrite the Java code to be compatible with Java 8 syntax, or use a parsing library that supports newer Java versions. If you must use `javalang`, ensure the input is strictly Java 8 compliant. -
javalang.parser.JavaSyntaxError: Unexpected token 'extends'
cause Parsing a Java 9+ feature like 'sealed classes' or 'records' that uses new keywords or syntax not recognized by the Java 8-targeted parser.fixRestrict Java input to Java 8 syntax. `javalang` does not support newer Java features like sealed classes or records. -
AttributeError: 'NoneType' object has no attribute 'name'
cause This typically occurs when traversing the AST, and an expected node or attribute is `None`. This can happen if the parsing failed partially, or if the AST structure for a specific Java construct was not as anticipated, leading to accessing a non-existent child.fixAdd explicit checks for `None` before accessing attributes of AST nodes, especially when dealing with optional elements in Java syntax (e.g., `tree.package` can be `None` if no package is declared). Ensure the input Java code is syntactically valid and complete for `javalang`. Example: `if tree.package: print(tree.package.name)`.
Warnings
- breaking Javalang explicitly targets Java 8. Attempting to parse Java code leveraging newer language features (e.g., `var` keyword, text blocks, sealed classes, `switch` expressions from Java 9+) will likely result in `LexerError` or `JavaSyntaxError`.
- gotcha The `javalang.parse.parse()` function strictly requires a 'complete compilation unit' (a full, valid Java source file). It cannot parse isolated code snippets like a single method, statement, or declaration without the enclosing class and package structure.
- deprecated The `javalang` library has not seen a new release since March 2020. While still functional for Java 8, active development appears to have ceased, meaning new features, bug fixes, or compatibility with recent Python or Java versions are unlikely.
Install
-
pip install javalang
Imports
- javalang.parse.parse
import javalang tree = javalang.parse.parse(java_code)
- javalang.tree.CompilationUnit
from javalang import tree # tree is an instance of CompilationUnit
- javalang.tree.Node
import javalang.ast.Node
from javalang import tree # For traversing AST nodes
Quickstart
import javalang
java_code = """
package com.example;
public class MyClass {
// Simple field
int myField = 10;
public static void main(String[] args) {
System.out.println("Hello, Javalang!");
}
}
"""
try:
tree = javalang.parse.parse(java_code)
print(f"Parsed package: {tree.package.name}")
print(f"Parsed class: {tree.types[0].name}")
# Iterate through nodes to find methods and fields
for path, node in tree:
if isinstance(node, javalang.tree.MethodDeclaration):
print(f" Method found: {node.name}")
elif isinstance(node, javalang.tree.FieldDeclaration):
print(f" Field found: {node.declarators[0].name}")
except javalang.tokenizer.LexerError as e:
print(f"Lexer Error: {e}")
except javalang.parser.JavaSyntaxError as e:
print(f"Syntax Error: {e}")
except Exception as e:
print(f"An unexpected error occurred: {e}")