Salesforce Apex Language Parser

raw JSON →
2.17.0 verified Sun Apr 19 auth: no javascript

apex-parser is a JavaScript parser specifically designed for the Salesforce Apex language, including support for Apex Triggers, and inline SOQL (Salesforce Object Query Language) and SOSL (Salesforce Object Search Language) queries. Built upon an ANTLR4 grammar, it provides a low-level parse tree representation of Apex code. The current stable version is 2.17.0, with regular updates addressing bug fixes, new Apex/SOQL/SOSL features, and dependency updates. Unlike higher-level tools, this library focuses solely on parsing, offering the raw parse tree for further analysis by downstream tools, rather than providing built-in semantic analysis. A key differentiator is its correct handling of Apex's case-insensitivity through a custom input stream. It is available as an NPM module for Node.js environments and a Maven package for JVMs.

error Error: MismatchedTokenException: expecting 'class' got 'CLASS'
cause The input stream provided to the lexer/parser is case-sensitive, but Apex is a case-insensitive language.
fix
Wrap your input string in new CaseInsensitiveInputStream(yourApexCode) before creating the lexer.
error TypeError: Cannot read properties of undefined (reading 'CommonTokenStream') when importing from 'antlr4ts'
cause A version mismatch between `antlr4ts` installed directly in your project and the version `apex-parser` expects internally.
fix
Instead of import { CommonTokenStream } from 'antlr4ts';, use import { CommonTokenStream } from 'apex-parser';.
error SyntaxError: Unexpected token 'export' or 'import'
cause Attempting to use ES module `import` syntax in a CommonJS-only environment (e.g., older Node.js versions or misconfigured bundlers).
fix
Ensure your Node.js environment supports ES modules (Node.js >=12, with .mjs or type: "module" in package.json), or switch to CommonJS require() syntax: const { ApexLexer } = require('apex-parser');.
breaking The internal handling of character positions for Unicode code points changed. Prior to 2.12.0 (JVM) and 2.14.0 (Node), `ANTLRInputStream` resulted in UTF-16 character positions. After these versions, switching to `CharStream` aligns character positions with Unicode code points. This might affect tools relying on precise character offsets.
fix Ensure your code correctly interprets character positions based on Unicode code points if upgrading from older versions. Test any existing tools that consume character offsets.
gotcha Apex and its embedded languages (SOQL/SOSL) are case-insensitive. Failing to use `CaseInsensitiveInputStream` will lead to incorrect parsing results as the parser expects a case-insensitive stream.
fix Always wrap your input string with `new CaseInsensitiveInputStream(yourCodeString)` before passing it to the lexer.
gotcha To avoid version conflicts with the underlying `antlr4ts` library, it is strongly recommended to import shared ANTLR components like `CommonTokenStream` and `ParseTreeWalker` directly from `apex-parser` rather than importing them from `antlr4ts`.
fix Change `import { CommonTokenStream } from 'antlr4ts';` to `import { CommonTokenStream } from 'apex-parser';`.
gotcha SOSL FIND clauses use different quoting characters when embedded within Apex (single quotes `'`) versus when used in the API format (braces `{}`). There are alternative parser rules (e.g., `soslLiteralAlt`) available to handle these differences.
fix Consult the `SOSLParserTest` examples or the grammar for specific rules to use when parsing SOSL FIND queries in different contexts.
npm install apex-parser
yarn add apex-parser
pnpm add apex-parser

Demonstrates how to initialize the Apex Lexer and Parser, feed it Apex code, and obtain the root parse tree context for further traversal.

import { ApexLexer, CaseInsensitiveInputStream, CommonTokenStream, ApexParser } from 'apex-parser';

function parseApexCode(code: string): any {
  const lexer = new ApexLexer(new CaseInsensitiveInputStream(code));
  const tokens = new CommonTokenStream(lexer);
  const parser = new ApexParser(tokens);
  
  // Optionally add error listeners for more robust error handling
  // parser.removeErrorListeners();
  // parser.addErrorListener(new MyErrorListener());

  // The root context for a class file is compilationUnit()
  const context = parser.compilationUnit();
  
  console.log('Successfully parsed Apex code.');
  // You can traverse the parse tree from the 'context' object.
  // Example: Printing the tree structure (simplified, for illustration)
  // console.log(context.toStringTree(parser.ruleNames));
  
  return context;
}

const apexClassCode = `
public class MyExampleClass {
    public void myMethod(String param) {
        // Example SOQL query
        List<Account> accounts = [SELECT Id, Name FROM Account WHERE Name = :param];
        System.debug('Accounts found: ' + accounts.size());
    }
}
`;

const parseTree = parseApexCode(apexClassCode);
// In a real application, you would now use ANTLR4 visitors or listeners
// to traverse 'parseTree' and extract information or perform analysis.
console.log(`Root context type: ${parseTree.constructor.name}`);