PEG.js Parser Generator
PEG.js is a parser generator for JavaScript that produces fast parsers based on the Parsing Expression Grammar (PEG) formalism. It enables developers to define grammars for custom languages or complex data formats and generate a JavaScript parser function from them. The current stable version, 0.10.0, was released in August 2016. While the project's direct development on PEG.js itself has largely ceased, its successor, Peggy.js (npm: `peggy`), maintains API compatibility and active development. Key differentiators of PEG.js include its simple, expressive grammar syntax, excellent error reporting, and the ability to integrate both lexical and syntactical analysis into a single grammar. It can be used programmatically via a JavaScript API or through a command-line interface, generating parsers in multiple module formats like CommonJS (default), AMD, UMD, or global variables.
Common errors
-
TypeError: pegjs.buildParser is not a function
cause Attempting to call the deprecated `buildParser` method instead of `generate`.fixChange `pegjs.buildParser(...)` to `pegjs.generate(...)`. -
Error: Infinite loop detected in grammar.
cause The grammar contains a construct that could lead to infinite recursion during parsing, a check introduced in v0.9.0.fixReview grammar rules for left recursion or other patterns that could cause infinite loops. Restructure rules to ensure progress is made with each match. -
ReferenceError: PEG is not defined
cause In a browser environment, the global API name changed from `PEG` to `peg` in v0.10.0, or the UMD/global output format was not correctly integrated.fixUse `peg` instead of `PEG` for the global variable. Ensure the generated parser script is included correctly and the format option (e.g., `globals`) is set appropriately during generation. -
SyntaxError: Expected [rule description] but [found token] found.
cause The input string does not conform to the defined grammar rules at the specified location.fixDebug the input string and grammar. Use the error's `location` property (offset, line, column) to pinpoint the exact mismatch. Consider adding more robust error handling or `error()` calls within grammar actions.
Warnings
- breaking In v0.10.0, the primary API function `pegjs.buildParser` was renamed to `pegjs.generate`. Code using the old function name will break. Additionally, the global variable name for browser use changed from `PEG` to `peg`.
- breaking In v0.8.0, all internal identifiers in generated parser code were prefixed with `peg$` to discourage their direct use and avoid conflicts. Code that directly accessed non-prefixed internal variables within generated parsers will likely break.
- gotcha The default output module format for generated parsers changed in v0.10.0. While CommonJS remains the default, explicitly specifying the `format` option (e.g., `format: 'commonjs'`, `format: 'umd'`, `format: 'globals'`) is now clearer and recommended, especially if distributing the generated parser.
- gotcha PEG.js v0.10.0 was released in August 2016 and is no longer actively maintained. For ongoing development, modern JavaScript features, and TypeScript support, consider migrating to its direct successor, Peggy.js (`npm install peggy`), which is API compatible and actively maintained.
- deprecated Tracing support, introduced in v0.9.0, was marked as experimental. Its API and behavior were subject to change, and users should not rely on its stability across minor versions.
Install
-
npm install pegjs -
yarn add pegjs -
pnpm add pegjs
Imports
- pegjs
import pegjs from 'pegjs';
const pegjs = require('pegjs'); - generate
const parser = pegjs.buildParser(grammarString, options);
const parser = pegjs.generate(grammarString, options);
- parser.parse
const result = myParser.parse('input string');
Quickstart
const pegjs = require('pegjs');
// Define a simple grammar for arithmetic expressions
const grammar = `
start = expression
expression = term (('+' / '-') term)* {
return arguments[0].reduce((acc, current) => {
const [op, val] = current;
return op === '+' ? acc + val : acc - val;
});
}
term = factor (('*' / '/') factor)* {
return arguments[0].reduce((acc, current) => {
const [op, val] = current;
return op === '*' ? acc * val : acc / val;
});
}
factor = number / '(' expression ')'
number = [0-9]+ {
return parseInt(text(), 10);
}
whitespace = [ \t\n\r]*
`;
// Generate the parser
try {
const parser = pegjs.generate(grammar, {
output: 'parser',
format: 'commonjs',
optimize: 'speed'
});
// Use the generated parser
const input1 = "10 + 5 * (2 - 1)";
const result1 = parser.parse(input1);
console.log(`Input: "${input1}", Result: ${result1}`); // Expected: 15
const input2 = "(100 / 2) - 15";
const result2 = parser.parse(input2);
console.log(`Input: "${input2}", Result: ${result2}`); // Expected: 35
// Example of a parsing error
const invalidInput = "10 + * 5";
try {
parser.parse(invalidInput);
} catch (e) {
console.error(`Error parsing "${invalidInput}": ${e.message}`);
}
} catch (e) {
console.error("Error generating parser: ", e.message);
}