Mbox File Parser for Node.js

2.0.0 · active · verified Tue Apr 21

The `node-mbox` library (current stable version 2.0.0) provides a fast, stream-based parser for mbox email archives in Node.js environments. It is designed to efficiently process large mbox files, reportedly handling 1.5GB in about 20 seconds. The library differentiates itself by focusing specifically on the mbox file structure parsing, emitting individual email messages as `Buffer` instances, rather than attempting to parse the intricate content of the email messages themselves (a task typically handled by companion libraries like `mailparser`). Version 2.0.0 introduces a completely new API, shifting towards a more idiomatic Node.js stream approach and allowing for custom line splitting technologies. While generally robust, it notes that it is not 100% compliant with RFC 4155, which is an important consideration for strict RFC adherence. Its release cadence appears to involve significant API revisions between major versions.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates three ways to use `node-mbox`: reading from a file, piping from a stream with a custom line splitter, and piping from a stream using the default splitter. It then shows how to listen for 'data', 'error', and 'finish' events to process parsed email messages, which are provided as Buffers.

import { Mbox, MboxStream } from 'node-mbox';
import fs from 'fs';
import split from 'line-stream';

// 1. Pass it a filename
const mboxFromFile = new Mbox();
fs.createReadStream('test/test-4-message.mbox').pipe(mboxFromFile);

// 2. Pass it a stream and use a custom line splitter
const mailboxStream = fs.createReadStream('test/test-4-message.mbox');
const splitter = split('\n');
const mboxFromCustomStream = mailboxStream.pipe(splitter).pipe(new Mbox());

// 3. Pass it a stream and use the default line splitter (same as #2 without explicit splitter)
const mboxFromDefaultStream = MboxStream(fs.createReadStream('test/test-4-message.mbox'), { includeMboxHeader: false });

const activeMbox = mboxFromDefaultStream; // Choose one for demonstration

activeMbox.on('data', function(msg) {
  // `msg` is a `Buffer` instance
  console.log('Got a message (first 100 chars):', msg.toString().substring(0, 100) + '...');
});

activeMbox.on('error', function(err) {
  console.log('Got an error:', err);
});

activeMbox.on('finish', function() {
  console.log('Done reading mbox file.');
});

// Example for an input file. Create a dummy if it doesn't exist for running the quickstart.
// You'd typically pipe real mbox data into this.
// For quick testing, you can create a simple file:
// echo "From MAILER-DAEMON Mon Apr 18 10:00:00 2022\nSubject: Test Email 1\n\nThis is the body of test email 1.\n\nFrom MAILER-DAEMON Mon Apr 18 10:01:00 2022\nSubject: Test Email 2\n\nThis is the body of test email 2." > test/test-4-message.mbox

view raw JSON →