Node.js Stream Concatenation

raw JSON →
2.0.0 verified Sat Apr 25 auth: no javascript

The `stream-concat` library provides a simple and efficient mechanism for concatenating multiple Node.js readable streams into a single output readable stream. Currently at version `2.0.0`, it maintains an active development status, though a specific release cadence isn't explicitly defined. A core differentiator is its ability to handle streams in two ways: either by accepting an array of pre-existing streams or, more efficiently, by using a user-supplied function that returns the next stream on demand. This 'on-the-fly' stream creation significantly reduces memory consumption and improves performance, particularly when dealing with large datasets or numerous files, by preventing all streams from being buffered simultaneously. It builds upon Node.js's `Transform` stream, leveraging its underlying options like `highWaterMark` and `objectMode` for fine-grained control over stream behavior.

error TypeError: StreamConcat is not a constructor
cause Attempting to use `import StreamConcat from 'stream-concat'` in an ES Modules (ESM) context without proper CJS-ESM interop.
fix
Since stream-concat is a CommonJS package, you must use const StreamConcat = require('stream-concat'); to import it correctly in both CJS and ESM environments (where ESM allows CJS require).
breaking Version `2.0.0` and newer requires Node.js `v12` or higher. Older versions of Node.js (`v0.12` to `v10`) are only supported by `stream-concat` versions prior to `1.0.0`. Ensure your Node.js runtime meets the `engines` requirement.
fix Upgrade your Node.js environment to version 12 or newer, or use an older `stream-concat` version if you are constrained to an older Node.js runtime.
gotcha When concatenating many or very large streams, providing an array of all streams directly to the `StreamConcat` constructor can lead to high memory consumption and reduced performance. All streams' read queues will be buffered simultaneously.
fix Instead of providing an array of streams, pass a function to the `StreamConcat` constructor that returns the next stream (or a Promise resolving to a stream) only when it's needed. This allows the library to manage buffering sequentially.
npm install stream-concat
yarn add stream-concat
pnpm add stream-concat

This example demonstrates how to concatenate multiple files efficiently using a `nextStream` function. This approach defers opening new streams until they are needed, optimizing memory usage compared to providing an array of all streams upfront.

const fs = require('fs');
const StreamConcat = require('stream-concat');
const path = require('path');

// Create some dummy files for demonstration
fs.writeFileSync('file1.csv', 'header1,data1a\nheader2,data1b\n');
fs.writeFileSync('file2.csv', 'header3,data2a\nheader4,data2b\n');
fs.writeFileSync('file3.csv', 'header5,data3a\nheader6,data3b\n');

const fileNames = ['file1.csv', 'file2.csv', 'file3.csv'];
let fileIndex = 0;

// Use a function to defer stream creation, reducing memory footprint
const nextStream = () => {
  if (fileIndex === fileNames.length) {
    return null;
  }
  console.log(`Reading: ${fileNames[fileIndex]}`);
  return fs.createReadStream(fileNames[fileIndex++]);
};

const output = fs.createWriteStream('combined.csv');

const combinedStream = new StreamConcat(nextStream);

combinedStream.pipe(output)
  .on('finish', () => {
    console.log('All files combined into combined.csv');
    // Clean up dummy files
    fileNames.forEach(file => fs.unlinkSync(file));
    fs.unlinkSync('combined.csv');
  })
  .on('error', (err) => {
    console.error('Error combining streams:', err);
  });