Seekable Bzip2 Decoder
seek-bzip is a pure-JavaScript Node.js module (version 2.0.0, last published over 6 years ago) designed for random-access decoding of bzip2 data. Its primary differentiator is the ability to seek to and decompress individual blocks within a bzip2 file, provided an external index. It mainly operates synchronously, decoding buffers into other buffers. While it offers experimental support for Node.js streams, this functionality relies on the `fibers` package, which is deprecated and largely incompatible with modern Node.js versions (v16+), severely limiting its utility in current environments. The package also includes a `seek-bzip-table` tool for generating the necessary block indices.
Common errors
-
TypeError: Bunzip.decode is not a function
cause Incorrect import statement (e.g., attempting to use ES module syntax like `import { decode } from 'seek-bzip'`) or attempting to call a method on an undefined `Bunzip` object due to a failed `require`.fixEnsure you are using the CommonJS `require` syntax: `const Bunzip = require('seek-bzip');`. Verify that `Bunzip` is correctly assigned before calling its methods. -
Error: Decoded size does not match expected size
cause The `expectedSize` parameter (second argument to `decode` or `decodeBlock`) was provided and is a `Number`, but the actual size of the decompressed data did not match this expectation. This can also indicate corrupted bzip2 data.fixEither remove the `expectedSize` parameter if the size is unknown or non-critical, or ensure the provided `expectedSize` accurately reflects the uncompressed data's byte length. Validate the integrity of your bzip2 source file. -
TypeError: input.seek is not a function
cause When using `Bunzip.decodeBlock` with a custom stream object as input, that stream object does not implement the required `seek` method.fixModify your custom input stream object to include a `seek(position)` method that sets the stream's read position, or provide a `Buffer` as input instead of a stream for `decodeBlock`. -
TypeError: Cannot read properties of undefined (reading 'readByte') or similar errors related to stream methods
cause The input provided to `Bunzip.decode` or `Bunzip.decodeBlock` is neither a `Buffer` nor a valid custom stream object implementing the `readByte` method. This often happens if the stream object is malformed or an invalid type.fixEnsure the `input` argument is either a Node.js `Buffer` containing the compressed bzip2 data or a custom object that implements a `readByte()` method to sequentially fetch bytes from the source.
Warnings
- breaking The package's stream-based functionality relies on the `fibers` package, which is deprecated and largely incompatible with Node.js v16.0.0 and later (e.g., Node.js 16+). Attempting to use stream features will likely result in runtime errors.
- gotcha The `Bunzip.decodeBlock` method requires an 'out-of-band' index to specify the starting bit offset of the desired bzip2 block. This index is not automatically generated during compression by `seek-bzip` itself and must be provided externally.
- gotcha Input for `Bunzip.decode` and `Bunzip.decodeBlock` can be a `Buffer` or a custom 'stream' object. This custom stream object must implement specific methods (`readByte` for `decode`, `readByte` and `seek` for `decodeBlock`). Similarly, output can be a `Buffer` or a custom stream implementing `writeByte`.
- gotcha All core decoding operations (`Bunzip.decode`, `Bunzip.decodeBlock`, `Bunzip.table`) are synchronous. Processing large bzip2 files can block the Node.js event loop, leading to performance issues and unresponsiveness in applications.
Install
-
npm install seek-bzip -
yarn add seek-bzip -
pnpm add seek-bzip
Imports
- Bunzip
import Bunzip from 'seek-bzip';
const Bunzip = require('seek-bzip'); - Bunzip.decode
import { decode } from 'seek-bzip';const Bunzip = require('seek-bzip'); const decodedData = Bunzip.decode(compressedBuffer); - Bunzip.decodeBlock
import { decodeBlock } from 'seek-bzip';const Bunzip = require('seek-bzip'); const blockData = Bunzip.decodeBlock(compressedBuffer, 123456);
Quickstart
const Bunzip = require('seek-bzip');
const fs = require('fs');
const path = require('path');
// IMPORTANT: This package decodes bzip2. To truly test, you would need
// a valid `example.bz2` file created by a bzip2 compressor (e.g., `bzip2 -c original.txt > example.bz2`).
// For demonstration, we'll create a dummy file, but Bunzip.decode will fail if it's not actual bzip2.
const dummyCompressedPath = path.join(__dirname, 'dummy_example.bz2');
const dummyOriginalData = Buffer.from('BZ2h1A&SY99999999999', 'ascii'); // Not actual bzip2 data
fs.writeFileSync(dummyCompressedPath, dummyOriginalData);
const decompressedOutputPath = path.join(__dirname, 'decompressed_output.txt');
try {
console.log(`Attempting to read dummy bzip2 data from: ${dummyCompressedPath}`);
const compressedData = fs.readFileSync(dummyCompressedPath);
console.log('Attempting to decode...');
// This will likely throw an error because dummyOriginalData is not valid bzip2
const data = Bunzip.decode(compressedData);
fs.writeFileSync(decompressedOutputPath, data);
console.log(`Successfully (or not) decompressed data to: ${decompressedOutputPath}`);
console.log(`Decoded data length: ${data.length}`);
} catch (error) {
console.error('Error during decompression:', error.message);
console.warn('\nNOTE: The `dummy_example.bz2` created in this quickstart is NOT a real bzip2 file.');
console.warn('`seek-bzip` requires actual bzip2 compressed data for successful decoding.');
console.warn('To test properly, manually create `example.bz2` using a bzip2 compressor, then replace `dummyCompressedPath` with `path.join(__dirname, 'example.bz2')`.');
} finally {
if (fs.existsSync(dummyCompressedPath)) {
fs.unlinkSync(dummyCompressedPath);
}
// You might want to keep decompressed_output.txt for inspection if it ever succeeds.
// if (fs.existsSync(decompressedOutputPath)) {
// fs.unlinkSync(decompressedOutputPath);
// }
}