Pelias Database Client
`pelias-dbclient` is a core Node.js module within the Pelias geocoder ecosystem, providing a stream-based interface for efficiently bulk-inserting documents into Elasticsearch. Its primary function is to act as a crucial pipeline stage for Pelias import processes, transforming Pelias `Document` objects into Elasticsearch-compatible bulk operations. The current stable version is v3.3.0, with releases occurring on a feature-driven, rather than strict time-based, cadence, though a major version (v3.0.0) was released in March 2024. Key differentiators include its tight integration with the Pelias data model, its focus on streaming for large-scale data ingestion, and its commitment to open-source principles as part of the broader Pelias open-data geocoding project. It leverages the official `elasticsearch` client under the hood and is designed specifically for Node.js environments.
Common errors
-
TypeError: Cannot read properties of undefined (reading 'client') or Cannot read property 'indexName' of undefined
cause The `pelias-config` module was not able to generate a valid configuration object, specifically missing `esclient` or `schema.indexName`.fixVerify your Pelias configuration files (`pelias.json` or environment variables) are correctly set up and accessible to the application. Ensure `pelias-config` can locate and parse them. -
ElasticsearchClientError: [es/search] 'type' is no longer supported
cause Attempting to send requests or interact with Elasticsearch using the deprecated `_type` field, which was removed in Elasticsearch v7 and v8, and subsequently dropped by `pelias-dbclient` v3.x.fixUpdate your Pelias data models, import scripts, or any custom code to remove references to the `_type` field. Ensure `pelias-dbclient` is version 3.x or higher and your Elasticsearch is v7+. -
TypeError: dbclient is not a function
cause The `pelias-dbclient` module is imported incorrectly, often by attempting to destructure it as a named export when it is a default export function.fixUse `const dbclient = require('pelias-dbclient');` for CommonJS or `import dbclient from 'pelias-dbclient';` for ESM. Do not use `{ dbclient }`. -
Error: 'pelias-dbclient' requires Node.js version >=10.0.0. You are running Node.js 8.x.x.
cause Running `pelias-dbclient` on an unsupported Node.js version.fixUpgrade your Node.js environment to version 10.0.0 or higher. The recommended version for current `pelias-dbclient` releases is Node.js 16 or newer.
Warnings
- breaking Version 3.0.0 of `pelias-dbclient` dropped support for Elasticsearch v6 and removed the internal handling of the `_type` field. This change was necessary to support Elasticsearch v8, with v7 remaining the recommended version.
- breaking Version 2.14.0 introduced a breaking change by dropping official support for Node.js 8. The minimum required Node.js version is now 10.0.0.
- gotcha The `dbclient` stream is designed to emit its 'finish' event only after all documents have been successfully stored in Elasticsearch. This behavior is intentional to guarantee data persistence before subsequent operations (e.g., cleanup or indexing updates) are triggered.
- gotcha The Elasticsearch client configuration (`config.esclient`) and index name (`config.schema.indexName`) are sourced from `pelias-config`. Incorrect or missing Pelias configuration can lead to connection errors or documents being written to unintended indices.
Install
-
npm install pelias-dbclient -
yarn add pelias-dbclient -
pnpm add pelias-dbclient
Imports
- dbclient
const { dbclient } = require('pelias-dbclient');const dbclient = require('pelias-dbclient'); - streamFactory
const streamFactory = require('pelias-dbclient'); const stream = streamFactory(); - Document (from pelias-model)
const Document = require('pelias-model').Document;
Quickstart
'use strict';
const streamify = require('stream-array');
const through = require('through2');
const Document = require('pelias-model').Document;
const dbMapper = require('pelias-model').createDocumentMapperStream;
const dbclient = require('pelias-dbclient');
const elasticsearch = require('elasticsearch');
const config = require('pelias-config').generate(); // Ensure pelias-config is properly set up
const elasticDeleteQuery = require('elastic-deletebyquery');
const timestamp = Date.now();
// Simulate an upstream data source
const stream = streamify([1, 2, 3, 4, 5])
.pipe(through.obj((item, enc, next) => {
// Create a Pelias Document for each item
const uniqueId = [ 'docType', item ].join(':');
const doc = new Document( 'sourceType', 'venue', uniqueId );
doc.timestamp = timestamp;
doc.setName('default', `Test Venue ${item}`);
doc.setCentroid(item * 0.1, item * 0.2);
next(null, doc);
}))
.pipe(dbMapper()) // Map Pelias Document to Elasticsearch format
.pipe(dbclient()); // Bulk-insert documents into Elasticsearch
stream.on('finish', () => {
console.log('All documents processed and sent to Elasticsearch.');
const client = new elasticsearch.Client(config.esclient);
// Example of a post-import operation: clean up old documents
const options = {
index: config.schema.indexName,
body: {
query: {
"bool": {
"must": [
{"term": { "source": "sourceType" }}
],
"must_not": [
{"term": { "timestamp": timestamp }}
]
}
}
}
};
client.deleteByQuery(options, (err, response) => {
if (err) {
console.error('Error during cleanup:', err);
} else {
console.log(`Cleaned up ${response.elements || response.deleted} old elements.`);
}
client.close();
});
});
stream.on('error', (err) => {
console.error('Stream encountered an error:', err);
});