Express Sitemap XML Middleware

raw JSON →
3.1.0 verified Thu Apr 23 auth: no javascript

express-sitemap-xml is an Express.js middleware designed to automatically generate and serve sitemap.xml files based on a dynamic list of URLs provided by an async function. It intelligently handles large sitemaps, automatically splitting them into multiple files and generating a sitemap index (sitemap.xml) when more than 50,000 URLs are present, adhering to the sitemap protocol. The current stable version is 3.1.0, with recent updates indicating active maintenance and feature additions, such as image support and base URL fixes. It differentiates itself by its automatic handling of large sitemaps, caching mechanisms for performance, and a flexible API that can also be used independently of Express for pure sitemap generation. The package calls the URL-fetching function at most once every 24 hours, caching the results to optimize performance for subsequent requests.

error TypeError: expressSitemapXml is not a function
cause Incorrect import statement or mixing CommonJS `require` with ES module `import` syntax for the default export.
fix
If using ES modules: import expressSitemapXml from 'express-sitemap-xml';. If using CommonJS: const expressSitemapXml = require('express-sitemap-xml');.
error Error: Argument `getUrls` must be a function.
cause The first argument to `expressSitemapXml` must be an async function that returns an array of URLs.
fix
Ensure the first argument is a function, preferably an async function, for example: app.use(expressSitemapXml(async () => ['/'], 'https://example.com')).
error ReferenceError: require is not defined in ES module scope
cause Attempting to use `require()` in a project configured for ES modules (e.g., `"type": "module"` in `package.json`).
fix
Switch to ES module import syntax: import express from 'express'; import expressSitemapXml from 'express-sitemap-xml';.
gotcha The `getUrls` function passed to `expressSitemapXml` is called at most once every 24 hours. If your URLs change more frequently and you need the sitemap to reflect those changes immediately, you'll need to restart your application or implement a custom cache invalidation mechanism outside of the middleware's built-in caching.
fix Consider application restart for immediate updates or manually clear caches if a more granular control is needed, acknowledging the 24-hour cache limit. For most use cases, this caching behavior is beneficial.
gotcha When providing URLs as objects, the `priority` option is explicitly not supported. Google states that it ignores the `priority` value in sitemaps, so the package omits this option to avoid confusion and unnecessary data.
fix Do not include a `priority` field in URL objects. Focus on `lastMod` and `changeFreq` if you need to provide additional details beyond the URL itself.
gotcha For sitemaps with more than 50,000 URLs, the package automatically generates a sitemap index (`sitemap.xml`) and multiple sitemap files (`sitemap-0.xml`, `sitemap-1.xml`, etc.). Ensure your `robots.txt` points to the main `sitemap.xml` file, as it will act as the index.
fix Always point your `robots.txt` to the root `sitemap.xml` (e.g., `Sitemap: https://yourdomain.com/sitemap.xml`). The package handles the internal linking to sub-sitemaps automatically.
npm install express-sitemap-xml
yarn add express-sitemap-xml
pnpm add express-sitemap-xml

This quickstart sets up an Express server with the express-sitemap-xml middleware, demonstrating how to provide a dynamic list of URLs and serve the sitemap.xml file. It includes a simulated asynchronous URL fetching function and shows both simple URL strings and more detailed URL objects.

import express from 'express';
import expressSitemapXml from 'express-sitemap-xml';

const app = express();

// Simulate fetching URLs from a database
async function getUrlsFromDatabase() {
  // In a real application, you would fetch these dynamically
  console.log('Fetching URLs for sitemap...');
  return [
    '/',
    '/about',
    '/contact',
    { url: '/products', lastMod: new Date(), changeFreq: 'daily' },
    { url: '/blog/post-1', lastMod: new Date('2023-01-15T10:00:00Z'), changeFreq: 'monthly' }
  ];
}

// Use the sitemap middleware
// The getUrls function will be called at most once every 24 hours.
app.use(expressSitemapXml(getUrlsFromDatabase, 'https://example.com'));

// Other Express routes
app.get('/', (req, res) => res.send('Welcome to the homepage!'));
app.get('/about', (req, res) => res.send('About Us'));

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
  console.log(`Server running on http://localhost:${PORT}`);
  console.log('Sitemap available at http://localhost:3000/sitemap.xml');
  console.log('Remember to add \"Sitemap: https://example.com/sitemap.xml\" to your robots.txt');
});