{"id":13301,"library":"html-metadata-parser","title":"HTML Metadata Parser","description":"html-metadata-parser is a JavaScript library for Node.js environments, specializing in the extraction and parsing of metadata from HTML documents. It efficiently scrapes Open Graph (OG) tags, standard HTML meta tags, and image URLs from a given web page. The library provides a single, promise-based `parser` function that takes a URL as input and returns a structured object containing the discovered `og`, `meta`, and `images` data. The current stable version is 2.0.4, with development appearing to be active, although without a strictly defined public release cadence. Its key differentiator lies in its straightforward API for server-side metadata extraction, making it highly suitable for applications requiring functionalities like link previews, social media card generation, or general web content analysis. The package also ships with comprehensive TypeScript type definitions, providing an enhanced development experience for TypeScript users.","status":"active","version":"2.0.4","language":"javascript","source_language":"en","source_url":"https://github.com/nasa8x/html-metadata-parser","tags":["javascript","html metadata","html metadata parser","html metadata crawler","html metadata scraper","html meta","metadata parser","metadata crawler","metadata extract","typescript"],"install":[{"cmd":"npm install html-metadata-parser","lang":"bash","label":"npm"},{"cmd":"yarn add html-metadata-parser","lang":"bash","label":"yarn"},{"cmd":"pnpm add html-metadata-parser","lang":"bash","label":"pnpm"}],"dependencies":[],"imports":[{"note":"The `parser` function is a named export. Default import will result in `undefined`.","wrong":"import parser from 'html-metadata-parser';","symbol":"parser","correct":"import { parser } from 'html-metadata-parser';"},{"note":"When using CommonJS, `parser` must be destructured as it's a named export, not the default module export.","wrong":"const parser = require('html-metadata-parser');","symbol":"parser (CommonJS)","correct":"const { parser } = require('html-metadata-parser');"},{"note":"For TypeScript projects, it is recommended to import the `MetadataResult` type for type-checking the parser's output.","symbol":"MetadataResult (TypeScript Type)","correct":"import type { MetadataResult } from 'html-metadata-parser';"}],"quickstart":{"code":"import { parser, type MetadataResult } from 'html-metadata-parser';\n\nconst targetUrl = 'https://www.youtube.com/watch?v=eSzNNYk7nVU';\n\n(async () => {\n  try {\n    console.log(`Attempting to parse metadata from: ${targetUrl}`);\n    const result: MetadataResult = await parser(targetUrl);\n\n    if (result) {\n      console.log('Successfully parsed metadata:\\n');\n      console.log(JSON.stringify(result, null, 2));\n    } else {\n      console.log('No metadata found or parsed successfully.');\n    }\n\n    // Example of accessing specific metadata\n    console.log(`\\nOpen Graph Title: ${result.og?.title || 'N/A'}`);\n    console.log(`Meta Description: ${result.meta?.description || 'N/A'}`);\n\n  } catch (error) {\n    console.error('An error occurred during metadata parsing:');\n    if (error instanceof Error) {\n      console.error(error.message);\n    } else {\n      console.error(String(error));\n    }\n  }\n})();","lang":"typescript","description":"This quickstart demonstrates how to use `html-metadata-parser` to asynchronously fetch and parse metadata from a given URL, including error handling."},"warnings":[{"fix":"Wrap calls to `parser` in a `try...catch` block and handle specific network error codes or messages. Consider retries with a backoff strategy for transient issues.","message":"The library relies on network requests. Unreliable network connectivity, DNS issues, or target server downtime can lead to `FetchError` or `ENOTFOUND` errors. Implement robust error handling for network-related failures.","severity":"gotcha","affected_versions":">=1.0.0"},{"fix":"For very large documents, consider alternative parsing strategies or optimize resource handling. Monitor memory and CPU usage in production environments and apply timeouts to prevent hanging operations.","message":"Parsing large or poorly structured HTML documents can be resource-intensive, potentially leading to increased memory usage or slow response times. Performance may vary significantly depending on the target URL's content.","severity":"gotcha","affected_versions":">=1.0.0"},{"fix":"Implement delays between requests, cache results, or consider using a proxy rotation service if frequent requests to the same domain are necessary. Always respect `robots.txt`.","message":"Repeated scraping of the same website can lead to IP blocking or rate limiting by the target server, resulting in HTTP 429 (Too Many Requests) or other access denied errors.","severity":"gotcha","affected_versions":">=1.0.0"},{"fix":"Always perform nullish coalescing or optional chaining (e.g., `result.og?.title`) when accessing properties of the parsed metadata object to prevent runtime errors.","message":"The `parser` function may return `null` or an object with `undefined` properties if specific metadata (e.g., Open Graph tags, meta description) is not present on the target page. Direct access without null-checking can lead to `TypeError`.","severity":"gotcha","affected_versions":">=1.0.0"}],"env_vars":null,"last_verified":"2026-04-19T00:00:00.000Z","next_check":"2026-07-18T00:00:00.000Z","problems":[{"fix":"Use optional chaining (`?.`) or nullish coalescing (`??`) when accessing metadata properties, e.g., `result.og?.title` or `result.meta?.description ?? 'No description'`. Always check if `result` itself is truthy.","cause":"Attempting to access a property on an `og` or `meta` object that does not exist or is `undefined` because the target page lacked that specific metadata.","error":"TypeError: Cannot read properties of undefined (reading 'title')"},{"fix":"Verify the URL is correct and accessible from your network. Check your DNS configuration or ensure there's no firewall blocking outbound requests. The error indicates the requested domain does not exist or cannot be reached.","cause":"The hostname could not be resolved (DNS error), or the URL is malformed/unreachable. This is a network-related issue.","error":"FetchError: request to https://example.com failed, reason: getaddrinfo ENOTFOUND example.com"},{"fix":"Use named import for ESM: `import { parser } from 'html-metadata-parser';` or named require for CommonJS: `const { parser } = require('html-metadata-parser');`","cause":"Incorrect import or require statement, often attempting to use `import parser from 'html-metadata-parser';` or `const parser = require('html-metadata-parser');` instead of named destructuring.","error":"TypeError: parser is not a function"}],"ecosystem":"npm","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null,"pypi_latest":null,"cli_name":"","cli_version":null}