Got Scraping
Got Scraping is an HTTP client built as an extension of the popular `got` library, specifically designed for web scraping tasks by making requests appear browser-like out of the box. It integrates features such as automatic browser-like header generation through the `header-generator` package, which allows for specifying desired browser, device, locale, and operating system profiles. It also simplifies proxy management, supporting HTTP and HTTPS proxies, including HTTP/2, and performs automatic ALPN negotiation for target servers. The package requires Node.js >=16 due to stability concerns with HTTP/2 in earlier versions. The current stable version is 4.2.1. However, as of its recent deprecation announcement, `got-scraping` is officially End-of-Life (EOL) and will no longer receive updates or support. For new projects, the maintainers strongly recommend migrating to `impit` (github.com/apify/impit), a modern, `fetch`-API-based HTTP client built on Rust's `reqwest` library, which offers a similar feature set and performance benefits.
Common errors
-
ReferenceError: require is not defined
cause Attempting to use CommonJS `require()` in an ES module context or with an ESM-only library like `got-scraping`.fixChange your import statement to `import { gotScraping } from 'got-scraping';` or configure your project as a CommonJS module and use dynamic `import()` for `got-scraping`. -
TypeError: gotScraping is not a function
cause `gotScraping` is an already instantiated `got` instance, not a constructor. It's meant to be called directly for requests or to chain methods like `.get()` or `.post()`.fixDo not use `new` keyword or try to call `gotScraping()` directly. Instead, call its methods: `gotScraping.get(url)`, `gotScraping.post(options)`, or configure via `gotScraping(options)`. -
Error: Proxy CONNECT Error: 407 Proxy Authentication Required
cause The configured proxy server requires authentication, but either no credentials were provided or they are incorrect.fixEnsure the `proxyUrl` includes valid username and password credentials: `http://username:password@proxy.example.com:port`. Verify that the credentials are correct for your proxy. -
Error: This module was compiled against Node.js version XX. You are currently running version YY. Please update Node.js to a compatible version.
cause `got-scraping` requires Node.js >=16, and you are running an older, incompatible version.fixUpgrade your Node.js environment to version 16 or higher. Use a Node Version Manager (e.g., `nvm`) to easily switch versions: `nvm install 16 && nvm use 16`.
Warnings
- breaking The `got-scraping` package has been officially declared End-of-Life (EOL) by its maintainers. It will no longer receive updates, bug fixes, or support. Users are strongly advised to migrate to the recommended alternative, `impit`, for new projects and consider migrating existing projects.
- breaking `got-scraping` is an ESM-only module since version 4. This means it can only be imported using `import` statements in ES modules or dynamically with `import()` in CommonJS contexts. Direct `require()` calls will result in errors.
- breaking The package requires Node.js version 16 or newer. Running `got-scraping` on older Node.js versions may lead to instability, particularly with HTTP/2 connections, or outright failures.
- gotcha When using proxies, ensure the `proxyUrl` is correctly formatted with scheme, host, port, and credentials if required (e.g., `http://username:password@myproxy.com:1234`). Incorrectly formatted URLs or invalid credentials will lead to proxy connection failures.
- gotcha By default, `got-scraping` generates new browser-like headers for each request. To maintain a consistent set of headers across multiple requests, you must provide a unique `sessionToken` object (any non-primitive object will work, e.g., `{}`) to reuse the same generated headers for subsequent requests within that session.
Install
-
npm install got-scraping -
yarn add got-scraping -
pnpm add got-scraping
Imports
- gotScraping
const { gotScraping } = require('got-scraping');import { gotScraping } from 'got-scraping'; - HeaderGeneratorOptions
import type { HeaderGeneratorOptions } from 'got-scraping'; - getAgents
import { getAgents } from 'got-scraping';
Quickstart
import { gotScraping } from 'got-scraping';
async function fetchExampleData() {
try {
const response = await gotScraping.get({
url: 'https://httpbin.org/headers', // A simple endpoint to inspect request headers
proxyUrl: process.env.PROXY_URL ?? 'http://username:password@proxy.example.com:8000',
useHeaderGenerator: true,
headerGeneratorOptions: {
browsers: [
{ name: 'chrome', minVersion: 90, maxVersion: 99 }
],
devices: ['desktop'],
locales: ['en-US', 'de-DE'],
operatingSystems: ['windows', 'macos']
},
// Standard Got options are also available
timeout: { request: 30000 }, // 30-second total request timeout
retry: { limit: 2, methods: ['GET'] }, // Retry GET requests up to 2 times
headers: {
'X-Custom-Header': 'Scraping-Demo-Request'
}
});
console.log('Status Code:', response.statusCode);
console.log('Response Body (first 500 chars):', response.body.substring(0, 500));
// Further processing of `response.body` (e.g., JSON.parse) would happen here.
} catch (error) {
console.error('Failed to fetch data:', error instanceof Error ? error.message : error);
}
}
fetchExampleData();