Tweet Harvest (Twitter Crawler)

2.7.1 · active · verified Wed Apr 22

Tweet Harvest is an active command-line interface (CLI) tool designed for scraping tweets from Twitter search results. Utilizing Playwright, it automates browser interactions to retrieve data based on specified keywords and date ranges, exporting the results to CSV or XLSX formats. The current stable version is 2.7.1, with frequent minor releases addressing bug fixes, performance improvements, and new export functionalities (e.g., XLSX export in v2.7.0). A key differentiator is its reliance on a valid Twitter `auth_token` cookie for authentication, as Twitter prohibits unauthenticated search. While primarily a CLI, it also exposes programmatic APIs for integration into Node.js applications, offering functions to initiate the scraping process and process tweet data. Its continuous updates ensure compatibility with Twitter's evolving interface and provide enhanced data quality features like ISO 8601 timestamps.

Common errors

Warnings

Install

Imports

Quickstart

Demonstrates programmatic usage of `tweet-harvest` to scrape tweets with specified keywords and date ranges, requiring a Twitter authentication token.

import { harvest } from 'tweet-harvest';
import type { Options } from 'tweet-harvest';

const twitterAuthToken = process.env.TWITTER_AUTH_TOKEN ?? ''; // Get this from your browser cookies

if (!twitterAuthToken) {
  console.error('TWITTER_AUTH_TOKEN environment variable is not set. Please provide a valid Twitter auth token from your browser cookies.');
  process.exit(1);
}

const options: Options = {
  keyword: 'AI ethics',
  from: '2023-01-01',
  to: '2023-12-31',
  filename: 'ai-ethics-tweets',
  limit: 100, // Limit to 100 tweets for this example
  exportFormat: 'csv',
  auth_token: twitterAuthToken,
  withReplies: false,
  withImages: false,
  withVideos: false
};

async function runHarvest() {
  console.log('Starting tweet harvest...');
  try {
    await harvest(options);
    console.log(`Successfully harvested tweets to ${options.filename}.csv`);
  } catch (error) {
    console.error('Error during tweet harvest:', error);
    if (error instanceof Error && error.message.includes('auth_token')) {
        console.error('Ensure your TWITTER_AUTH_TOKEN is valid and up-to-date.');
    }
  }
}

runHarvest();

view raw JSON →