JS Tiktoken

1.0.21 · active · verified Tue Apr 21

js-tiktoken is a pure JavaScript port of OpenAI's tiktoken library, providing a BPE (Byte Pair Encoding) tokenizer primarily for use with OpenAI's models. Currently at version 1.0.21, the library undergoes frequent patch releases, mainly to incorporate new OpenAI models and their corresponding tokenizer configurations. Its key differentiators include being a pure JavaScript implementation, making it suitable for web browsers, edge environments, and Node.js applications where Python dependencies are not feasible. It also offers a "lite" mode, allowing developers to load only specific encoding ranks to significantly reduce bundle size, or to dynamically fetch encoding data from a CDN, addressing concerns about the full library's potentially large footprint.

Common errors

Warnings

Install

Imports

Quickstart

Demonstrates how to obtain an encoding using both `getEncoding` for a specific scheme and `encodingForModel` for a model name, then encode and decode text, verifying the round trip. Includes error handling for unknown models.

import assert from 'node:assert';
import { getEncoding, encodingForModel } from 'js-tiktoken';

// Basic usage: Get an encoding directly
const enc = getEncoding('gpt2');
const encodedTokens = enc.encode('hello world');
console.log(`'gpt2' tokens for 'hello world': ${encodedTokens}`);
assert(enc.decode(encodedTokens) === 'hello world');

// Model-specific usage: Get encoding for a known model
const modelName = 'gpt-4'; // Or 'gpt-3.5-turbo', 'text-embedding-ada-002', etc.
try {
  const modelEnc = encodingForModel(modelName);
  const text = 'This is an example sentence for GPT-4 tokenization.';
  const tokens = modelEnc.encode(text);
  console.log(`\n'${modelName}' tokens: ${tokens.length}, tokens array: [${tokens.slice(0, 5)}..., ${tokens.slice(-5)}]`);
  const decoded = modelEnc.decode(tokens);
  console.log(`'${modelName}' decoded: ${decoded}`);
} catch (error) {
  console.error(`\nError getting encoding for model '${modelName}':`, error.message);
}

view raw JSON →