node-llama-cpp
JSON →Run large language models (LLMs) locally from Node.js using llama.cpp bindings. Version 3.18.1 provides pre-built binaries for macOS, Linux, and Windows (Metal, CUDA, Vulkan) with automatic fallback to source build via cmake (no node-gyp or Python required). Supports JSON schema enforcement, function calling, embedding, reranking, and chat sessions. Full TypeScript types included. Active development with frequent releases synced to upstream llama.cpp. Key differentiator: zero-config GPU acceleration and safe token injection prevention.