ctransformers

0.2.27 verified Fri May 01 auth: no python

Python bindings for Transformer models implemented in C/C++ using the GGML library. Provides a high-level API for inference with popular models (LLaMA, Falcon, GPT-J, StarCoder, etc.) in GGML/GGUF format, with optional GPU acceleration (CUDA, Metal, ROCm). Latest version 0.2.27, active development.

pip install ctransformers

Common errors

error ModuleNotFoundError: No module named 'ctransformers' ↓

cause ctransformers is not installed or not installed with the correct extra for your platform.

fix

Install with: pip install ctransformers. If you need GPU, use pip install ctransformers[cuda] (or [metal]/[rocm]).

error AttributeError: module 'ctransformers' has no attribute 'AutoModelForCausalLM' ↓

cause Import path is wrong; you are using an older version (<0.2.0) or importing incorrectly.

fix

Use: from ctransformers import AutoModelForCausalLM. Ensure you have ctransformers >=0.2.0.

error ValueError: Unknown model type: '...'. Must be one of: ... ↓

cause The model file type is not recognized, often due to non-standard file name or format.

fix

Specify model_type explicitly, e.g., AutoModelForCausalLM.from_pretrained('path/to/model', model_type='llama').

Warnings

deprecated LLM.reset() method is deprecated since v0.2.27. Use high-level API instead (e.g., create a new instance or reuse with generation parameters). ↓

fix Remove calls to LLM.reset() and use high-level API (AutoModelForCausalLM).

breaking Older GGML models (v1) may not work with newer versions of ctransformers. GGUF v2 support is added in v0.2.25, but GGML v1 support may be removed gradually. ↓

fix Use GGUF v2 models or convert your model to GGUF format using llama.cpp convert scripts.

gotcha CUDA, Metal, and ROCm support are optional extras; install with appropriate extras like [cuda], [metal], or [rocm] to enable GPU acceleration. ↓

fix Install with the correct extra, e.g., pip install ctransformers[cuda]. Note: CUDA support is experimental for some model types.

gotcha Model types (model_type) must be explicitly specified if the model file name does not follow standard naming conventions. Otherwise loading may fail. ↓

fix Pass model_type parameter, e.g., AutoModelForCausalLM.from_pretrained('./model.gguf', model_type='llama').

Install

pip install ctransformers[cuda]

pip install ctransformers[metal]

pip install ctransformers[rocm]

Imports

AutoModelForCausalLM
wrong
```
from transformers import AutoModelForCausalLM
```
correct
```
from ctransformers import AutoModelForCausalLM
```
ctransformers provides its own AutoModelForCausalLM class, not from Hugging Face transformers.

Quickstart

Load a GGML model from Hugging Face Hub and generate text. Replace model ID with a valid GGML/GGUF model.

from ctransformers import AutoModelForCausalLM

llm = AutoModelForCausalLM.from_pretrained('marella/gpt-2-ggml')
print(llm('AI is going to', max_new_tokens=50))