CLIP Interrogator

0.6.0 · active · verified Wed Apr 15

CLIP Interrogator is a Python library that generates text prompts from images, leveraging large language and vision models like CLIP and BLIP. It's particularly useful for generating prompts for text-to-image AI models. The current version is 0.6.0, and it has an active but infrequent release cadence, with major updates addressing model support and performance optimizations.

Warnings

Install

Imports

Quickstart

This quickstart initializes the CLIP Interrogator, downloads necessary models (on first run), and then generates a text prompt from a dummy image. It includes best practices for VRAM management.

import os
from PIL import Image
from clip_interrogator import Config, Interrogator

# Create a dummy image for the example to be runnable
dummy_image_path = "dummy_image_for_ci.jpg"
try:
    Image.new('RGB', (224, 224), color = 'red').save(dummy_image_path)
except ImportError:
    # Fallback if Pillow is not available for some reason (unlikely for this lib)
    with open(dummy_image_path, 'w') as f:
        f.write("dummy content")

# Configure CLIP Interrogator
ci_config = Config()
# Set model names (these will be downloaded on first run and cached)
ci_config.clip_model_name = "ViT-L-14/openai"
ci_config.caption_model_name = "blip-large" # Other options: blip-base, blip2-2.7b, blip2-flan-t5-xl, git-large-coco

# Apply low VRAM settings if available (recommended for GPUs with <12GB VRAM)
# This method was introduced in v0.5.4
if hasattr(ci_config, 'apply_low_vram_defaults'):
    ci_config.apply_low_vram_defaults()

# Initialize the Interrogator. This will download models if not already cached.
print("Initializing CLIP Interrogator (models may download on first run)...")
try:
    ci = Interrogator(ci_config)
    print("CLIP Interrogator initialized.")

    # Load an image
    image = Image.open(dummy_image_path).convert("RGB")

    # Perform interrogation
    prompt = ci.interrogate(image)
    print(f"Generated prompt: {prompt}")
except Exception as e:
    print(f"Error during interrogation: {e}. Please ensure sufficient VRAM and disk space for models.")
finally:
    # Clean up the dummy image
    if os.path.exists(dummy_image_path):
        os.remove(dummy_image_path)

view raw JSON →