LlamaIndex LLMs Vertex

0.7.0 verified Fri May 01 auth: no python

LlamaIndex integration for Google Vertex AI LLMs (e.g., Gemini, PaLM). Allows using Vertex AI models as LLMs within LlamaIndex. Current version 0.7.0, release cadence is frequent (weekly/biweekly).

pip install llama-index-llms-vertex

Common errors

error google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials. ↓

cause Missing or misconfigured Google Cloud credentials (service account file or ADC).

fix

Set GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of your service account JSON, or run gcloud auth application-default login.

error google.api_core.exceptions.InvalidArgument: 400 Request contains an invalid argument. ↓

cause Invalid model name or region. For example, using 'gemini-pro' instead of 'gemini-2.0-flash-001'.

fix

Double-check model name and location. Use Vertex.available_models() to see supported models for your project.

error ModuleNotFoundError: No module named 'llama_index.llms.vertex' ↓

cause Wrong import path. Users often try `from llama_index.llms import Vertex` but that doesn't work.

fix

Use correct import: from llama_index.llms.vertex import Vertex.

error TypeError: __init__() got an unexpected keyword argument 'temperature' ↓

cause Constructor parameters like temperature, max_tokens are deprecated/removed in recent versions.

fix

Pass temperature and other generation config as keyword arguments to llm.complete() or llm.chat() instead of constructor.

Warnings

breaking Version 0.7.0 changed default model from 'text-bison' to 'gemini-2.0-flash-001'. Existing code using default model will break if you rely on PaLM behavior. ↓

fix Explicitly set model parameter to your desired model (e.g., 'text-bison@001').

deprecated The 'temperature', 'max_tokens', etc. kwargs in the constructor are deprecated in favor of passing them to the call (complete/chat) methods. Constructor params may be removed in future version. ↓

fix Pass generation config via generation_config dict or kwargs at call time instead of constructor.

gotcha Authentication requires a service account JSON or ADC. Forgetting to set GOOGLE_APPLICATION_CREDENTIALS raises a vague google.auth.exceptions.DefaultCredentialsError. ↓

fix Ensure credentials are set via environment variable or `google.auth.default()` is configured.

gotcha Model names must match exactly what Vertex AI expects (e.g., 'gemini-2.0-flash-001', not 'gemini-pro'). Using an invalid model raises a 404 or 400 error. ↓

fix Refer to Vertex AI documentation for exact model names. Use `Vertex.available_models()` to list supported models.

Imports

Vertex

wrong

from llama_index.llms import Vertex

correct

from llama_index.llms.vertex import Vertex

Correct import is the submodule path, not the top-level llms.

Quickstart

Initialize Vertex LLM with a Gemini model and call complete().

import os
from llama_index.llms.vertex import Vertex

# Ensure credentials are set
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/service-account.json'

llm = Vertex(
    model='gemini-2.0-flash-001',
    project='your-project-id',
    location='us-central1',
)
resp = llm.complete('Hello, who are you?')
print(resp)