VerticaPy
raw JSON → 1.1.1 verified Fri May 01 auth: no python
VerticaPy is a Python library for data exploration, data cleaning, and machine learning in Vertica. It simplifies the integration between Python and Vertica databases, providing a pandas-like interface and ML capabilities that run directly in-database. Current version is 1.1.1, released May 2025. The project follows a monthly release cadence.
pip install verticapy Common errors
error No module named 'verticapy.core' ↓
cause Importing from the old submodule path that was removed in 1.0.0.
fix
Use
from verticapy import vDataFrame instead of from verticapy.core import vDataFrame. error AttributeError: 'vDataFrame' object has no attribute 'select' ↓
cause The `select` method was renamed to `select_` (with trailing underscore) to avoid Python keyword conflict.
fix
Use
vdf.select_('col1', 'col2') instead of vdf.select('col1', 'col2'). Warnings
breaking Import paths changed significantly in 1.0.0. Many functions moved from submodules to top-level or were renamed. ↓
fix Run `verticapy.upgrade()` or consult the migration guide. For example, `from verticapy import vDataFrame` instead of `from verticapy.core import vDataFrame`.
gotcha The database connection must be passed explicitly when creating a vDataFrame. It is easy to forget and assume an implicit connection. ↓
fix Always provide a cursor or connection object: `vDataFrame('table', cursor)`.
gotcha vDataFrame methods mutate the object in-place by default, unlike pandas which returns a new object. ↓
fix Be aware that operations like `drop()` modify the current vDataFrame; use `.copy()` if you need to preserve the original.
Imports
- vDataFrame wrong
from verticapy.core import vDataFramecorrectfrom verticapy import vDataFrame - set_option wrong
from verticapy.options import set_optioncorrectfrom verticapy import set_option
Quickstart
from verticapy import vDataFrame, set_option
# Optional: configure display
set_option('max_cellwidth', 50)
# Connect using parameters (replace with actual credentials)
from vertica_python import connect
conn_info = {
'host': 'localhost',
'port': 5433,
'user': 'dbadmin',
'password': '',
'database': 'vmart',
'ssl': False
}
cur = connect(**conn_info).cursor()
# Create vDataFrame from a table
vdf = vDataFrame('public.my_table', cur)
# Quick exploration
print(vdf.shape())
print(vdf.describe())