ODPS Python SDK and data analysis framework

0.12.6 · active · verified Thu Apr 09

PyODPS is the official Python SDK for Alibaba Cloud's MaxCompute (formerly ODPS), providing an elegant way to access MaxCompute APIs. It supports basic operations on MaxCompute objects and includes a DataFrame framework for streamlined data analysis. Currently at version 0.12.6, the library is actively maintained with a regular release cadence, adding new features, enhancements, and bug fixes.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the `ODPS` object using environment variables for credentials and then access a MaxCompute table (`dual`) to inspect its schema and read a few records. Replace placeholder values like 'your-project' and 'your-endpoint' with your actual MaxCompute project and endpoint. It's highly recommended to use environment variables for sensitive credentials.

import os
from odps import ODPS

# Ensure environment variables are set for security and best practice
# ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET
access_id = os.environ.get('ALIBABA_CLOUD_ACCESS_KEY_ID', 'your-access-id')
secret_access_key = os.environ.get('ALIBABA_CLOUD_ACCESS_KEY_SECRET', 'your-secret-access-key')
project = os.environ.get('ODPS_PROJECT', 'your-project')
endpoint = os.environ.get('ODPS_ENDPOINT', 'your-endpoint')

# Initialize ODPS object
o = ODPS(access_id, secret_access_key, project=project, endpoint=endpoint)

# Get a table object
table = o.get_table('dual')

# Print table schema details
print(f"Table Name: {table.name}")
print(f"Table Schema: {table.table_schema}")
print("First 5 records:")
with table.open_reader() as reader:
    for record in reader.read(5):
        print(record)

view raw JSON →