{"id":2227,"library":"pyodps","title":"ODPS Python SDK and data analysis framework","description":"PyODPS is the official Python SDK for Alibaba Cloud's MaxCompute (formerly ODPS), providing an elegant way to access MaxCompute APIs. It supports basic operations on MaxCompute objects and includes a DataFrame framework for streamlined data analysis. Currently at version 0.12.6, the library is actively maintained with a regular release cadence, adding new features, enhancements, and bug fixes.","status":"active","version":"0.12.6","language":"en","source_language":"en","source_url":"https://github.com/aliyun/aliyun-odps-python-sdk","tags":["MaxCompute","Alibaba Cloud","Big Data","SDK","Data Analysis","DataFrame"],"install":[{"cmd":"pip install pyodps","lang":"bash","label":"Basic installation"},{"cmd":"pip install pyodps[full]","lang":"bash","label":"Full installation with Jupyter support"}],"dependencies":[{"reason":"Required for package installation and metadata handling.","package":"setuptools","optional":false},{"reason":"HTTP client for API communication.","package":"requests","optional":false},{"reason":"Python 2 and 3 compatibility utilities.","package":"six","optional":false},{"reason":"Used for data serialization.","package":"protobuf","optional":false}],"imports":[{"note":"The core entry point for interacting with MaxCompute (ODPS) services.","symbol":"ODPS","correct":"from odps import ODPS"},{"note":"Used for the pandas-like DataFrame API for data analysis on MaxCompute.","symbol":"DataFrame","correct":"from odps.df import DataFrame"},{"note":"Global configuration options for PyODPS behavior, e.g., enabling interactive mode or schema support.","symbol":"options","correct":"from odps import options"}],"quickstart":{"code":"import os\nfrom odps import ODPS\n\n# Ensure environment variables are set for security and best practice\n# ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET\naccess_id = os.environ.get('ALIBABA_CLOUD_ACCESS_KEY_ID', 'your-access-id')\nsecret_access_key = os.environ.get('ALIBABA_CLOUD_ACCESS_KEY_SECRET', 'your-secret-access-key')\nproject = os.environ.get('ODPS_PROJECT', 'your-project')\nendpoint = os.environ.get('ODPS_ENDPOINT', 'your-endpoint')\n\n# Initialize ODPS object\no = ODPS(access_id, secret_access_key, project=project, endpoint=endpoint)\n\n# Get a table object\ntable = o.get_table('dual')\n\n# Print table schema details\nprint(f\"Table Name: {table.name}\")\nprint(f\"Table Schema: {table.table_schema}\")\nprint(\"First 5 records:\")\nwith table.open_reader() as reader:\n    for record in reader.read(5):\n        print(record)","lang":"python","description":"This quickstart demonstrates how to initialize the `ODPS` object using environment variables for credentials and then access a MaxCompute table (`dual`) to inspect its schema and read a few records. Replace placeholder values like 'your-project' and 'your-endpoint' with your actual MaxCompute project and endpoint. It's highly recommended to use environment variables for sensitive credentials."},"warnings":[{"fix":"Update your imports from `from odps.accounts import AliyunAccount` to `from odps.account import CloudAccount`.","message":"The class `odps.accounts.AliyunAccount` was renamed to `odps.account.CloudAccount`. Code directly importing or referencing the old name will break.","severity":"breaking","affected_versions":">=0.12.4"},{"fix":"If your service does not support V4 signatures, you might need to explicitly configure PyODPS to use an older signature version, if such an option is available and supported by your MaxCompute instance.","message":"MaxCompute V4 signature is enabled by default, which may cause issues with services or environments that do not support it.","severity":"breaking","affected_versions":">=0.12.4"},{"fix":"Review and adjust your data handling logic involving decimal types to ensure they conform to the stricter precision and scale rules.","message":"Decimal precision and scale checks at the client side have been tightened to align with MaxCompute server-side behavior. This might cause existing client-side checks to fail that previously passed.","severity":"breaking","affected_versions":">=0.12.5"},{"fix":"Move `import` statements for third-party libraries into the `evaluate` or `process` method of your UDF class. Ensure the third-party package is uploaded as an archive resource to MaxCompute.","message":"When using third-party packages in Python UDFs for MaxCompute, import statements for these packages must be placed *inside* the UDF's `evaluate` method (or similar processing method). Placing them at the module level will lead to runtime errors because the package is only available within the execution context on the MaxCompute server.","severity":"gotcha","affected_versions":"All versions when using UDFs with third-party packages"},{"fix":"For DDL operations (e.g., `CREATE TABLE`, `DROP TABLE`), use the specific PyODPS methods on the `ODPS` object (e.g., `o.create_table()`, `o.delete_table()`). For API commands, use corresponding PyODPS API methods.","message":"PyODPS `execute_sql()` or `run_sql()` methods are primarily for DQL (Data Query Language) and DML (Data Manipulation Language). They may not correctly execute all SQL statement types, particularly DDL (Data Definition Language) commands like `CREATE TABLE` or complex API commands.","severity":"gotcha","affected_versions":"All versions"},{"fix":"It is highly recommended to offload heavy data processing and computations to the MaxCompute cluster by leveraging PyODPS DataFrame API or MaxCompute SQL. Only download aggregated or sampled results to your local environment.","message":"Downloading large datasets entirely to a local machine using PyODPS can lead to Out-Of-Memory (OOM) errors, especially when dealing with MaxCompute's distributed nature.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-09T00:00:00.000Z","next_check":"2026-07-08T00:00:00.000Z"}