OpenDataLab Python SDK

0.0.10 · active · verified Sun Apr 12

The OpenDataLab Python SDK (version 0.0.10) is a library designed for programmatic access to the OpenDataLab platform and its open datasets. It provides a Pythonic interface to resources and includes a command-line interface (CLI) tool, `odl`, for convenient dataset operations. The SDK is currently a work-in-progress (WIP), and users are advised to use the latest version as compatibility across releases is not guaranteed.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the `OdlClient` and conceptually interact with the OpenDataLab platform to download a dataset. Note that an OpenDataLab account is required, and the current official documentation primarily emphasizes the CLI tool `odl` for these operations. While the SDK provides the underlying Pythonic access, direct Python examples for authentication and dataset download are less prominent in the main README. It is recommended to use the `odl login` command-line tool to establish a session before using the SDK programmatically for certain operations if direct client-side login methods are not explicitly detailed.

import os
from opendatalab.client import OdlClient

# An OpenDataLab account is required. Register at https://opendatalab.org.cn/
# Set your credentials as environment variables or pass them directly.
USERNAME = os.environ.get('OPEN_DATALAB_USERNAME', 'your_username')
PASSWORD = os.environ.get('OPEN_DATALAB_PASSWORD', 'your_password')

# Initialize the client
odl_client = OdlClient()

try:
    # Login to the OpenDataLab platform
    print(f"Attempting to log in as {USERNAME}...")
    # The SDK's client methods usually mirror the CLI, but direct programmatic login might vary.
    # As per CLI, it typically involves a login command. The OdlClient likely handles session management.
    # For this quickstart, we'll assume the client manages authentication after init or first call.
    # Actual login might be handled via `odl login` CLI or specific client method.
    # For direct Python SDK usage, you would typically configure credentials during client init or through a dedicated login method.
    # Given the CLI-heavy documentation, a direct 'login' method on OdlClient is likely, though not explicitly shown in public docs in detail.

    # For demonstration, we'll assume an authenticated state for 'get' after setting up credentials indirectly/via CLI 'odl login'.
    # In a real scenario, ensure you are logged in using `odl login` first, or check for a client-side login method.
    # The provided client object often carries session information.

    # Example: Get (download) a dataset (replace 'dataset-id' with a real one)
    print("Attempting to list datasets (requires login/session)...")
    # No direct 'list' method found in quick search for OdlClient, focusing on 'get' as per CLI doc.
    # The CLI 'odl get' downloads a dataset.
    # Let's simulate a download call, assuming authentication is handled.
    dataset_id = 'YOUR_DATASET_ID' # e.g., 'mnist'
    destination_path = './downloaded_dataset'
    print(f"Attempting to download dataset '{dataset_id}' to '{destination_path}'...")
    # This method signature is inferred from CLI `odl get` and common SDK patterns.
    # You might need to check actual SDK source or documentation for exact method names/arguments.
    # odl_client.get(dataset_id, destination_path)
    print(f"Please use the CLI 'odl login' and 'odl get {dataset_id} -o {destination_path}' for actual usage as per current documentation.")
    print("The Python SDK offers underlying access, but CLI is primary documented interface for these actions.")

except Exception as e:
    print(f"An error occurred: {e}")

view raw JSON →