sodapy: Python client for Socrata Open Data API

2.2.0 · deprecated · verified Wed Apr 15

sodapy is a Python client for the Socrata Open Data API (SODA), enabling programmatic access to datasets from Socrata-powered platforms. While the library is functional, it has been unmaintained since August 31, 2022, with no new features or bug fixes planned. The current version is 2.2.0, and it is compatible with Python 3.5-3.10.

Warnings

Install

Imports

Quickstart

Initializes a Socrata client, retrieves the first 5 records from a public dataset (e.g., NYC 311 Service Requests), and fetches its metadata. It demonstrates setting up the client with an optional application token and credentials, and performing a basic data retrieval. Environment variables are used for secure credential handling.

import os
from sodapy import Socrata

# Get credentials from environment variables or provide directly
APP_TOKEN = os.environ.get('SOCRATA_APP_TOKEN', None) # Recommended for higher rate limits
USERNAME = os.environ.get('SOCRATA_USERNAME', None) # Only required for creating/modifying data
PASSWORD = os.environ.get('SOCRATA_PASSWORD', None) # Only required for creating/modifying data

# Example: Connect to a public dataset (e.g., NYC Open Data - 311 Service Requests)
# Replace 'data.cityofnewyork.us' with your Socrata domain
# Replace 'erm2-nwe9' with your dataset identifier
domain = 'data.cityofnewyork.us'
dataset_identifier = 'erm2-nwe9'

with Socrata(domain, APP_TOKEN, username=USERNAME, password=PASSWORD) as client:
    # Increase timeout for large datasets if needed
    # client.timeout = 50

    # Example: Retrieve the first 5 records
    print(f"Retrieving the first 5 records from {dataset_identifier} on {domain}...")
    results = client.get(dataset_identifier, limit=5)

    # Results are returned as a list of dictionaries
    for item in results:
        print(item)

    print(f"\nRetrieved {len(results)} records.")

    # Example: Retrieve metadata for the dataset
    print(f"\nRetrieving metadata for {dataset_identifier}...")
    metadata = client.get_metadata(dataset_identifier)
    print(f"Dataset Name: {metadata.get('name')}")
    print(f"Description: {metadata.get('description', '')[:100]}...")

view raw JSON →