SDMX: Statistical Data and Metadata eXchange
sdmx1 is a Python library for consuming and working with Statistical Data and Metadata eXchange (SDMX) web services and files. It supports various SDMX versions and data formats, allowing users to query, download, and parse statistical data from official sources like Eurostat, IMF, and OECD. The library is actively maintained with frequent minor releases, currently at version 2.26.0, and requires Python >=3.10.
Common errors
-
AttributeError: module 'sdmx' has no attribute 'Client'
cause You are attempting to use the `sdmx.Client` class, which was removed in `sdmx1` version 2.0.0.fixUpdate your code to use `sdmx.Request()` instead of `sdmx.Client()`. For example, `req = sdmx.Request('ESTAT')`. -
sdmx.api.APIError: 404 Client Error: Not Found for url: ...
cause The SDMX agency or resource ID specified in your request does not exist, or the URL constructed by the library leads to a non-existent endpoint. This often happens with incorrect agency IDs or resource IDs.fixDouble-check the `agency_id` and `resource_id` (e.g., dataflow ID) you are passing to `sdmx.Request()` and `req.get_data()`. Verify them against the official SDMX documentation or the agency's portal. -
sdmx.api.APIError: 400 Client Error: Bad Request for url: ...
cause The parameters sent in your data request are invalid, incomplete, or not understood by the SDMX API. Common issues include incorrect dimension keys, missing required parameters like `startPeriod`/`endPeriod`, or invalid date formats.fixReview the `key` and `params` arguments passed to `req.get_data()`. Ensure all required dimensions are specified and their values are valid according to the agency's data structure definitions. Consult the agency's SDMX API documentation.
Warnings
- breaking The `sdmx.Client` class was removed and replaced by `sdmx.Request` in version 2.0.0. This significantly changed how connections to SDMX agencies are initiated.
- breaking Many parameter names for data queries and metadata requests changed in version 2.0.0 (e.g., `agency_id` often became `agency`), and the structure of returned objects (e.g., `Message`, `DataSet`, `Series`, `Obs`) was refined.
- gotcha SDMX data is highly structured and often nested. New users may find it challenging to navigate the `Message` object to extract `DataSet`, `Series`, and `Obs` objects.
- gotcha Many SDMX web services (APIs) have strict rate limits, size limits for requests, or require specific query parameters (e.g., `startPeriod`, `endPeriod`, `dimensionAtObservation`) to fetch data successfully.
Install
-
pip install sdmx1 -
pip install sdmx1[cache]
Imports
- Request
from sdmx import Client
import sdmx req = sdmx.Request('ESTAT') - read_sdmx
import sdmx with open('data.xml', 'rb') as f: msg = sdmx.read_sdmx(f)
Quickstart
import sdmx
import os
# Example agency ID; replace with a real one like 'ESTAT' (Eurostat), 'IMF', 'OECD'
# Some agencies may require specific authentication or have strict rate limits.
agency_id = os.environ.get('SDMX_AGENCY_ID', 'IMF') # Using IMF as a common example
try:
# Create a Request object for the specified agency
req = sdmx.Request(agency_id)
print(f"Attempting to connect to SDMX agency: {agency_id}")
# Fetch available dataflows
# This performs an HTTP GET request to the agency's API endpoint
dataflows = req.dataflow()
print(f"Successfully retrieved dataflows from {agency_id}.")
print(f"First 3 dataflows from {agency_id} (ID: Name):")
if dataflows.data.dataflow:
for i, flow in enumerate(dataflows.data.dataflow[:3]):
print(f" - {flow.id}: {flow.name.get('en', 'No English name')}")
else:
print(" No dataflows found.")
# To fetch actual data, you would then use a specific dataflow ID and keys.
# Example (commented out, as specific dataflow IDs and keys vary greatly):
# # For IMF, using International Financial Statistics (IFS) dataflow, if available
# if agency_id == 'IMF':
# print("\nAttempting to fetch data from IMF (example).")
# data = req.get_data(
# resource_id='IFS',
# key={'REF_AREA': ['US', 'CN'], 'INDICATOR': ['LP_CPI_IX']},
# params={'startPeriod': '2020', 'endPeriod': '2022'}
# )
# print(f"Fetched {len(data.series)} series from IFS.")
except sdmx.api.APIError as e:
print(f"Error connecting to or fetching data from {agency_id}: {e}")
print("Possible causes: invalid agency ID, network issues, API rate limits, or specific API endpoint errors.")
except Exception as e:
print(f"An unexpected error occurred: {e}")