Hadoop YARN API Client

1.0.3 · active · verified Tue Apr 14

A Python client for interacting with the Hadoop® YARN API. It provides programmatic access to YARN components like ResourceManager, ApplicationMaster, HistoryServer, and NodeManager. The library is actively maintained, with a focus on supporting recent Python and Hadoop YARN versions, and has a steady release cadence for minor improvements and bug fixes.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize the `ResourceManager` and fetch a list of applications currently running or finished on the YARN cluster. It highlights configuring YARN endpoints and includes basic error handling for connection issues.

import os
from yarn_api_client.resource_manager import ResourceManager

# Configure YARN ResourceManager endpoints
# These can also be discovered automatically if YARN_CONF_DIR or HADOOP_CONF_DIR
# environment variables are set and point to valid Hadoop configuration.
# For local testing, ensure a YARN ResourceManager is running or adjust endpoint.

# Using a dummy endpoint if not set, for demonstration purposes.
# In a real scenario, replace with your actual YARN ResourceManager URL(s).
# Example for HA: ['http://rm1.example.com:8088', 'http://rm2.example.com:8088']
rm_endpoints = os.environ.get('YARN_RM_ENDPOINTS', 'http://localhost:8088').split(',')

if not rm_endpoints or rm_endpoints == ['']:
    print("Warning: YARN_RM_ENDPOINTS environment variable not set. Using 'http://localhost:8088' as default.")
    rm_endpoints = ['http://localhost:8088']

print(f"Attempting to connect to YARN ResourceManager at: {rm_endpoints}")

try:
    resource_manager = ResourceManager(rm_endpoints)

    # Fetch cluster applications
    applications_response = resource_manager.cluster_applications()
    
    if applications_response.apps:
        print(f"Found {len(applications_response.apps)} applications.")
        for app in applications_response.apps[:3]: # Print first 3 apps
            print(f"  Application ID: {app.id}, Name: {app.name}, State: {app.state}")
    else:
        print("No applications found on the YARN cluster.")

except Exception as e:
    print(f"Error connecting to YARN ResourceManager or fetching applications: {e}")
    print("Please ensure a YARN ResourceManager is running and accessible at the configured endpoint(s).")
    print("You can set the YARN_RM_ENDPOINTS environment variable, e.g., export YARN_RM_ENDPOINTS='http://your-rm-host:8088'")

view raw JSON →