{"id":6302,"library":"yarn-api-client","title":"Hadoop YARN API Client","description":"A Python client for interacting with the Hadoop® YARN API. It provides programmatic access to YARN components like ResourceManager, ApplicationMaster, HistoryServer, and NodeManager. The library is actively maintained, with a focus on supporting recent Python and Hadoop YARN versions, and has a steady release cadence for minor improvements and bug fixes.","status":"active","version":"1.0.3","language":"en","source_language":"en","source_url":"https://github.com/CODAIT/hadoop-yarn-api-python-client","tags":["hadoop","yarn","api-client","big-data","cluster-management"],"install":[{"cmd":"pip install yarn-api-client","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Used for making HTTP requests to the YARN API.","package":"requests","optional":false},{"reason":"Provides Kerberos/SPNEGO authentication support, optional since v0.3.2.","package":"requests-kerberos","optional":true}],"imports":[{"symbol":"ResourceManager","correct":"from yarn_api_client.resource_manager import ResourceManager"},{"symbol":"ApplicationMaster","correct":"from yarn_api_client.application_master import ApplicationMaster"},{"symbol":"HistoryServer","correct":"from yarn_api_client.history_server import HistoryServer"},{"symbol":"NodeManager","correct":"from yarn_api_client.node_manager import NodeManager"}],"quickstart":{"code":"import os\nfrom yarn_api_client.resource_manager import ResourceManager\n\n# Configure YARN ResourceManager endpoints\n# These can also be discovered automatically if YARN_CONF_DIR or HADOOP_CONF_DIR\n# environment variables are set and point to valid Hadoop configuration.\n# For local testing, ensure a YARN ResourceManager is running or adjust endpoint.\n\n# Using a dummy endpoint if not set, for demonstration purposes.\n# In a real scenario, replace with your actual YARN ResourceManager URL(s).\n# Example for HA: ['http://rm1.example.com:8088', 'http://rm2.example.com:8088']\nrm_endpoints = os.environ.get('YARN_RM_ENDPOINTS', 'http://localhost:8088').split(',')\n\nif not rm_endpoints or rm_endpoints == ['']:\n    print(\"Warning: YARN_RM_ENDPOINTS environment variable not set. Using 'http://localhost:8088' as default.\")\n    rm_endpoints = ['http://localhost:8088']\n\nprint(f\"Attempting to connect to YARN ResourceManager at: {rm_endpoints}\")\n\ntry:\n    resource_manager = ResourceManager(rm_endpoints)\n\n    # Fetch cluster applications\n    applications_response = resource_manager.cluster_applications()\n    \n    if applications_response.apps:\n        print(f\"Found {len(applications_response.apps)} applications.\")\n        for app in applications_response.apps[:3]: # Print first 3 apps\n            print(f\"  Application ID: {app.id}, Name: {app.name}, State: {app.state}\")\n    else:\n        print(\"No applications found on the YARN cluster.\")\n\nexcept Exception as e:\n    print(f\"Error connecting to YARN ResourceManager or fetching applications: {e}\")\n    print(\"Please ensure a YARN ResourceManager is running and accessible at the configured endpoint(s).\")\n    print(\"You can set the YARN_RM_ENDPOINTS environment variable, e.g., export YARN_RM_ENDPOINTS='http://your-rm-host:8088'\")\n","lang":"python","description":"This quickstart demonstrates how to initialize the `ResourceManager` and fetch a list of applications currently running or finished on the YARN cluster. It highlights configuring YARN endpoints and includes basic error handling for connection issues."},"warnings":[{"fix":"Migrate your codebase to Python 3.6+ to use versions 1.0.3 and later. If Python 2.7 is strictly required, pin the library version to <1.0.3.","message":"Python 2.7 support was officially dropped in version 1.0.3. Code written for Python 2.7 will likely fail with syntax errors or missing features.","severity":"breaking","affected_versions":"1.0.3 and later"},{"fix":"Update constructor calls to pass full endpoint URLs as a list (even for a single endpoint). For example, `ResourceManager('localhost', 8088)` becomes `ResourceManager(['http://localhost:8088'])`.","message":"Version 1.0.0 introduced a major API cleanup. The `ResourceManager`, `ApplicationMaster`, `HistoryServer`, and `NodeManager` constructors no longer accept separate `address` and `port` parameters. Instead, they require complete endpoint URLs (e.g., `['http://localhost:8088']`). `ResourceManager` also now accepts a list of endpoints for HA support.","severity":"breaking","affected_versions":"1.0.0 and later"},{"fix":"Pass a list of all ResourceManager URLs: `ResourceManager(['http://rm1.example.com:8088', 'http://rm2.example.com:8088'])`.","message":"When using YARN in High Availability (HA) mode, ensure you provide a list of all active ResourceManager endpoints to the `ResourceManager` constructor. The client will attempt to connect to the active RM from the provided list.","severity":"gotcha","affected_versions":"1.0.0 and later"},{"fix":"Be mindful of these environment variables. If you wish to explicitly control endpoints, ensure these variables are not set or that your explicit endpoint configuration takes precedence as expected. Consult the official documentation for precedence rules.","message":"The library can automatically discover Hadoop configuration by checking `YARN_CONF_DIR` or `HADOOP_CONF_DIR` environment variables. If these are set, explicit endpoints provided in the constructor might be overridden or interact unexpectedly with discovered configurations.","severity":"gotcha","affected_versions":"All versions with configuration discovery (from 0.3.4, enhanced in 1.0.3)"},{"fix":"Upgrade to version 1.0.2 or later to benefit from improved robustness when handling empty or malformed YARN responses. Implement robust error handling and check for empty content in your application logic.","message":"Older YARN deployments or certain API calls might return empty JSON responses, which could cause parsing errors in the client. Version 1.0.2 improved handling of such cases.","severity":"gotcha","affected_versions":"Prior to 1.0.2"}],"env_vars":null,"last_verified":"2026-04-14T00:00:00.000Z","next_check":"2026-07-13T00:00:00.000Z"}