NVIDIA One Logger Core
nv-one-logger-core is the foundational logging library within the NVIDIA one-logger ecosystem, providing core logging functionality, including spans, events, and attributes. It enables tracking of GPU application progress and helps identify overhead. The library also integrates with OpenTelemetry (OTEL) for backend telemetry. The current version is 2.3.1, with recent releases indicating an active development cadence.
Common errors
-
OSError: [Errno 24] Too many open files: '/path/to/onelogger.err'
cause File descriptors for internal log files are not being correctly closed, leading to resource exhaustion, particularly in long-running applications or distributed training jobs.fixThis issue was reported to be fixed in versions 2.5.0+. Upgrade `nv-one-logger-core` and any dependent `nv-one-logger` components (e.g., `nv-one-logger-training-telemetry`) to their latest available versions that include the fix. -
WARNING - Skipping execution of on_train_start because OneLogger is not enabled.
cause An `nv-one-logger` API or callback (e.g., `on_train_start` from an integration) is being invoked when the OneLogger system has not been fully initialized or is explicitly disabled.fixVerify the logging configuration and ensure `configure_otel()` (or equivalent initialization for other backends) is called and `OneLogger` is globally enabled before calls to its APIs. In distributed setups, ensure all processes correctly initialize the logger. -
ModuleNotFoundError: No module named 'nv_one_logger_core'
cause Attempting to import modules or classes directly from 'nv_one_logger_core' as if it were the top-level package. The correct top-level package is `nv_one_logger`.fixAdjust import statements to use `from nv_one_logger.core.<submodule> import ...` or `from nv_one_logger.otel.<submodule> import ...` based on the specific component you need.
Warnings
- breaking Older versions (prior to 2.5.0) of the underlying `nv-one-logger` system, which `nv-one-logger-core` is part of, had a 'Too many open files' (`OSError: [Errno 24]`) issue, particularly when integrated with frameworks like NeMo, due to file handlers not being correctly closed.
- gotcha When `nv-one-logger` is disabled or not properly initialized in an environment (e.g., specific distributed training ranks), you might encounter 'Skipping execution of X because OneLogger is not enabled.' warnings. This indicates that instrumentation calls are being made without an active logger.
- gotcha The PyPI description 'Extensions to onelogger library' might be misleading. `nv-one-logger-core` is a core component of the `NVIDIA/nv-one-logger` project, which includes its own OpenTelemetry integration (`nv_one_logger.otel`), and is distinct from other Python 'onelogger' projects (e.g., realsdx/onelogger).
Install
-
pip install nv-one-logger-core -
pip install nv-one-logger-core==2.3.1
Imports
- Logger
from nv_one_logger_core import Logger
from nv_one_logger.core.api import Logger
- Span
from nv_one_logger_core.Span import Span
from nv_one_logger.core.api import Span
- OTELConfig
from nv_one_logger.otel.config import OTELConfig
- configure_otel
from nv_one_logger.otel.api import configure_otel
Quickstart
import os
from nv_one_logger.core.api import Logger, Span
from nv_one_logger.otel.api import configure_otel
from nv_one_logger.otel.config import OTELConfig
# Configure OpenTelemetry to export to console for demonstration
otel_config = OTELConfig(
exporter_type='console', # Options: 'console', 'otlp_grpc', 'otlp_http'
service_name='my-application',
endpoint=os.environ.get('OTEL_EXPORTER_OTLP_ENDPOINT', '') # Set if using otlp_grpc or otlp_http
)
configure_otel(otel_config)
# Get a logger instance
logger = Logger.get_logger("my_app_logger")
# Log a simple message
logger.info("Application started.")
# Create a span to track an operation
with Span.create("process_data", logger=logger) as span:
span.set_attribute("input_size", 100)
logger.debug("Processing data...")
# Simulate some work
result = sum(range(100))
span.set_attribute("output_result", result)
logger.info("Data processed successfully.")
logger.info("Application finished.")