Neptune Query
Neptune Query is a Python library (current version 1.14.1) for retrieving logged metadata from the Neptune MLOps platform. It provides a read-only API to programmatically fetch experiments, runs, and their associated attributes, often as Pandas DataFrames. The library is actively maintained with frequent minor and patch releases, offering a stable interface for data retrieval and analysis.
Warnings
- gotcha This library (`neptune-query`) is for interacting with the neptune.ai MLOps platform, not Amazon Neptune (AWS's graph database service). There is common confusion due to the name overlap. Ensure you are using the correct library for your intended platform.
- breaking In version 1.10.0, the external dependency on `neptune-api` was dropped. A copy of `neptune-api` (version 0.26.0) was bundled directly within `neptune-query`. This simplifies `neptune-query`'s dependency management but could break environments where users explicitly managed `neptune-api` versions in conjunction with `neptune-query` or expected a specific external `neptune-api` version to be used.
- gotcha Authentication relies on either `NEPTUNE_API_TOKEN` and `NEPTUNE_PROJECT` environment variables or passing these values directly to functions like `set_api_token` or the `project` argument. Forgetting to set these can lead to authentication errors or operations failing silently. Ensure correct permissions for the API token.
- deprecated The Neptune Fetcher API (`neptune-fetcher`) is deprecated in favor of `neptune-query`. If you are migrating from older Neptune clients, you should transition to using `neptune-query` for all data retrieval tasks.
- gotcha Some functions, such as `fetch_experiments_table_global()` and `fetch_runs_table_global()`, are explicitly marked as 'experimental'. Their API signatures or behavior might change in future releases, and they may not be as optimized or stable as core functions.
Install
-
pip install neptune-query
Imports
- neptune_query
import neptune_query as nq
- neptune_query.runs
import neptune_query.runs as nq_runs
Quickstart
import os
import neptune_query as nq
import pandas as pd
# Set your Neptune API token and project name as environment variables.
# Example: export NEPTUNE_API_TOKEN="YOUR_API_TOKEN"
# Example: export NEPTUNE_PROJECT="workspace-name/project-name"
# Ensure environment variables are set or pass them explicitly
neptune_api_token = os.environ.get('NEPTUNE_API_TOKEN', 'YOUR_API_TOKEN_HERE')
neptune_project = os.environ.get('NEPTUNE_PROJECT', 'your_workspace/your_project')
if neptune_api_token == 'YOUR_API_TOKEN_HERE' or neptune_project == 'your_workspace/your_project':
print("Warning: Please set NEPTUNE_API_TOKEN and NEPTUNE_PROJECT environment variables or provide them explicitly.")
# For demonstration, we'll skip further execution if credentials aren't set.
# In a real application, handle this appropriately (e.g., raise an error, prompt user).
exit()
# Set the API token for the session (optional if NEPTUNE_API_TOKEN env var is set)
nq.set_api_token(api_token=neptune_api_token)
# List experiments in a project
print(f"Listing experiments in project: {neptune_project}")
experiment_names = nq.list_experiments(project=neptune_project)
print(f"Found {len(experiment_names)} experiments: {experiment_names[:5]}...")
# Fetch a table of experiments with specific attributes
# Fetch runs as rows and attributes as columns
table_df: pd.DataFrame = nq.fetch_experiments_table(
project=neptune_project,
columns=['sys/name', 'sys/creation_time', 'params/*', 'metrics/loss']
)
print("\nFetched experiments table (first 5 rows):\n")
print(table_df.head())
# Fetch a specific metric series for an experiment
if not table_df.empty:
first_experiment_name = table_df.iloc[0]['sys/name']
print(f"\nFetching 'metrics/loss' for experiment: {first_experiment_name}")
metric_series_df = nq.fetch_metrics(
experiments=[first_experiment_name],
attributes=['metrics/loss']
)
print("\nFetched metric series (first 5 rows):\n")
print(metric_series_df.head())