{"id":7800,"library":"torcheval","title":"TorchEval","description":"TorchEval is a PyTorch library providing a simple interface to create new metrics and an easy-to-use toolkit for metric computations and checkpointing. It offers a rich collection of high-performance metric calculations out-of-the-box, leveraging PyTorch's vectorization and GPU acceleration. Currently at version 0.0.7, it maintains an active release schedule with regular updates and new metric additions.","status":"active","version":"0.0.7","language":"en","source_language":"en","source_url":"https://github.com/pytorch/torcheval","tags":["pytorch","metrics","evaluation","machine-learning","deep-learning"],"install":[{"cmd":"pip install torcheval","lang":"bash","label":"Install stable version"}],"dependencies":[{"reason":"Core dependency for tensor operations, GPU acceleration, and distributed training.","package":"torch","optional":false}],"imports":[{"symbol":"Metric","correct":"from torcheval.metrics import Metric"},{"note":"Metrics are typically exposed directly under `torcheval.metrics`.","wrong":"from torcheval.metrics.classification import BinaryAccuracy","symbol":"BinaryAccuracy","correct":"from torcheval.metrics import BinaryAccuracy"},{"symbol":"MetricCollection","correct":"from torcheval.metrics import MetricCollection"}],"quickstart":{"code":"import torch\nfrom torcheval.metrics import BinaryAccuracy\n\n# Initialize the metric\nmetric = BinaryAccuracy()\n\n# Simulate model predictions and ground truth labels\n# Ensure inputs are tensors and on the correct device\npredictions = torch.tensor([0.9, 0.1, 0.8, 0.2, 0.95])\ntargets = torch.tensor([1, 0, 1, 0, 1])\n\n# Update the metric with a batch of data\nmetric.update(predictions, targets)\n\n# Get the computed result\naccuracy = metric.compute()\nprint(f\"Binary Accuracy: {accuracy.item():.4f}\")\n\n# Example with another batch\npredictions2 = torch.tensor([0.4, 0.6, 0.7])\ntargets2 = torch.tensor([0, 1, 0])\nmetric.update(predictions2, targets2)\n\n# Compute cumulative accuracy\ncumulative_accuracy = metric.compute()\nprint(f\"Cumulative Binary Accuracy: {cumulative_accuracy.item():.4f}\")\n\n# Reset the metric's internal state\nmetric.reset()\nprint(f\"Accuracy after reset and recompute: {metric.compute().item():.4f}\")","lang":"python","description":"This example demonstrates how to initialize a `BinaryAccuracy` metric, update it with predictions and targets, compute the current accuracy, and reset its internal state. Metrics accumulate data, so remember to call `reset()` for new evaluation runs (e.g., per epoch)."},"warnings":[{"fix":"Use `metric.sync_and_compute()` when operating in a distributed setting. Ensure `torch.distributed` is initialized before calling this method.","message":"In distributed training environments (e.g., using `torch.distributed`), metrics accumulate local states independently on each process. To get the correct global metric value, you must call `metric.sync_and_compute()` instead of `metric.compute()`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Call `metric.reset()` at the beginning of each new evaluation period to clear previously accumulated data.","message":"Metrics accumulate their internal state across multiple calls to `update()`. If you need to calculate metrics for distinct evaluation periods (e.g., per epoch or per validation run), you must call `metric.reset()` before processing new data, or create a new metric instance.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Periodically check the official documentation and release notes when upgrading for potential API adjustments. Pin minor versions if strict API stability is required.","message":"TorchEval is currently in a pre-1.0 state (0.0.x versions). While efforts are made to maintain stability, minor API changes might occur between releases. Always refer to the latest documentation for precise API details.","severity":"gotcha","affected_versions":"All 0.0.x versions"},{"fix":"Carefully read the documentation for each specific metric regarding expected input shapes, dtypes, and value ranges (e.g., logits vs. probabilities for classification). Reshape or cast tensors as needed, e.g., `predictions.squeeze(-1)`.","message":"Input tensors for `update()` must adhere to specific shapes and dtypes expected by each metric. For instance, binary metrics typically expect 1D tensors (N,) or (N,1) for predictions and targets.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Use `metric.update(predictions, targets)` to feed data and `metric.compute()` to get the result. The metric instance itself is not a function.","cause":"Attempting to call the metric instance directly instead of using its `update()` or `compute()` methods.","error":"TypeError: 'BinaryAccuracy' object is not callable"},{"fix":"Ensure both `predictions` and `targets` tensors are on the same device as the metric, or on the same device as each other if the metric is device-agnostic (though it often infers from the first input). Use `.to(device)` to move tensors: `predictions.to('cuda'), targets.to('cuda')`.","cause":"Input tensors (predictions and targets) passed to `metric.update()` are on different devices (e.g., one on CPU, one on GPU).","error":"RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!"},{"fix":"Reshape the `input` tensor to be 1-dimensional, typically `(N,)` or `(N,1)` for binary classification. Use methods like `.squeeze()`, `.view()`, or appropriate indexing to get the correct shape.","cause":"The input prediction tensor provided to `BinaryAccuracy.update()` has an incorrect shape (e.g., `(N, C)` for a multiclass output, or `(N, 1, H, W)`).","error":"ValueError: The 'input' tensor must be 1D for BinaryAccuracy"},{"fix":"In a distributed environment, always use `metric.sync_and_compute()` instead of `metric.compute()` to ensure all local states are gathered and aggregated correctly before calculation.","cause":"Forgetting to synchronize metric states across distributed processes when computing the final result.","error":"Incorrect or unexpected metric values in distributed training."}]}