SWE-smith
SWE-smith is an open-source Python toolkit designed for generating large-scale software engineering training data. It enables users to turn any GitHub repository into a 'SWE-gym' to create unlimited task instances (e.g., file localization, program repair, SWE-bench) for training Software Engineering (SWE) agents. The current version is 0.0.9, and it appears to be actively developed, with frequent updates and an upcoming NeurIPS 2025 Datasets & Benchmarks Track spotlight. [2, 4, 6]
Warnings
- breaking SWE-smith relies heavily on Docker for creating and managing execution environments. Lack of Docker or running on unsupported OS (like Windows/MacOS directly) can lead to unexpected behavior or prevent functionality. [4]
- gotcha The primary interaction model often involves running specific modules via `python -m swesmith.module.submodule` for tasks like bug generation, validation, or environment building, rather than direct class instantiation and method calls for core workflows. [1, 10]
- gotcha While the core library is Python-focused, SWE-smith is expanding to support other programming languages (Go, JavaScript, Rust, C, C++, C#, Java, PHP). Be aware that full functionality and bug generation strategies might differ or be under active development for non-Python languages. [6, 8]
Install
-
pip install swesmith -
git clone git@github.com:SWE-bench/SWE-smith.git cd SWE-smith pip install -e '.[all]'
Imports
- registry
from swesmith.profiles import registry
Quickstart
# Example: Loading a SWE-smith dataset and getting a RepoProfile
# Requires 'datasets' to be installed (pip install datasets)
import os
from datasets import load_dataset
from swesmith.profiles import registry
# NOTE: This example requires Docker to be running for environment creation
# and may download a large dataset. Authentication (e.g., Hugging Face token)
# might be needed depending on dataset access.
# Load a small sample of the SWE-smith dataset
try:
ds = load_dataset("SWE-bench/SWE-smith", split="train", streaming=True)
print("Dataset loaded successfully. Processing first few tasks...")
count = 0
for task in ds:
if count >= 2: # Process only the first 2 tasks for quickstart
break
print(f"\n--- Processing Task {count + 1} ---")
print(f"Task ID: {task.get('instance_id', 'N/A')}")
# Get the RepoProfile for the task
rp = registry.get_from_inst(task)
print(f"Repository Profile for task: {rp.repo_name}")
# Get a pointer to a Docker container with the task initialized (requires Docker)
# This step will actually attempt to create/get a Docker container
# Skipping actual container interaction for a simple quickstart printout.
# container = rp.get_container(task)
# print(f"Container ID for task: {container.id}")
print("To get the Docker container, uncomment 'container = rp.get_container(task)'")
count += 1
except Exception as e:
print(f"An error occurred during quickstart: {e}")
print("Please ensure Docker is running and 'datasets' is installed. "
"If using a private dataset, ensure you are logged in (e.g., huggingface-cli login).")