IDC Index
raw JSON → 0.11.14 verified Fri May 01 auth: no python
Python package to simplify access to the data available in the NCI Imaging Data Commons (IDC). Provides queryable Pandas/DuckDB-based indices for DICOM studies, series, and analysis results. Latest version: 0.11.14. Released approximately every 2-4 weeks.
pip install idc-index Common errors
error AttributeError: 'IDCClient' object has no attribute 'get_collections' ↓
cause The method was renamed/removed in v0.11.
fix
Use
client.available_collections instead. error ModuleNotFoundError: No module named 'idc_index_data' ↓
cause The `idc-index-data` package is not installed or not up-to-date.
fix
Run
pip install idc-index-data or pip install idc-index --upgrade. error ValueError: The truth value of a DataFrame is ambiguous ↓
cause Calling `bool(client.get_series(...))` on an empty DataFrame.
fix
Use
not series.empty instead of if series: error RuntimeError: Cannot connect to IDC index. Make sure you have internet access. ↓
cause First-time download or cached index is corrupt; or network issue.
fix
Ensure stable internet, delete
~/.cache/idc-index and retry, or set environment variable IDC_INDEX_CACHE_DIR to a writable path. Warnings
deprecated The method `get_collections()` was removed in v0.11. Use `client.available_collections` property instead. ↓
fix Replace `client.get_collections()` with `client.available_collections`
gotcha IDCClient() will download the index data on first instantiation if not already cached. This can be slow (>1GB download). Use `IDCClient(lazy=True)` to defer downloading until a query is made. ↓
fix client = IDCClient(lazy=True)
gotcha Index data (parquet files) is versioned with the `idc-index-data` package. If you have an older version of idc-index-data, new methods may fail or return empty results. Always keep both packages up-to-date. ↓
fix Run `pip install --upgrade idc-index idc-index-data`
breaking In v0.11.0, the `get_patient_study_series()` return format changed from a dict with keys to a namedtuple. Code expecting dict keys will break. ↓
fix Access elements by index (e.g., result.patient_id) instead of dict['patient_id']
Imports
- IDCClient wrong
from idc_index.client import IDCClientcorrectfrom idc_index import IDCClient - index
from idc_index import index
Quickstart
from idc_index import IDCClient
client = IDCClient()
# Get all DICOM series in the index
series = client.get_series(collection_id="TCGA-LUAD", modality="CT")
print(len(series))
# Access as Pandas DataFrame
print(series.head())