{"id":2382,"library":"anndata","title":"Annotated Data (anndata)","description":"anndata is a Python package designed for efficient handling of annotated data matrices, both in memory and on disk. Positioned between pandas and xarray, it offers robust features like sparse data support, lazy operations, and a PyTorch interface, making it a cornerstone in single-cell data analysis workflows. The library maintains an active development cycle with frequent patch releases to ensure stability and incorporate new features, building on stable minor and major versions.","status":"active","version":"0.12.10","language":"en","source_language":"en","source_url":"https://github.com/scverse/anndata","tags":["bioinformatics","single-cell","data analysis","h5ad","zarr","sparse data"],"install":[{"cmd":"pip install anndata","lang":"bash","label":"Install latest stable version"}],"dependencies":[{"reason":"Requires Python 3.11 or higher.","package":"python","optional":false},{"reason":"Fundamental array operations.","package":"numpy","optional":false},{"reason":"Sparse matrix and scientific computing support.","package":"scipy","optional":false},{"reason":"DataFrame for annotations (`.obs`, `.var`).","package":"pandas","optional":false},{"reason":"Required for HDF5-backed storage (.h5ad files).","package":"h5py","optional":false},{"reason":"Supports Zarr-backed storage (optional, but highly recommended for cloud-native workflows).","package":"zarr","optional":true}],"imports":[{"symbol":"AnnData","correct":"import anndata as ad\nadata = ad.AnnData(...)"}],"quickstart":{"code":"import anndata as ad\nimport numpy as np\nimport pandas as pd\nfrom scipy.sparse import csr_matrix\n\n# Create a sparse data matrix\ncounts = csr_matrix(np.random.poisson(1, size=(10, 5)), dtype=np.float32)\n\n# Create observation (cell) and variable (gene) metadata\nobs_data = pd.DataFrame({\n    'cell_type': ['T cell', 'B cell', 'T cell', 'NK cell', 'B cell', 'T cell', 'NK cell', 'B cell', 'T cell', 'NK cell'],\n    'patient': ['P1', 'P1', 'P2', 'P1', 'P2', 'P1', 'P2', 'P1', 'P2', 'P2']\n}, index=[f'Cell_{i}' for i in range(10)])\n\nvar_data = pd.DataFrame({\n    'gene_name': [f'Gene_{i}' for i in range(5)],\n    'chromosome': ['chr1', 'chr2', 'chr1', 'chr3', 'chr2']\n}, index=[f'Gene_{i}' for i in range(5)])\n\n# Initialize an AnnData object\nadata = ad.AnnData(X=counts, obs=obs_data, var=var_data)\n\nprint(adata)\nprint(adata.obs.head())\nprint(adata.var.head())\nprint(adata.X.shape)","lang":"python","description":"This quickstart demonstrates how to create an AnnData object from a sparse matrix and annotate it with observation (cell-level) and variable (gene-level) metadata using pandas DataFrames. It then prints a summary of the AnnData object and its annotations."},"warnings":[{"fix":"Use `adata_subset = adata[...].copy()` to ensure you're working with an independent copy. Be mindful when modifying `.X` directly on a view.","message":"Subsetting an AnnData object (e.g., `adata_subset = adata[:, list_of_vars]`) typically returns a 'view' of the original object, not a full copy. Modifying elements of this view (except for the main data matrix `.X`) will trigger a copy-on-modify, converting the view into an independent AnnData object. However, direct modifications to `.X` on a view *can* modify the underlying original AnnData object. Always call `.copy()` explicitly on a subset (`adata_subset = adata[...].copy()`) if you intend to make independent changes.","severity":"gotcha","affected_versions":"All versions, `0.11.4` introduced `ImplicitModificationWarning` when setting `.X` on a view."},{"fix":"Upgrade your Python environment to version 3.10 or newer.","message":"Starting with `anndata 0.11.0`, support for Python 3.9 has been dropped. If you are using an older Python version, you will need to upgrade your Python environment to 3.10 or higher to use `anndata 0.11.0` and later.","severity":"breaking","affected_versions":">=0.11.0 (specifically from 0.11.0rc3)"},{"fix":"Update import statements from `import anndata as ad; ad.read_h5ad(...)` to `import anndata.io as aio; aio.read_h5ad(...)` or `from anndata.io import read_h5ad`.","message":"The top-level `anndata.read_*` functions (e.g., `anndata.read_h5ad`) have been moved to `anndata.io` module. Direct imports like `from anndata import read_h5ad` will still work but it's recommended to use the new `anndata.io` module for all read/write operations.","severity":"breaking","affected_versions":">=0.11.0 (specifically from 0.11.0rc2)"},{"fix":"Replace `anndata.__version__` with `importlib.metadata.version('anndata')`.","message":"The `anndata.__version__` attribute is deprecated. For programmatic version checking, use `importlib.metadata.version('anndata')` instead.","severity":"deprecated","affected_versions":">=0.12.3"},{"fix":"Carefully consider the `join` strategy. If memory is an issue, consider alternative strategies for combining data or ensure you have sufficient resources.","message":"Using `anndata.concat()` with `join='outer'` on sparse datasets can significantly increase file size and memory consumption due to the explicit filling of missing variables with zeros. This can quickly lead to out-of-memory errors for large datasets.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Avoid using forward slashes in keys for `.obs`, `.var`, `.uns`, etc. Use `anndata.settings.disallow_forward_slash_in_h5ad = True` to proactively enforce the future behavior and identify problematic keys.","message":"Writing keys with forward slashes in `.h5ad` files (`adata.uns['my/nested/key']`) was re-allowed in `0.12.3` but will be disallowed in future versions. This can lead to corrupted file structures or errors in subsequent reads.","severity":"gotcha","affected_versions":"Future breaking (warned in >=0.12.3)"}],"env_vars":null,"last_verified":"2026-04-10T00:00:00.000Z","next_check":"2026-07-09T00:00:00.000Z"}