{"id":24531,"library":"rnanorm","title":"RNA-Norm","description":"Rnanorm provides common RNA-seq normalization methods (TPM, CPM, FPKM, TMM, etc.) with a scikit-learn-like API. Current version 2.2.0 requires Python >=3.9, <3.14. The library is actively maintained with regular releases.","status":"active","version":"2.2.0","language":"python","source_language":"en","source_url":"https://github.com/czbiohub-sf/rnanorm","tags":["RNA-seq","normalization","TPM","CPM","FPKM","TMM","bioinformatics"],"install":[{"cmd":"pip install rnanorm","lang":"bash","label":"PyPI"}],"dependencies":[{"reason":"API design and base classes","package":"scikit-learn","optional":false},{"reason":"Data handling","package":"pandas","optional":false},{"reason":"Array operations","package":"numpy","optional":false}],"imports":[{"note":"CountData is in the datasets submodule; direct import was available in earlier versions but now requires full path.","wrong":"from rnanorm import CountData","symbol":"CountData","correct":"from rnanorm.datasets import CountData"},{"note":"","wrong":"","symbol":"TPM","correct":"from rnanorm import TPM"},{"note":"","wrong":"","symbol":"CPM","correct":"from rnanorm import CPM"},{"note":"","wrong":"","symbol":"FPKM","correct":"from rnanorm import FPKM"},{"note":"","wrong":"","symbol":"TMM","correct":"from rnanorm import TMM"},{"note":"","wrong":"","symbol":"UpperQuartile","correct":"from rnanorm import UpperQuartile"},{"note":"Filter classes are in rnanorm.filters submodule.","wrong":"from rnanorm import RemoveUninformative","symbol":"RemoveUninformative","correct":"from rnanorm.filters import RemoveUninformative"},{"note":"","wrong":"","symbol":"CountFilter","correct":"from rnanorm.filters import CountFilter"}],"quickstart":{"code":"import pandas as pd\nfrom rnanorm import TPM\nfrom rnanorm.datasets import CountData\n\n# Load example dataset\ncounts = CountData()\nexp = counts.expression\n\n# TPM normalization (requires gene lengths)\n# For demo, use dummy lengths (1 for all genes)\nlengths = pd.Series(1.0, index=exp.columns)\ntpm = TPM().set_output(transform='pandas').fit_transform(exp, lengths)\nprint(tpm.iloc[:5, :5])","lang":"python","description":"Basic usage: load example count data, apply TPM normalization with dummy gene lengths."},"warnings":[{"fix":"Upgrade code to pass lengths to fit_transform. See migration guide: https://rnanorm.readthedocs.io/en/latest/migration.html","message":"In version 2.0.0, the API changed from fit/transform on raw counts to requiring explicit gene lengths for length-dependent methods (TPM, FPKM). The 'expression' attribute of CountData now returns a DataFrame, not an object with .counts.","severity":"breaking","affected_versions":"<2.0.0"},{"fix":"Change 'from rnanorm import CountData' to 'from rnanorm.datasets import CountData'","message":"CountData is no longer importable directly from rnanorm; it must be imported from rnanorm.datasets.","severity":"breaking","affected_versions":">=2.0.0"},{"fix":"Always provide correct gene lengths; verify by checking that the sum of each sample's TPM is approximately 1e6.","message":"Length-dependent methods (TPM, FPKM) require gene lengths as a pandas Series or array with the same index as the expression DataFrame columns. Using wrong or missing lengths will silently produce incorrect results.","severity":"gotcha","affected_versions":">=2.0.0"},{"fix":"Chain .set_output(transform='pandas') on the estimator before calling fit_transform.","message":"The set_output(transform='pandas') method must be called before fit_transform to get a DataFrame output; otherwise output is a numpy array.","severity":"gotcha","affected_versions":">=2.0.0"},{"fix":"Apply log transformation manually after normalization using numpy.log1p.","message":"The 'log1p' parameter in some normalizers is deprecated and will be removed in future versions.","severity":"deprecated","affected_versions":">=2.0.0"}],"env_vars":null,"last_verified":"2026-05-01T00:00:00.000Z","next_check":"2026-07-30T00:00:00.000Z","problems":[{"fix":"Use 'from rnanorm.datasets import CountData'","cause":"CountData moved to submodule rnanorm.datasets in version 2.0.0.","error":"AttributeError: module 'rnanorm' has no attribute 'CountData'"},{"fix":"Provide a pandas Series or array of gene lengths as the second argument to fit_transform.","cause":"Length-dependent normalizers require gene lengths as second argument. In version 1.x, lengths were optional or not needed.","error":"TypeError: TPM.fit_transform() missing 1 required positional argument: 'lengths'"},{"fix":"Ensure the same column names/index are used in both fit and transform calls.","cause":"When using set_output(transform='pandas'), the feature names (gene IDs) must be consistent between fit and transform.","error":"ValueError: The feature names should match those that were passed during fit."}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null}