{"id":6850,"library":"rdt","title":"RDT (Reversible Data Transforms)","description":"RDT (Reversible Data Transforms) is a Python library that enables the transformation of raw data into fully numerical data, making it ready for various data science tasks. The transformations are designed to be reversible, allowing conversion back to the original data format. It is part of The Synthetic Data Vault Project and is actively maintained by DataCebo, with frequent updates and releases. The current version is 1.21.0.","status":"active","version":"1.21.0","language":"en","source_language":"en","source_url":"https://github.com/sdv-dev/RDT","tags":["data transformation","reversible transforms","synthetic data","data science","machine learning"],"install":[{"cmd":"pip install rdt","lang":"bash","label":"Install with pip"}],"dependencies":[{"reason":"Required Python version range for rdt.","package":"python","version":">=3.9, <3.15"},{"reason":"Fundamental for data handling; HyperTransformer expects pandas DataFrames.","package":"pandas","optional":false},{"reason":"RDT is part of the SDV project; installing SDV automatically includes RDT.","package":"sdv","optional":true}],"imports":[{"note":"Used for transforming multi-column datasets.","symbol":"HyperTransformer","correct":"from rdt import HyperTransformer"},{"note":"Provides a demo dataset for quick experimentation.","symbol":"get_demo","correct":"from rdt import get_demo"}],"quickstart":{"code":"import pandas as pd\nfrom rdt import HyperTransformer, get_demo\n\n# Load a demo dataset\ncustomers = get_demo()\nprint(\"Original Data:\\n\", customers.head())\n\n# Initialize and detect config with HyperTransformer\nht = HyperTransformer()\nht.detect_initial_config(data=customers)\nprint(\"\\nDetected Config:\\n\", ht.get_config())\n\n# Transform the data\ntransformed_data = ht.transform(customers)\nprint(\"\\nTransformed Data (first 5 rows):\\n\", transformed_data.head())\n\n# Reverse transform the data back to original format\nreversed_data = ht.reverse_transform(transformed_data)\nprint(\"\\nReversed Data (first 5 rows):\\n\", reversed_data.head())","lang":"python","description":"This quickstart demonstrates how to load a demo dataset, initialize a `HyperTransformer`, automatically detect its configuration based on the data, transform the data into a numerical format, and then reverse the transformation back to its original representation."},"warnings":[{"fix":"Upgrade to RDT 0.2.0 or newer and refactor code to use the new API, especially `HyperTransformer` and its configuration. Ensure data is in pandas DataFrame format.","message":"RDT versions prior to 0.2.0 had a significantly different API. Version 0.2.0 introduced a brand new API, removed the old metadata JSON from user arguments, and made transformers work exclusively with pandas Series.","severity":"breaking","affected_versions":"<0.2.0"},{"fix":"Update code to align with the new `HyperTransformer` and `BaseTransformer` APIs introduced in 0.6.0, leveraging the enhanced multi-column and chained transformer capabilities.","message":"Version 0.6.0 brought major changes to the `HyperTransformer` and `BaseTransformer` APIs, enabling multi-column input for transformers and allowing sequences of transformers per column.","severity":"breaking","affected_versions":"0.5.x"},{"fix":"Migrate to using the `UniformEncoder` transformer instead.","message":"The `frequencyEncoder` transformer is deprecated and will not be supported in future RDT versions.","severity":"deprecated","affected_versions":"All versions up to 1.7.0"},{"fix":"Update distribution names in `GaussianNormalizer` configurations to the new `scipy`-consistent terms.","message":"The distribution option names for `GaussianNormalizer` have been updated to be consistent with `scipy`. `gaussian` is now `norm`, `student_t` is `t`, and `truncated_gaussian` is `truncnorm`.","severity":"deprecated","affected_versions":"All versions up to 1.7.0"},{"fix":"Replace the 'text' sdtype with 'id' for relevant columns.","message":"The `sdtype` 'text' was removed in RDT versions 1.13.0 and newer. Attempting to use 'text' as an sdtype will lead to errors.","severity":"gotcha","affected_versions":">=1.13.0"},{"fix":"Ensure your Python environment meets the required version for your RDT installation. For version 1.21.0, Python 3.9 through 3.14 are supported.","message":"Python 3.6 support was dropped in RDT 1.0.0, and later versions have stricter Python requirements (e.g., currently >=3.9, <3.15).","severity":"gotcha","affected_versions":"<1.0.0 (for Python 3.6), all (for specific range)"}],"env_vars":null,"last_verified":"2026-04-15T00:00:00.000Z","next_check":"2026-07-14T00:00:00.000Z","problems":[]}