{"id":3858,"library":"ydata-profiling","title":"YData Profiling","description":"ydata-profiling is a powerful Python library that automates the generation of comprehensive exploratory data analysis (EDA) reports for pandas DataFrames. It provides detailed statistics, visualizations, and interactive widgets to understand data quality and distributions. The library is actively maintained with frequent minor releases, typically monthly or bi-monthly.","status":"active","version":"4.18.1","language":"en","source_language":"en","source_url":"https://github.com/ydataai/ydata-profiling","tags":["data profiling","eda","pandas","data science","report generation"],"install":[{"cmd":"pip install ydata-profiling","lang":"bash","label":"Base installation"},{"cmd":"pip install ydata-profiling[notebook]","lang":"bash","label":"For Jupyter Notebook widgets"},{"cmd":"pip install ydata-profiling[spark]","lang":"bash","label":"For Spark DataFrame support"}],"dependencies":[{"reason":"Core data structure for profiling.","package":"pandas"},{"reason":"For report visualizations.","package":"matplotlib"},{"reason":"For report visualizations.","package":"seaborn"},{"reason":"Fundamental for numerical operations.","package":"numpy"},{"reason":"Templating engine for HTML reports.","package":"jinja2"},{"reason":"Required for Jupyter Notebook widgets.","package":"ipython","optional":true},{"reason":"Required for Jupyter Notebook widgets.","package":"ipywidgets","optional":true},{"reason":"Required for Spark DataFrame profiling.","package":"pyspark","optional":true}],"imports":[{"note":"The library was renamed from 'pandas-profiling' to 'ydata-profiling' in version 3.0.0. The import path changed accordingly.","wrong":"from pandas_profiling import ProfileReport","symbol":"ProfileReport","correct":"from ydata_profiling import ProfileReport"}],"quickstart":{"code":"import pandas as pd\nfrom ydata_profiling import ProfileReport\n\n# Sample DataFrame\ndata = {\n    'col1': [1, 2, 3, 4, 5],\n    'col2': ['A', 'B', 'A', 'C', 'B'],\n    'col3': [1.1, 2.2, None, 4.4, 5.5]\n}\ndf = pd.DataFrame(data)\n\n# Generate the profile report\nprofile = ProfileReport(df, title=\"My DataFrame Profile\", explorative=True)\n\n# Save report to an HTML file\nprofile.to_file(\"your_report.html\")\n\n# If running in a Jupyter Notebook, you can display widgets directly:\n# profile.to_widgets()\n\nprint(\"Profile report saved to your_report.html\")","lang":"python","description":"This quickstart demonstrates how to create a pandas DataFrame, generate a comprehensive profile report using `ProfileReport`, and save it to an HTML file. For interactive display in Jupyter Notebooks, use `profile.to_widgets()`."},"warnings":[{"fix":"Update `pip install pandas-profiling` to `pip install ydata-profiling` and `from pandas_profiling import ...` to `from ydata_profiling import ...`.","message":"The library was renamed from `pandas-profiling` to `ydata-profiling` starting with version 3.0.0. This change requires updating import statements and package names in `requirements.txt` or `pyproject.toml`.","severity":"breaking","affected_versions":">=3.0.0"},{"fix":"For large datasets, use `df.sample()` or enable the Spark backend. Monitor memory and CPU usage during report generation. Tune the `config_minimal` or `config_small` parameters for faster, less detailed reports.","message":"Profiling large datasets can be computationally intensive and consume significant memory. For very large datasets, consider sampling or using the Spark integration (`ydata-profiling[spark]`).","severity":"gotcha","affected_versions":"All versions"},{"fix":"Install with extras: `pip install ydata-profiling[notebook]` for widgets or `pip install ydata-profiling[spark]` for Spark integration.","message":"Specific features like Jupyter Notebook widgets or Spark DataFrame profiling require installing optional dependencies. The base `pip install ydata-profiling` does not include these.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Check your Python version using `python --version` and upgrade or downgrade as necessary, or use a virtual environment with a compatible Python version.","message":"The library has specific Python version requirements (currently `Python >=3.10, <3.14`). Ensure your environment matches these requirements to avoid installation or runtime errors.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-11T00:00:00.000Z","next_check":"2026-07-10T00:00:00.000Z"}