{"id":8778,"library":"whylogs","title":"whylogs","description":"whylogs is an open-source Python library for logging, profiling, and monitoring ML data pipelines end-to-end. It generates lightweight, mergeable statistical summaries (profiles) of datasets, enabling data quality validation, drift detection, and exploratory data analysis. It integrates with the WhyLabs Platform for observability and alerting, but the core library is open source. The library is actively maintained with frequent patch releases.","status":"active","version":"1.6.4","language":"en","source_language":"en","source_url":"https://github.com/whylabs/whylogs","tags":["MLOps","data quality","data drift","ML monitoring","data profiling","AI telemetry"],"install":[{"cmd":"pip install whylogs","lang":"bash","label":"Base installation"},{"cmd":"pip install \"whylogs[viz]\"","lang":"bash","label":"For visualization capabilities"},{"cmd":"pip install \"whylogs[spark]\"","lang":"bash","label":"For PySpark integration"}],"dependencies":[{"reason":"Commonly used for data input (DataFrames) to whylogs.","package":"pandas","optional":false},{"reason":"Required for Spark integration, installed via 'whylogs[spark]'.","package":"pyspark","optional":true},{"reason":"Often used for profile visualization, included with 'whylogs[viz]'.","package":"matplotlib","optional":true}],"imports":[{"symbol":"get_or_create_session","correct":"from whylogs import get_or_create_session"},{"symbol":"whylogs as why","correct":"import whylogs as why"},{"note":"The ProfileVisualizer is part of the 'viz' submodule and requires the 'whylogs[viz]' extra installation.","wrong":"from whylogs import ProfileVisualizer","symbol":"ProfileVisualizer","correct":"from whylogs.viz import ProfileVisualizer"},{"symbol":"profile_viewer","correct":"from whylogs.viz import profile_viewer"}],"quickstart":{"code":"import pandas as pd\nfrom whylogs import get_or_create_session\n\n# Create a sample DataFrame\ndata = {\n    'col_a': [1, 2, 3, 4, 5],\n    'col_b': ['apple', 'banana', 'cherry', 'apple', 'date']\n}\ndf = pd.DataFrame(data)\n\n# Get or create a whylogs session\nsession = get_or_create_session()\n\n# Log the DataFrame to generate a profile\nwith session.logger(dataset_name=\"my_first_dataset\") as logger:\n    logger.log_dataframe(df)\n\n# Get the generated profile (ResultSet)\nresults = logger.profile() \n\n# You can also use the direct API for convenience (e.g., if not using a logger for multiple logs)\n# import whylogs as why\n# results_direct = why.log(df)\n\nprint(results.view().to_pandas())","lang":"python","description":"This quickstart demonstrates how to initialize a whylogs session, log a Pandas DataFrame to create a data profile, and then view the summary statistics."},"warnings":[{"fix":"Review the official whylogs v1 Migration Guide and update your code and profile loading logic accordingly. Re-profile data where necessary.","message":"whylogs v1 introduced significant breaking changes from v0.x, including API alterations and potential incompatibility with profiles generated by older versions. Users migrating from v0.x should consult the migration guide.","severity":"breaking","affected_versions":"<1.0"},{"fix":"Install the necessary extras: `pip install \"whylogs[viz]\"` for visualization or `pip install \"whylogs[spark]\"` for PySpark integration.","message":"Core visualization tools (`ProfileVisualizer`, `profile_viewer`) and PySpark integration require extra installations (`whylogs[viz]` and `whylogs[spark]`, respectively). A base `pip install whylogs` will not include these functionalities.","severity":"gotcha","affected_versions":">=1.0"},{"fix":"If you rely on the WhyLabs Platform for monitoring, consider self-hosting the open-sourced WhyLabs platform or integrating whylogs profiles with an alternative monitoring solution.","message":"The hosted WhyLabs Platform, used for advanced monitoring and observability of whylogs profiles, is being discontinued. While the whylogs library remains open source and the WhyLabs platform's source code is publicly available for self-hosting, the managed SaaS offering is no longer accessible.","severity":"deprecated","affected_versions":"All versions that integrate with WhyLabs SaaS"},{"fix":"Avoid installing or using whylogs version 1.1.2. Upgrade to a later stable version (e.g., 1.1.3 or higher) or downgrade to an earlier stable version.","message":"whylogs version 1.1.2 was yanked from PyPI due to a bug that prevented it from correctly reading dataset profiles written with previous versions.","severity":"breaking","affected_versions":"1.1.2"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install whylogs with the visualization extra: `pip install \"whylogs[viz]\"`","cause":"The visualization utilities are part of an optional extra installation ('viz') and are not included in the base `whylogs` package.","error":"ModuleNotFoundError: No module named 'whylogs.viz'"},{"fix":"Install whylogs with the PySpark extra: `pip install \"whylogs[spark]\"`","cause":"PySpark integration is an optional extra ('spark') and is not included in the base `whylogs` package.","error":"ModuleNotFoundError: No module named 'whylogs.pyspark.experimental'"},{"fix":"Ensure the whylogs library version used for reading profiles is compatible with the version used for generating them. If migrating from v0.x to v1.x, you may need to re-profile your historical data or use the appropriate migration tools if available.","cause":"Attempting to load a whylogs profile generated by a significantly different major version of whylogs (e.g., v0.x profile with v1.x library), which introduced breaking changes in the profile format.","error":"Failed to deserialize profile: The profile was generated with an incompatible whylogs version."}]}