Document WhyLabs shutdown in whylogs integration (#3997)

strickvl · web-flow · commit 7cfc7fbfe4e5 · 2025-09-26T08:40:13.000+02:00
* Update whylogs integration docs after WhyLabs shutdown

* Update docs/book/component-guide/data-validators/README.md

* Update docs/book/component-guide/data-validators/whylogs.md
diff --git a/docs/book/component-guide/data-validators/README.md b/docs/book/component-guide/data-validators/README.md
@@ -34,7 +34,7 @@ Data Validator are optional stack components provided by integrations. The follo
 | [Deepchecks](deepchecks.md)                 | <p>data quality<br>data drift<br>model drift<br>model performance</p> | <p>tabular: <code>pandas.DataFrame</code><br>CV: <code>torch.utils.data.dataloader.DataLoader</code></p> | <p>tabular: <code>sklearn.base.ClassifierMixin</code><br>CV: <code>torch.nn.Module</code></p> | Add Deepchecks data and model validation tests to your pipelines                                    | `deepchecks`         |
 | [Evidently](evidently.md)                   | <p>data quality<br>data drift<br>model drift<br>model performance</p> | tabular: `pandas.DataFrame`                                                                              | N/A                                                                                           | Use Evidently to generate a variety of data quality and data/model drift reports and visualizations | `evidently`          |
 | [Great Expectations](great-expectations.md) | <p>data profiling<br>data quality</p>                                 | tabular: `pandas.DataFrame`                                                                              | N/A                                                                                           | Perform data testing, documentation and profiling with Great Expectations                           | `great_expectations` |
-| [Whylogs/WhyLabs](whylogs.md)               | data drift                                                            | tabular: `pandas.DataFrame`                                                                              | N/A                                                                                           | Generate data profiles with whylogs and upload them to WhyLabs                                      | `whylogs`            |
+| [Whylogs/WhyLabs](whylogs.md)               | data drift                                                            | tabular: `pandas.DataFrame`                                                                              | N/A                                                                    | Generate data profiles with whylogs. Hosted WhyLabs platform is being discontinued after Apple's acquisition—see the integration page for OSS deployment options.                                      | `whylogs`            |
 
 If you would like to see the available flavors of Data Validator, you can use the command:
 
diff --git a/docs/book/component-guide/data-validators/whylogs.md b/docs/book/component-guide/data-validators/whylogs.md
@@ -6,11 +6,13 @@ description: >-
 
 # Whylogs
 
-The whylogs/WhyLabs [Data Validator](./) flavor provided with the ZenML integration uses [whylogs](https://whylabs.ai/whylogs) and [WhyLabs](https://whylabs.ai) to generate and track data profiles, highly accurate descriptive representations of your data. The profiles can be used to implement automated corrective actions in your pipelines, or to render interactive representations for further visual interpretation, evaluation and documentation.
+The whylogs/WhyLabs [Data Validator](./) flavor provided with the ZenML integration uses the open-source [whylogs](https://github.com/whylabs/whylogs) library together with the now open-sourced [WhyLabs platform](https://github.com/whylabs/whylabs-oss) to generate and track data profiles, highly accurate descriptive representations of your data. The profiles can be used to implement automated corrective actions in your pipelines, or to render interactive representations for further visual interpretation, evaluation and documentation.
+
+> **Warning:** [WhyLabs was acquired by Apple](https://whylabs.ai/) and the hosted WhyLabs platform is being discontinued. While the whylogs library remains open source and the WhyLabs platform source code is publicly available, hosted deployments may no longer be accessible. Make sure to plan your usage of the integration accordingly and consider self-hosting the OSS platform if you still need WhyLabs features.
 
 ### When would you want to use it?
 
-[Whylogs](https://whylabs.ai/whylogs) is an open-source library that analyzes your data and creates statistical summaries called whylogs profiles. Whylogs profiles can be processed in your pipelines and visualized locally or uploaded to the [WhyLabs platform](https://whylabs.ai/), where more in depth analysis can be carried out. Even though [whylogs also supports other data types](https://github.com/whylabs/whylogs#data-types), the ZenML whylogs integration currently only works with tabular data in `pandas.DataFrame` format.
+[Whylogs](https://github.com/whylabs/whylogs) is an open-source library that analyzes your data and creates statistical summaries called whylogs profiles. Whylogs profiles can be processed in your pipelines and visualized locally or uploaded to a WhyLabs deployment for more in depth analysis. The official hosted WhyLabs service is being discontinued, but you can continue to operate a WhyLabs instance yourself by using the open-source release at [https://github.com/whylabs/whylabs-oss](https://github.com/whylabs/whylabs-oss). Even though [whylogs also supports other data types](https://github.com/whylabs/whylogs#data-types), the ZenML whylogs integration currently only works with tabular data in `pandas.DataFrame` format.
 
 You should use the whylogs/WhyLabs Data Validator when you need the following data validation features that are possible with whylogs and WhyLabs:
 
@@ -28,7 +30,7 @@ The whylogs Data Validator flavor is included in the whylogs ZenML integration,
 zenml integration install whylogs -y
 ```
 
-If you don't need to connect to the WhyLabs platform to upload and store the generated whylogs data profiles, the Data Validator stack component does not require any configuration parameters. Adding it to a stack is as simple as running e.g.:
+If you don't need to connect to a WhyLabs deployment to upload and store the generated whylogs data profiles, the Data Validator stack component does not require any configuration parameters. Adding it to a stack is as simple as running e.g.:
 
 ```shell
 # Register the whylogs data validator
@@ -38,7 +40,7 @@ zenml data-validator register whylogs_data_validator --flavor=whylogs
 zenml stack register custom_stack -dv whylogs_data_validator ... --set
 ```
 
-Adding WhyLabs logging capabilities to your whylogs Data Validator is just slightly more complicated, as you also need to create a [ZenML Secret](https://docs.zenml.io/getting-started/deploying-zenml/secret-management) to store the sensitive WhyLabs authentication information in a secure location and then reference the secret in the Data Validator configuration. To generate a WhyLabs access token, you can follow [the official WhyLabs instructions documented here](https://docs.whylabs.ai/docs/whylabs-api/#creating-an-api-token) .
+Adding WhyLabs logging capabilities to your whylogs Data Validator is just slightly more complicated, as you also need to create a [ZenML Secret](https://docs.zenml.io/getting-started/deploying-zenml/secret-management) to store the sensitive WhyLabs authentication information in a secure location and then reference the secret in the Data Validator configuration. To generate a WhyLabs access token for a deployment that you host yourself, refer to the guidance in the [WhyLabs OSS repository](https://github.com/whylabs/whylabs-oss).
 
 Then, you can register the whylogs Data Validator with WhyLabs logging capabilities as follows:
 
@@ -53,7 +55,7 @@ zenml data-validator register whylogs_data_validator --flavor=whylogs \
     --authentication_secret=whylabs_secret
 ```
 
-You'll also need to enable whylabs logging for your custom pipeline steps if you want to upload the whylogs data profiles that they return as artifacts to the WhyLabs platform. This is enabled by default for the standard whylogs step. For custom steps, you can enable WhyLabs logging by setting the `upload_to_whylabs` parameter to `True` in the step configuration, e.g.:
+You'll also need to enable whylabs logging for your custom pipeline steps if you want to upload the whylogs data profiles that they return as artifacts to your WhyLabs deployment. This is enabled by default for the standard whylogs step. For custom steps, you can enable WhyLabs logging by setting the `upload_to_whylabs` parameter to `True` in the step configuration, e.g.:
 
 ```python
 from typing import Annotated
@@ -104,7 +106,7 @@ You can [visualize whylogs profiles](whylogs.md#visualizing-whylogs-profiles) in
 
 #### The whylogs standard step
 
-ZenML wraps the whylogs/WhyLabs functionality in the form of a standard `WhylogsProfilerStep` step. The only field in the step config is a `dataset_timestamp` attribute which is only relevant when you upload the profiles to WhyLabs that uses this field to group and merge together profiles belonging to the same dataset. The helper function `get_whylogs_profiler_step` used to create an instance of this standard step takes in an optional `dataset_id` parameter that is also used only in the context of WhyLabs upload to identify the model in the context of which the profile is uploaded, e.g.:
+ZenML wraps the whylogs/WhyLabs functionality in the form of a standard `WhylogsProfilerStep` step. The only field in the step config is a `dataset_timestamp` attribute which is only relevant when you upload the profiles to a WhyLabs deployment that uses this field to group and merge together profiles belonging to the same dataset. The helper function `get_whylogs_profiler_step` used to create an instance of this standard step takes in an optional `dataset_id` parameter that is also used only in the context of WhyLabs uploads to identify the model in the context of which the profile is uploaded, e.g.:
 
 ```python
 from zenml.integrations.whylogs.steps import get_whylogs_profiler_step
@@ -149,7 +151,7 @@ You can view [the complete list of configuration parameters](https://sdkdocs.zen
 
 The whylogs Data Validator implements the same interface as do all Data Validators, so this method forces you to maintain some level of compatibility with the overall Data Validator abstraction, which guarantees an easier migration in case you decide to switch to another Data Validator.
 
-All you have to do is call the whylogs Data Validator methods when you need to interact with whylogs to generate data profiles. You may optionally enable whylabs logging to automatically upload the returned whylogs profile to WhyLabs, e.g.:
+All you have to do is call the whylogs Data Validator methods when you need to interact with whylogs to generate data profiles. You may optionally enable whylabs logging to automatically upload the returned whylogs profile to your WhyLabs deployment, e.g.:
 
 ```python
 
@@ -191,7 +193,7 @@ def data_profiler(
     profile = data_validator.data_profiling(
         dataset,
     )
-    # optionally upload the profile to WhyLabs, if WhyLabs credentials are configured
+    # optionally upload the profile to your WhyLabs deployment, if WhyLabs credentials are configured
     data_validator.upload_profile_view(profile)
 
     # validation post-processing (e.g. interpret results, take actions) can happen here
@@ -203,7 +205,7 @@ Have a look at [the complete list of methods and parameters available in the `Wh
 
 #### Call whylogs directly
 
-You can use the whylogs library directly in your custom pipeline steps, and only leverage ZenML's capability of serializing, versioning and storing the `DatasetProfileView` objects in its Artifact Store. You may optionally enable whylabs logging to automatically upload the returned whylogs profile to WhyLabs, e.g.:
+You can use the whylogs library directly in your custom pipeline steps, and only leverage ZenML's capability of serializing, versioning and storing the `DatasetProfileView` objects in its Artifact Store. You may optionally enable whylabs logging to automatically upload the returned whylogs profile to your WhyLabs deployment, e.g.:
 
 ```python
 
diff --git a/src/zenml/cli/integration.py b/src/zenml/cli/integration.py
@@ -37,6 +37,14 @@
 from zenml.enums import CliCategories
 from zenml.logger import get_logger
 
+_WHYLOGS_INTEGRATION_WARNING = (
+    "WhyLabs was acquired by Apple and the hosted WhyLabs platform is being "
+    "discontinued. The whylogs library remains open source and the WhyLabs "
+    "platform is now available as OSS at "
+    "https://github.com/whylabs/whylabs-oss, but hosted functionality may stop "
+    "working. Plan accordingly before continuing with the whylogs integration."
+)
+
 logger = get_logger(__name__)
 
 
@@ -198,6 +206,9 @@ def export_requirements(
                     f"{all_integrations}"
                 )
 
+    if "whylogs" in integrations_to_export:
+        warning(_WHYLOGS_INTEGRATION_WARNING)
+
     requirements = []
     for integration_name in integrations_to_export:
         try:
@@ -318,6 +329,9 @@ def install(
     else:
         integration_set = set(integrations)
 
+    if "whylogs" in integration_set:
+        warning(_WHYLOGS_INTEGRATION_WARNING)
+
     if sys.version_info.minor == 12 and "tensorflow" in integration_set:
         warning(
             "The TensorFlow integration is not yet compatible with Python "