Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue while loading demo data #448

Open
laresbernardo opened this issue Jan 29, 2025 · 2 comments
Open

Issue while loading demo data #448

laresbernardo opened this issue Jan 29, 2025 · 2 comments

Comments

@laresbernardo
Copy link

laresbernardo commented Jan 29, 2025

Really glad to start testing meridian.

I've started testing meridian with the demo and dummy data, following the notebook provided with Collab. I have encountered the following breaking issue while loading the data:

data = loader.load()`
TypeError                                 Traceback (most recent call last)
[<ipython-input-7-ca0a8a955fd6>](https://localhost:8080/#) in <cell line: 0>()
      6     media_spend_to_channel=correct_media_spend_to_channel,
      7 )
----> 8 data = loader.load()

2 frames
[/usr/local/lib/python3.11/dist-packages/meridian/data/load.py](https://localhost:8080/#) in load(self)
   1396     """Reads data from a CSV file and returns an `InputData` object."""
   1397 
-> 1398     return self._df_loader.load()

[/usr/local/lib/python3.11/dist-packages/meridian/data/load.py](https://localhost:8080/#) in load(self)
   1042     controls_xr = (
   1043         df_indexed[self.coord_to_columns.controls]
-> 1044         .stack()
   1045         .rename(constants.CONTROLS)
   1046         .rename_axis(

[/usr/local/lib/python3.11/dist-packages/pandas/core/frame.py](https://localhost:8080/#) in stack(self, level, dropna, sort, future_stack)
   9698         Notes
   9699         -----
-> 9700         If a list of dict/series is passed and the keys are all contained in
   9701         the DataFrame's index, the order of the columns in the resulting
   9702         DataFrame will be unchanged.

TypeError: stack() got an unexpected keyword argument 'sort'

Given I feel more comfortable in RStudio/R, I tested the code using reticulate, and got a very similar error in the same step. (I can share the code afterwards if it helps).

── Python Exception Message ─────────────────────────────────────────────────────────────────────────────────
Traceback (most recent call last):
  File "...\meridian\meridian\data\load.py", line 1398, in load
    return self._df_loader.load()
  File "...\meridian\meridian\data\load.py", line 1057, in load
    .stack()
  File "...\DOCUME~1\VIRTUA~1\R-RETI~1\lib\site-packages\pandas\core\generic.py", line 5902, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'stack'

── R Traceback ──────────────────────────────────────────────────────────────────────────────────────────────
    ▆
 1. └─loader$load()
 2.   └─reticulate:::py_call_impl(callable, call_args$unnamed, call_args$named)
See `reticulate::py_last_error()$r_trace$full_call` for more details.

I'm testing latest dev version from Github and tested with pip as well. Also, pandas updated to v2.2.3. Sorry if it's a dumb question instead of a bug.

@laresbernardo
Copy link
Author

After some clumsy testing, changing this chunk with the following, fixed the issue.

    if self.coord_to_columns.non_media_treatments is not None:
      non_media_column = df_indexed[self.coord_to_columns.non_media_treatments]
      if isinstance(non_media_column, pd.Series):
          non_media_column = non_media_column.to_frame()
      non_media_xr = (
          non_media_column
          .stack()
          .rename(constants.NON_MEDIA_TREATMENTS)
          .rename_axis(
              [constants.GEO, constants.TIME, constants.NON_MEDIA_CHANNEL]
          )
          .to_frame()
          .to_xarray()
      )
      dataset = xr.combine_by_coords([dataset, non_media_xr])

Happy to submit a PR.

@cpulavarthi
Copy link
Collaborator

Hello @laresbernardo,

Thank you for contacting us!

We appreciate your enthusiasm for testing Meridian and welcome your feedback. It looks like the issues you are observing are related to pandas updated to v2.2.3. Please be advised that Meridian currently only supports pandas >= 1.5.3, < 2. For local installations, we recommend to install Meridian using pip in a fresh virtual environment to make sure that correct versions of all the dependencies are installed, as defined in pyproject.toml.

We are not yet accepting PRs but will merge your suggestion from our end. We are working internally on supporting pandas v2+ and after this is completed, your issue should be resolved too.

Thank you

Google Meridian Support Team

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants