-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persistent cache with a polars dataframe #2661
Comments
No, this looks like a bug, marimo should detect whether the object is serializable in the way it expects. This exception is thrown when there's that discrepancy. There's a bit of dataframe checking logic under the hood, so I think this might be solved by moving that logic to narwhals Thanks for the easily reproducible code. You may be able to get around this by putting defining df in a separate cell in the meantime. |
Also, quick question, I notice the cached dataframe is saved as a pickle, could it be saved as a parquet for better performance/memory usage? Thanks for your help! |
Sure, I don't think any given file format should replace pickle, but maybe we'll expose a setting to choose a "loader" type. Here's the pickle loader for your reference, I don't think it'd be too tricky to implement for any given storage type: https://github.com/marimo-team/marimo/blob/main/marimo/_save/loaders/pickle.py Couple other thoughts were npz, dill, and remote cache. If you did want to play with this, the undocumented keyword arg marimo/tests/_save/test_cache.py Line 49 in 45056be
|
This has fixes for: - [x] Shadowed arguments - [x] Formatting causing issues with context block: #2633 - [x] improved df "object detection": #2661 Following PR changes: - Detect when execution hash relies on a another hash object (cache breaking) (#3270) - Allow for pickle hash as fallback for "unhashable" variables (#3270) - Expand `@persistent_cache` api (this shouldn't cache bust, so I might just follow up) (#2653) --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Closed by #3480 Thanks for reporting these! |
Describe the bug
Hi,
I'm trying to save a polars dataframe in cache using the following operation.
I get TypeError("Cannot change data-type for object array.") (sorry I can't post the whole traceback, issue at line 217 in data_to_buffer in hash.py)
Is that expected?
A monkey patch that works is:
Thanks,
Adrien
Environment
Marimo 0.9.10
Code to reproduce
See above.
The text was updated successfully, but these errors were encountered: