You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The in_set predicate raises the error unhashable type: 'Series' when used with make_batch_reader and make_petastorm_dataset. I am using pandas 1.3.5. See below for a minimal working example.
import pandas as pd
from petastorm.predicates import in_set
from petastorm import make_batch_reader
from petastorm.tf_utils import make_petastorm_dataset
output_url='file:///tmp/hello_world_dataset'
hello_world = pd.DataFrame({'id': [i for i in range(100)]})
hello_world.to_parquet(output_url)
predicate_id = in_set([1,2,3,4,5],'id')
with make_batch_reader(output_url,num_epochs=1,workers_count=1,predicate=predicate_id) as reader:
ds = make_petastorm_dataset(reader)
train_values = list(ds.as_numpy_iterator())
For me, the issue is resolved by applying the in operator elementwise in the predicates.in_set function:
The
in_set
predicate raises the error unhashable type: 'Series' when used withmake_batch_reader
andmake_petastorm_dataset
. I am using pandas 1.3.5. See below for a minimal working example.For me, the issue is resolved by applying the
in
operator elementwise in thepredicates.in_set
function:Instead of the whole dataframe at once:
The text was updated successfully, but these errors were encountered: