Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't load Oxford Pets dataset due to NotImplementedError #5717

Closed
vfdev-5 opened this issue Nov 6, 2024 · 5 comments
Closed

Can't load Oxford Pets dataset due to NotImplementedError #5717

vfdev-5 opened this issue Nov 6, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@vfdev-5
Copy link

vfdev-5 commented Nov 6, 2024

Short description

The following code is not working:

dataset, info = tfds.load('oxford_iiit_pet:4.*.*', with_info=True, download=True, data_dir="/tmp/data")

Error message:

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/file_adapters.py:301](http://localhost:2211/lab/tree/jax-ai-stack/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/file_adapters.py#line=300), in ArrayRecordFileAdapter.make_tf_data(cls, filename, buffer_size)
    294 @classmethod
    295 def make_tf_data(
    296     cls,
    297     filename: epath.PathLike,
    298     buffer_size: int | None = None,
    299 ) -> tf.data.Dataset:
    300   """Returns TensorFlow Dataset comprising given array record file."""
--> 301   raise NotImplementedError(
    302       '`.as_dataset()` not implemented for ArrayRecord files. Please, use'
    303       ' `.as_data_source()`.'
    304   )

NotImplementedError: `.as_dataset()` not implemented for ArrayRecord files. Please, use `.as_data_source()`.

Environment information

  • Operating System: ubuntu

  • Python version: 3.11.9

  • tensorflow-datasets/tfds-nightly version: tfds-nightly 4.9.7+nightly

  • tensorflow/tf-nightly version: tf-nightly 2.19.0-dev20241105

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ?

yes

Reproduction instructions

import tensorflow_datasets as tfds

dataset, info = tfds.load('oxford_iiit_pet:4.*.*', with_info=True, download=True, data_dir="/tmp/data")

Link to logs

Error message
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[3], line 2
      1 get_ipython().system('mkdir -p [/tmp/data](http://localhost:2213/tmp/data)')
----> 2 dataset, info = tfds.load('oxford_iiit_pet:4.*.*', with_info=True, download=True, data_dir="[/tmp/data](http://localhost:2213/tmp/data)")

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py:176](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py#line=175), in _FunctionDecorator.__call__(self, function, instance, args, kwargs)
    174 metadata = self._start_call()
    175 try:
--> 176   return function(*args, **kwargs)
    177 except Exception:
    178   metadata.mark_error()

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/load.py:670](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/load.py#line=669), in load(name, split, data_dir, batch_size, shuffle_files, download, as_supervised, decoders, read_config, with_info, builder_kwargs, download_and_prepare_kwargs, as_dataset_kwargs, try_gcs)
    667 as_dataset_kwargs.setdefault('shuffle_files', shuffle_files)
    668 as_dataset_kwargs.setdefault('read_config', read_config)
--> 670 ds = dbuilder.as_dataset(**as_dataset_kwargs)
    671 if with_info:
    672   return ds, dbuilder.info

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py:176](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/logging/__init__.py#line=175), in _FunctionDecorator.__call__(self, function, instance, args, kwargs)
    174 metadata = self._start_call()
    175 try:
--> 176   return function(*args, **kwargs)
    177 except Exception:
    178   metadata.mark_error()

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_builder.py:1025](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_builder.py#line=1024), in DatasetBuilder.as_dataset(self, split, batch_size, shuffle_files, decoders, read_config, as_supervised)
   1016 # Create a dataset for each of the given splits
   1017 build_single_dataset = functools.partial(
   1018     self._build_single_dataset,
   1019     shuffle_files=shuffle_files,
   (...)
   1023     as_supervised=as_supervised,
   1024 )
-> 1025 all_ds = tree.map_structure(build_single_dataset, split)
   1026 return all_ds

File [/opt/conda/lib/python3.11/site-packages/tree/__init__.py:435](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tree/__init__.py#line=434), in map_structure(func, *structures, **kwargs)
    432 for other in structures[1:]:
    433   assert_same_structure(structures[0], other, check_types=check_types)
    434 return unflatten_as(structures[0],
--> 435                     [func(*args) for args in zip(*map(flatten, structures))])

File [/opt/conda/lib/python3.11/site-packages/tree/__init__.py:435](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tree/__init__.py#line=434), in <listcomp>(.0)
    432 for other in structures[1:]:
    433   assert_same_structure(structures[0], other, check_types=check_types)
    434 return unflatten_as(structures[0],
--> 435                     [func(*args) for args in zip(*map(flatten, structures))])

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_builder.py:1043](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_builder.py#line=1042), in DatasetBuilder._build_single_dataset(self, split, batch_size, shuffle_files, decoders, read_config, as_supervised)
   1040   batch_size = self.info.splits.total_num_examples or sys.maxsize
   1042 # Build base dataset
-> 1043 ds = self._as_dataset(
   1044     split=split,
   1045     shuffle_files=shuffle_files,
   1046     decoders=decoders,
   1047     read_config=read_config,
   1048 )
   1049 # Auto-cache small datasets which are small enough to fit in memory.
   1050 if self._should_cache_ds(
   1051     split=split, shuffle_files=shuffle_files, read_config=read_config
   1052 ):

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_builder.py:1497](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/dataset_builder.py#line=1496), in FileReaderBuilder._as_dataset(self, split, decoders, read_config, shuffle_files)
   1491 reader = reader_lib.Reader(
   1492     self.data_dir,
   1493     example_specs=example_specs,
   1494     file_format=self.info.file_format,
   1495 )
   1496 decode_fn = functools.partial(features.decode_example, decoders=decoders)
-> 1497 return reader.read(
   1498     instructions=split,
   1499     split_infos=self.info.splits.values(),
   1500     decode_fn=decode_fn,
   1501     read_config=read_config,
   1502     shuffle_files=shuffle_files,
   1503     disable_shuffling=self.info.disable_shuffling,
   1504 )

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/reader.py:430](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/reader.py#line=429), in Reader.read(self, instructions, split_infos, read_config, shuffle_files, disable_shuffling, decode_fn)
    421   file_instructions = splits_dict[instruction].file_instructions
    422   return self.read_files(
    423       file_instructions,
    424       read_config=read_config,
   (...)
    427       decode_fn=decode_fn,
    428   )
--> 430 return tree.map_structure(_read_instruction_to_ds, instructions)

File [/opt/conda/lib/python3.11/site-packages/tree/__init__.py:435](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tree/__init__.py#line=434), in map_structure(func, *structures, **kwargs)
    432 for other in structures[1:]:
    433   assert_same_structure(structures[0], other, check_types=check_types)
    434 return unflatten_as(structures[0],
--> 435                     [func(*args) for args in zip(*map(flatten, structures))])

File [/opt/conda/lib/python3.11/site-packages/tree/__init__.py:435](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tree/__init__.py#line=434), in <listcomp>(.0)
    432 for other in structures[1:]:
    433   assert_same_structure(structures[0], other, check_types=check_types)
    434 return unflatten_as(structures[0],
--> 435                     [func(*args) for args in zip(*map(flatten, structures))])

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/reader.py:422](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/reader.py#line=421), in Reader.read.<locals>._read_instruction_to_ds(instruction)
    420 def _read_instruction_to_ds(instruction):
    421   file_instructions = splits_dict[instruction].file_instructions
--> 422   return self.read_files(
    423       file_instructions,
    424       read_config=read_config,
    425       shuffle_files=shuffle_files,
    426       disable_shuffling=disable_shuffling,
    427       decode_fn=decode_fn,
    428   )

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/reader.py:462](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/reader.py#line=461), in Reader.read_files(self, file_instructions, read_config, shuffle_files, disable_shuffling, decode_fn)
    459   raise ValueError(msg)
    461 # Read serialized example (eventually with `tfds_id`)
--> 462 ds = _read_files(
    463     file_instructions=file_instructions,
    464     read_config=read_config,
    465     shuffle_files=shuffle_files,
    466     disable_shuffling=disable_shuffling,
    467     file_format=self._file_format,
    468 )
    470 # Parse and decode
    471 def parse_and_decode(ex: Tensor) -> TreeDict[Tensor]:
    472   # TODO(pierrot): `parse_example` uses
    473   # `tf.io.parse_single_example`. It might be faster to use `parse_example`,
    474   # after batching.
    475   # https://www.tensorflow.org/api_docs/python/tf/io/parse_example

File /opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/reader.py:302, in _read_files(file_instructions, read_config, shuffle_files, disable_shuffling, file_format)
    295 if (
    296     shuffle_files
    297     and read_config.shuffle_seed is None
    298     and tf_compat.get_option_deterministic(read_config.options) is None
    299 ):
    300   deterministic = False
--> 302 ds = instruction_ds.interleave(
    303     functools.partial(
    304         _get_dataset_from_filename,
    305         do_skip=do_skip,
    306         do_take=do_take,
    307         file_format=file_format,
    308         add_tfds_id=read_config.add_tfds_id,
    309         override_buffer_size=read_config.override_buffer_size,
    310     ),
    311     cycle_length=cycle_length,
    312     block_length=block_length,
    313     num_parallel_calls=read_config.num_parallel_calls_for_interleave_files,
    314     deterministic=deterministic,
    315 )
    317 return assert_cardinality_and_apply_options(ds)

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/dataset_ops.py:2534](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/dataset_ops.py#line=2533), in DatasetV2.interleave(self, map_func, cycle_length, block_length, num_parallel_calls, deterministic, name)
   2530 # Loaded lazily due to a circular dependency (
   2531 # dataset_ops -> interleave_op -> dataset_ops).
   2532 # pylint: disable=g-import-not-at-top,protected-access
   2533 from tensorflow.python.data.ops import interleave_op
-> 2534 return interleave_op._interleave(self, map_func, cycle_length, block_length,
   2535                                  num_parallel_calls, deterministic, name)

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/interleave_op.py:49](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/interleave_op.py#line=48), in _interleave(input_dataset, map_func, cycle_length, block_length, num_parallel_calls, deterministic, name)
     46   return _InterleaveDataset(
     47       input_dataset, map_func, cycle_length, block_length, name=name)
     48 else:
---> 49   return _ParallelInterleaveDataset(
     50       input_dataset,
     51       map_func,
     52       cycle_length,
     53       block_length,
     54       num_parallel_calls,
     55       deterministic=deterministic,
     56       name=name)

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/interleave_op.py:119](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/interleave_op.py#line=118), in _ParallelInterleaveDataset.__init__(self, input_dataset, map_func, cycle_length, block_length, num_parallel_calls, buffer_output_elements, prefetch_input_elements, deterministic, name)
    117 """See `Dataset.interleave()` for details."""
    118 self._input_dataset = input_dataset
--> 119 self._map_func = structured_function.StructuredFunctionWrapper(
    120     map_func, self._transformation_name(), dataset=input_dataset)
    121 if not isinstance(self._map_func.output_structure, dataset_ops.DatasetSpec):
    122   raise TypeError(
    123       "The `map_func` argument must return a `Dataset` object. Got "
    124       f"{dataset_ops.get_type(self._map_func.output_structure)!r}.")

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/structured_function.py:265](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/structured_function.py#line=264), in StructuredFunctionWrapper.__init__(self, func, transformation_name, dataset, input_classes, input_shapes, input_types, input_structure, add_to_graph, use_legacy_function, defun_kwargs)
    258       warnings.warn(
    259           "Even though the `tf.config.experimental_run_functions_eagerly` "
    260           "option is set, this option does not apply to tf.data functions. "
    261           "To force eager execution of tf.data functions, please use "
    262           "`tf.data.experimental.enable_debug_mode()`.")
    263     fn_factory = trace_tf_function(defun_kwargs)
--> 265 self._function = fn_factory()
    266 # There is no graph to add in eager mode.
    267 add_to_graph &= not context.executing_eagerly()

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:1256](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py#line=1255), in Function.get_concrete_function(self, *args, **kwargs)
   1254 def get_concrete_function(self, *args, **kwargs):
   1255   # Implements PolymorphicFunction.get_concrete_function.
-> 1256   concrete = self._get_concrete_function_garbage_collected(*args, **kwargs)
   1257   concrete._garbage_collector.release()  # pylint: disable=protected-access
   1258   return concrete

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:1226](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py#line=1225), in Function._get_concrete_function_garbage_collected(self, *args, **kwargs)
   1224   if self._variable_creation_config is None:
   1225     initializers = []
-> 1226     self._initialize(args, kwargs, add_initializers_to=initializers)
   1227     self._initialize_uninitialized_variables(initializers)
   1229 if self._created_variables:
   1230   # In this case we have created variables on the first call, so we run the
   1231   # version which is guaranteed to never create variables.

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:696](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py#line=695), in Function._initialize(self, args, kwds, add_initializers_to)
    691 self._variable_creation_config = self._generate_scoped_tracing_options(
    692     variable_capturing_scope,
    693     tracing_compilation.ScopeType.VARIABLE_CREATION,
    694 )
    695 # Force the definition of the function for these arguments
--> 696 self._concrete_variable_creation_fn = tracing_compilation.trace_function(
    697     args, kwds, self._variable_creation_config
    698 )
    700 def invalid_creator_scope(*unused_args, **unused_kwds):
    701   """Disables variable creation."""

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:178](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py#line=177), in trace_function(args, kwargs, tracing_options)
    175     args = tracing_options.input_signature
    176     kwargs = {}
--> 178   concrete_function = _maybe_define_function(
    179       args, kwargs, tracing_options
    180   )
    182 if not tracing_options.bind_graph_to_function:
    183   concrete_function._garbage_collector.release()  # pylint: disable=protected-access

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:283](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py#line=282), in _maybe_define_function(args, kwargs, tracing_options)
    281 else:
    282   target_func_type = lookup_func_type
--> 283 concrete_function = _create_concrete_function(
    284     target_func_type, lookup_func_context, func_graph, tracing_options
    285 )
    287 if tracing_options.function_cache is not None:
    288   tracing_options.function_cache.add(
    289       concrete_function, current_func_context
    290   )

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py:310](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/tracing_compilation.py#line=309), in _create_concrete_function(function_type, type_context, func_graph, tracing_options)
    303   placeholder_bound_args = function_type.placeholder_arguments(
    304       placeholder_context
    305   )
    307 disable_acd = tracing_options.attributes and tracing_options.attributes.get(
    308     attributes_lib.DISABLE_ACD, False
    309 )
--> 310 traced_func_graph = func_graph_module.func_graph_from_py_func(
    311     tracing_options.name,
    312     tracing_options.python_function,
    313     placeholder_bound_args.args,
    314     placeholder_bound_args.kwargs,
    315     None,
    316     func_graph=func_graph,
    317     add_control_dependencies=not disable_acd,
    318     arg_names=function_type_utils.to_arg_names(function_type),
    319     create_placeholders=False,
    320 )
    322 transform.apply_func_graph_transforms(traced_func_graph)
    324 graph_capture_container = traced_func_graph.function_captures

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py:1059](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/framework/func_graph.py#line=1058), in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, create_placeholders)
   1056   return x
   1058 _, original_func = tf_decorator.unwrap(python_func)
-> 1059 func_outputs = python_func(*func_args, **func_kwargs)
   1061 # invariant: `func_outputs` contains only Tensors, CompositeTensors,
   1062 # TensorArrays and `None`s.
   1063 func_outputs = variable_utils.convert_variables_to_tensors(func_outputs)

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py:599](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/eager/polymorphic_function/polymorphic_function.py#line=598), in Function._generate_scoped_tracing_options.<locals>.wrapped_fn(*args, **kwds)
    595 with default_graph._variable_creator_scope(scope, priority=50):  # pylint: disable=protected-access
    596   # __wrapped__ allows AutoGraph to swap in a converted function. We give
    597   # the function a weak reference to itself to avoid a reference cycle.
    598   with OptionalXlaContext(compile_with_xla):
--> 599     out = weak_wrapped_fn().__wrapped__(*args, **kwds)
    600   return out

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/structured_function.py:231](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/structured_function.py#line=230), in StructuredFunctionWrapper.__init__.<locals>.trace_tf_function.<locals>.wrapped_fn(*args)
    230 def wrapped_fn(*args):  # pylint: disable=missing-docstring
--> 231   ret = wrapper_helper(*args)
    232   ret = structure.to_tensor_list(self._output_structure, ret)
    233   return [ops.convert_to_tensor(t) for t in ret]

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/structured_function.py:161](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/data/ops/structured_function.py#line=160), in StructuredFunctionWrapper.__init__.<locals>.wrapper_helper(*args)
    159 if not _should_unpack(nested_args):
    160   nested_args = (nested_args,)
--> 161 ret = autograph.tf_convert(self._func, ag_ctx)(*nested_args)
    162 ret = variable_utils.convert_variables_to_tensors(ret)
    163 if _should_pack(ret):

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:690](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py#line=689), in convert.<locals>.decorator.<locals>.wrapper(*args, **kwargs)
    688 try:
    689   with conversion_ctx:
--> 690     return converted_call(f, args, kwargs, options=options)
    691 except Exception as e:  # pylint:disable=broad-except
    692   if hasattr(e, 'ag_error_metadata'):

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:352](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py#line=351), in converted_call(f, args, kwargs, caller_fn_scope, options)
    349   new_args = f.args + args
    350   logging.log(3, 'Forwarding call of partial %s with\n%s\n%s\n', f, new_args,
    351               new_kwargs)
--> 352   return converted_call(
    353       f.func,
    354       new_args,
    355       new_kwargs,
    356       caller_fn_scope=caller_fn_scope,
    357       options=options)
    359 if inspect_utils.isbuiltin(f):
    360   if f is eval:

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:377](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py#line=376), in converted_call(f, args, kwargs, caller_fn_scope, options)
    374   return _call_unconverted(f, args, kwargs, options)
    376 if not options.user_requested and conversion.is_allowlisted(f):
--> 377   return _call_unconverted(f, args, kwargs, options)
    379 # internal_convert_user_code is for example turned off when issuing a dynamic
    380 # call conversion from generated code while in nonrecursive mode. In that
    381 # case we evidently don't want to recurse, but we still have to convert
    382 # things like builtins.
    383 if not options.internal_convert_user_code:

File [/opt/conda/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py:459](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow/python/autograph/impl/api.py#line=458), in _call_unconverted(f, args, kwargs, options, update_cache)
    456   return f.__self__.call(args, kwargs)
    458 if kwargs is not None:
--> 459   return f(*args, **kwargs)
    460 return f(*args)

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/reader.py:69](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/reader.py#line=68), in _get_dataset_from_filename(instruction, do_skip, do_take, file_format, add_tfds_id, override_buffer_size)
     60 def _get_dataset_from_filename(
     61     instruction: _Instruction,
     62     do_skip: bool,
   (...)
     66     override_buffer_size: Optional[int] = None,
     67 ) -> tf.data.Dataset:
     68   """Returns a tf.data.Dataset instance from given instructions."""
---> 69   ds = file_adapters.ADAPTER_FOR_FORMAT[file_format].make_tf_data(
     70       instruction.filepath, buffer_size=override_buffer_size
     71   )
     72   if do_skip:
     73     ds = ds.skip(instruction.skip)

File [/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/file_adapters.py:301](http://localhost:2213/opt/conda/lib/python3.11/site-packages/tensorflow_datasets/core/file_adapters.py#line=300), in ArrayRecordFileAdapter.make_tf_data(cls, filename, buffer_size)
    294 @classmethod
    295 def make_tf_data(
    296     cls,
    297     filename: epath.PathLike,
    298     buffer_size: int | None = None,
    299 ) -> tf.data.Dataset:
    300   """Returns TensorFlow Dataset comprising given array record file."""
--> 301   raise NotImplementedError(
    302       '`.as_dataset()` not implemented for ArrayRecord files. Please, use'
    303       ' `.as_data_source()`.'
    304   )

NotImplementedError: `.as_dataset()` not implemented for ArrayRecord files. Please, use `.as_data_source()`.
@vfdev-5 vfdev-5 added the bug Something isn't working label Nov 6, 2024
@dsaha21
Copy link

dsaha21 commented Nov 17, 2024

Hi Team,

Just like @vfdev-5 mentioned, I am also facing the same issue. Also adding the issue thread #5416

@fineguy
Copy link
Collaborator

fineguy commented Dec 10, 2024

@vfdev-5 @dsaha21 sorry for the late reply, but could you try updating to the most recent version with pip install --upgrade tfds-nightly? I think this issues has been fixed.

@fineguy fineguy self-assigned this Dec 10, 2024
@dsaha21
Copy link

dsaha21 commented Dec 10, 2024

Hi @fineguy,

I tried to perform just like you mentioned above.

Seg1

However, when I am splitting the dataset into train and test, the train shape is coming like the following. I dont know might be I am making a mistake. Can you please verify once if possible. Thanks

Seg2

@fineguy
Copy link
Collaborator

fineguy commented Dec 10, 2024

@dsaha21 did you expect a different result? It seems to be correct according to https://www.tensorflow.org/datasets/api_docs/python/tfds/load#args

@dsaha21
Copy link

dsaha21 commented Dec 10, 2024

Okay @fineguy , Thanks a lot for the help 👍

@fineguy fineguy closed this as completed Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants