Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failing test: test_beam_hermetic_loading #800

Open
csbrown opened this issue Jan 27, 2025 · 1 comment
Open

failing test: test_beam_hermetic_loading #800

csbrown opened this issue Jan 27, 2025 · 1 comment

Comments

@csbrown
Copy link

csbrown commented Jan 27, 2025

Getting the following error:

mlcroissant/_src/datasets_test.py:156: in load_records_with_beam_and_test_equality
    with test_pipeline.TestPipeline() as pipeline:
../../.venv/lib/python3.11/site-packages/apache_beam/pipeline.py:620: in __exit__
    self.result = self.run()
../../.venv/lib/python3.11/site-packages/apache_beam/testing/test_pipeline.py:115: in run
    result = super().run(
../../.venv/lib/python3.11/site-packages/apache_beam/pipeline.py:567: in run
    return Pipeline.from_runner_api(
../../.venv/lib/python3.11/site-packages/apache_beam/pipeline.py:1015: in from_runner_api
    p.transforms_stack = [context.transforms.get_by_id(root_transform_id)]
../../.venv/lib/python3.11/site-packages/apache_beam/runners/pipeline_context.py:106: in get_by_id
    self._id_to_obj[id] = self._obj_type.from_runner_api(
../../.venv/lib/python3.11/site-packages/apache_beam/pipeline.py:1451: in from_runner_api
    part = context.transforms.get_by_id(transform_id)
../../.venv/lib/python3.11/site-packages/apache_beam/runners/pipeline_context.py:106: in get_by_id
    self._id_to_obj[id] = self._obj_type.from_runner_api(
../../.venv/lib/python3.11/site-packages/apache_beam/pipeline.py:1451: in from_runner_api
    part = context.transforms.get_by_id(transform_id)
../../.venv/lib/python3.11/site-packages/apache_beam/runners/pipeline_context.py:106: in get_by_id
    self._id_to_obj[id] = self._obj_type.from_runner_api(
../../.venv/lib/python3.11/site-packages/apache_beam/pipeline.py:1451: in from_runner_api
    part = context.transforms.get_by_id(transform_id)
../../.venv/lib/python3.11/site-packages/apache_beam/runners/pipeline_context.py:106: in get_by_id
    self._id_to_obj[id] = self._obj_type.from_runner_api(
../../.venv/lib/python3.11/site-packages/apache_beam/pipeline.py:1437: in from_runner_api
    result = AppliedPTransform(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = AppliedPTransform(Create|no filter persons: compute the global index./Create/MaybeReshuffle, MaybeReshuffle)
parent = None, transform = <apache_beam.transforms.core.Create.expand.<locals>.MaybeReshuffle object at 0x7f0ca4c2b790>
full_label = 'Create|no filter persons: compute the global index./Create/MaybeReshuffle'
main_inputs = {'None': <PCollection[Create|no filter persons: compute the global index./Create/FlatMap(<lambda at core.py:3970>).None] at 0x7f0ca4d84550>}
environment_id = None
annotations = {'python_type': b'apache_beam.transforms.core.Create.expand.<locals>.MaybeReshuffle'}

    def __init__(
        self,
        parent,  # type:  Optional[AppliedPTransform]
        transform,  # type: Optional[ptransform.PTransform]
        full_label,  # type: str
        main_inputs,  # type: Optional[Mapping[str, Union[pvalue.PBegin, pvalue.PCollection]]]
        environment_id=None,  # type: Optional[str]
        annotations=None, # type: Optional[Dict[str, bytes]]
    ):
      # type: (...) -> None
      self.parent = parent
      self.transform = transform
      # Note that we want the PipelineVisitor classes to use the full_label,
      # inputs, side_inputs, and outputs fields from this instance instead of the
      # ones of the PTransform instance associated with it. Doing this permits
      # reusing PTransform instances in different contexts (apply() calls) without
      # any interference. This is particularly useful for composite transforms.
      self.full_label = full_label
      self.main_inputs = dict(main_inputs or {})

>     self.side_inputs = tuple() if transform is None else transform.side_inputs
E     AttributeError: 'MaybeReshuffle' object has no attribute 'side_inputs'

../../.venv/lib/python3.11/site-packages/apache_beam/pipeline.py:1141: AttributeError

Have only tried with Python 3.11, and I don't see a recommended python version in the pyproject.toml, so this may or may not just be some versioning issue.

Ubuntu 24.04 WSL
Python 3.11

@csbrown
Copy link
Author

csbrown commented Jan 27, 2025

@marcenacp do you have any thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant