Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Error with Azure DFP examples when using multiprocess download method #1145

Closed
2 tasks done
efajardo-nv opened this issue Aug 23, 2023 · 2 comments · Fixed by #1189
Closed
2 tasks done

[BUG]: Error with Azure DFP examples when using multiprocess download method #1145

efajardo-nv opened this issue Aug 23, 2023 · 2 comments · Fixed by #1189
Assignees
Labels
bug Something isn't working dfp [Workflow] Related to the Digital Fingerprinting (DFP) workflow

Comments

@efajardo-nv
Copy link
Contributor

efajardo-nv commented Aug 23, 2023

Version

23.11

Which installation method(s) does this occur on?

Docker

Describe the bug.

Azure DFP example training and inference pipelines fail during DFPFileToDataFrameStage when using multiprocess download method. Error is not seen with Duo examples.

Minimum reproducible example

  1. Follow instructions here to build DFP container.

  2. Create bash shell in container:

docker compose run morpheus_pipeline bash
  1. Download Azure example data:
python /workspace/examples/digital_fingerprinting/fetch_example_data.py azure
  1. Set to multiprocess download method
export MORPHEUS_FILE_DOWNLOAD_TYPE=multiprocess
  1. Ensure there is no cached download data:
cd /workspace/examples/digital_fingerprinting/production/morpheus
rm -rf .cache
  1. Run Azure DFP training example pipeline:
python dfp_azure_pipeline.py --train_users generic --start_time "2022-08-01" --input_file="../../../data/dfp/azure-training-data/AZUREAD_2022*.json"

Relevant log output

Traceback (most recent call last):
  File "/workspace/examples/digital_fingerprinting/production/morpheus/dfp_azure_pipeline.py", line 354, in <module>
    run_pipeline(obj={}, auto_envvar_prefix='DFP', show_default=True, prog_name="dfp")
  File "/opt/conda/envs/morpheus/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/envs/morpheus/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/opt/conda/envs/morpheus/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/envs/morpheus/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/workspace/examples/digital_fingerprinting/production/morpheus/dfp_azure_pipeline.py", line 349, in run_pipeline
    pipeline.run()
  File "/workspace/morpheus/pipeline/pipeline.py", line 598, in run
    asyncio.run(self.run_async())
  File "/opt/conda/envs/morpheus/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/opt/conda/envs/morpheus/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/workspace/morpheus/pipeline/pipeline.py", line 576, in run_async
    await self.join()
  File "/workspace/morpheus/pipeline/pipeline.py", line 327, in join
    await self._mrc_executor.join_async()
  File "/workspace/examples/digital_fingerprinting/production/morpheus/dfp/stages/dfp_file_to_df.py", line 204, in convert_to_dataframe
    output_df, cache_hit = self._get_or_create_dataframe_from_batch(fsspec_batch)
  File "/workspace/examples/digital_fingerprinting/production/morpheus/dfp/stages/dfp_file_to_df.py", line 166, in _get_or_create_dataframe_from_batch
    dfs = self._downloader.download(download_buckets, download_method)
  File "/workspace/morpheus/utils/downloader.py", line 165, in download
    dfs = pool.map(download_fn, download_buckets)
  File "/opt/conda/envs/morpheus/lib/python3.10/multiprocessing/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/opt/conda/envs/morpheus/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/opt/conda/envs/morpheus/lib/python3.10/multiprocessing/pool.py", line 540, in _handle_tasks
    put(task)
  File "/opt/conda/envs/morpheus/lib/python3.10/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/opt/conda/envs/morpheus/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object '<lambda>.<locals>.<lambda>'

Full env printout

No response

Other/Misc.

No response

Code of Conduct

  • I agree to follow Morpheus' Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@efajardo-nv efajardo-nv added the bug Something isn't working label Aug 23, 2023
@mdemoret-nv mdemoret-nv added this to the 23.11 - DFP Improvements milestone Aug 28, 2023
@mdemoret-nv mdemoret-nv added the dfp [Workflow] Related to the Digital Fingerprinting (DFP) workflow label Aug 28, 2023
@mdemoret-nv
Copy link
Contributor

I believe that we want to deprecate the multi-process download in the future

@mdemoret-nv
Copy link
Contributor

To fix this issue, we should:

  • Remove the multiprocess download option from the available options
  • Since this option currently exits, check if the user has chosen this option and generate an error guiding the user to choose either the single threaded, or dask options
  • Create a test that ensures the error is raised when choosing the multi-processing option

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dfp [Workflow] Related to the Digital Fingerprinting (DFP) workflow
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants