Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConnectionResetError: [Errno 104] Connection reset by peer #246

Open
hhvu0102 opened this issue Dec 16, 2021 · 0 comments
Open

ConnectionResetError: [Errno 104] Connection reset by peer #246

hhvu0102 opened this issue Dec 16, 2021 · 0 comments

Comments

@hhvu0102
Copy link

Hello,
I'm trying to run the example in tutorial 2. I downloaded the files and ran the exact command listed in the tutorial to denoise, but I always got ConnectionResetError: [Errno 104] Connection reset by peer error. This is the full error:

Traceback (most recent call last):
  File "/home/hhvu/.local/bin/atacworks", line 8, in <module>
    sys.exit(main())
  File "/home/hhvu/.local/lib/python3.7/site-packages/scripts/main.py", line 565, in main
    ngpus_per_node, args, res_queue), join=True)
  File "/home/hhvu/.local/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
    while not spawn_context.join():
  File "/home/hhvu/.local/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join
    raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/hhvu/.local/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
    fn(i, *args)
  File "/home/hhvu/.local/lib/python3.7/site-packages/scripts/worker.py", line 290, in infer_worker
    pad=args.pad)
  File "/home/hhvu/.local/lib/python3.7/site-packages/atacworks/dl4atac/infer.py", line 80, in infer
    res_queue.put((idxes, batch_res))
  File "<string>", line 2, in put
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/managers.py", line 834, i
n _callmethod
    raise convert_to_error(kind, result)
multiprocessing.managers.RemoteError:
---------------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/managers.py", line 234, i
n serve_client
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
  File "/home/hhvu/.local/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 284, in rebuild_storage_fd
    fd = df.detach()
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/resource_sharer.py", line 58, in detach
    return reduction.recv_handle(conn)
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/reduction.py", line 185, in recv_handle
    return recvfds(s, 1)[0]
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/reduction.py", line 161, in recvfds
    len(ancdata))
RuntimeError: received 0 items of ancdata
---------------------------------------------------------------------------

Process Process-2:
Traceback (most recent call last):
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/hhvu/.local/lib/python3.7/site-packages/scripts/main.py", line 217, in writer
    if not res_queue.empty():
  File "<string>", line 2, in empty
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/managers.py", line 819, in _callmethod
    kind, result = conn.recv()
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/opt/rit/spack-app/linux-rhel7-x86_64/gcc-4.8.5/python-3.7.7-b5s6jni4fu45wd4rns43cetmu4u6grxz/lib/python3.7/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer

I'm on a NVIDIA GeForce GTX 1080 machine with 4 gpus. I was able to run tutorial 1 successfully with this machine.
I appreciate any help. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant