Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Platform 2.1.3 - No free channel ids error #953

Closed
sambles opened this issue Jan 19, 2024 · 0 comments · Fixed by #954
Closed

Platform 2.1.3 - No free channel ids error #953

sambles opened this issue Jan 19, 2024 · 0 comments · Fixed by #954

Comments

@sambles
Copy link
Contributor

sambles commented Jan 19, 2024

Issue Description

Both the task_controller and worker_monitor are failing with the following error

oasis-worker-monitor [2024-01-19 07:28:00,077: ERROR/ForkPoolWorker-7] Task send_queue_status_digest[857810ed-dd10-40d1-bf92-1ec16361cf08] raised unexpected: ResourceError(None, 'No free channel ids, current=2048, channel_max=2047', (20, 10), 'Channel.open')
oasis-worker-monitor Traceback (most recent call last):
oasis-worker-monitor   File "/home/server/.local/lib/python3.10/site-packages/amqp/connection.py", line 514, in channel
oasis-worker-monitor     return self.channels[channel_id]
oasis-worker-monitor KeyError: None
oasis-worker-monitor 
oasis-worker-monitor During handling of the above exception, another exception occurred:
oasis-worker-monitor 
oasis-worker-monitor Traceback (most recent call last):
oasis-worker-monitor   File "/home/server/.local/lib/python3.10/site-packages/celery/app/trace.py", line 477, in trace_task
oasis-worker-monitor     R = retval = fun(*args, **kwargs)
oasis-worker-monitor   File "/home/server/.local/lib/python3.10/site-packages/celery/app/trace.py", line 760, in __protected_call__
oasis-worker-monitor     return self.run(*args, **kwargs)
oasis-worker-monitor   File "/var/www/oasis/src/server/oasisapi/queues/tasks.py", line 10, in send_queue_status_digest
oasis-worker-monitor     send_task_status_message(build_all_queue_status_message())
oasis-worker-monitor   File "/var/www/oasis/src/server/oasisapi/queues/consumers.py", line 99, in build_all_queue_status_message
oasis-worker-monitor     all_queues = filter_queues_info(queue_names)
oasis-worker-monitor   File "/var/www/oasis/src/server/oasisapi/queues/utils.py", line 134, in filter_queues_info
oasis-worker-monitor     info = info or get_queues_info()
oasis-worker-monitor   File "/var/www/oasis/src/server/oasisapi/queues/utils.py", line 64, in get_queues_info
oasis-worker-monitor     res = [
oasis-worker-monitor   File "/var/www/oasis/src/server/oasisapi/queues/utils.py", line 70, in <listcomp>
oasis-worker-monitor     'queue_message_count': _get_queue_message_count(q),
oasis-worker-monitor   File "/var/www/oasis/src/server/oasisapi/queues/utils.py", line 22, in _get_queue_message_count
oasis-worker-monitor     chan = conn.channel()
oasis-worker-monitor   File "/home/server/.local/lib/python3.10/site-packages/kombu/connection.py", line 303, in channel
oasis-worker-monitor     chan = self.transport.create_channel(self.connection)
oasis-worker-monitor   File "/home/server/.local/lib/python3.10/site-packages/kombu/transport/pyamqp.py", line 168, in create_channel
oasis-worker-monitor     return connection.channel()
oasis-worker-monitor   File "/home/server/.local/lib/python3.10/site-packages/amqp/connection.py", line 516, in channel
oasis-worker-monitor     channel = self.Channel(self, channel_id, on_open=callback)
oasis-worker-monitor   File "/home/server/.local/lib/python3.10/site-packages/amqp/channel.py", line 102, in __init__
oasis-worker-monitor     channel_id = connection._get_free_channel_id()
oasis-worker-monitor   File "/home/server/.local/lib/python3.10/site-packages/amqp/connection.py", line 493, in _get_free_channel_id
oasis-worker-monitor     raise ResourceError(

Happens when running 6-8 analyses in parallel, and did not occur before #943
Top suspect is:

def _get_queue_consumers(queue_name):
with celery_app.pool.acquire(block=True) as conn:
chan = conn.channel()
name, message_count, consumers = chan.queue_declare(queue=queue_name, passive=True)
return consumers
def _get_queue_message_count(queue_name):
with celery_app.pool.acquire(block=True) as conn:
chan = conn.channel()
name, message_count, consumers = chan.queue_declare(queue=queue_name, passive=True)
return message_count

Which likely isn't closing connections when the function returns

@sambles sambles self-assigned this Jan 19, 2024
@sambles sambles moved this to In Progress in Oasis Dev Team Tasks Jan 19, 2024
This was referenced Jan 19, 2024
@sambles sambles linked a pull request Jan 19, 2024 that will close this issue
@github-project-automation github-project-automation bot moved this from In Progress to Done in Oasis Dev Team Tasks Jan 19, 2024
@awsbuild awsbuild added this to the 2.3.0 milestone Feb 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants