You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is probably just me not understanding how things are supposed to work.
I have created a user-defined source, based on the async source example that sets up a REST API to accept requests that execute database queries and generate Numaflow messages for a pipeline to work off.
I am not sure what the read_handler function should return when there aren't any results to pass on (this could be just because we are waiting for another REST request).
I tried just breaking out of the iterator but that resulted in a "Readiness probe" failure so K8s will restart the pod.
To Reproduce
Steps to reproduce the behavior:
Modify the async-source example.py so that the read_handler returns after some number of messages, rather than running forever.
Quick and dirty:
From:
for x in range(datum.num_records):
To:
for x in range(self.read_idx, datum.num_records):
Build the image
Deploy the pipeline
Monitor the deployment (k9s)
Expected behavior
I thought that the source would stop producing messages so the pipeline would flush all the queues and then wait for more work (which will never come in this test case, but could in the REST API scenario described above).
Environment
Kubernetes: v1.27.6+k3s1
Numaflow: quay.io/numaproj/numaflow:v1.1.1
Numalogic: unknown (please advise where I might find this information)
Numaflow-python: 0.6.0
Message from the maintainers:
Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.
The text was updated successfully, but these errors were encountered:
Is the expected behavior for the read_handler to run, forever, and just block while there is no data to pass along? I always worry about waiting for things indefinitely.
Hey @tolmanam
I was trying to replicate the issue with the steps you provided and I had a quick question,
Were you seeing a pipeline deletion due to pods autoscaling down to 0 because of no traffic or was a crash seen at your end?
I believe it was Kubernetes killing the pod because it failed the "Readiness probe".
Consider the use case that you want to run a database query that generates X number of messages every 10 minutes. You wouldn't want autoscaling to drop the vertex.
FWIW - I swapped out the UDF source with the built-in HTTP source, and it runs happily without adding any messages to the pipeline until receiving a POST, so the behavior I would like is compatible with Numaflow, I just don't appear to know how to build a User Defined Source.
Description
This is probably just me not understanding how things are supposed to work.
I have created a user-defined source, based on the async source example that sets up a REST API to accept requests that execute database queries and generate Numaflow messages for a pipeline to work off.
I am not sure what the
read_handler
function should return when there aren't any results to pass on (this could be just because we are waiting for another REST request).I tried just breaking out of the iterator but that resulted in a "Readiness probe" failure so K8s will restart the pod.
To Reproduce
Steps to reproduce the behavior:
read_handler
returns after some number of messages, rather than running forever.Quick and dirty:
From:
To:
Expected behavior
I thought that the source would stop producing messages so the pipeline would flush all the queues and then wait for more work (which will never come in this test case, but could in the REST API scenario described above).
Environment
Message from the maintainers:
Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.
The text was updated successfully, but these errors were encountered: