-
Notifications
You must be signed in to change notification settings - Fork 3
Feat/qwen speech summarization #418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…g BECAUSE it tries to talk to a vllm container TODO: parameterize the URL
…ither needed at build OR plumbing
… I have tested their functionality
jrobble
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrobble made 5 comments.
Reviewable status: 0 of 21 files reviewed, 4 unresolved discussions (waiting on @eric-mccann-pro).
a discussion (no related file):
The component is trying to reach out to huggingface:
2026-01-22 17:52:06,962 DEBUG [connectionpool.py:544] - [Job 14:video.mp4] https://huggingface.co:443 "HEAD /Qwen/Qwen3-30B-A3B-Instruct-2507-FP8/resolve/main/added_tokens.json HTTP/1.1" 404 0
2026-01-22 17:52:06,981 DEBUG [connectionpool.py:544] - [Job 14:video.mp4] https://huggingface.co:443 "HEAD /Qwen/Qwen3-30B-A3B-Instruct-2507-FP8/resolve/main/special_tokens_map.json HTTP/1.1" 404 0
2026-01-22 17:52:07,001 DEBUG [connectionpool.py:544] - [Job 14:video.mp4] https://huggingface.co:443 "HEAD /Qwen/Qwen3-30B-A3B-Instruct-2507-FP8/resolve/main/chat_template.jinja HTTP/1.1" 404 0
2026-01-22 17:52:07,337 DEBUG [connectionpool.py:544] - [Job 14:video.mp4] https://huggingface.co:443 "GET /api/models/Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 HTTP/1.1" 200 4932
Please make sure this works in an environment with no Internet connectivity.
a discussion (no related file):
Please poll to ensure the server is initialized before sending it a request. This is how our YOLO component does it with Triton:
| // do some check on server and model |
python/QwenSpeechSummarization/Dockerfile.vllm line 57 at r1 (raw file):
CMD [ \ "--host", "0.0.0.0",\ "--port", "11434",\
This generates a warning:
1 warning found:
- JSONArgsRecommended: JSON arguments recommended for CMD to prevent unintended behavior related to OS signals (line 55)
JSON arguments recommended for ENTRYPOINT/CMD to prevent unintended behavior related to OS signals
More info: https://docs.docker.com/go/dockerfile/rule/json-args-recommended/
Dockerfile.vllm:55
--------------------
54 |
55 | >>> CMD [ \
56 | >>> "--host", "0.0.0.0",\
57 | >>> "--port", "11434",\
58 | >>> ]
59 |
--------------------
Remove the trailing comma after "11434", to get rid of the warning.
I fixed this in a recent commit.
python/QwenSpeechSummarization/qwen_speech_summarization_component/test_data/SOURCE line 1 at r1 (raw file):
test.json is PUBLIC DOMAIN text from the US Library of Congress.
Instead of SOURCE, call this NOTICE for consistency with other components. Update the formatting to match. For example: https://github.com/openmpf/openmpf-components/blob/master/cpp/OcvYoloDetection/test/data/NOTICE
python/QwenSpeechSummarization/README.md line 40 at r1 (raw file):
NOTE: if you have an internet connection at runtime, you may use the image `vllm/vllm-openai:latest` directly in lieu of building Dockerfile.vllm. We do not support this arrangement BUT it is possible with the right command on the docker service. # Environment variables
Define all of these as algorithm properties in descriptor.json. For future reference, any algorithm property can be set in the compose file as an env. var. with the MPF_PROP_ prefix like this:
qwen-speech-summarization:
depends_on:
- workflow-manager
deploy:
mode: replicated
replicas: 1
environment:
WFM_PASSWORD: mpfadm
WFM_USER: admin
MPF_PROP_VLLM_URI: http://qwen-speech-summarization-server:11434/v1
image: openmpf_qwen_speech_summarization:jrobble-video
volumes:
- shared_data:/opt/mpf/share:rwThose env. vars take precedence over everything, including incoming job properties.
Rename VLLM_URI to VLLM_SERVER for consistency with https://github.com/openmpf/openmpf-components/blob/master/cpp/OcvYoloDetection/plugin-files/descriptor/descriptor.json#L72C20-L72C33
Right now you're reading these in once and assuming they will never change. When using algorithm properties they may change with any job, so your code needs to check for this and reinit the client_factory, if necessary.
By default they should not need to be set as env. vars. The default algorithm property values should work.
jrobble
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrobble made 1 comment.
Reviewable status: 0 of 21 files reviewed, 5 unresolved discussions (waiting on @eric-mccann-pro).
a discussion (no related file):
When running WHISPER SPEECH DETECTION WITH QWEN SUMMARIZATION PIPELINE I was getting:
2026-01-21 21:10:45,929 ERROR [Camel (camelContext) thread #119 - JmsConsumer[MPF.JOB_ROUTER]] o.m.m.w.c.JobCompleteProcessor - [Job 2] Failed to create the output object due to: java.lang.IllegalArgumentException: Invalid range: [0..-1]
java.lang.IllegalArgumentException: Invalid range: [0..-1]
I fixed that Whisper issue in a commit I made to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrobble made 1 comment.
Reviewable status: 0 of 22 files reviewed, 6 unresolved discussions (waiting on @eric-mccann-pro).
python/WhisperSpeechDetection/plugin-files/descriptor/descriptor.json line 24 at r3 (raw file):
"properties": [ { "name": "TARGET_SEGMENT_LENGTH",
Prior to adding this Whisper was processing multiple video chunks as sub-jobs from the WFM. It would process the whole video per sub-job, not caring about the frame range. These settings disable segmentation.
…into feat/qwen-speech-summarization
* Change server service name.
|
I believe that we ultimately decided on dropping a classifier track if the confidence is too low. I believe we should make dropping them a configurable option. This can be done by setting a classifier confidence threshold. If -1 we don't drop any of them. |
jrobble
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrobble made 1 comment.
Reviewable status: 0 of 22 files reviewed, 8 unresolved discussions (waiting on @eric-mccann-pro).
python/QwenSpeechSummarization/Dockerfile line 58 at r4 (raw file):
# make sure the tokenizer is available offline /opt/mpf/plugin-venv/bin/python3 -c 'from qwen_speech_summarization_component.qwen_speech_summarization_component import QwenSpeechSummaryComponent; QwenSpeechSummaryComponent()'; \ if [ "${RUN_TESTS,,}" == true ]; then pytest qwen_speech_summarization_component; fi
Also run test_slapchop.py.
jrobble
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrobble made 1 comment.
Reviewable status: 0 of 22 files reviewed, 9 unresolved discussions (waiting on @eric-mccann-pro).
python/QwenSpeechSummarization/qwen_speech_summarization_component/llm_util/input_cleanup.py line 31 at r4 (raw file):
import mpf_component_api as mpf def clean_input_json(input):
There's supporting code in this PR like clean_input_json(input) and convert_to_csv(input) (possibly other files as well) that's not used by the component.
We should either remove it, or provide a script that uses it and document how to use that script in the README if it's important enough to keep and you think it will have value in the future. Right now it's dead code.
|
Previously, jrobble (Jeff Robble) wrote…
Never mind. I see that it runs. |
jrobble
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrobble made 2 comments and resolved 1 discussion.
Reviewable status: 0 of 22 files reviewed, 10 unresolved discussions (waiting on @eric-mccann-pro).
a discussion (no related file):
Please address this:
/opt/mpf/plugin-venv/lib/python3.12/site-packages/qwen_speech_summarization_component/qwen_speech_summarization_component.py:49: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
from pkg_resources import resource_filename
Here's an example in the EAST component of how we resolved it during the Python 3.12 upgrade: 9adca03#diff-e8aac52dfacf8355656a33a662170ab6a26c9243c82e0b57969824966b4df265
a discussion (no related file):
Please address this:
Failed to convert the "PROMPT_TEMPLATE" key with value "" to <class 'NoneType'> due to: NoneType takes no arguments
|
Do this: |
Issues:
Related PRs:
This change is