Skip to content
Open
Show file tree
Hide file tree
Changes from 53 commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
2689205
Add QwenSpeechSummarization
eric-mccann-pro Dec 12, 2025
f01498a
Runs main() in container... does not run main during build for testin…
eric-mccann-pro Dec 12, 2025
98bc842
Logger won't log. Deal with it later
eric-mccann-pro Dec 12, 2025
98bb794
Fix format strings
eric-mccann-pro Dec 12, 2025
e29c34a
Add primary_topic and other_topics to output
eric-mccann-pro Dec 12, 2025
0604c07
Make sure we download the tokenizer giblets during docker build
eric-mccann-pro Dec 12, 2025
97ae53f
Mock an LLM generator's events stream. Run pytest if RUN_TESTS is true
eric-mccann-pro Dec 15, 2025
f04d5b0
Use releasable descriptor
eric-mccann-pro Dec 16, 2025
86a7ab4
Readme
eric-mccann-pro Dec 16, 2025
50cb5f7
Change default RUN_TEST to false
eric-mccann-pro Dec 16, 2025
e006afd
Parameterize VLLM_MODEL and VLLM_URI at container scope, as they're e…
eric-mccann-pro Dec 16, 2025
b0b1c15
+x
eric-mccann-pro Dec 16, 2025
198f3ec
Include served-model-name param in the entrypoint, not the CMD
eric-mccann-pro Dec 16, 2025
0908932
Make sure tokenizer pull step has VLLM_MODEL defined in env if overriden
eric-mccann-pro Dec 16, 2025
c20c3d2
License blocks
eric-mccann-pro Dec 16, 2025
ebbecd7
Make exception text less useless when there are no FF tracks
eric-mccann-pro Dec 16, 2025
5aea1b7
Fix typo
eric-mccann-pro Dec 16, 2025
68c8456
Fix another typo
eric-mccann-pro Dec 16, 2025
ae4f6f0
Fix default in descriptor
eric-mccann-pro Dec 16, 2025
de6f2d3
Make speaker id optional
eric-mccann-pro Dec 16, 2025
cc151c6
input_cleanup: be cool
eric-mccann-pro Dec 16, 2025
16a367c
again
eric-mccann-pro Dec 16, 2025
7d231e5
Change summary and print the final summary after it comes back from t…
eric-mccann-pro Dec 16, 2025
3c04189
Print number of results from component video track func when called b…
eric-mccann-pro Dec 16, 2025
47ca541
Actually return results. duh
eric-mccann-pro Dec 16, 2025
dbed34c
Set an ImageLocation for video tracks
eric-mccann-pro Dec 17, 2025
bb5d333
Define CLASSIFIERS_FILE and ENABLED_CLASSIFIERS in the json, now that…
eric-mccann-pro Dec 17, 2025
6948569
Gate some of the output behind debug parameter
eric-mccann-pro Dec 17, 2025
82f37b6
Provide Items of Interest instruction
eric-mccann-pro Dec 17, 2025
9e47148
Remove businesses from entities list
eric-mccann-pro Dec 17, 2025
ed36524
Parameterization and documentation
eric-mccann-pro Dec 17, 2025
17e8c54
Switch propertiesKeys instead of defaultValues
eric-mccann-pro Dec 17, 2025
8f299b8
Remove partial word from README.md
eric-mccann-pro Dec 17, 2025
6fc5a37
PROMPT_TEMPLATE is a property
eric-mccann-pro Dec 17, 2025
f3500db
Fix a typo and mention VLLM_URI
eric-mccann-pro Dec 17, 2025
64d62bb
Don't mention VRAM
eric-mccann-pro Dec 17, 2025
247ca37
Make sample classifiers match readme AND put ticks around properties+…
eric-mccann-pro Dec 17, 2025
d18a7af
Switch to defaults for the properties that have a default
eric-mccann-pro Dec 17, 2025
c192dca
Output => tracks
eric-mccann-pro Dec 17, 2025
c32d4bb
justification => reasoning
eric-mccann-pro Dec 17, 2025
60e7fa3
Specific Items of Interest appendage is never empty if present
eric-mccann-pro Dec 17, 2025
9eb3a39
reasonining
eric-mccann-pro Dec 17, 2025
eb4fccc
Use classifier confidence for detection confidence
eric-mccann-pro Dec 18, 2025
fc5dc70
Use FakeClass for all of the manual openai-api client mock buildout
eric-mccann-pro Dec 18, 2025
d838d0b
Make sure the tracks are ordered in accordance with their index
eric-mccann-pro Dec 18, 2025
ceb6801
Validate schema, close clients between calls (prevents deadlock)
eric-mccann-pro Dec 18, 2025
67e680f
Fix path to vllm-entrypoint.sh.
jrobble Dec 24, 2025
3097d3f
Disable XET for hf download and fix deprecation warning
eric-mccann-pro Jan 2, 2026
7318613
Perform download in separate stage
eric-mccann-pro Jan 2, 2026
84e170c
Fix max-model-length parameter name
eric-mccann-pro Jan 15, 2026
8f5f61d
Merge branch 'develop' into feat/qwen-speech-summarization
jrobble Jan 15, 2026
177671b
Update versions to 10.0.
jrobble Jan 15, 2026
9202086
Fix JSONArgsRecommended warning.
jrobble Jan 22, 2026
04f7e1a
Fix how Whisper is returning duplicate tracks for videos.
jrobble Jan 22, 2026
541fac1
Wait up to two minutes for vllm to be healthy for each call to summarize
eric-mccann-pro Jan 22, 2026
e6506bd
Merge remote-tracking branch 'origin/feat/qwen-speech-summarization' …
eric-mccann-pro Jan 22, 2026
2a14fe1
Use algorithm prop.
jrobble Jan 22, 2026
e3d6327
Fix test.
jrobble Jan 23, 2026
08d5531
Fix test round 2.
jrobble Jan 23, 2026
f7fa93c
Fix bug.
jrobble Jan 23, 2026
e3e9c0d
Use local_files_only=True.
jrobble Jan 23, 2026
48acdaa
Download autotokenizer in Dockerfile.
jrobble Jan 23, 2026
f38fc8a
Fix syntax.
jrobble Jan 23, 2026
31f1b68
Proper quotes.
jrobble Jan 23, 2026
4710b35
Use import.
jrobble Jan 23, 2026
c5d9d52
Bug fix.
jrobble Jan 23, 2026
319d1a7
Use HF_HUB_OFFLINE.
jrobble Jan 23, 2026
073003b
Use HF_HUB_OFFLINE before import.
jrobble Jan 24, 2026
e0be4ec
Merge remote-tracking branch 'origin/jrobble/qwen-speech-summarizatio…
eric-mccann-pro Jan 26, 2026
15404ee
Filter out low confidence classifiers
eric-mccann-pro Jan 26, 2026
79ffed8
Add classifier_confidence_minimum to descriptor
eric-mccann-pro Jan 26, 2026
05e12ee
Add requests to setup.cfg
eric-mccann-pro Jan 26, 2026
b935b2e
descriptor: true ==> "TRUE"
eric-mccann-pro Jan 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions python/QwenSpeechSummarization/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# syntax=docker/dockerfile:1.2

#############################################################################
# NOTICE #
# #
# This software (or technical data) was produced for the U.S. Government #
# under contract, and is subject to the Rights in Data-General Clause #
# 52.227-14, Alt. IV (DEC 2007). #
# #
# Copyright 2025 The MITRE Corporation. All Rights Reserved. #
#############################################################################

#############################################################################
# Copyright 2025 The MITRE Corporation #
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #
# you may not use this file except in compliance with the License. #
# You may obtain a copy of the License at #
# #
# http://www.apache.org/licenses/LICENSE-2.0 #
# #
# Unless required by applicable law or agreed to in writing, software #
# distributed under the License is distributed on an "AS IS" BASIS, #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #
# See the License for the specific language governing permissions and #
# limitations under the License. #
#############################################################################

ARG BUILD_REGISTRY
ARG BUILD_TAG=latest
FROM ${BUILD_REGISTRY}openmpf_python_executor_ssb:${BUILD_TAG}

ARG RUN_TESTS=false
RUN set -x; DEPS="transformers>=4.51.0 accelerate pydantic openai jinja2"; \
if [ "${RUN_TESTS,,}" == true ]; then DEPS="$DEPS pytest"; fi; \
pip3 install --no-cache-dir $DEPS

ARG VLLM_MODEL="Qwen/Qwen3-30B-A3B-Instruct-2507-FP8"
ENV VLLM_MODEL="${VLLM_MODEL}"

### Defaults for runtime container-wide tunables

# MAX_MODEL_LEN should match vllm container env
ENV MAX_MODEL_LEN=45000

# UPPER BOUND for splitting of input into chunks for summary of summaries agglomeration
ENV INPUT_TOKEN_CHUNK_SIZE=10000

# OVERLAP between chunks if the whole input does not fit into 1 chunk
ENV INPUT_CHUNK_TOKEN_OVERLAP=500

### END runtime container tunables

RUN --mount=target=.,readwrite \
install-component.sh; \
# make sure the tokenizer is available offline
/opt/mpf/plugin-venv/bin/python3 -c 'from qwen_speech_summarization_component.qwen_speech_summarization_component import QwenSpeechSummaryComponent; QwenSpeechSummaryComponent()'; \
if [ "${RUN_TESTS,,}" == true ]; then pytest qwen_speech_summarization_component; fi

LABEL org.label-schema.license="Apache 2.0" \
org.label-schema.name="OpenMPF Qwen Speech Summarization" \
org.label-schema.schema-version="1.0" \
org.label-schema.url="https://openmpf.github.io" \
org.label-schema.vcs-url="https://github.com/openmpf/openmpf-components" \
org.label-schema.vendor="MITRE"
58 changes: 58 additions & 0 deletions python/QwenSpeechSummarization/Dockerfile.vllm
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#############################################################################
# NOTICE #
# #
# This software (or technical data) was produced for the U.S. Government #
# under contract, and is subject to the Rights in Data-General Clause #
# 52.227-14, Alt. IV (DEC 2007). #
# #
# Copyright 2025 The MITRE Corporation. All Rights Reserved. #
#############################################################################

#############################################################################
# Copyright 2025 The MITRE Corporation #
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #
# you may not use this file except in compliance with the License. #
# You may obtain a copy of the License at #
# #
# http://www.apache.org/licenses/LICENSE-2.0 #
# #
# Unless required by applicable law or agreed to in writing, software #
# distributed under the License is distributed on an "AS IS" BASIS, #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #
# See the License for the specific language governing permissions and #
# limitations under the License. #
#############################################################################

FROM ubuntu:20.04 AS download_model

RUN --mount=type=tmpfs,target=/var/cache/apt \
--mount=type=tmpfs,target=/var/lib/apt/lists \
--mount=type=tmpfs,target=/tmp \
apt-get update && apt-get install --no-install-recommends -y curl ca-certificates python3-venv python3-pip python3-certifi python3-urllib3 && \
pip install huggingface_hub[cli]

ARG VLLM_MODEL="Qwen/Qwen3-30B-A3B-Instruct-2507-FP8"
ENV VLLM_MODEL="${VLLM_MODEL}"
RUN HF_HUB_DISABLE_XET=1 hf download ${VLLM_MODEL}


FROM vllm/vllm-openai:latest
ARG VLLM_MODEL="Qwen/Qwen3-30B-A3B-Instruct-2507-FP8"
ENV VLLM_MODEL="${VLLM_MODEL}"

USER root
RUN mkdir -p /root/.cache
COPY --chown=root:root --from=download_model /root/.cache/huggingface /root/.cache/huggingface

# default value
ENV MAX_MODEL_LEN=45000

COPY --chown=root:root vllm-entrypoint.sh /usr/bin/

ENTRYPOINT ["/usr/bin/vllm-entrypoint.sh"]

CMD [ \
"--host", "0.0.0.0",\
"--port", "11434"\
]
54 changes: 54 additions & 0 deletions python/QwenSpeechSummarization/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Overview

This folder contains source code for the OpenMPF Qwen speech summarization component.

This component requires a base image python3.10+ and an mpf_component_api that supports mpf.AllVideoTracksJob.

We have tested Qwen/Qwen3-30B-A3B-Instruct-2507 on an 80GB card and Qwen/Qwen3-30B-A3B-Instruct-2507-FP8 on a 40GB card. Both seem quite viable.

If you are daring, any openai-compatible API could be substituted for VLLM and any model could replace Qwen3-30B BUT these scenarios are untested
and your mileage may vary.

In either case, the component assumes anonymous access to the openai-api-compatible endpoint that performs the summarization.

# Inputs

- classifiers.json: contains a definition of subjects of interest to score with a low 0-1 confidence if the input DOES NOT include the defined classifier OR high if it does

```json
[
{
"Classifier": "Major League Baseball",
"Definition": "discussions regarding major league baseball teams, professional baseball players, and baseball stadiums",
"Items of Interest": "Baseball fields, baseball teams, baseball players, baseballs, baseball bats, baseball hats"
}
]
```

# Properties

- `CLASSIFIERS_FILE`: when set to an absolute path (with a valid classifiers.json in a volume mounted such that the file is at the specified path), will replace the default classifiers.json
- `CLASSIFIERS_LIST`: Either "ALL", or a comma-separated list of specific names of the "Classifier" fields of defined classifiers
- `PROMPT_TEMPLATE`: if set, will replace the packaged `templates/prompt.jinja` with one read from this location. Must include self-recursive summarization instructions and the jinja templates `{{ classifiers }}` and `{{ input }}`.

# Docker build-args

- `VLLM_MODEL`: if building Dockerfile.vllm for vllm (which downloads the model during docker build), this is the ONLY model that your qwen_speech_summarization_component will be able to use.

NOTE: if you have an internet connection at runtime, you may use the image `vllm/vllm-openai:latest` directly in lieu of building Dockerfile.vllm. We do not support this arrangement BUT it is possible with the right command on the docker service.

# Environment variables

- `VLLM_MODEL`: must MATCH the model name being served by vllm OR be available at whichver openai-api-compatible API you choose to talk to.
- `VLLM_URI`: the base_url of the openai-api-compatible API providing access to your model. If your vllm service is named vllm, then this would need to be `http://vllm:11434/v1`.
- `MODEL_MAX_LEN` should be defined on both the qwen container AND the vllm container. It is the maximum input+output token count you can use without erroring. We have tried 45000 for the -FP8 model and 120000 for the nonquantized model on a 40GB and 80GB card, respectively.
- `INPUT_TOKEN_CHUNK_SIZE` should be about 20%-30% of your `MODEL_MAX_LEN`, and is the token size that your input will be split into during chunking before making a series of calls to the LLM.
- `INPUT_CHUNK_TOKEN_OVERLAP` should be small and constant. If it is too small, there will be no overlap between chunks, which could negatively impact performance with huge input tracks.

# Outputs

A list of mpf.VideoTracks or mpf.AudioTracks (once supported).

Track[0] will always contain the overall summary of the input, including primary/other topics and entities.

Track[1-n] will be the confidences, reasoning, and name for each of the intersection of enabled classifiers AND classifiers defined in classifiers.json.
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
{
"componentName": "QwenSpeechSummarization",
"componentVersion": "10.0",
"middlewareVersion": "10.0",
"sourceLanguage": "python",
"batchLibrary": "QwenSpeechSummarization",
"environmentVariables": [],
"algorithm": {
"name": "QWENSPEECHSUMMARIZATION",
"description": "Uses Qwen3 to summarize speech",
"actionType": "DETECTION",
"trackType": "TEXT",
"requiresCollection": {
"states": []
},
"providesCollection": {
"states": [
"DETECTION",
"DETECTION_TEXT",
"DETECTION_TEXT_QWEN_SPEECH_SUMMARIZATION"
],
"properties": [
{
"name": "CLASSIFIERS_LIST",
"description": "Comma-separated list of classifiers to include in the summary output.",
"type": "STRING",
"defaultValue": "ALL"
},
{
"name": "CLASSIFIERS_FILE",
"description": "The package-relative OR absolute filename of the classifiers json file",
"type": "STRING",
"defaultValue": "classifiers.json"
},
{
"name": "ENABLE_DEBUG",
"description": "If true, each detection will include extra debug output.",
"type": "BOOLEAN",
"defaultValue": "FALSE"
},
{
"name": "PROMPT_TEMPLATE",
"description": "If set, will override the default, tested prompt template with one read from a different file",
"type": "STRING",
"defaultValue": ""
}
]
}
},
"actions": [
{
"name": "QWEN SPEECH SUMMARIZATION (WITH FF REGION) ACTION",
"description": "Performs Qwen summarization Video|Audio tracks.",
"algorithm": "QWENSPEECHSUMMARIZATION",
"properties": [
{"name": "FEED_FORWARD_ALL_TRACKS", "value": true},
{"name": "FEED_FORWARD_TYPE", "value": "REGION"}
]
}
],
"tasks": [
{
"name": "QWEN SPEECH SUMMARIZATION (WITH FF REGION) TASK",
"description": "Performs Qwen summarization Video|Audio tracks.",
"actions": [
"QWEN SPEECH SUMMARIZATION (WITH FF REGION) ACTION"
]
}
],
"pipelines": [
{
"name": "WHISPER SPEECH DETECTION WITH QWEN SUMMARIZATION PIPELINE",
"description": "Runs Whisper speech detection on audio or video and summarizes the transcript using QWEN.",
"tasks": [
"WHISPER SPEECH DETECTION TASK",
"QWEN SPEECH SUMMARIZATION (WITH FF REGION) TASK"
]
}
]
}
29 changes: 29 additions & 0 deletions python/QwenSpeechSummarization/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#############################################################################
# NOTICE #
# #
# This software (or technical data) was produced for the U.S. Government #
# under contract, and is subject to the Rights in Data-General Clause #
# 52.227-14, Alt. IV (DEC 2007). #
# #
# Copyright 2025 The MITRE Corporation. All Rights Reserved. #
#############################################################################

#############################################################################
# Copyright 2025 The MITRE Corporation #
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #
# you may not use this file except in compliance with the License. #
# You may obtain a copy of the License at #
# #
# http://www.apache.org/licenses/LICENSE-2.0 #
# #
# Unless required by applicable law or agreed to in writing, software #
# distributed under the License is distributed on an "AS IS" BASIS, #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #
# See the License for the specific language governing permissions and #
# limitations under the License. #
#############################################################################

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#############################################################################
# NOTICE #
# #
# This software (or technical data) was produced for the U.S. Government #
# under contract, and is subject to the Rights in Data-General Clause #
# 52.227-14, Alt. IV (DEC 2007). #
# #
# Copyright 2025 The MITRE Corporation. All Rights Reserved. #
#############################################################################

#############################################################################
# Copyright 2025 The MITRE Corporation #
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #
# you may not use this file except in compliance with the License. #
# You may obtain a copy of the License at #
# #
# http://www.apache.org/licenses/LICENSE-2.0 #
# #
# Unless required by applicable law or agreed to in writing, software #
# distributed under the License is distributed on an "AS IS" BASIS, #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #
# See the License for the specific language governing permissions and #
# limitations under the License. #
#############################################################################
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[
{
"Classifier": "Major League Baseball",
"Definition": "discussions regarding major league baseball teams, professional baseball players, and baseball stadiums",
"Items of Interest": "Baseball fields, baseball teams, baseball players, baseballs, baseball bats, baseball hats"
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#############################################################################
# NOTICE #
# #
# This software (or technical data) was produced for the U.S. Government #
# under contract, and is subject to the Rights in Data-General Clause #
# 52.227-14, Alt. IV (DEC 2007). #
# #
# Copyright 2025 The MITRE Corporation. All Rights Reserved. #
#############################################################################

#############################################################################
# Copyright 2025 The MITRE Corporation #
# #
# Licensed under the Apache License, Version 2.0 (the "License"); #
# you may not use this file except in compliance with the License. #
# You may obtain a copy of the License at #
# #
# http://www.apache.org/licenses/LICENSE-2.0 #
# #
# Unless required by applicable law or agreed to in writing, software #
# distributed under the License is distributed on an "AS IS" BASIS, #
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. #
# See the License for the specific language governing permissions and #
# limitations under the License. #
#############################################################################
Loading