Add Nemotron VLM support in video captioning #1160

ronjer30 · 2025-10-02T17:54:04Z

Description

Adds support for the Nemotron Nano 12B V2 VLM

Usage

Download model checkpoints
Run

python tutorials/video/getting-started/video_split_clip_example.py \
  --video-dir </path/to/videos_directory/> \
  --output-clip-path ./outputs \
  --generate-captions \
  --captioning-algorithm nemotron \
  --nemotronh-vl-model-path </path/to/checkpoints/> \
  --captioning-batch-size 8 \
  --no-generate-embeddings

Checklist

I am familiar with the Contributing Guide.
New or Existing tests cover these changes.
The documentation is up to date with these changes.

copy-pr-bot · 2025-10-02T17:54:08Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

nemo_curator/stages/video/caption/caption_preparation.py

nemo_curator/tasks/video.py

pyproject.toml

tutorials/video/getting-started/video_split_clip_example.py

…in favor of a unified llm_inputs structure. Update related tests and examples to reflect this change

ronjer30 · 2025-10-07T16:49:55Z

Thanks @suiyoubi, I've addressed the issues, added tests and updated existing ones as well.

…onH VL fixes

Add NemotronH VLM support for video captioning

c26027c

suiyoubi reviewed Oct 2, 2025

View reviewed changes

nemo_curator/stages/video/caption/caption_preparation.py Outdated Show resolved Hide resolved

suiyoubi reviewed Oct 2, 2025

View reviewed changes

nemo_curator/tasks/video.py Outdated Show resolved Hide resolved

suiyoubi reviewed Oct 2, 2025

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

suiyoubi reviewed Oct 2, 2025

View reviewed changes

tutorials/video/getting-started/video_split_clip_example.py Outdated Show resolved Hide resolved

Refactor caption generation and preparation to remove qwen_llm_input …

6703649

…in favor of a unified llm_inputs structure. Update related tests and examples to reflect this change

ronjer30 added 2 commits October 9, 2025 14:07

Updated to use vLLM V1 engine for NemotronH Nano V2 model

759ae38

Reverted flash-attn back to 2.8.3 to work with latest upstream Nemotr…

2d81c50

…onH VL fixes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Nemotron VLM support in video captioning #1160

Add Nemotron VLM support in video captioning #1160

Uh oh!

ronjer30 commented Oct 2, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Oct 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ronjer30 commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Nemotron VLM support in video captioning #1160

Are you sure you want to change the base?

Add Nemotron VLM support in video captioning #1160

Uh oh!

Conversation

ronjer30 commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Usage

Checklist

Uh oh!

copy-pr-bot bot commented Oct 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ronjer30 commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ronjer30 commented Oct 2, 2025 •

edited

Loading