Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
-
Updated
Apr 8, 2023 - Jupyter Notebook
Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
Awesome papers & datasets specifically focused on long-term videos.
End-to-End Dense Video Captioning with Parallel Decoding (ICCV 2021)
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
Official Tensorflow Implementation of the paper "Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning" in CVPR 2018, with code, model and prediction results.
PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
Second-place solution to dense video captioning task in ActivityNet Challenge (CVPR 2020 workshop)
[Preprint] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
[Preprint] TRACE: Temporal Grounding Video LLM via Casual Event Modeling
[CVPR 2024] Do you remember? Dense Video Captioning with Cross-Modal Memory Retrieval
Dense video captioning in PyTorch
Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
Semantic Metadata Extraction from Generated Video Captions (CD-MAKE 2023).
Add a description, image, and links to the dense-video-captioning topic page so that developers can more easily learn about it.
To associate your repository with the dense-video-captioning topic, visit your repo's landing page and select "manage topics."