PyTorch Implementation for the paper "Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation" accepted to RA-L'24.
-
Updated
Jul 1, 2024 - Python
PyTorch Implementation for the paper "Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation" accepted to RA-L'24.
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Video Foundation Models & Data for Multimodal Understanding
VELOCITI Benchmark Evaluation and Visualisation Code
[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"
Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.
Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentation"
Official code for MiniGPT4-video
[ICCV 2023] How Much Temporal Long-Term Context is Needed for Action Segmentation?
OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
VTC: Improving Video-Text Retrieval with User Comments
The official code of "CSTA: CNN-based Spatiotemporal Attention for Video Summarization"
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Official repo of the paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"
Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""
FreeVA: Offline MLLM as Training-Free Video Assistant
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
Add a description, image, and links to the video-understanding topic page so that developers can more easily learn about it.
To associate your repository with the video-understanding topic, visit your repo's landing page and select "manage topics."