video-understanding

Awesome video understanding toolkits based on PaddlePaddle. It supports video data annotation tools, lightweight RGB and skeleton based action recognition model, practical applications for video tagging and sport action detection.

ava youtube-8m action-recognition video-understanding action-detection tsm video-recognition activitynet tsn bmn action-localization temporal-action-detection slowfast st-gcn kinetics400 actbert pp-tsm videotag t2vlad

Updated Jun 24, 2024
Python

ZijiaLewisLu / CVPR2024-FACT

Star

Official Repo for CVPR 2024 Paper "FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Fully-Supervised Action Segmentation"

video-understanding action-segmentation cvpr2024

Updated Jun 24, 2024
Python

Vision-CAIR / MiniGPT4-video

Star

Official code for MiniGPT4-video

video-understanding video-question-answering

Updated Jun 22, 2024
Python

LTContext / LTContext

Star

[ICCV 2023] How Much Temporal Long-Term Context is Needed for Action Segmentation?

computer-vision deep-learning video-understanding iccv2023

Updated Jun 21, 2024
Python

sming256 / OpenTAD

Star

OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.

video-understanding temporal-action-detection temporal-action-localization

Updated Jun 18, 2024
Python

unitaryai / VTC

Star

VTC: Improving Video-Text Retrieval with User Comments

comments video-understanding multimodal-deep-learning video-text-retrieval vision-language-transformer vision-language-pretraining

Updated Jun 18, 2024
Python

thswodnjs3 / CSTA

Star

The official code of "CSTA: CNN-based Spatiotemporal Attention for Video Summarization"

deep-neural-networks computer-vision deep-learning video-summarization cnn pytorch video-processing supervised-learning pretrained-models attention-mechanism cvpr video-understanding pytorch-implementation videosummarization cvpr2024

Updated Jun 15, 2024
Python

boheumd / MA-LMM

Star

(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

video-understanding llm

Updated Jun 14, 2024
Python

eric-ai-lab / MMWorld

Star

Official repo of the paper "MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos"

evaluation video-understanding video-dataset multi-disciplinary multimodal-large-language-models world-model

Updated Jun 13, 2024

jinxiang-liu / UFE-AVS

Star

Official code for CVPR 2024 paper, "Audio-Visual Segmentation via Unlabeled Frame Exploitation""

semantic-segmentation video-understanding audio-visual-segmentation

Updated Jun 13, 2024
Python

whwu95 / FreeVA

Star

FreeVA: Offline MLLM as Training-Free Video Assistant

chatbot video-understanding zero-shot-video-captioning video-question-answering chatgpt vision-language-model llava training-free multimodal-large-language-models

Updated Jun 9, 2024
Python

MCG-NJU / AMD

Star

[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models

action-recognition video-understanding distillation self-supervised-learning temporal-action-detection foundation-models small-models cvpr2024

Updated Jun 4, 2024
Python

Improve this page

Add a description, image, and links to the video-understanding topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the video-understanding topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

video-understanding

Here are 186 public repositories matching this topic...

sarthak268 / nesca-pytorch

showlab / Awesome-Video-Diffusion

OpenGVLab / Ask-Anything

OpenGVLab / InternVideo

katha-ai / VELOCITI

declare-lab / Sealing

HKUST-LongGroup / Awesome-Open-Vocabulary-Detection-and-Segmentation

henghuiding / MeViS

PaddlePaddle / PaddleVideo

ZijiaLewisLu / CVPR2024-FACT

Vision-CAIR / MiniGPT4-video

LTContext / LTContext

sming256 / OpenTAD

unitaryai / VTC

thswodnjs3 / CSTA

boheumd / MA-LMM

eric-ai-lab / MMWorld

jinxiang-liu / UFE-AVS

whwu95 / FreeVA

MCG-NJU / AMD

Improve this page

Add this topic to your repo