Notebook based on https://github.com/lucidrains/TimeSformer-pytorch repository.
Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification. This repository will only house the best performing variant, 'Divided Space-Time Attention', which is nothing more than attention along the time axis before the spatial.
The notebook explains the various steps to obtain the results of publication: "Is Space-Time Attention All You Need for Video Understanding?" Paper available on https://arxiv.org/abs/2102.05095