-
Notifications
You must be signed in to change notification settings - Fork 175
A Description of the Proposal Files
yjxiong edited this page Oct 23, 2017
·
1 revision
A proposal file in the SSN codebase is similar to those used in RCNN.
Here is a snippet from a real proposal file
# 1
3HHAEmr0Q34
1
1
1
86 0.0718 0.9654
11
86 0.9045 0.9625 0.1277 1.0000
86 0.8943 0.8943 0.0000 1.0000
86 0.8349 0.9595 0.1915 1.0000
86 0.3650 0.9121 0.6277 1.0000
86 0.3302 0.9037 0.6596 1.0000
86 0.2954 0.8936 0.6915 1.0000
86 0.2606 0.8810 0.7234 1.0000
86 0.2886 1.0000 0.1277 0.3856
86 0.2345 1.0000 0.1436 0.3537
86 0.1804 1.0000 0.1596 0.3218
86 0.1263 1.0000 0.1915 0.3059
This file can be described using the following tempolate
# INDEX
VIDEO_ID
NUM_UNITS
FPS
NUM_GT
(CLASS START END) x NUM_GT
NUM_PROP
(CLASS MAX_IOU MAX_OVERLAP START END) x NUM_PROP
In plain language, this file has a list of videos. Each video entry contains:
-
INDEX the index of this video starting from
1
on the first line. - VIDEO_ID the ID of the video, on the second line.
-
NUM_UNITS the next line has a number indicating the total units of time for this video. The unit can be a frame, a second, or a
1
for the normalized proposal files. -
FPS The next line is for the frames per second (
FPS
) of this video, if the unit isframe
, then this line will be 1. If the unit is - NUM_GT number of ground truth action instances in this video. This number can be set to 0 for testing videos where we do not have annotations.
-
(CLASS START END) x NUM_GT Then go
NUM_GT
lines of groundtruth action instances. Each instance has aCLASS
id, theSTART
andEND
in the unit used by this proposal file. For example, in SSN we use the frame unit. So hereSTART
andEND
will denote the starting and ending frame of the instance. In the provided normalized proposal files, the unit is1
. So theSTART
andEND
will be a decimal number from 0 to 1. Actually, the [gen_proporal_list.py][https://github.com/yjxiong/action-detection/blob/master/gen_proposal_list.py] script is translating between these two units based on the actual number of frames extracted for each video on your machine. - NUM_PROP After the groundtruth instances come the proposals, lead by the total number of proposals for this video.
-
(CLASS MAX_IOU MAX_OVERLAP START END) x NUM_PROP similarly, proposals a recorded one per line for
NUM_PROP
lines. Compared with a groundtruth instance, a proposal has two more fields.MAX_IOU
stands for the maximal intersection over union (IoU) of this proposal w.r.t. all groundtruth instances.MAX_OVERLAP
stands for the maximal overlap with groundtruth proportional to the length of this proposal. Here theCLASS
is for a proposal is the one from the groundtruth instance with the max IoU.