Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no_action label in test mode #11

Open
asdfqwer2015 opened this issue Feb 18, 2019 · 8 comments
Open

no_action label in test mode #11

asdfqwer2015 opened this issue Feb 18, 2019 · 8 comments

Comments

@asdfqwer2015
Copy link

Hi, in ActionRecognition.py, I tested some videos from nvgesture dataset(untrimmed video). And it outputs class no. in 0~24 in each frames. i.e. it didn't outputs any blank or no_action label.
If processing untrimmed video is online detection refer to trimmed video as offline detection. How to do online detection? Did I miss something?
Thanks.

@breadbread1984
Copy link
Owner

you should maintain an buffer of the time sequence and feed it into the model. the buffer serves as a fifo of the online frames.

@asdfqwer2015
Copy link
Author

Thanks for your quick reply.
I might haven't described the issue clearly.
En, for buffer mechanism, I've noticed the buffer mechanism in your ActionRecognition.py script before. And I just modified the script with video source(webcam => video file). It still process the video by fixed length buffer(80 frames) fifo.

The issue is how to process the frames without gesture in unsegmented input video. I found the model can only output 25 classes of gesture, but can not output no gesture found class. For online mode, the model should not only classify the gesture class, but also should process the frames without gesture(e.g. outputs 'no action found' class).
Thanks.

@breadbread1984
Copy link
Owner

breadbread1984 commented Feb 20, 2019

you can feed data sequence of any length as long as you assign the input parameter 'sequence_lengths' in the input dictionary correctly.

@asdfqwer2015
Copy link
Author

En, thanks for your reply again.

But I'm still confused. Could you please help to see these?

a. For no_gesture_found class, i.e. negetive class
I'm not sure, maybe the model need some training sample without gesture to learn the negative class(i.e. 26th class no gesture found)?

b. Should classes count for ctc == len(classes) or len(classes) + len(['blank for ctc loss'])?
And I tested trained model with training samples, these have high accuracy in classification. But it didn't output class label when processing 24th(0 based) class of samples. I doubt this may due to last class also occupied by ctc loss. So, should the classes for model's output equal to len(classes) + len(['blank for ctc loss'])?
BTW, my tensorflow's version is 1.12.0.

@breadbread1984
Copy link
Owner

for nv hand gesture dataset, every video clip contain one continuous occurring gesture definitely. so the label sequence contains only one class label (>=0). there is no label for no gesture.

the output length of ctc == how many continuous occurring gestures are found in a video clip. one label in the output sequence represents one gesture.

@asdfqwer2015
Copy link
Author

En, I'v understood. Thanks. But I met a new issue: overfitting. Could you please help to see? I'll create a new issue.
:)

@buaa-luzhi
Copy link

@asdfqwer2015
Hello, can you explain why the 'no_gesture' class is not printed when is tested online?
Thanks.

@buaa-luzhi
Copy link

@asdfqwer2015 @breadbread1984
Hello, output random numbers between 0 to 24 when using the ActionRecognition.py script for classification. Do you know why?
Looking forward to your reply!!!!!!!!!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants