You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 11, 2021. It is now read-only.
I am having a bit of trouble understanding how to incorporate the AttentionLSTM layer into my code. In your blog you have said that "The attentional component can be tacked onto the LSTM code that already exists.". But unlike a standard LSTM, this custom layer requires a second parameter which is the attention vector. As such, I tried the following code to build my model
However I get the following error: ValueError: Dimensions 4096 and 200 are not compatible.
My main trouble is understanding what should be the attention vector that should be passed according to your class specification. I know, conceptually, from the Show, Attend and Tell paper, that the attention vector should be each of the 1x4096 vectors. But I can't figure out how to pass that into the AttentionLSTM layer.
It would be very helpful if you could provide a gist or example script to demonstrate how to use the AttentionLSTM layer just like you did with the different rnns in your blog post!
The text was updated successfully, but these errors were encountered:
I am having a bit of trouble understanding how to incorporate the AttentionLSTM layer into my code. In your blog you have said that "The attentional component can be tacked onto the LSTM code that already exists.". But unlike a standard LSTM, this custom layer requires a second parameter which is the attention vector. As such, I tried the following code to build my model
However I get the following error:
ValueError: Dimensions 4096 and 200 are not compatible
.My main trouble is understanding what should be the attention vector that should be passed according to your class specification. I know, conceptually, from the Show, Attend and Tell paper, that the attention vector should be each of the 1x4096 vectors. But I can't figure out how to pass that into the AttentionLSTM layer.
It would be very helpful if you could provide a gist or example script to demonstrate how to use the AttentionLSTM layer just like you did with the different rnns in your blog post!
The text was updated successfully, but these errors were encountered: