You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
am getting issue related to miss match of state and output. But I am unable to figure the issue.
It would be really appreciated if someone can guide me. Thanks in advance.
I am using tensorfow-gpu==1.2.1, with 1080 Ti graphics.
Error is as below:
ValueError: Shapes (8, 522) and (8, 512) are incompatible
Error occurs in the file "attention_wrapper.py" in the method named "call" at line 708
I was able to figure out that it is adding the attention_size to the shape and so there is a mismatch.
But I have no idea how to fix it.
The code is as below, hyper-parameters are declared as below (test purpose).
`
batch_size= 8
number_of_units_per_layer= 512
number_of_layers = 3
attn_size= 10
def build_decoder_cell(enc_output, enc_state, source_sequence_length, attn_size, batch_size):
encoder_outputs = enc_output
encoder_last_state = enc_state
encoder_inputs_length = source_sequence_length
attention_mechanism = attention_wrapper.LuongAttention(
num_units=attn_size, memory=encoder_outputs,
memory_sequence_length=encoder_inputs_length,
scale=True,
name='LuongAttention' )
# Building decoder_cell
decoder_cell_list = [
build_single_cell() for i in range(num_layers)]
decoder_initial_state = encoder_last_state
def attn_decoder_input_fn(inputs, attention):
#if not self.attn_input_feeding:
# return inputs
# Essential when use_residual=True
_input_layer = Dense(size, dtype=tf.float32,
name='attn_input_feeding')
return _input_layer(array_ops.concat([inputs, attention], -1))
# AttentionWrapper wraps RNNCell with the attention_mechanism
# Note: We implement Attention mechanism only on the top decoder layer
decoder_cell_list[-1] = attention_wrapper.AttentionWrapper(
cell=decoder_cell_list[-1],
attention_mechanism=attention_mechanism,
attention_layer_size=attn_size,
#cell_input_fn=attn_decoder_input_fn,
initial_cell_state=encoder_last_state[-1],
alignment_history=False,
name='Attention_Wrapper')
# To be compatible with AttentionWrapper, the encoder last state
# of the top layer should be converted into the AttentionWrapperState form
# We can easily do this by calling AttentionWrapper.zero_state
# Also if beamsearch decoding is used, the batch_size argument in .zero_state
# should be ${decoder_beam_width} times to the origianl batch_size
#batch_size = self.batch_size if not self.use_beamsearch_decode \
# else self.batch_size * self.beam_width
initial_state = [state for state in encoder_last_state]
initial_state[-1] = decoder_cell_list[-1].zero_state(
batch_size=batch_size, dtype=tf.float32)
decoder_initial_state = tuple(initial_state)
return tf.contrib.rnn.MultiRNNCell(decoder_cell_list), decoder_initial_state`
Thank you once again.
The text was updated successfully, but these errors were encountered:
am getting issue related to miss match of state and output. But I am unable to figure the issue.
It would be really appreciated if someone can guide me. Thanks in advance.
I am using tensorfow-gpu==1.2.1, with 1080 Ti graphics.
Error is as below:
ValueError: Shapes (8, 522) and (8, 512) are incompatible
Error occurs in the file "attention_wrapper.py" in the method named "call" at line 708
cell_output, next_cell_state = self._cell(cell_inputs, cell_state)
I was able to figure out that it is adding the attention_size to the shape and so there is a mismatch.
But I have no idea how to fix it.
The code is as below, hyper-parameters are declared as below (test purpose).
`
batch_size= 8
number_of_units_per_layer= 512
number_of_layers = 3
attn_size= 10
def build_decoder_cell(enc_output, enc_state, source_sequence_length, attn_size, batch_size):
Thank you once again.
The text was updated successfully, but these errors were encountered: