Wrong Loss Function?

Hello 

Did anybody successfully train using this code? We don't get the pinball (VideoPinball-v0) to do usefull stuff. 

There seems to be a subtle bug in the calculation of the loss function. According to the [nature paper](https://storage.googleapis.com/deepmind-data/assets/papers/DeepMindNature14236Paper.pdf) (see Algorithm 1) the Q-Value of the target function should be the maximum. However in the code [dqn](https://github.com/llSourcell/Game-AI/blob/master/dqn.py) in function doMinibatch (line 122)

its
```
q_target_max = np.argmax(q_target, axis=1)
```
and thus not the maximum. Shouldn't that be
```
q_target_max =  np.amax(q_target, axis=1)
```

Cheers,
Oliver

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Wrong Loss Function? #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Wrong Loss Function? #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions