cs547 project on implementing image captioning (show and tell approach)
Things to try
- Try Glove embedding instead of nn.Embedding
- Increase encoder capacity: resnet50 -> resnet152
- Try GRU instead of LSTM
- Beam search with variable k
- DataParallel + Distributed training