Skip to content
This repository was archived by the owner on May 1, 2025. It is now read-only.
This repository was archived by the owner on May 1, 2025. It is now read-only.

Reproducing the VQA candidate answers from the dataset and paper #135

@MagnusOstertag

Description

@MagnusOstertag

Hi,
first of all, thanks for the amazing work!

You wrote in the paper: "For a fair comparison with existing methods, we constrain the decoder to only generate from the $3,192$ candidate answers". In the data to download however, the number of elements in the answer_list is only $3,128$. I first suspected a typo (and -1), because in the paper you're citing it says "The number of outputs is determined by the minimum occurrence of the answer in unique questions as nine times in the dataset, which is $3,129$."

When trying to reproduce the answer_list with the given answers or directly with VQAv2, I get a different number of answers and nearly 300 different answers. So how was the answer list actually created?
(I count the number of unique answers (not questions, because then the problem is not unambiguous). I take as a threshold at least 9 occurrences of the answer, standardizing each answer as in VQAEval. When not only considering the VQAv2.0 answers, but also VisualGenome I get a much higher number of candidate answers.)

I further noticed that you seem to have excluded 7 questions from vqav2.0 in vqa_train/val, namely the questions with ID=268735002, 293514000, 147314003, 68003002, 451818000, 362391000, 196280004. Why was that?

Best,
Magnus

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions