You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Large language models have shown impressive few-shot results on a wide range of tasks.
However, when knowledge is key for such results, as is the case for tasks such as question
answering and fact checking, massive parameter counts to store knowledge seem to be needed.
Retrieval augmented models are known to excel at knowledge intensive tasks without the
need for as many parameters, but it is unclear whether they work in few-shot settings. In this
work we present Atlas, a carefully designed and pre-trained retrieval augmented language
model able to learn knowledge intensive tasks with very few training examples. We perform
evaluations on a wide range of tasks, including MMLU, KILT and NaturalQuestions, and
study the impact of the content of the document index, showing that it can easily be updated.
Notably, Atlas reaches over 42% accuracy on Natural Questions using only 64 examples,
outperforming a 540B parameters model by 3% despite having 50x fewer parameters.
Super appreciative of the authors for open-sourcing this model, really exciting stuff.
I'm planning on having a go at implementing this model here. Aware there are others who have been looking at similar models in the past (#15387), so thought it good to get this ticket in early in case you are also interested in working on this!
go for it! it shouldnt be too hard to get inference working - training may be more involved - the way we do the distributed index might be a little painful to integrate gracefully.
good luck!
Please make sure that you provide links to the original repo prominently, and try to make sure the models are 1) capable of achieving the same accuracy that they do in our repo 2) mathematically preform the same computations.
Model description
Atlas is a retrieval-augmented seq2seq language model comprised of a Contriever retriever and fusion-in-decoder (FID) architecture (which uses T5), introduced in the paper Atlas: Few-shot Learning with Retrieval Augmented Language Models
From the papers abstract:
Open source status
Provide useful links for the implementation
Open-sourced implementation from Meta https://github.com/facebookresearch/atlas, with weights available.
Authored by @patrick-s-h-lewis and @gizacard
The text was updated successfully, but these errors were encountered: