Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atlas: Few-shot Learning with Retrieval Augmented Language Model #20503

Open
2 tasks done
ae99 opened this issue Nov 30, 2022 · 3 comments
Open
2 tasks done

Atlas: Few-shot Learning with Retrieval Augmented Language Model #20503

ae99 opened this issue Nov 30, 2022 · 3 comments

Comments

@ae99
Copy link

ae99 commented Nov 30, 2022

Model description

Atlas is a retrieval-augmented seq2seq language model comprised of a Contriever retriever and fusion-in-decoder (FID) architecture (which uses T5), introduced in the paper Atlas: Few-shot Learning with Retrieval Augmented Language Models

From the papers abstract:

Large language models have shown impressive few-shot results on a wide range of tasks.
However, when knowledge is key for such results, as is the case for tasks such as question
answering and fact checking, massive parameter counts to store knowledge seem to be needed.
Retrieval augmented models are known to excel at knowledge intensive tasks without the
need for as many parameters, but it is unclear whether they work in few-shot settings. In this
work we present Atlas, a carefully designed and pre-trained retrieval augmented language
model able to learn knowledge intensive tasks with very few training examples. We perform
evaluations on a wide range of tasks, including MMLU, KILT and NaturalQuestions, and
study the impact of the content of the document index, showing that it can easily be updated.
Notably, Atlas reaches over 42% accuracy on Natural Questions using only 64 examples,
outperforming a 540B parameters model by 3% despite having 50x fewer parameters.

Open source status

Provide useful links for the implementation

Open-sourced implementation from Meta https://github.com/facebookresearch/atlas, with weights available.

Authored by @patrick-s-h-lewis and @gizacard

@ae99 ae99 added the New model label Nov 30, 2022
@ae99
Copy link
Author

ae99 commented Nov 30, 2022

Hi all!

Super appreciative of the authors for open-sourcing this model, really exciting stuff.

I'm planning on having a go at implementing this model here. Aware there are others who have been looking at similar models in the past (#15387), so thought it good to get this ticket in early in case you are also interested in working on this!

@patrick-s-h-lewis
Copy link
Contributor

patrick-s-h-lewis commented Dec 1, 2022

go for it! it shouldnt be too hard to get inference working - training may be more involved - the way we do the distributed index might be a little painful to integrate gracefully.
good luck!

Please make sure that you provide links to the original repo prominently, and try to make sure the models are 1) capable of achieving the same accuracy that they do in our repo 2) mathematically preform the same computations.

@vahuja4
Copy link

vahuja4 commented Apr 7, 2023

Hello, is ATLAS a part of huggingface now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants