Awesome series for Large Language Model(LLM)s
Name | Parameter size | Announcement date |
---|---|---|
BERT-Large (336M) | 336 million | 2018 |
T5 (11B) | 11 billion | 2020 |
Gopher (280B) | 280 billion | 2021 |
GPT-J (6B) | 6 billion | 2021 |
LaMDA (137B) | 137 billion | 2021 |
Megatron-Turing NLG (530B) | 530 billion | 2021 |
T0 (11B) | 11 billion | 2021 |
Macaw (11B) | 11 billion | 2021 |
GLaM (1.2T) | 1.2 trillion | 2021 |
T5 FLAN (540B) | 540 billion | 2022 |
OPT-175B (175B) | 175 billion | 2022 |
ChatGPT (175B) | 175 billion | 2022 |
GPT 3.5 (175B) | 175 billion | 2022 |
AlexaTM (20B) | 20 billion | 2022 |
Bloom (176B) | 176 billion | 2022 |
Bard | Not yet announced | 2023 |
GPT 4 | Not yet announced | 2023 |
AlphaCode (41.4B) | 41.4 billion | 2022 |
Chinchilla (70B) | 70 billion | 2022 |
Sparrow (70B) | 70 billion | 2022 |
PaLM (540B) | 540 billion | 2022 |
NLLB (54.5B) | 54.5 billion | 2022 |
Alexa TM (20B) | 20 billion | 2022 |
Galactica (120B) | 120 billion | 2022 |
UL2 (20B) | 20 billion | 2022 |
Jurassic-1 (178B) | 178 billion | 2022 |
LLaMA (65B) | 65 billion | 2023 |
Stanford Alpaca (7B) | 7 billion | 2023 |
GPT-NeoX 2.0 (20B) | 20 billion | 2023 |
BloombergGPT | 50 billion | 2023 |
Dolly | 6 billion | 2023 |
Jurassic-2 | Not yet announced | 2023 |
OpenAssistant LLaMa | 30 billion | 2023 |
Koala | 13 billion | 2023 |
Vicuna | 13 billion | 2023 |
PaLM2 | Not yet announced, Smaller than PaLM1 | 2023 |
LIMA | 65 billion | 2023 |
MPT | 7 billion | 2023 |
Falcon | 40 billion | 2023 |
Llama 2 | 70 billion | 2023 |
Google Gemini | Not yet announced | 2023 |
Microsoft Phi-2 | 2.7 billion | 2023 |
Grok-0 | 33 billion | 2023 |
Grok-1 | 314 billion | 2023 |
Solar | 10.7 billion | 2024 |
Gemma | 7 billion | 2024 |
Grok-1.5 | Not yet announced | 2024 |
DBRX | 132 billion | 2024 |
Claude 3 | Not yet announced | 2024 |
Gemma 1.1 | 7 billion | 2024 |
Llama 3 | 70 billion | 2024 |
- T5 (11B) - Announced by Google / 2020
- T5 FLAN (540B) - Announced by Google / 2022
- T0 (11B) - Announced by BigScience (HuggingFace) / 2021
- OPT-175B (175B) - Announced by Meta / 2022
- UL2 (20B) - Announced by Google / 2022
- Bloom (176B) - Announced by BigScience (HuggingFace) / 2022
- BERT-Large (336M) - Announced by Google / 2018
- GPT-NeoX 2.0 (20B) - Announced by EleutherAI / 2023
- GPT-J (6B) - Announced by EleutherAI / 2021
- Macaw (11B) - Announced by AI2 / 2021
- Stanford Alpaca (7B) - Announced by Stanford University / 2023
- Visual ChatGPT - Announced by Microsoft / 2023
- LMOps - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities.
- GPT 4 (Parameter size unannounced, gpt-4-32k) - Announced by OpenAI / 2023
- ChatGPT (175B) - Announced by OpenAI / 2022
- ChatGPT Plus (175B) - Announced by OpenAI / 2023
- GPT 3.5 (175B, text-davinci-003) - Announced by OpenAI / 2022
- Gemini - Announced by Google Deepmind / 2023
- Bard - Announced by Google / 2023
- Codex (11B) - Announced by OpenAI / 2021
- Sphere - Announced by Meta / 2022
134M
documents split into906M
passages as the web corpus.
- Common Crawl
3.15B
pages and over than380TiB
size dataset, public, free to use.
- SQuAD 2.0
100,000+
question dataset for QA.
- Pile
825 GiB diverse
, open source language modelling data set.
- RACE
- A large-scale reading comprehension dataset with more than
28,000
passages and nearly100,000
questions.
- A large-scale reading comprehension dataset with more than
- Wikipedia
- Wikipedia dataset containing cleaned articles of all languages.
- Megatron-Turing NLG (530B) - Announced by NVIDIA and Microsoft / 2021
- LaMDA (137B) - Announced by Google / 2021
- GLaM (1.2T) - Announced by Google / 2021
- PaLM (540B) - Announced by Google / 2022
- AlphaCode (41.4B) - Announced by DeepMind / 2022
- Chinchilla (70B) - Announced by DeepMind / 2022
- Sparrow (70B) - Announced by DeepMind / 2022
- NLLB (54.5B) - Announced by Meta / 2022
- LLaMA (65B) - Announced by Meta / 2023
- AlexaTM (20B) - Announced by Amazon / 2022
- Gopher (280B) - Announced by DeepMind / 2021
- Galactica (120B) - Announced by Meta / 2022
- PaLM2 Tech Report - Announced by Google / 2023
- LIMA - Announced by Meta / 2023
- Llama 2 (70B) - Announced by Meta / 2023
- Luminous (13B) - Announced by Aleph Alpha / 2021
- Turing NLG (17B) - Announced by Microsoft / 2020
- Claude (52B) - Announced by Anthropic / 2021
- Minerva (Parameter size unannounced) - Announced by Google / 2022
- BloombergGPT (50B) - Announced by Bloomberg / 2023
- AlexaTM (20B - Announced by Amazon / 2023
- Dolly (6B) - Announced by Databricks / 2023
- Jurassic-1 - Announced by AI21 / 2022
- Jurassic-2 - Announced by AI21 / 2023
- Koala - Announced by Berkeley Artificial Intelligence Research(BAIR) / 2023
- Gemma - Gemma: Introducing new state-of-the-art open models / 2024
- Grok-1 - Open Release of Grok-1 / 2023
- Grok-1.5 - Announced by XAI / 2024
- DBRX - Announced by Databricks / 2024
- BigScience - Maintained by HuggingFace (Twitter) (Notion)
- HuggingChat - Maintained by HuggingFace / 2023
- OpenAssistant - Maintained by Open Assistant / 2023
- StableLM - Maintained by Stability AI / 2023
- Eleuther AI Language Model- Maintained by Eleuther AI / 2023
- Falcon LLM - Maintained by Technology Innovation Institute / 2023
- Gemma - Maintained by Google / 2024
- Stanford Alpaca - - A repository of Stanford Alpaca project, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations.
- Dolly - - A large language model trained on the Databricks Machine Learning Platform.
- AutoGPT - - An experimental open-source attempt to make GPT-4 fully autonomous.
- dalai - - The cli tool to run LLaMA on the local machine.
- LLaMA-Adapter - - Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters.
- alpaca-lora - - Instruct-tune LLaMA on consumer hardware.
- llama_index - - A project that provides a central interface to connect your LLM's with external data.
- openai/evals - - A curated list of reinforcement learning with human feedback resources.
- trlx - - A repo for distributed training of language models with Reinforcement Learning via Human Feedback. (RLHF)
- pythia - - A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters.
- Embedchain - - Framework to create ChatGPT like bots over your dataset.
- OpenAssistant SFT 6 - 30 billion LLaMa-based model made by HuggingFace for the chatting conversation.
- Vicuna Delta v0 - An open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
- MPT 7B - A decoder-style transformer pre-trained from scratch on 1T tokens of English text and code. This model was trained by MosaicML.
- Falcon 7B - A 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora.
- Phi-2: The surprising power of small language models
- StackLLaMA: A hands-on guide to train LLaMA with RLHF
- PaLM2
- PaLM2 and Future work: Gemini model
We welcome contributions to the Awesome LLMOps list! If you'd like to suggest an addition or make a correction, please follow these guidelines:
- Fork the repository and create a new branch for your contribution.
- Make your changes to the README.md file.
- Ensure that your contribution is relevant to the topic of LLM.
- Use the following format to add your contribution:
[Name of Resource](Link to Resource) - Description of resource
- Add your contribution in alphabetical order within its category.
- Make sure that your contribution is not already listed.
- Provide a brief description of the resource and explain why it is relevant to LLM.
- Create a pull request with a clear title and description of your changes.
We appreciate your contributions and thank you for helping to make the Awesome LLM list even more awesome!