Skip to content

Latest commit

 

History

History
83 lines (59 loc) · 6.79 KB

README.md

File metadata and controls

83 lines (59 loc) · 6.79 KB

🤖 Large Language Models (LLMs)

Welcome to the Large Language Models section of the AI Engineering Academy! This module provides a comprehensive understanding of LLMs and their practical applications in AI engineering.

📚 Repository Structure

Model/Directory Description & Contents
Axolotl Framework for fine-tuning language models
Gemma Google's latest LLM implementation
- finetune-gemma.ipynb
- gemma-sft.py
- Gemma_finetuning_notebook.ipynb
Fine-tuning notebooks and scripts
LLama2 Meta's open-source LLM
- generate_response_stream.py
- Llama2_finetuning_notebook.ipynb
- Llama_2_Fine_Tuning_using_QLora.ipynb
Implementation and fine-tuning guides
Llama3 Upcoming Meta LLM experiments
- Llama3_finetuning_notebook.ipynb Initial fine-tuning experiments
LlamaFactory LLM training and deployment framework
LLMArchitecture/ParameterCount Technical details of model architectures
Mistral-7b Mistral AI's 7B parameter model
- LLM_evaluation_harness_for_Arc_Easy_and_SST.ipynb
- Mistral_Colab_Finetune_ipynb_Colab_Final.ipynb
- notebooks_chatml_inference.ipynb
- notebooks_DPO_fine_tuning.ipynb
- notebooks_SFTTrainer TRL.ipynb
- SFT.py
Comprehensive notebooks for evaluation, fine-tuning, and inference
Mixtral Mixtral's mixture-of-experts model
- Mixtral_fine_tuning.ipynb Fine-tuning implementation
VLM Visual Language Models
- Florence2_finetuning_notebook.ipynb
- PaliGemma_finetuning_notebook.ipynb
Implementations for vision-language models

🎯 Module Overview

1. LLM Architectures

  • Explore implementations of:
    • Llama2 (Meta's open-source model)
    • Mistral-7b (Efficient 7B parameter model)
    • Mixtral (Mixture-of-experts architecture)
    • Gemma (Google's latest contribution)
    • Llama3 (Upcoming experiments)

2. 🛠️ Fine-tuning Techniques

  • Implementation strategies
  • LoRA (Low-Rank Adaptation) approaches
  • Advanced optimization methods

3. 🏗️ Model Architecture Analysis

  • Deep dives into model structures
  • Parameter counting methodologies
  • Scaling considerations

4. 🔧 Specialized Implementations

  • Code LLama for programming tasks
  • Visual Language Models:
    • Florence2
    • PaliGemma

5. 💻 Practical Applications

  • Comprehensive Jupyter notebooks
  • Response generation pipelines
  • Inference implementation guides

6. 🚀 Advanced Topics

  • DPO (Direct Preference Optimization)
  • SFT (Supervised Fine-Tuning)
  • Evaluation methodologies

🤝 Contributing

We welcome contributions! See our contributing guidelines for more information.

📚 Resources

Each subdirectory contains detailed documentation and implementation guides. Check individual README files for specific instructions.

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❤️ for the AI community