Skip to content

Commit

Permalink
readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Kye committed May 25, 2023
1 parent d8660dc commit 112a535
Show file tree
Hide file tree
Showing 2 changed files with 60 additions and 55 deletions.
2 changes: 1 addition & 1 deletion Andromeda/build_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ class CFG:
SEED: int = 42
SEQ_LEN: int = 8192
NUM_CPU: int = multiprocessing.cpu_count()
HF_ACCOUNT_REPO: str = "huggingface account"
HF_ACCOUNT_REPO: str = "YOUR HUGGINGFACE API KEY"
TOKENIZER: str = "EleutherAI/gpt-neox-20b"
DATASET_NAME: str = "EleutherAI/the_pile_deduplicated"

Expand Down
113 changes: 59 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,63 @@

Andromeda is a state-of-the-art language model that pushes the boundaries of natural language understanding and generation. Designed for high performance and efficiency, Andromeda is built upon advanced techniques that make it a strong contender against the likes of OpenAI's GPT-4 and PALM.



# Usage

Get started:

1. Clone the repository and install the required packages.


```git clone https://github.com/kyegomez/Andromeda/tree/e2
cd Andromeda
pip3 install -r requirements.txt
```

or:
```bash
pip install andromeda
```



## Dataset building building

Data
You can preprocess a different dataset in a way similar to the C4 dataset used during training by running the build_dataset.py script. This will pre-tokenize, chunk the data in blocks of a specified sequence length, and upload to the Huggingface hub. For example:

```python3 Andromeda/build_dataset.py --seed 42 --seq_len 8192 --hf_account "HUGGINGFACE APIKEY" --tokenizer "EleutherAI/gpt-neox-20b" --dataset_name "EleutherAI/the_pile_deduplicated"```


# Training

2. Run the training script:

```python
from andromeda import TrainAndromeda

if __name__ == "__main__":
TrainAndromeda()

```

run the file:

```
python3 trainandromeda.py
```

# Inference

```python3 inference.py "My dog is very cute" --seq_len 256 --temperature 0.8 --filter_thres 0.9 --model "andromeda"```
Not yet we need to submit model to pytorch hub


This script will train the Andromeda model on the enwik8 dataset, leveraging the advanced techniques discussed above. The model's progress will be displayed during training, and the model will be saved periodically.

By incorporating these cutting-edge techniques, Andromeda is designed to outperform other language models like OpenAI's GPT-4 and PALM in terms of efficiency, flexibility, and scalability.

## Model Architecture 🧠🔧

```python
Expand Down Expand Up @@ -88,10 +145,6 @@ attn_layers = Decoder(
)
```

### Deep Normalization (deepnorm)

Deep normalization is a technique that normalizes the activations within a layer, helping with training stability and convergence. It allows the model to better learn complex patterns and generalize to unseen data.

Usage example:

```python
Expand All @@ -102,57 +155,9 @@ attn_layers = Decoder(
)
```

### Training Example

Here's an example of training Andromeda with the provided code snippet:

1. Clone the repository and install the required packages.

```bash
pip install andromeda
```

or:

```git clone https://github.com/kyegomez/Andromeda/tree/e2
cd Andromeda
pip3 install -r requirements.txt
```


2. Data

Data
You can preprocess a different dataset in a way similar to the C4 dataset used during training by running the build_dataset.py script. This will pre-tokenize, chunk the data in blocks of a specified sequence length, and upload to the Huggingface hub. For example:

```python3 Andromeda/build_dataset.py --seed 42 --seq_len 8192 --hf_account "your_hf_account" --tokenizer "EleutherAI/gpt-neox-20b" --dataset_name "EleutherAI/the_pile_deduplicated"```



2. Run the training script:

```python
from andromeda import TrainAndromeda

if __name__ == "__main__":
TrainAndromeda()

```

run the file:

```
python3 trainandromeda.py
```

# Inference

```python3 inference.py "My dog is very cute" --seq_len 256 --temperature 0.8 --filter_thres 0.9 --model "andromeda"```


This script will train the Andromeda model on the enwik8 dataset, leveraging the advanced techniques discussed above. The model's progress will be displayed during training, and the model will be saved periodically.
### Deep Normalization (deepnorm)

By incorporating these cutting-edge techniques, Andromeda is designed to outperform other language models like OpenAI's GPT-4 and PALM in terms of efficiency, flexibility, and scalability.
Deep normalization is a technique that normalizes the activations within a layer, helping with training stability and convergence. It allows the model to better learn complex patterns and generalize to unseen data.

# Andromeda Principles
- **Efficiency**: Andromeda incorporates cutting-edge optimization techniques, such as attention flashing, rotary position encodings, and deep normalization, resulting in efficient training and inference.
Expand Down

0 comments on commit 112a535

Please sign in to comment.