- A collection of transformer models built using huggingface for various tasks. Training done using pytorch lightning.
- Datasets, models and tokenizers from hugging face.
- Goal: Get familiar with huggingface and pytorch lightning ecosystems.
- To train models, install using pip:
pip install transformers-collection
- check installation:
transformers-collection version
To play around with the code clone the repo:
git clone [email protected]:aadhithya/transformers-collection.git
- Install poetry:
pip install poetry
- Intsall dependencies:
poetry install
Note: poetry install
will create a new venv.
Note: poetry/pip install
installs CPU version of pytorch if not available, please make sure to install CUDA version if needed.
-
Create the yaml config file for the model (see configs/sentiment-clf.yml for example).
-
train model using:
transformers-collection train /path/to/config.yml
-
For a list of supported models, see section Supported Models.
The following models are planned:
Task | Model | Default Dataset | Status | Checkpoint |
---|---|---|---|---|
Text Classification | SentimentClassification |
emotion | ✅ | TBD |
Text Summarization | - | 🗓️ Planned | TBD |
- Auto push model to huggingface hub with commit ref.
- make models available via transformers pipelines.
- add more models.