Skip to content

TartuNLP/smugri-llm

Repository files navigation

LLMs for Extremely Low-Resource Finno-Ugric Languages

This repository contains the implementation used for training and evaluating language models for extremely low-resource Finno-Ugric languages.

Models

Pre-trained:

Instruction-tuned:

Evaluation

Belebele-SMUGRI:

SIB-SMUGRI:

Usage

Scripts for launching training are provided in:

LM-eval-harness configurations:

Citation

@misc{purason2024llmsextremelylowresourcefinnougric,
      title={LLMs for Extremely Low-Resource Finno-Ugric Languages}, 
      author={Taido Purason and Hele-Andra Kuulmets and Mark Fishel},
      year={2024},
      eprint={2410.18902},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2410.18902}, 
}

Acknowledgements

The implementation is built on github.com/TartuNLP/llammas.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published