Releases · lassl/lassl · GitHub

02 Nov 12:16

seopbo

v1.0.0 (221102) Latest

Latest

What's Changed

[mixed] refactor: Refactor for v1.0.0 by @seopbo in #102
Currently, lassl suports to train bert, albert, roberta, gpt2, bart, t5, ul2
In next, lassl will suport to train electra. Moreover train_universal_tokenizer.py will be added to lassl.
- train_universal_tokenizer.py will train tokenizer used to train all types of model which are supported by lassl.

Full Changelog: v0.2.0...v1.0.0

Contributors

seopbo

Assets 2

22 Sep 09:28

seopbo

v0.2.0 (220922)

What's Changed

Support training BART by @seopbo in #81
Support training T5 model by @DaehanKim in #87
Add config files #82 by @Doohae in #88
Support Electra pretrain by @Doohae in #91
Add UL2 Language Modeling by @DaehanKim in #98

New Contributors

@DaehanKim made their first contribution in #87

Full Changelog: v0.1.4...v0.2.0

Contributors

seopbo, DaehanKim, and wavy-jung

Assets 2

18 Mar 08:06

seopbo

v0.1.3

Summary

Refactor lassl for packaging modules to library
Add a function of dataset blending

What's Changed

Add dataset blender by @hyunwoongko in #73
Remove poetry dependencies by @seopbo in #76

New Contributors

@hyunwoongko made their first contribution in #73

Full Changelog: v0.1.2...v0.1.3

Contributors

seopbo and hyunwoongko

Assets 2

30 Dec 01:46

seopbo

v0.1.2

Summary

Fix bugs in src/collators.py

What's Changed

[python] fix: Fix importing a invalid module by @seopbo in #72

Full Changelog: v0.1.1...v0.1.2

Contributors

seopbo

Assets 2

20 Dec 01:01

seopbo

v0.1.1

Summary

Update README.md
- Support README.md in english.
- Support README_ko.md in korean.
Fix bugs of training GPT2
Add examples configs for gpu, tpu environments.

What's Changed

[docs] fix: Change a license by @seopbo in #64
[etc] docs: Add English version of README by @bzantium in #66
Add example configs for gpu, tpu by @seopbo in #65
[python] fix: debug GPT2 processor and collator by @bzantium in #69
Update README.md by @bzantium in #70

Full Changelog: v0.1.0...v0.1.1

Contributors

bzantium and seopbo

Assets 2

15 Dec 16:00

seopbo

v0.1.0

Summary

First release

What's Changed

Feature/#2 by @seopbo in #4
feat: TPU compatibility by @monologg in #8
Feature/#3 GPT2Preprocessor 추가 by @iron-ij in #10
[docs] chore: Add authors by @seopbo in #13
Feature/#9 ALBERT용 Processor, Collator 추가 by @bzantium in #14
[python] feat: Save tokenizer by @seopbo in #19
[python] mixed: Support sentence per line type doc by @seopbo in #20
Support setting arguments of pretraining by a config file by @seopbo in #22
Support corpus_type by @seopbo in #25
Support adding additional special tokens by @seopbo in #26
[python] feat: Add bert processor by @bzantium in #29
Refactor codes relevant to pretrain by @seopbo in #31
Update issue templates by @seopbo in #34
[python] fix: sampling_ratio 조건 추가하기 by @bzantium in #36
[python] chore: Update dependencies by @seopbo in #38
[python] fix: Fix a buffer in processing.py by @seopbo in #41
[mixed] fix: xla_spawn 변경, config 추가 및 주석 by @bzantium in #44
[python] feat: add keep_in_memory option in serialize_corpora by @iron-ij in #43
[chore] fix: Fix a requirements.txt by @seopbo in #46
[python] fix: sampling할 때 중복샘플링 옵션 제거 by @bzantium in #48
[etc] docs: README 추가 by @bzantium in #39
[etc] docs: README에 LASSL 약자소개 추가하기 by @bzantium in #52
[python] chore: Update dependencies by @seopbo in #54
[python] fix: GPT2 Collator CollatorForLM 상속하기 by @bzantium in #57
[etc] docs: Add additional information to doc by @seopbo in #59

New Contributors

@seopbo made their first contribution in #4
@monologg made their first contribution in #8
@iron-ij made their first contribution in #10
@bzantium made their first contribution in #14

Full Changelog: https://github.com/lassl/lassl/commits/v0.1.0

Contributors

bzantium, seopbo, and 2 other contributors

Assets 2