Skip to content

Latest commit

 

History

History
79 lines (60 loc) · 6.23 KB

MODEL_GARDEN.md

File metadata and controls

79 lines (60 loc) · 6.23 KB

TF-NLP Model Garden

Introduction

The TF-NLP library provides a collection of scripts for training and evaluating transformer-based models, on various tasks such as sentence classification, question answering, and translation. Additionally, we provide checkpoints of pretrained models which can be finetuned on downstream tasks.

⚠️ Disclaimer: Checkpoints are based on training with publicly available datasets. Some datasets contain limitations, including non-commercial use limitations. Please review the terms and conditions made available by third parties before using the datasets provided. Checkpoints are licensed under Apache 2.0.

⚠️ Disclaimer: Datasets hyperlinked from this page are not owned or distributed by Google. Such datasets are made available by third parties. Please review the terms and conditions made available by the third parties before using the data.

How to Train Models

Model Garden can be easily installed with pip install tf-models-nightly. After installation, check out this instruction on how to train models with this codebase.

By default, the experiment runs on GPUs. To run on TPUs, one should overwrite runtime.distribution_strategy and set the tpu address. See RuntimeConfig for details.

In general, the experiments can run with the following command by setting the corresponding ${TASK}, ${TASK_CONFIG}, ${MODEL_CONFIG}.

EXPERIMENT=???
TASK_CONFIG=???
MODEL_CONFIG=???
EXRTRA_PARAMS=???
MODEL_DIR=???  # a-folder-to-hold-checkpoints-and-logs
python3 train.py \
  --experiment=${EXPERIMENT} \
  --mode=train_and_eval \
  --model_dir=${MODEL_DIR} \ 
  --config_file=${TASK_CONFIG} \
  --config_file=${MODEL_CONFIG} \
  --params_override=${EXRTRA_PARAMS}
  • EXPERIMENT can be found under configs/
  • TASK_CONFIG can be found under configs/experiments/
  • MODEL_CONFIG can be found under configs/models/

Order of params override:

  1. train.py looks up the registered ExperimentConfig with ${EXPERIMENT}
  2. Overrides params in TaskConfig in ${TASK_CONFIG}
  3. Overrides params model in TaskConfig with ${MODEL_CONFIG}
  4. Overrides any params in ExperimentConfig with ${EXTRA_PARAMS}

Note that

  1. ${TASK_CONFIG}, ${MODEL_CONFIG}, ${EXTRA_PARAMS} can be optional when EXPERIMENT default is enough.
  2. ${TASK_CONFIG}, ${MODEL_CONFIG}, ${EXTRA_PARAMS} are only guaranteed to be compatible to it's ${EXPERIMENT} that defines it.

Experiments

NAME EXPERIMENT TASK_CONFIG MODEL_CONFIG EXRTRA_PARAMS
BERT-base GLUE/MNLI-matched finetune bert/sentence_prediction glue_mnli_matched.yaml bert_en_uncased_base.yaml
data and bert-base hub inittask.train_data.input_path=/path-to-your-training-data,task.validation_data.input_path=/path-to-your-val-data,task.hub_module_url=https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4
BERT-base GLUE/MNLI-matched finetune bert/sentence_prediction glue_mnli_matched.yaml bert_en_uncased_base.yaml
data and bert-base ckpt inittask.train_data.input_path=/path-to-your-training-data,task.validation_data.input_path=/path-to-your-val-data,task.init_checkpoint=gs://tf_model_garden/nlp/bert/uncased_L-12_H-768_A-12/bert_model.ckpt
BERT-base SQuAD v1.1 finetune bert/squad squad_v1.yaml bert_en_uncased_base.yaml
data and bert-base hub inittask.train_data.input_path=/path-to-your-training-data,task.validation_data.input_path=/path-to-your-val-data,task.hub_module_url=https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4
ALBERT-base SQuAD v1.1 finetune bert/squad squad_v1.yaml albert_base.yaml
data and albert-base hub inittask.train_data.input_path=/path-to-your-training-data,task.validation_data.input_path=/path-to-your-val-data,task.hub_module_url=https://tfhub.dev/tensorflow/albert_en_base/3
Transformer-large WMT14/en-de scratch wmt_transformer/large
ende-32k sentencepiecetask.sentencepiece_model_path='gs://tf_model_garden/nlp/transformer_wmt/ende_bpe_32k.model'

Useful links

How to Train Models

List of Pre-trained Models for finetuning

How to Publish Models

TensorFlow blog on Model Garden.