Skip to content

Latest commit

 

History

History
105 lines (93 loc) · 9.65 KB

learning_about_configs.md

File metadata and controls

105 lines (93 loc) · 9.65 KB

Learning about Configs

AdaSeq uses configuration file to control model assembling, training and evaluation. The configuration file supports yaml json jsonline format.

1. Configurate File Organization

Let's take resume.yaml as an example. A configuration file usually consists of the following fields:

experiment: ...
task: ...
dataset: ...
preprocessor: ...
data_collator: ...
model: ...
train: ...
evaluation: ...

2. Introduction to Global Parameters

Notice: Default = / means this parameter is compulsory.

2.1 experiment

Parameter Description Type Default
exp_dir experiment directory str experiments
exp_name experiment name. all outputs will be saved to ./${exp_dir}/${exp_name}/${datetime}/ str unknown
seed random seed int 42

2.2 task

task supports the following values (see metainfo):

  • word-segmentation
  • part-of-speech
  • named-entity-recognition
  • relation-extraction
  • entity-typing

2.3 dataset

Please refer to Customizing Dataset as the combination of dataset parameters is complex.

Parameter Description Type Default
task task of the dataset str None
name modelscope dataset name, for example damo/resume_ner str None
path huggingface dataset name, for example conll2003 str None
data_file data files, it can be an url, local directory or archive, it can be a dict containing train valid test as well str/dict None
data_type used to specify data loading method str None
transform dataset post processing, usually containing name key scheme dict None
labels label set, it can be a list labels: ['O', 'B-ORG', ...], file or urllabels: PATH_OR_URL, or a function counting labels from dataset str/list/dict None
access_token used to access private repos from modelscope or huggingface str None

2.4 preprocessor

Parameter Description Type Default
type preprocessor type str /
model_dir tokenizer name or directory str /
is_word2vec whether to use Lookup Table bool False
tokenizer_kwargs other parameters for tokenizer dict None
max_length maximum sentence length (subtoken-level) int 512

2.5 data_collator

data_collator supports the following values (see metainfo):

  • DataCollatorWithPadding
  • SequenceLabelingDataCollatorWithPadding
  • SpanExtractionDataCollatorWithPadding
  • MultiLabelSpanTypingDataCollatorWithPadding
  • MultiLabelConcatTypingDataCollatorWithPadding

2.6 model

Parameter Child-Parameter Description Type Default
type model type str /
embedder used to embed input ids to vectors, usually a pretrained model dict None
type embedder type, optional when using modelscope or huggingface model str None
model_name_or_path pretrained model name or path, supporting both modelscope or huggingface models str /
encoder encode the sentence vector, such as LSTM dict None
type encoder type str /
decoder not available, under construction dict None

2.7 train

Parameter Child-Parameter Description Type Default
trainer trainer type str None
max_epochs maximum number of epochs in training int /
dataloader used to load data dict /
batch_size_per_gpu batch size per gpu int /
workers_per_gpu data loading workers per gpu int 0
optimizer optimizer dict None
type optimizer type str /
lr learning rate for all parameters except specific param_groups float /
options options used in optimizer, for example grad_clip: max_norm: 2.0 dict None
param_groups param_groups can have different learning rates list None
└ regex regex expression to specify parameter group str /
└ lr learning rate for specific parameter group float /
lr_scheduler used to adjust learning rate uding training dict None
type supporting all lr_scheduler from pytorch (check if your pytorch version includes them) str /
options options used in the lr_scheduler dict None
hooks also callbacks see ModelScope documentation list None

2.8 evaluation

Parameter Child-Parameter Description Type Default
dataloader used to load data dict /
batch_size_per_gpu batch size per gpu int /
workers_per_gpu data loading workers per gpu int 0
metrics evaluation metrics list None
type metric type str /