This note explains how to train models. See also winning_solution.md
.
-
SETTINGS.json
exists in this directory.- different settings file can be specified with
--settings=filename
for python scripts.
- different settings file can be specified with
-
INPUT_DIR
,DATA_DIR
, andOUTPUT_DIR
specified inSETTINGS.json
exists- the codes do not create these directories automatically
-
Kaggle data are in
INPUT_DIR/google-research-identify-contrails-reduce-global-warming
- Put the data (or symbolic link to the data) in
./input
directory, - or, set
INPUT_DIR
inSETTINGS.json
- Put the data (or symbolic link to the data) in
-
About 70 GB of free disk space for
DATA_DIR
-
About 128MB per model weight, 512MB for 2 folds + 2 folds
The training and validation data needs to be converted for efficient data loading.
$ python3 src/script/convert_data_compact4.py train
$ python3 src/script/convert_data_compact4.py validation
- Read data from
INPUT_DIR/google-research-identify-contrails-reduce-global-warming
- Output HDF5 to
DATA_DIR/compact4
Test run:
$ sh test_run.sh
The run is checked with GPU RTX3090 (24GB RAM); 16GB is insufficient.
Training the final models requires about 40GB RAM (24 GB is insufficient). The required RAM can be reduced with smaller batch size, but the model performance could be different.
$ python3 src/unet1024/evaluate.py src/unet1024/unet1024.yml
$ python3 src/vit4/evaluate.py src/vit4/vit4_1024.yml
- Reads data in
DATA_DIR/compact4
. - Outputs to
OUTPUT_DIR/<config name>/
,- where
<config name>.yml
determines the subdirectory name, - i.e.,
unet1024/
andvit4_1024/
. - model weights are
model<ifold>.pytorch
.
- where
-
The scripts accept
--settings=SETTINGS.json
for other settings files. -
The output directory should not contain model weight files. Use
--overwrite
option to overwrite.