Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parametric UMAP model family #688

Merged
merged 184 commits into from
Aug 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
184 commits
Select commit Hold shift + click to select a range
ec9d5eb
Add src/vak/prep/unit_dataset/ with unit_dataset.py
NickleDave Jul 7, 2023
7b33756
Add vak/prep/dimensionality_reduction/ with prep_dimensionality_reduc…
NickleDave Jul 7, 2023
4452620
Import new modules in vak/prep/__init__.py
NickleDave Jul 7, 2023
ece533d
Remove parameter from prep_frame_classification dataset docstring, no…
NickleDave Jul 7, 2023
7b2ef81
Rename 'vak.prep.split.dataframe' -> 'vak.prep.split.frame_classifica…
NickleDave Jul 7, 2023
79f5244
Use renamed 'split.frame_classification_dataframe' in vak.prep.frame_…
NickleDave Jul 7, 2023
41776a4
Fix typo in src/vak/prep/split/split.py
NickleDave Jul 7, 2023
2c7967e
Remove wrong type hint in src/vak/prep/unit_dataset/unit_dataset.py
NickleDave Jul 7, 2023
a918181
Add vak/prep/dimensionality_reduction/dataset_arrays.py with function…
NickleDave Jul 7, 2023
4ae6fb0
Add src/vak/datasets/dimensionality_reduction/ with unit_dataset.py a…
NickleDave Jul 7, 2023
702a9e2
Add initial UnitDataset class, fix imports in datasets/dimensionality…
NickleDave Jul 7, 2023
1ed1303
Import dataset_arrays module in src/vak/prep/dimensionality_reduction…
NickleDave Jul 7, 2023
a733332
Import dimensionality_reduction in vak/datasets/__init__.py
NickleDave Jul 7, 2023
77a7cc4
Fix typo in src/vak/datasets/dimensionality_reduction/__init__.py
NickleDave Jul 7, 2023
7e5d6b9
Fix `vak.prep.dimensionality_reduction.prep_dimensionality_reduction_…
NickleDave Jul 7, 2023
32759b3
Remove wrong import from src/vak/datasets/dimensionality_reduction/un…
NickleDave Jul 7, 2023
d732fd5
Fix pad_spectrogram to re-save file after padding
NickleDave Jul 8, 2023
dceb44a
Add vak/nn/loss/umap.py
NickleDave Jul 8, 2023
d87cfad
Add src/vak/datasets/dimensionality_reduction/parametric_umap/
NickleDave Jul 8, 2023
3fdd573
Add vak/nets/conv_encoder.py
NickleDave Jul 8, 2023
44a5094
Fix dataset class in vak/datasets/dimensionality_reduction/parametric…
NickleDave Jul 8, 2023
e796a8d
Import ParametricUMAPDataset in vak.datasets
NickleDave Jul 8, 2023
0b1de15
Import umap_loss in src/vak/nn/loss/__init__.py
NickleDave Jul 8, 2023
f31864f
Add src/vak/models/parametric_umap_model.py
NickleDave Jul 8, 2023
e9a4251
Add src/vak/models/convencoder_parametric_umap.py
NickleDave Jul 8, 2023
7e81e14
Import ParametricUMAPModel and ConvEncoderParametricUMAP in src/vak/m…
NickleDave Jul 8, 2023
8b25e8e
Import conv_encoder and ConvEncoder in vak/nets/__init__.py
NickleDave Jul 8, 2023
0256d0d
Add shape property to ParametricUMAPDataset
NickleDave Jul 8, 2023
291cef8
Add UmapLoss class to vak/nn/loss/umap.py
NickleDave Jul 8, 2023
d8f17e7
Import functional and import * from .loss and .modules in vak/nn/__in…
NickleDave Jul 8, 2023
72438e8
Fix how we get decoder from network dict in ParametricUMAPModel: use …
NickleDave Jul 8, 2023
eeef543
Fix ConvEncoderParametricUMAP to only specify 'encoder' in network di…
NickleDave Jul 8, 2023
abb51e8
Add umap-learn and pynndescent as dependencies
NickleDave Jul 8, 2023
ff8e8ee
Remove out-dated parameter from docstring in train/frame_classificati…
NickleDave Jul 8, 2023
1d4d17a
Add vak/train/parametric_umap.py
NickleDave Jul 8, 2023
019d5a5
Fix vak/train/train.py to call train_parametric_umap_model when model…
NickleDave Jul 8, 2023
7d7c922
Add 'dimensionality reduction' to DATASET_TYPE_FUNCTION_MAP in vak.pr…
NickleDave Jul 8, 2023
d5a8da5
Fix vak/prep/prep.py so it will call prep_dimensionality_reduction_da…
NickleDave Jul 8, 2023
3f99fab
Fix parameter of parametric umap dataset: 'Euclidean' -> 'euclidean'
NickleDave Jul 8, 2023
2bacd21
Add duration property to ParametricUMAPDataset
NickleDave Jul 8, 2023
a7585c2
Rename vak/transforms/defaults.py -> frame_classification.py, add 'ge…
NickleDave Jul 9, 2023
347503e
Import registry in models/__init__.py, add to __all__ there
NickleDave Jul 9, 2023
635f0f7
Add transforms/defaults/parametric_umap.py
NickleDave Jul 9, 2023
89d033e
Add valid transform_kwarg key-value pairs to docstring of get_default…
NickleDave Jul 9, 2023
8c31314
Add transforms/defaults/get.py
NickleDave Jul 9, 2023
698a7fa
Rename transforms.defaults.get.get_defaults -> get_default_transform
NickleDave Jul 9, 2023
44de249
Add transforms/defaults/__init__.py with imports
NickleDave Jul 9, 2023
1feb336
Fix train/frame_classification.py to use transforms.defaults.get_defa…
NickleDave Jul 9, 2023
9ea20da
Fix eval/frame_classification.py to use transforms.defaults.get_defau…
NickleDave Jul 9, 2023
5ce8e47
Fix predict/frame_classification.py to use transforms.defaults.get_de…
NickleDave Jul 9, 2023
6241e47
Fix imports in transforms/__init__.py, just import defaults not get_d…
NickleDave Jul 9, 2023
8af11ec
Fixup src/vak/eval/frame_classification.py
NickleDave Jul 9, 2023
23ed923
Fixup src/vak/predict/frame_classification.py
NickleDave Jul 9, 2023
3246c48
Fixup src/vak/train/frame_classification.py
NickleDave Jul 9, 2023
b724eb5
Fix 'get_default_frame_classification_transform' to access transform_…
NickleDave Jul 9, 2023
bb8f649
Fix function name in 'make_learncurve_splits_from_dataset_df': 'split…
NickleDave Jul 9, 2023
f3c4932
Remove argument 'spect_key' in cli/predict that was removed from vak.…
NickleDave Jul 9, 2023
eee7896
Refactor/fix script that generates the 'generated' test data
NickleDave Jul 16, 2023
1b99d8d
Fix name in predict/frame_classificaton.py: -> 'datasets.frame_classi…
NickleDave Jul 16, 2023
fd2d5f4
Fix frames dataset so frames_labels_paths is None when split is 'pred…
NickleDave Jul 16, 2023
5a3b46e
Use dict get method with transform_kwargs for PredictItemTransform in…
NickleDave Jul 16, 2023
6d99d55
Fix how we add spect_format to metadata in prep/frame_classification/…
NickleDave Jul 16, 2023
22777de
Fix how we determine source_paths for input_type 'spect' in FramesDat…
NickleDave Jul 16, 2023
b9319c0
Fix arg name in predict/frame_classification.py
NickleDave Jul 16, 2023
2b64aa5
Add print statements to tests/scripts/generate_data_for_tests.py so w…
NickleDave Jul 16, 2023
0df0bf9
Modify generate_data_for_tests script so it only preps datasets once
NickleDave Jul 17, 2023
e7b6875
Rename configs in test data so model name is capitalized
NickleDave Jul 17, 2023
9ce324f
Update tests/data_for_tests/configs/configs.json
NickleDave Jul 17, 2023
4ab7a33
Add package tests/scripts/vaktestdata, refactor giant script
NickleDave Jul 17, 2023
25adc7c
In configs used for test data, capitalize model names in directory paths
NickleDave Jul 17, 2023
c2fcfaa
Rename top-level field -> 'config_metadata' in metadata.json, add mis…
NickleDave Jul 17, 2023
af6e7d3
Add ConfigMetadata dataclass in vaktestdata/config_metadata.py
NickleDave Jul 17, 2023
6a81273
Add log message to tests/scripts/vaktestdata/dirs.py
NickleDave Jul 17, 2023
16e6aa4
Modify vaktestdata.configs.copy_config_files to make GENERATED_TEST_C…
NickleDave Jul 17, 2023
12c9c9d
Modify constants so it has a list of ConfigMetadata instances made fr…
NickleDave Jul 17, 2023
3233005
Modify vaktestdata.prep.run_prep to use CONFIG_METADATA to only run p…
NickleDave Jul 17, 2023
35be421
Fix formatting errors in tests/data_for_tests/configs/configs.json
NickleDave Jul 17, 2023
46f328c
Add missing field/attribute 'model' to ConfigMetadata
NickleDave Jul 17, 2023
9144c1e
Rewrite `vaktestdata.configs.add_dataset_path_from_prepped_configs` t…
NickleDave Jul 17, 2023
5fbc0c5
Use logger in tests/scripts/vaktestdata/prep.py
NickleDave Jul 17, 2023
04d6ef1
Use logger in tests/scripts/vaktestdata/dirs.py
NickleDave Jul 17, 2023
5117376
Use logger, get name of config section correctly, and import/use path…
NickleDave Jul 17, 2023
68a7814
Add default for parser arg '--commands' in tests/scripts/vaktestdata/…
NickleDave Jul 17, 2023
a61c77b
Rewrite tests/scripts/generate_data_for_tests.py to use vaktestdata p…
NickleDave Jul 17, 2023
880b646
Rename ConvEnconderParametricUMAP -> ConvEncoderUMAP (the fact that i…
NickleDave Jul 17, 2023
5331982
Add: tests/data_for_tests/configs/ConvEncoderUMAP_train_audio_cbin_an…
NickleDave Jul 17, 2023
c0667e6
Add ConvEncoderUMAP_train_audio_cbin_annot_notmat.toml to configs.json
NickleDave Jul 17, 2023
ebd25c6
Add args + attributes to UMAPLoss class
NickleDave Jul 18, 2023
a52b9c5
Revise vak.models.parametric_umap_model
NickleDave Jul 18, 2023
de42b79
Move parametric UMAP dataset up into vak.datasets, get rid of dimensi…
NickleDave Jul 19, 2023
b69705f
Make minor fixes to ParametricUMAP class
NickleDave Jul 19, 2023
67d979d
WIP: Fix vak/train/parametric_umap.py so it actually works
NickleDave Jul 19, 2023
eea3814
Add 'train_dataset_params' and 'val_dataset_params' as attributes to …
NickleDave Jul 19, 2023
96adb56
Add 'train_transform_params' and 'val_transform_params' as attributes…
NickleDave Jul 19, 2023
d15aa44
Remove `root_results_path` arg from vak.train.train, no longer used
NickleDave Jul 19, 2023
3bb75d1
Add args for train/val transform params and train/val dataset params …
NickleDave Jul 19, 2023
43caa36
Pass args for train/val transform params and train/val dataset params…
NickleDave Jul 19, 2023
6f4781f
Add and use args train_transform_params and val_transform_params in v…
NickleDave Jul 19, 2023
af851cc
Put args with defaults in correct place in vak.train.train
NickleDave Jul 19, 2023
23e74b5
Fix TrainConfig attribute names: train/val transform kwargs -> transf…
NickleDave Jul 19, 2023
9249d22
Change type annotation for ParametricUMAPModel parameter network to i…
NickleDave Jul 19, 2023
ef30343
Fix where we get Metadata from in ParametricUMAPDatasets.from_dataset…
NickleDave Jul 19, 2023
7a12882
Fix how we get default kwargs for a network definition that's a dict …
NickleDave Jul 19, 2023
e699799
Fix how we get ParametricUMAPModel in vak.models.get
NickleDave Jul 19, 2023
07e1b15
Remove ckpt_step and patience args from call to train_parametric_umap…
NickleDave Jul 19, 2023
fc49d47
Make more fixes to train_parametric_umap_model
NickleDave Jul 19, 2023
b31ef6a
Rewrite train_frame_classification_model to use train/val_transform_p…
NickleDave Jul 21, 2023
37e8923
Fix train_parametric_umap_model to use train/val_transform_params
NickleDave Jul 21, 2023
4ba566b
In vak.train.train, pass train/val_transform/dataset_params into trai…
NickleDave Jul 21, 2023
80b0a25
Add train_dataset_params and val_transform_params to frame classifica…
NickleDave Jul 21, 2023
32b719b
Add train/val_transform_params to tests/data_for_tests/configs/ConvEn…
NickleDave Jul 21, 2023
b6b05ec
Add train/val_transform_params adn train/val_dataset_params to vak.tr…
NickleDave Jul 21, 2023
454b75c
fixup Add train/val_transform_params adn train/val_dataset_params to …
NickleDave Jul 21, 2023
0019b5f
Fix definitioin in train_frame_classification_model docstring
NickleDave Jul 21, 2023
0e3fa95
Add transform/dataset_param options to EVAL and PREDICT sections in v…
NickleDave Jul 21, 2023
5c44de2
Add transform/dataset_params to EvalConfig
NickleDave Jul 21, 2023
b1aae25
Add transform/dataset_params to PredictConfig
NickleDave Jul 21, 2023
8764b1c
Add/use transform/dataset_params in eval_frame_classification_model f…
NickleDave Jul 21, 2023
63e7e3a
Add/use transform/dataset_params in vak.eval.eval function -- pass in…
NickleDave Jul 21, 2023
d665154
Add/use transform/dataset_params in predict_frame_classification_mode…
NickleDave Jul 21, 2023
a76a229
Add/use transform/dataset_params in vak.predict.predict function -- p…
NickleDave Jul 21, 2023
322c2ff
Fix definition in vak/train/train.py docstring
NickleDave Jul 21, 2023
4791e7f
Remove src/vak/config/dataloader.py
NickleDave Jul 21, 2023
61d9403
Remove use of Dataloader in vak/config
NickleDave Jul 23, 2023
fe3370d
Add transform_params options and remove DATALOADER sections in eval/p…
NickleDave Jul 23, 2023
a0bf643
Add train/val_transform_params and train/val_dataset_params to vak.le…
NickleDave Jul 23, 2023
1c02bee
Add train/val_transform_params and train/val_dataset_params to vak.le…
NickleDave Jul 23, 2023
bfa7792
Remove import of dataloader in config/config.py
NickleDave Jul 23, 2023
dcddbe2
Finish remove dataloader imports from config sub-package
NickleDave Jul 24, 2023
bc1bde9
Filter out NumbaDeprecationWarnings triggered by umap
NickleDave Jul 24, 2023
2aca829
Remove DATALOADER section from two learncurve configs
NickleDave Jul 24, 2023
46d5eca
Remove DATALOADER section from 4 other configs in data_for_tests/configs
NickleDave Jul 24, 2023
0773694
Add name 'use_result_from_config' in configs.json
NickleDave Jul 26, 2023
5ae5a29
Add attribute `use_result_from_config` to ConfigMetadata
NickleDave Jul 26, 2023
824ad80
Remove constants from tests/scripts/vaktestdata/constants.py that are…
NickleDave Jul 26, 2023
5071049
Rewrite `vaktestdata.configs.fix_options_in_configs` to use declarati…
NickleDave Jul 26, 2023
c4b6d73
Refactor main loop in tests/scripts/generate_data_for_tests.py
NickleDave Jul 26, 2023
502f9d5
Remove other constants from tests/scripts/vaktestdata/constants.py
NickleDave Jul 26, 2023
fc9d257
Fix how dirs get made in tests/scripts/vaktestdata/dirs.py
NickleDave Jul 26, 2023
71e7414
Change birdsongrec -> birdsong-recognition-dataset in dir names in te…
NickleDave Jul 26, 2023
615cfe8
Rename prep/dimensionality_reduction -> prep/parametric_umap
NickleDave Jul 26, 2023
af7765b
Add missing train/val_dataset/transform_params options to LEARNCURVE …
NickleDave Jul 26, 2023
52bc9f0
Add missing train/val_dataset/transform_params options in call to lea…
NickleDave Jul 26, 2023
c7dc6b0
Remove window size, add transform/dataset_params in call to eval in c…
NickleDave Jul 26, 2023
02580c7
Remove window size, add transform/dataset_params in call to predict i…
NickleDave Jul 26, 2023
7f23952
Remove window_size arg in call to eval_frame_classification_model
NickleDave Jul 26, 2023
9ec72a3
fixup Remove window size, add transform/dataset_params in call to pre…
NickleDave Jul 26, 2023
46a6d06
Add missing fields to some entries in tests/data_for_tests/configs/co…
NickleDave Jul 28, 2023
4288ce2
Fix how we handle 'train_continue' command in generate_data_for_tests.py
NickleDave Jul 28, 2023
4878924
Make minor fixes to docstring of eval_frame_classification_model
NickleDave Jul 28, 2023
1bde4c6
Remove 'batch_size' from eval_frame_classification_model docstring, n…
NickleDave Jul 28, 2023
b80d01b
Remove unused import from train_parametric_umap_model
NickleDave Jul 28, 2023
f47ef5d
Add vak/eval/parametric_umap.py
NickleDave Jul 28, 2023
c4fa22d
Modify vak/eval/eval.py to call eval_parametric_umap_model when appro…
NickleDave Jul 28, 2023
774a2e8
Remove 'labelmap_path' from REQUIRED_OPTIONS IN vak/config/parse.py, …
NickleDave Jul 28, 2023
103b7c6
Fix how we train parametric umap so it saves checkpoints, and in the …
NickleDave Jul 28, 2023
231b538
Pass 'ckpt_step' into train_parametric_umap inside vak.train.train
NickleDave Jul 28, 2023
c33cd83
Add tests/data_for_tests/configs/ConvEncoderUMAP_eval_audio_cbin_anno…
NickleDave Jul 28, 2023
584f795
Add ConvEncoderUMAP_eval_audio_cbin_annot_notmat.toml to tests/data_f…
NickleDave Jul 28, 2023
8cb57e4
Make labelmap_path optional for EvalConfig, so Parametric UMAP models…
NickleDave Aug 9, 2023
9274de8
Rewrite definition for batch_size in docstring of src/vak/eval/parame…
NickleDave Aug 9, 2023
787f738
Add batch_size parameter to vak.eval.eval, make labelmap_path paramet…
NickleDave Aug 9, 2023
72822d3
Pass batch size from config into vak.eval.eval inside vak.cli.eval
NickleDave Aug 10, 2023
8c375f7
Fix prep section of config so it makes a test split: tests/data_for_t…
NickleDave Aug 10, 2023
f0cd19e
Fix resize option in ConvEncoderUMAP configs so that unit images are …
NickleDave Aug 10, 2023
de61aac
Add shape attribute to parametric_umap.Metadata
NickleDave Aug 10, 2023
d8c95b9
Revise src/vak/prep/unit_dataset/unit_dataset.py for readability, and…
NickleDave Aug 10, 2023
ec67027
Get shape returned by prep_unit_dataset inside src/vak/prep/parametri…
NickleDave Aug 10, 2023
c0a1586
Fix parametric_umap.Metadata -- shape attribute is mandatory, needs t…
NickleDave Aug 10, 2023
071ac0c
Fix prep_unit_dataset so we actually get shape of spectrograms
NickleDave Aug 10, 2023
6ae3b92
Add converter to parametric_umap.Metadata.shape attribute to cast lis…
NickleDave Aug 10, 2023
c3c9945
Add functions for default padding to src/vak/models/convencoder_umap.…
NickleDave Aug 10, 2023
6a92e31
Modify default parametric_umap transform so that it only adds padding…
NickleDave Aug 10, 2023
a336373
Modify train/parametric_umap to use default padding for ConvEncoderUM…
NickleDave Aug 10, 2023
d6e4590
Rewrite default padding for convencoder_umap to round to nearest tens…
NickleDave Aug 10, 2023
b6e08f2
Move code block that gets default padding for ConvEncoderUMAP so it's…
NickleDave Aug 10, 2023
0fd0404
Make fixes in eval/parametric_umap -- get default padding for ConvEnc…
NickleDave Aug 10, 2023
ea52abb
Remove passing parameter 'spect_scaler_path' into vak.eval.eval_param…
NickleDave Aug 10, 2023
23b19d6
WIP: Add missing docstrings in src/vak/datasets/frame_classification/…
NickleDave Aug 11, 2023
c573f07
WIP: Add missing docstrings in src/vak/datasets/frame_classification/…
NickleDave Aug 11, 2023
362af10
Revise docstrings in src/vak/datasets/parametric_umap/parametric_umap…
NickleDave Aug 11, 2023
fad7b53
WIP: Add src/vak/predict/parametric_umap.py
NickleDave Aug 11, 2023
84bd98b
Rename ParametricUMAPTrainingDataset -> ParametricUMAPDataset
NickleDave Aug 14, 2023
96a8d56
Remove transform_params table from ConvEncoderUMAP configs
NickleDave Aug 14, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ dependencies = [
"pytorch-lightning >=1.8.4.post0, <2.0",
"matplotlib >=3.3.3",
"numpy >=1.18.1",
"pynndescent >=0.5.10",
"scipy >=1.4.1",
"SoundFile >=0.10.3",
"pandas >=1.0.1",
Expand All @@ -39,6 +40,7 @@ dependencies = [
"torch >=1.7.1, <2.0.0",
"torchvision >=0.5.0",
"tqdm >=4.42.1",
"umap-learn >=0.5.3",
]

[project.optional-dependencies]
Expand Down
4 changes: 3 additions & 1 deletion src/vak/cli/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,10 @@ def eval(toml_path):
checkpoint_path=cfg.eval.checkpoint_path,
labelmap_path=cfg.eval.labelmap_path,
output_dir=cfg.eval.output_dir,
window_size=cfg.dataloader.window_size,
num_workers=cfg.eval.num_workers,
batch_size=cfg.eval.batch_size,
transform_params=cfg.eval.transform_params,
dataset_params=cfg.eval.dataset_params,
spect_scaler_path=cfg.eval.spect_scaler_path,
device=cfg.eval.device,
post_tfm_kwargs=cfg.eval.post_tfm_kwargs,
Expand Down
5 changes: 4 additions & 1 deletion src/vak/cli/learncurve.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,13 @@ def learning_curve(toml_path):
model_name=model_name,
model_config=model_config,
dataset_path=cfg.learncurve.dataset_path,
window_size=cfg.dataloader.window_size,
batch_size=cfg.learncurve.batch_size,
num_epochs=cfg.learncurve.num_epochs,
num_workers=cfg.learncurve.num_workers,
train_transform_params=cfg.learncurve.train_transform_params,
train_dataset_params=cfg.learncurve.train_dataset_params,
val_transform_params=cfg.learncurve.val_transform_params,
val_dataset_params=cfg.learncurve.val_dataset_params,
results_path=results_path,
post_tfm_kwargs=cfg.learncurve.post_tfm_kwargs,
normalize_spectrograms=cfg.learncurve.normalize_spectrograms,
Expand Down
4 changes: 2 additions & 2 deletions src/vak/cli/predict.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,9 @@ def predict(toml_path):
dataset_path=cfg.predict.dataset_path,
checkpoint_path=cfg.predict.checkpoint_path,
labelmap_path=cfg.predict.labelmap_path,
window_size=cfg.dataloader.window_size,
num_workers=cfg.predict.num_workers,
spect_key=cfg.spect_params.spect_key,
transform_params=cfg.predict.transform_params,
dataset_params=cfg.predict.dataset_params,
timebins_key=cfg.spect_params.timebins_key,
spect_scaler_path=cfg.predict.spect_scaler_path,
device=cfg.predict.device,
Expand Down
5 changes: 4 additions & 1 deletion src/vak/cli/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,10 @@ def train(toml_path):
model_name=model_name,
model_config=model_config,
dataset_path=cfg.train.dataset_path,
window_size=cfg.dataloader.window_size,
train_transform_params=cfg.train.train_transform_params,
train_dataset_params=cfg.train.train_dataset_params,
val_transform_params=cfg.train.val_transform_params,
val_dataset_params=cfg.train.val_dataset_params,
batch_size=cfg.train.batch_size,
num_epochs=cfg.train.num_epochs,
num_workers=cfg.train.num_workers,
Expand Down
1 change: 0 additions & 1 deletion src/vak/config/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
"""sub-package that parses config.toml files and returns config object"""
from . import (
config,
dataloader,
eval,
learncurve,
model,
Expand Down
8 changes: 0 additions & 8 deletions src/vak/config/config.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
import attr
from attr.validators import instance_of, optional

from .dataloader import DataLoaderConfig
from .eval import EvalConfig
from .learncurve import LearncurveConfig
from .predict import PredictConfig
Expand All @@ -20,8 +19,6 @@ class Config:
represents ``[PREP]`` section of config.toml file
spect_params : vak.config.spect_params.SpectParamsConfig
represents ``[SPECT_PARAMS]`` section of config.toml file
dataloader : vak.config.dataloader.DataLoaderConfig
represents ``[DATALOADER]`` section of config.toml file
train : vak.config.train.TrainConfig
represents ``[TRAIN]`` section of config.toml file
eval : vak.config.eval.EvalConfig
Expand All @@ -31,14 +28,9 @@ class Config:
learncurve : vak.config.learncurve.LearncurveConfig
represents ``[LEARNCURVE]`` section of config.toml file
"""

spect_params = attr.ib(
validator=instance_of(SpectParamsConfig), default=SpectParamsConfig()
)
dataloader = attr.ib(
validator=instance_of(DataLoaderConfig), default=DataLoaderConfig()
)

prep = attr.ib(validator=optional(instance_of(PrepConfig)), default=None)
train = attr.ib(validator=optional(instance_of(TrainConfig)), default=None)
eval = attr.ib(validator=optional(instance_of(EvalConfig)), default=None)
Expand Down
16 changes: 0 additions & 16 deletions src/vak/config/dataloader.py

This file was deleted.

25 changes: 24 additions & 1 deletion src/vak/config/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,10 +99,17 @@ class EvalConfig:
a float value for ``min_segment_dur``.
See the docstring of the transform for more details on
these arguments and how they work.
transform_params: dict, optional
Parameters for data transform.
Passed as keyword arguments.
Optional, default is None.
dataset_params: dict, optional
Parameters for dataset.
Passed as keyword arguments.
Optional, default is None.
"""
# required, external files
checkpoint_path = attr.ib(converter=expanded_user_path)
labelmap_path = attr.ib(converter=expanded_user_path)
output_dir = attr.ib(converter=expanded_user_path)

# required, model / dataloader
Expand All @@ -118,6 +125,10 @@ class EvalConfig:
default=None,
)

# "optional" but actually required for frame classification models
# TODO: check model family in __post_init__ and raise ValueError if labelmap
# TODO: not specified for a frame classification model?
labelmap_path = attr.ib(converter=converters.optional(expanded_user_path), default=None)
# optional, transform
spect_scaler_path = attr.ib(
converter=converters.optional(expanded_user_path),
Expand All @@ -133,3 +144,15 @@ class EvalConfig:
# optional, data loader
num_workers = attr.ib(validator=instance_of(int), default=2)
device = attr.ib(validator=instance_of(str), default=device.get_default())

transform_params = attr.ib(
converter=converters.optional(dict),
validator=validators.optional(instance_of(dict)),
default=None,
)

dataset_params = attr.ib(
converter=converters.optional(dict),
validator=validators.optional(instance_of(dict)),
default=None,
)
5 changes: 0 additions & 5 deletions src/vak/config/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
from toml.decoder import TomlDecodeError

from .config import Config
from .dataloader import DataLoaderConfig
from .eval import EvalConfig
from .learncurve import LearncurveConfig
from .predict import PredictConfig
Expand All @@ -15,7 +14,6 @@


SECTION_CLASSES = {
"DATALOADER": DataLoaderConfig,
"EVAL": EvalConfig,
"LEARNCURVE": LearncurveConfig,
"PREDICT": PredictConfig,
Expand All @@ -25,10 +23,8 @@
}

REQUIRED_OPTIONS = {
"DATALOADER": None,
"EVAL": [
"checkpoint_path",
"labelmap_path",
"output_dir",
"model",
],
Expand All @@ -38,7 +34,6 @@
],
"PREDICT": [
"checkpoint_path",
"labelmap_path",
"model",
],
"PREP": [
Expand Down
20 changes: 20 additions & 0 deletions src/vak/config/predict.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,14 @@ class PredictConfig:
spectrogram with `spect_path` filename `gy6or6_032312_081416.npz`,
and the network is `TweetyNet`, then the net output file
will be `gy6or6_032312_081416.tweetynet.output.npz`.
transform_params: dict, optional
Parameters for data transform.
Passed as keyword arguments.
Optional, default is None.
dataset_params: dict, optional
Parameters for dataset.
Passed as keyword arguments.
Optional, default is None.
"""

# required, external files
Expand Down Expand Up @@ -109,3 +117,15 @@ class PredictConfig:
)
majority_vote = attr.ib(validator=instance_of(bool), default=True)
save_net_outputs = attr.ib(validator=instance_of(bool), default=False)

transform_params = attr.ib(
converter=converters.optional(dict),
validator=validators.optional(instance_of(dict)),
default=None,
)

dataset_params = attr.ib(
converter=converters.optional(dict),
validator=validators.optional(instance_of(dict)),
default=None,
)
24 changes: 24 additions & 0 deletions src/vak/config/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,3 +118,27 @@ class TrainConfig:
converter=converters.optional(expanded_user_path),
default=None,
)

train_transform_params = attr.ib(
converter=converters.optional(dict),
validator=validators.optional(instance_of(dict)),
default=None,
)

train_dataset_params = attr.ib(
converter=converters.optional(dict),
validator=validators.optional(instance_of(dict)),
default=None,
)

val_transform_params = attr.ib(
converter=converters.optional(dict),
validator=validators.optional(instance_of(dict)),
default=None,
)

val_dataset_params = attr.ib(
converter=converters.optional(dict),
validator=validators.optional(instance_of(dict)),
default=None,
)
16 changes: 12 additions & 4 deletions src/vak/config/valid.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,6 @@ freqbins_key = 'f'
timebins_key = 't'
audio_path_key = 'audio_path'

[DATALOADER]
window_size = 88

[TRAIN]
model = 'TweetyNet'
root_results_dir = './tests/test_data/results/train'
Expand All @@ -52,6 +49,10 @@ patience = 4
results_dir_made_by_main_script = '/some/path/to/learncurve/'
checkpoint_path = '/home/user/results_181014_194418/TweetyNet/checkpoints/'
spect_scaler_path = '/home/user/results_181014_194418/spect_scaler'
train_transform_params = {'resize' = 128}
train_dataset_params = {'window_size' = 80}
val_transform_params = {'resize' = 128}
val_dataset_params = {'window_size' = 80}

[EVAL]
dataset_path = 'tests/test_data/prep/learncurve/032312_prep_191224_225910.csv'
Expand All @@ -64,6 +65,8 @@ num_workers = 4
device = 'cuda'
spect_scaler_path = '/home/user/results_181014_194418/spect_scaler'
post_tfm_kwargs = {'majority_vote' = true, 'min_segment_dur' = 0.01}
transform_params = {'resize' = 128}
dataset_params = {'window_size' = 80}

[LEARNCURVE]
model = 'TweetyNet'
Expand All @@ -80,7 +83,10 @@ results_dir_made_by_main_script = '/some/path/to/learncurve/'
post_tfm_kwargs = {'majority_vote' = true, 'min_segment_dur' = 0.01}
num_workers = 4
device = 'cuda'

train_transform_params = {'resize' = 128}
train_dataset_params = {'window_size' = 80}
val_transform_params = {'resize' = 128}
val_dataset_params = {'window_size' = 80}

[PREDICT]
dataset_path = 'tests/test_data/prep/learncurve/032312_prep_191224_225910.csv'
Expand All @@ -96,3 +102,5 @@ spect_scaler_path = '/home/user/results_181014_194418/spect_scaler'
min_segment_dur = 0.004
majority_vote = false
save_net_outputs = false
transform_params = {'resize' = 128}
dataset_params = {'window_size' = 80}
2 changes: 2 additions & 0 deletions src/vak/datasets/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
from . import (
frame_classification,
parametric_umap
)


__all__ = [
"dimensionality_reduction",
"frame_classification",
]
19 changes: 17 additions & 2 deletions src/vak/datasets/frame_classification/frames_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,15 @@ def __init__(
dataset_df = dataset_df[dataset_df.split == split].copy()
self.dataset_df = dataset_df
self.frames_paths = self.dataset_df[constants.FRAMES_NPY_PATH_COL_NAME].values
self.frame_labels_paths = self.dataset_df[constants.FRAME_LABELS_NPY_PATH_COL_NAME].values
if split != 'predict':
self.frame_labels_paths = self.dataset_df[constants.FRAME_LABELS_NPY_PATH_COL_NAME].values
else:
self.frame_labels_paths = None

if input_type == 'audio':
self.source_paths = self.dataset_df['audio_path'].values
elif input_type == 'spect':
self.source_paths = self.dataset_df['audio_path'].values
self.source_paths = self.dataset_df['spect_path'].values
else:
raise ValueError(
f"Invalid `input_type`: {input_type}. Must be one of {{'audio', 'spect'}}."
Expand Down Expand Up @@ -97,6 +100,18 @@ def from_dataset_path(
split: str = "val",
item_transform: Callable | None = None,
):
"""

Parameters
----------
dataset_path
split
item_transform

Returns
-------

"""
dataset_path = pathlib.Path(dataset_path)
metadata = Metadata.from_dataset_path(dataset_path)
frame_dur = metadata.frame_dur
Expand Down
15 changes: 15 additions & 0 deletions src/vak/datasets/frame_classification/window_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,21 @@ def from_dataset_path(
transform: Callable | None = None,
target_transform: Callable | None = None
):
"""

Parameters
----------
dataset_path
window_size
stride
split
transform
target_transform

Returns
-------

"""
dataset_path = pathlib.Path(dataset_path)
metadata = Metadata.from_dataset_path(dataset_path)
frame_dur = metadata.frame_dur
Expand Down
8 changes: 8 additions & 0 deletions src/vak/datasets/parametric_umap/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from .metadata import Metadata
from .parametric_umap import ParametricUMAPDataset


__all__ = [
'Metadata',
'ParametricUMAPDataset'
]
Loading
Loading