TST/CLN: Fix unit tests (#693)

* Remove dataset_path option from tests/data_for_tests/configs/ConvEncoderUMAP_eval_audio_cbin_annot_notmat.toml * Fix use_dataset_from_config option to be null for ConvEncoderUMAP_eval, stops generate-test-data script from crashing * Fix 'teentytweetynet' -> 'TweetyNet' in SPECT_DIR_NPZ constant in tests/fixtures/spect.py * Fix config name declared as constant 'teenytweetynet' -> 'TeenyTweetyNet'; fix reference to Metadata in tests/test_datasets/test_window_dataset/conftest.py * Change options in ConvEncoderUMAP configs so training doesn't take forever * Set num_workers = 16 in all test data configs * Further modify config options in tests/data_for_tests/configs/ConvEncoderUMAP_train_audio_cbin_annot_notmat.toml to make training run faster * Fix how we handle labelmap_path in vaktestdata/configs.py * Add tests/test_datasets/test_frame_classification/ with test_window_dataset.py * Remove tests/test_datasets/test_window_dataset/ * Move test_datasets/test_metadata.py into test_frame_classification, fix unit tests * Delete tests/test_models/test_das.py for now * Add missing parameter name 'split' to parametrize in test_frame_classification/test_window_dataset.py * Fix 'WindowedFrameClassificationModel' -> 'FrameClassificationModel' in tests/test_models/test_decorator.py * Remove tests/test_nets/test_das for now * Remove tests/test_prep/test_frame_classification/test_helper.py -- helper module no longer exists * Remove extra tests and set single unit test to assert False for now in tests/test_prep/test_audio_dataset.py * Fix typo in docstring in src/vak/prep/frame_classification/dataset_arrays.py * Add two line breaks after imports in src/vak/prep/frame_classification/learncurve.py * WIP: fix tests/test_prep/test_frame_classification/test_dataset_arrays.py * WIP: fix tests/test_prep/test_frame_classification/test_learncurve.py * Rename 'labeled_timebins' -> 'frame_labels' in test_transforms/ * Fix key used by list_of_schematized_configs fixture -- 'configs' -> 'config_metadata' * Change 'teenytweetynet', 'tweetynet' -> 'TeenyTweetyNet', 'TweetyNet' throughout tests * Fix unit tests in tests/test_datasets/test_frame_classification/test_metadata.py * Remove DATALOADER from parametrize in a unit test in tests/test_config/test_parse.py * Fix how we load metadata in fixture in tests/fixtures/csv.py * Fix order of attributes in Metadata docstring in src/vak/datasets/frame_classification/metadata.py * Fix unit test in tests/test_datasets/test_frame_classification/test_window_dataset.py * Move test_datasets/test_seq/test_validators.py -> tests/test_prep/test_sequence_dataset.py, fix unit test * Import sequence_dataset in prep/__init__.py * Fix tests/test_eval/test_eval.py so it works * Rewrite test_eval/test_eval.py as tests/test_eval/test_frame_classification.py * Rewrite test_learncurve/test_learncurve.py as tests/test_learncurve/test_frame_classification.py * Add/fix imports in src/vak/learncurve/__init__.py * Fix module-level docstring in tests/test_eval/test_frame_classification.py * Fix unit tests in tests/test_learncurve/test_frame_classification.py * Make results_path not default to None, fix docstring in src/vak/learncurve/frame_classification.py * Remove stray backslash in docstring in src/vak/nets/tweetynet.py * Fix unit tests so they run in tests/test_models/test_base.py * Fix __init__ docstring for TweetyNet and TeenyTweetyNet so they define num_input_channels + num_freqbins, not input_shape * Revise docstring for register_model * Import 'model' and 'model_family' decorators in vak/models/__init__.py * Add MockModelFamily to tests/test_models/conftest.py and revise some of the docstrings there * Fix unit tests for vak.models.decorator.model in tests/test_models/test_decorator.py * Remove a .lower in a test in tests/test_models/test_windowed_frame_classification_model.py so that it doesn't fail * Reorganize / revise docstrings / add classes in tests/test_models/conftest.py * WIP: Add unit tests to tests/test_models/test_registry.py * Fix type annotation in src/vak/models/registry.py * Finish writing unit tests in tests/test_models/test_registry.py * Rename test_models/test_windowed_frame_classification_model.py -> test_frame_classification_model.py and fix tests * Refactor src/vak/models/registry.py to just use MODEL_REGISTRY dict -- previous way was unneccesarily convoluted * Have src/vak/models/get.py use registry.MODEL_REGISTRY * Fix tests in tests/test_models/test_registry.py after refactoring module * Fix unit test in tests/test_models/test_decorator.py so it removes the models it registers -- this way we don't raise errors in other unit tests because MockModel is already registered * Fix model family name / arguments in test_tweetynet.py + test_teenytweetynet.py * Add ignore for torchmetrics warning in pyproject.toml * Remove reference to pytest.Mark in tests/conftest.py that caused warning -- was unused anyway * Fix unit tests in tests/test_nets/test_tweetynet.py * Remove unused variable in tests/conftest.py * Remove unused import in tests/test_models/test_teenytweetynet.py * Fix unit tests in tests/test_nets/test_teenytweetynet.py * WIP: Fix tests in tests/test_predict/test_frame_classification.py * Remove stale comment from src/vak/eval/parametric_umap.py * Remove get_default_padding function from src/vak/models/convencoder_umap.py -- deciding to not do this * Fix output_dir option in tests/data_for_tests/configs/ConvEncoderUMAP_eval_audio_cbin_annot_notmat.toml * Remove calls to convencoder_umap.get_default padding in src/vak/train/parametric_umap.py * Remove call to convencoder_umap.get_default padding in src/vak/eval/parametric_umap.py * Do not add padding in src/vak/transforms/defaults/parametric_umap.py * Have ConvEncoderUMAP eval config re-use datast from train config, so that there's no issue with input shape being different that will lead to cryptic 'incorrect parameter size' errors when we re-load the checkpoint * Clean src/vak/cli/prep.py - Import annotations from __future__ to be able to use pipe for type annotations - Add type annotations to `purpose_from_toml` - Change `Path` -> `pathlib.Path`, to be explicit * Fix vaktestdata/configs.py so we get dataset paths from the right section * Add test_dur in tests/data_for_tests/configs/ConvEncoderUMAP_train_audio_cbin_annot_notmat.toml so that we can re-use the same dataset for the eval config * Clean src/vak/common/tensorboard.py -- add type annotations, fix formatting in docstrings * Fix 'vak.datasets.metadata.Metadata' -> 'vak.datasets.frame_classification.Metadata' in tests/test_predict/test_frame_classification.py * Clean src/vak/prep/audio_dataset.py - Fix order of parameters to `prep_audio_dataset` - Fix type annotation, remove default for parameter `data_dir` - Also fix parameter ordering in docstring - Fix validation of `data_dir` in pre-condition section of function - Use `vak.common.typing.PathLike` for type hint * Rewrite fixtures in tests/fixtures/audio.py so we can import as constants in tests where needed, to parametrize specific unit tests * WIP: Fix tests/test_prep/test_audio_dataset.py so it actually tests correctly -- need to add more cases to parametrize * Rename vak/train/train.py -> train_.py so we can still import train from train_ in vak/train/__init__.py and write 'vak.train.train', but *also* use unitest.mock.patch on functions where they are looked up in the train_ module * Fix imports in src/vak/train/__init__.py after renaming train.py -> train_.py * Write unit test for tests/test_train/test_train.py * Rename vak/eval/eval.py -> eval_.py as for train * Rename vak/predict/predict.py -> predict_.py as for train * Rename vak/prep/prep.py -> prep_.py as for train * Fixup tests/test_train/test_train.py * Add a 'break' in tests/fixtures/config.py fixture 'specific_config', so we don't loop unneccesarily through all configs * WIP: Add unit test in tests/test_eval/test_eval.py * Fix test in tests/test_eval/test_eval.py * Fix unit test names in tests/test_eval/test_frame_classification.py * Fix docstring, remove device fixture in tests/test_eval/test_eval.py * Remove device fixture in tests/test_train/test_train.py -- not needed since we're mocking anyways * Add tests/test_predict/test_predict.py * Fix module-level docstring in tests/test_predict/test_predict.py * Fix tests in tests/test_train/test_frame_classification.py * Add input_shape attribute to ConvEncoder neural network * Add tests/test_models/test_parametric_umap_model.py * Fix docstring, remove unused variable and unused import in tests/test_models/test_frame_classification_model.py * Add tests/test_train/test_parametric_umap.py * Fix docstring in tests/test_datasets/test_frame_classification/test_window_dataset.py * Add tests/test_datasets/test_frame_classification/test_frames_dataset.py * Add tests/test_datasets/test_parametric_umap/ * Add tests/test_models/test_ed_tcn.py * Add tests/test_models/test_convencoder_umap.py * Add tests/test_nets/test_ed_tcn.py * Add tests/test_nets/test_convencoder.py * Fix a unit test in tests/test_transforms/test_frame_labels/test_functional.py * Fix a test in tests/test_transforms/test_transforms.py * Fix undeclared variable 'device' in tests/test_train/test_train.py * Fix undeclared variable 'device' in tests/test_eval/test_eval.py * Fix undeclared variable 'device' in tests/test_predict/test_predict.py * Make othinor fixes in tests/test_predict/test_predict.py * Make input size smaller to speed up test in tests/test_models/test_convencoder_umap.py * Modify ConvEncoderUMAP configs to make dataset smaller, speed up tests * Fix test in tests/test_prep/test_prep.py * Fix docstring in src/vak/prep/frame_classification/dataset_arrays.py and fix function so that it does not add 'index' or 'level_0' columns to dataframes * Fix tests in tests/test_prep/test_frame_classification/test_dataset_arrays.py * Fix src/vak/prep/frame_classification/learncurve.py so it resets index on returned dataframe * Fix how we reset index on dataframe (again) in src/vak/prep/frame_classification/dataset_arrays.py * Fix how we reset index on dataframe in src/vak/prep/frame_classification/learncurve.py * Fix tests in tests/test_prep/test_frame_classification/test_frame_classification.py * Change LABELSET_YARDEN in tests/fixtures/annot.py to match what we use in config files in test data * Add return type in annotations on from_path classmethod in src/vak/datasets/frame_classification/metadata.py * Fix typo in docstring in src/vak/prep/split/split.py * Rewrite fixtures in tests/fixtures/spect.py to return constants we define at module level so we can import those in tests where needed to parametrize * Rewrite/fix tests for split_frame_classification_dataframe in tests/test_prep/test_split/test_split.py * Add unit tests for split.unit_dataframe to tests/test_prep/test_split/test_split.py * Rewrite one-line definition of prep_audio_dataset in src/vak/prep/audio_dataset.py for clarity * Revise docstring of prep_spectrogram_dataset and add return type to type annotations, in src/vak/prep/spectrogram_dataset/spect_helper.py * Fix how we build constants in tests/fixtures/spect.py so we don't clobber names of fixtures in other modules * Fix SPECT_DIR_NPZ and glob of SPECT_DIR_NPZ that produces SPECT_LIST_NPZ so that we are using a specific 'spectrograms_generated' directory' inside a dataset dir * Remove 'spect_annot_map' arg from src/vak/prep/spectrogram_dataset/spect_helper.py, and no longer do recursive glob of spect_dir * Rewrite/fix unit tests in tests/test_prep/test_spectrogram_dataset/test_spect_helper.py * Remove unused variable, add line break in docstring in tests/test_prep/test_spectrogram_dataset/test_spect_helper.py * Fix unit test in tests/test_prep/test_frame_classification/test_learncurve.py * Revise docstring in src/vak/prep/frame_classification/learncurve.py * Add fixture 'specific_audio_list' in tests/fixtures/audio.py * Fix variable name in tests/fixtures/audio.py * Fix/rewrite unit tests in tests/test_prep/test_spectrogram_dataset/test_prep.py * Change variable names for clarity in tests/test_prep/test_spectrogram_dataset/test_spect_helper.py * Fix tests in tests/test_train/test_parametric_umap.py -- use correct models, remove inappropriate asserts * Add tests/test_eval/test_parametric_umap.py * Add tests/vak.tests.config.toml * Use vak.tests.config.toml in tests/conftest.py to set default for command-line arg 'models' * Use vak.tests.config.toml in noxfile.py, for running tests and for generating test data * Fix root_results_dir option in train_continue configs * Change default parameters for ConvEncoderUMAP + add maxpool layers to reduce checkpoint size * Update GENERATED_TEST_DATA_ALL_URL in noxfile.py * Rewrite tests/vak.tests.config.toml as tests/vak.tests.config.json * Use json to load vak tests config in tests/conftest.py * Use json to load vak tests config in noxfile.py * Comment out calling fix_prep_csv_paths to see if we actually need to run it * Fix how we build DEFAULT_MODELS constant in noxfile.py * Remove constraints on dependencies on pyproject.toml to get pip to work * Fix path in conftest.py to avoid FileNotFoundError * Fix unit test in tests/test_cli/test_learncurve.py -- we just need to test that cli calls the right function * Fix unit test in tests/test_cli/test_predict.py to not use 'model' fixture -- we just need to test that cli calls the right function * Fix unit test in tests/test_cli/test_train.py to not use 'model' fixture -- we just need to test that cli calls the right function * Fix unit test in tests/test_config/test_parse.py to not use 'model' fixture -- we're not testing something model specific here * Fix 'accelerator' in src/vak/common/trainer.py so it is not set to None * Fix 'accelerator' in src/vak/eval/frame_classification.py so it is not set to None * Fix unit test in tests/test_eval/test_frame_classification.py to not use 'model' fixture -- we don't want to use ConvEncoderUMAP model here * Fix 'accelerator' in src/vak/eval/parametric_umap.py so it is not set to None * Add back lower bounds for pytorch-lightning + torch and torchvision in pyproject.toml * Remove commented code in noxfile.py * Delete tests/scripts/fix_prep_csv_paths.py, no longer needed * Change unit test in tests/test_models/test_base.py to use locally parametrized model_name instead of model fixture * Fix 'accelerator' in src/vak/predict/frame_classification.py so it is not set to None * Fix 'accelerator' in src/vak/predict/parametric_umap.py so it is not set to None * Fix 'parametric UMAP' -> 'parametric umap' in tests/fixtures/csv.py * Fix fixture in tests/test_predict/test_frame_classification.py to use locally parametrized 'model_name' instead of model fixture * Fix test in tests/test_prep/test_sequence_dataset.py to use locally parametrized 'model_name' instead of model fixture * Fix test in tests/test_learncurve/test_frame_classification.py to use locally parametrized 'model_name' instead of model fixture * Delete TeenyTweetyNet configs in tests/data_for_tests/configs * Delete TeenyTweetyNet from vak/nets * Fix test in tests/test_train/test_frame_classification.py to use locally parametrized 'model_name' instead of model fixture * Delete TeenyTweetyNet from vak/models * Add [TweetyNet.network] table to all TweetyNet configs in tests/data_for_tests/configs that makes a 'tiny' TweetyNet * Remove metadata for TeenyTweetyNet configs from tests/data_for_tests/configs/configs.json after deleting those configs * Add [ConvEncoderUMAP.network] table to all ConvEncoderUMAP configs in tests/data_for_tests/configs that makes a 'tiny' ConvEncoder * Delete tests/test_models/test_teenytweetynet.py and tests/test_nets/test_teenytweetynet.py * Change 'TeenyTweetyNet' -> 'TweetyNet' in tests/fixtures/dataframe.py * Change 'TeenyTweetyNet' -> 'TweetyNet' in tests/test_cli/test_eval.py * Change 'TeenyTweetyNet' -> 'TweetyNet' many places in tests * Remove TeenyTweetyNet from modules in tests/test_models * Fix test in tests/test_models/test_base.py to use network config from .toml file so we don't get tensor size mismatch errors * Fix a unit test in tests/test_models/test_frame_classification_model.py * Mark a test xfail in tests/test_models/test_parametric_umap_model.py because fixing it will require fixing/changing how we parse config files * Fix 'accelerator' in src/vak/train/parametric_umap.py so it is not set to None * Remove models command-line argument from tests, no longer used * Add attribute 'dataset_type' to PrepConfig docstring * Use locally parametrized variable 'model_name' in tests/test_cli/test_eval.py instead of 'model' fixture that was removed * Fix unit tests in tests/test_config/ to not use 'model' fixture that was removed * Refactor noxfile.py: separate into routinely used sessions at top and less-used sessions specific to test data at bottom. Remove use of model argument in test and coverage sessions, since that fixture was removed * Fix lower bound on torchvision, '15.2' -> '0.15.2' * Import annotations from __future__ in src/vak/transforms/transforms.py * Import annotations from __future__ in src/vak/prep/frame_classification/frame_classification.py * Import annotations from __future__ in src/vak/prep/parametric_umap/parametric_umap.py * Import annotations from __future__ in src/vak/prep/prep_.py * Remove 'running-on-ci' arg from call to nox session 'coverage' in .github/workflows/ci-linux.yml -- arg no longer used in that session
vocalpy · Sep 11, 2023 · f3c6f4b · f3c6f4b
1 parent af7f4d1
commit f3c6f4b
Show file tree

Hide file tree

Showing 136 changed files with 3,122 additions and 4,192 deletions.
diff --git a/.github/workflows/ci-linux.yml b/.github/workflows/ci-linux.yml
@@ -24,6 +24,6 @@ jobs:
         run: |
           nox -s test-data-download-source
           nox -s test-data-download-generated-ci
-          nox -s coverage --verbose -- running-on-ci
+          nox -s coverage --verbose
       - name: upload code coverage
         uses: codecov/codecov-action@v3
diff --git a/noxfile.py b/noxfile.py
@@ -1,3 +1,4 @@
+import json
 import os
 import pathlib
 import shutil
@@ -10,6 +11,11 @@
 DIR = pathlib.Path(__file__).parent.resolve()
 VENV_DIR = pathlib.Path('./.venv').resolve()
 
+
+with pathlib.Path('./tests/vak.tests.config.json').open('rb') as fp:
+    VAK_TESTS_CONFIG = json.load(fp)
+
+
 nox.options.sessions = ['test', 'coverage']
 
 
@@ -62,13 +68,57 @@ def lint(session):
     session.run("flake8", "./src", "--max-line-length", "120", "--exclude", "./src/crowsetta/_vendor")
 
 
-# ---- used by sessions that "clean up" data for tests
-def clean_dir(dir_path):
+@nox.session
+def test(session) -> None:
+    """
+    Run the unit and regular tests.
+    """
+    session.install(".[test]")
+    if session.posargs:
+        session.run("pytest", *session.posargs)
+    else:
+        session.run("pytest", "-x", "--slow-last")
+
+
+@nox.session
+def coverage(session) -> None:
+    """
+    Run the unit and regular tests, and save coverage report
     """
-    "clean" a directory by removing all files
-    (that are not hidden)
-    without removing the directory itself
+    session.install(".[test]")
+    session.run(
+        "pytest", "--cov=./", "--cov-report=xml", *session.posargs
+    )
+
+
+@nox.session
+def doc(session: nox.Session) -> None:
+    """
+    Build the docs.
+
+    To run ``sphinx-autobuild``,  do:
+
+    .. code-block::console
+
+       nox -s doc -- autobuild
+
+    Otherwise the docs will be built once using
     """
+    session.install(".[doc]")
+    if session.posargs:
+        if "autobuild" in session.posargs:
+            print("Building docs at http://127.0.0.1:8000 with sphinx-autobuild -- use Ctrl-C to quit")
+            session.run("sphinx-autobuild", "doc", "doc/_build/html")
+        else:
+            print("Unsupported argument to docs")
+    else:
+        session.run("sphinx-build", "-nW", "--keep-going", "-b", "html", "doc/", "doc/_build/html")
+
+
+# ---- sessions below this all have to do with data for tests ----------------------------------------------------
+def clean_dir(dir_path):
+    """Helper function that "cleans" a directory by removing all files
+    (that are not hidden) without removing the directory itself."""
     dir_path = pathlib.Path(dir_path)
     dir_contents = dir_path.glob('*')
     for content in dir_contents:
@@ -92,9 +142,7 @@ def clean_dir(dir_path):
 
 @nox.session(name='test-data-clean-source')
 def test_data_clean_source(session) -> None:
-    """
-    Clean (remove) 'source' test data, used by TEST_DATA_GENERATE_SCRIPT.
-    """
+    """Clean (remove) 'source' test data, used by TEST_DATA_GENERATE_SCRIPT."""
     clean_dir(SOURCE_TEST_DATA_DIR)
 
 
@@ -109,18 +157,14 @@ def copy_url(url: str, path: str) -> None:
 
 @nox.session(name='test-data-tar-source')
 def test_data_tar_source(session) -> None:
-    """
-    Make a .tar.gz file of just the 'generated' test data used to run tests on CI.
-    """
+    """Make a .tar.gz file of just the 'generated' test data used to run tests on CI."""
     session.log(f"Making tarfile with source data: {SOURCE_TEST_DATA_TAR}")
     make_tarfile(SOURCE_TEST_DATA_TAR, SOURCE_TEST_DATA_DIRS)
 
 
 @nox.session(name='test-data-download-source')
 def test_data_download_source(session) -> None:
-    """
-    Download and extract a .tar.gz file of 'source' test data, used by TEST_DATA_GENERATE_SCRIPT.
-    """
+    """Download and extract a .tar.gz file of 'source' test data, used by TEST_DATA_GENERATE_SCRIPT."""
     session.log(f'Downloading: {SOURCE_TEST_DATA_URL}')
     copy_url(url=SOURCE_TEST_DATA_URL, path=SOURCE_TEST_DATA_TAR)
     session.log(f'Extracting downloaded tar: {SOURCE_TEST_DATA_TAR}')
@@ -133,9 +177,7 @@ def test_data_download_source(session) -> None:
 
 @nox.session(name='test-data-generate', python="3.10")
 def test_data_generate(session) -> None:
-    """
-    Produced 'generated' test data, by running TEST_DATA_GENERATE_SCRIPT on 'source' test data.
-    """
+    """Produced 'generated' test data, by running TEST_DATA_GENERATE_SCRIPT on 'source' test data."""
     session.install(".[test]")
     session.run("python", TEST_DATA_GENERATE_SCRIPT)
 
@@ -145,13 +187,12 @@ def test_data_generate(session) -> None:
 
 @nox.session(name='test-data-clean-generated')
 def test_data_clean_generated(session) -> None:
-    """
-    Clean (remove) 'generated' test data.
-    """
+    """Clean (remove) 'generated' test data."""
     clean_dir(GENERATED_TEST_DATA_DIR)
 
 
 def make_tarfile(name: str, to_add: list):
+    """Helper function that makes a tarfile"""
     with tarfile.open(name, "w:gz") as tf:
         for add_name in to_add:
             tf.add(name=add_name)
@@ -161,8 +202,21 @@ def make_tarfile(name: str, to_add: list):
 PREP_DIR = f'{GENERATED_TEST_DATA_DIR}prep/'
 RESULTS_DIR = f'{GENERATED_TEST_DATA_DIR}results/'
 
-PREP_CI = sorted(pathlib.Path(PREP_DIR).glob('*/*/teenytweetynet'))
-RESULTS_CI = sorted(pathlib.Path(RESULTS_DIR).glob('*/*/teenytweetynet'))
+PREP_CI: list = []
+for model_name in VAK_TESTS_CONFIG['models']:
+    PREP_CI.extend(
+        sorted(
+            pathlib.Path(PREP_DIR).glob(f'*/*/{model_name}')
+                 )
+    )
+RESULTS_CI: list = []
+for model_name in VAK_TESTS_CONFIG['models']:
+    PREP_CI.extend(
+        sorted(
+            pathlib.Path(RESULTS_DIR).glob(f'*/*/{model_name}')
+                 )
+    )
+
 GENERATED_TEST_DATA_CI_TAR = f'{GENERATED_TEST_DATA_DIR}generated_test_data-version-1.x.ci.tar.gz'
 GENERATED_TEST_DATA_CI_DIRS = [CONFIGS_DIR] + PREP_CI + RESULTS_CI
 
@@ -172,30 +226,24 @@ def make_tarfile(name: str, to_add: list):
 
 @nox.session(name='test-data-tar-generated-all')
 def test_data_tar_generated_all(session) -> None:
-    """
-    Make a .tar.gz file of all 'generated' test data.
-    """
+    """Make a .tar.gz file of all 'generated' test data."""
     session.log(f"Making tarfile with all generated data: {GENERATED_TEST_DATA_ALL_TAR}")
     make_tarfile(GENERATED_TEST_DATA_ALL_TAR, GENERATED_TEST_DATA_ALL_DIRS)
 
 
 @nox.session(name='test-data-tar-generated-ci')
 def test_data_tar_generated_ci(session) -> None:
-    """
-    Make a .tar.gz file of just the 'generated' test data used to run tests on CI.
-    """
+    """Make a .tar.gz file of just the 'generated' test data used to run tests on CI."""
     session.log(f"Making tarfile with generated data for CI: {GENERATED_TEST_DATA_CI_TAR}")
     make_tarfile(GENERATED_TEST_DATA_CI_TAR, GENERATED_TEST_DATA_CI_DIRS)
 
 
-GENERATED_TEST_DATA_ALL_URL = 'https://osf.io/uvgjt/download'
+GENERATED_TEST_DATA_ALL_URL = 'https://osf.io/xfp6n/download'
 
 
 @nox.session(name='test-data-download-generated-all')
 def test_data_download_generated_all(session) -> None:
-    """
-    Download and extract a .tar.gz file of all 'generated' test data
-    """
+    """Download and extract a .tar.gz file of all 'generated' test data"""
     session.install("pandas")
     session.log(f'Downloading: {GENERATED_TEST_DATA_ALL_URL}')
     copy_url(url=GENERATED_TEST_DATA_ALL_URL, path=GENERATED_TEST_DATA_ALL_TAR)
@@ -204,80 +252,17 @@ def test_data_download_generated_all(session) -> None:
         tf.extractall(path='.')
     session.log('Fixing paths in .csv files')
     session.install("pandas")
-    session.run(
-        "python", "./tests/scripts/fix_prep_csv_paths.py"
-    )
 
 
 GENERATED_TEST_DATA_CI_URL = 'https://osf.io/un2zs/download'
 
 
 @nox.session(name='test-data-download-generated-ci')
 def test_data_download_generated_ci(session) -> None:
-    """
-    Download and extract a .tar.gz file of just the 'generated' test data used to run tests on CI
-    """
+    """Download and extract a .tar.gz file of just the 'generated' test data used to run tests on CI"""
     session.install("pandas")
     session.log(f'Downloading: {GENERATED_TEST_DATA_CI_URL}')
     copy_url(url=GENERATED_TEST_DATA_CI_URL, path=GENERATED_TEST_DATA_CI_TAR)
     session.log(f'Extracting downloaded tar: {GENERATED_TEST_DATA_CI_TAR}')
     with tarfile.open(GENERATED_TEST_DATA_CI_TAR, "r:gz") as tf:
         tf.extractall(path='.')
-    session.log('Fixing paths in .csv files')
-    session.run(
-        "python", "./tests/scripts/fix_prep_csv_paths.py"
-    )
-
-
-@nox.session
-def test(session) -> None:
-    """
-    Run the unit and regular tests.
-    """
-    session.install(".[test]")
-    session.run("pytest", *session.posargs)
-
-
-@nox.session
-def coverage(session) -> None:
-    """
-    Run the unit and regular tests, and save coverage report
-    """
-    session.install(".[test]")
-    if session.posargs:
-        if "running-on-ci" in session.posargs:
-            # on ci, just run `teenytweetynet` model
-            session.run(
-                "pytest", "--models", "teenytweetynet", "--cov=./", "--cov-report=xml"
-            )
-            return
-        else:
-            print("Unsupported argument to coverage")
-
-    session.run(
-        "pytest", "--cov=./", "--cov-report=xml", *session.posargs
-    )
-
-
-@nox.session
-def doc(session: nox.Session) -> None:
-    """
-    Build the docs.
-
-    To run ``sphinx-autobuild``,  do:
-
-    .. code-block::console
-
-       nox -s doc -- autobuild
-
-    Otherwise the docs will be built once using
-    """
-    session.install(".[doc]")
-    if session.posargs:
-        if "autobuild" in session.posargs:
-            print("Building docs at http://127.0.0.1:8000 with sphinx-autobuild -- use Ctrl-C to quit")
-            session.run("sphinx-autobuild", "doc", "doc/_build/html")
-        else:
-            print("Unsupported argument to docs")
-    else:
-        session.run("sphinx-build", "-nW", "--keep-going", "-b", "html", "doc/", "doc/_build/html")
diff --git a/pyproject.toml b/pyproject.toml
@@ -28,7 +28,7 @@ dependencies = [
     "dask >=2.10.1",
     "evfuncs >=0.3.4",
     "joblib >=0.14.1",
-    "pytorch-lightning >=1.8.4.post0, <2.0",
+    "pytorch-lightning >=2.0.7",
     "matplotlib >=3.3.3",
     "numpy >=1.18.1",
     "pynndescent >=0.5.10",
@@ -37,8 +37,8 @@ dependencies = [
     "pandas >=1.0.1",
     "tensorboard >=2.8.0",
     "toml >=0.10.2",
-    "torch >=1.7.1, <2.0.0",
-    "torchvision >=0.5.0",
+    "torch >= 2.0.1",
+    "torchvision >=0.15.2",
     "tqdm >=4.42.1",
     "umap-learn >=0.5.3",
 ]
@@ -85,5 +85,7 @@ markers = [
 filterwarnings = [
     "ignore:::torch.utils.tensorboard",
     'ignore:Deprecated call to `pkg_resources.declare_namespace',
-    'ignore:pkg_resources is deprecated as an API'
+    'ignore:pkg_resources is deprecated as an API',
+    'ignore:Implementing implicit namespace packages',
+    'ignore:distutils Version classes are deprecated.',
 ]
diff --git a/src/vak/cli/prep.py b/src/vak/cli/prep.py
@@ -1,8 +1,9 @@
-# note NO LOGGING -- we configure logger inside `core.prep`
-# so we can save log file inside dataset directory
+"""Function called by command-line interface for prep command"""
+from __future__ import annotations
+
 import shutil
 import warnings
-from pathlib import Path
+import pathlib
 
 import toml
 
@@ -12,7 +13,7 @@
 from ..config.validators import are_sections_valid
 
 
-def purpose_from_toml(config_toml, toml_path=None):
+def purpose_from_toml(config_toml: dict, toml_path: str | pathlib.Path | None = None) -> str:
     """determine "purpose" from toml config,
     i.e., the command that will be run after we ``prep`` the data.
 
@@ -35,6 +36,9 @@ def purpose_from_toml(config_toml, toml_path=None):
             return section_name.lower()  # this is the "purpose" of the file
 
 
+# note NO LOGGING -- we configure logger inside `core.prep`
+# so we can save log file inside dataset directory
+
 # see https://github.com/NickleDave/vak/issues/334
 SECTIONS_PREP_SHOULD_PARSE = ("PREP", "SPECT_PARAMS", "DATALOADER")
 
@@ -45,7 +49,7 @@ def prep(toml_path):
 
     Parameters
     ----------
-    toml_path : str, Path
+    toml_path : str, pathlib.Path
         path to a configuration file in TOML format.
         Used to rewrite file with options determined by this function and needed for other functions
 
@@ -75,7 +79,7 @@ def prep(toml_path):
     dataset, and for all rows the 'split' columns for that dataset
     will be 'predict' or 'test' (respectively).
     """
-    toml_path = Path(toml_path)
+    toml_path = pathlib.Path(toml_path)
 
     # open here because need to check for `dataset_path` in this function, see #314 & #333
     config_toml = _load_toml_from_path(toml_path)