Skip to content

Commit

Permalink
TST/CLN: Fix unit tests (#693)
Browse files Browse the repository at this point in the history
* Remove dataset_path option from tests/data_for_tests/configs/ConvEncoderUMAP_eval_audio_cbin_annot_notmat.toml

* Fix use_dataset_from_config option to be null for ConvEncoderUMAP_eval, stops generate-test-data script from crashing

* Fix 'teentytweetynet' -> 'TweetyNet' in SPECT_DIR_NPZ constant in tests/fixtures/spect.py

* Fix config name declared as constant 'teenytweetynet' -> 'TeenyTweetyNet'; fix reference to Metadata in tests/test_datasets/test_window_dataset/conftest.py

* Change options in ConvEncoderUMAP configs so training doesn't take forever

* Set num_workers = 16 in all test data configs

* Further modify config options in tests/data_for_tests/configs/ConvEncoderUMAP_train_audio_cbin_annot_notmat.toml to make training run faster

* Fix how we handle labelmap_path in vaktestdata/configs.py

* Add tests/test_datasets/test_frame_classification/ with test_window_dataset.py

* Remove  tests/test_datasets/test_window_dataset/

* Move test_datasets/test_metadata.py into test_frame_classification, fix unit tests

* Delete tests/test_models/test_das.py for now

* Add missing parameter name 'split' to parametrize in test_frame_classification/test_window_dataset.py

* Fix 'WindowedFrameClassificationModel' -> 'FrameClassificationModel' in tests/test_models/test_decorator.py

* Remove tests/test_nets/test_das for now

* Remove tests/test_prep/test_frame_classification/test_helper.py -- helper module no longer exists

* Remove extra tests and set single unit test to assert False for now in tests/test_prep/test_audio_dataset.py

* Fix typo in docstring in src/vak/prep/frame_classification/dataset_arrays.py

* Add two line breaks after imports in src/vak/prep/frame_classification/learncurve.py

* WIP: fix tests/test_prep/test_frame_classification/test_dataset_arrays.py

* WIP: fix tests/test_prep/test_frame_classification/test_learncurve.py

* Rename 'labeled_timebins' -> 'frame_labels' in test_transforms/

* Fix key used by list_of_schematized_configs fixture -- 'configs' -> 'config_metadata'

* Change 'teenytweetynet', 'tweetynet' -> 'TeenyTweetyNet', 'TweetyNet' throughout tests

* Fix unit tests in tests/test_datasets/test_frame_classification/test_metadata.py

* Remove DATALOADER from parametrize in a unit test in tests/test_config/test_parse.py

* Fix how we load metadata in fixture in tests/fixtures/csv.py

* Fix order of attributes in Metadata docstring in src/vak/datasets/frame_classification/metadata.py

* Fix unit test in tests/test_datasets/test_frame_classification/test_window_dataset.py

* Move test_datasets/test_seq/test_validators.py -> tests/test_prep/test_sequence_dataset.py, fix unit test

* Import sequence_dataset in prep/__init__.py

* Fix tests/test_eval/test_eval.py so it works

* Rewrite test_eval/test_eval.py as tests/test_eval/test_frame_classification.py

* Rewrite test_learncurve/test_learncurve.py as tests/test_learncurve/test_frame_classification.py

* Add/fix imports in src/vak/learncurve/__init__.py

* Fix module-level docstring in tests/test_eval/test_frame_classification.py

* Fix unit tests in tests/test_learncurve/test_frame_classification.py

* Make results_path not default to None, fix docstring in src/vak/learncurve/frame_classification.py

* Remove stray backslash in docstring in src/vak/nets/tweetynet.py

* Fix unit tests so they run in tests/test_models/test_base.py

* Fix __init__ docstring for TweetyNet and TeenyTweetyNet so they define num_input_channels + num_freqbins, not input_shape

* Revise docstring for register_model

* Import 'model' and 'model_family' decorators in vak/models/__init__.py

* Add MockModelFamily to tests/test_models/conftest.py and revise some of the docstrings there

* Fix unit tests for vak.models.decorator.model in tests/test_models/test_decorator.py

* Remove a .lower in a test in tests/test_models/test_windowed_frame_classification_model.py so that it doesn't fail

* Reorganize / revise docstrings / add classes in tests/test_models/conftest.py

* WIP: Add unit tests to tests/test_models/test_registry.py

* Fix type annotation in src/vak/models/registry.py

* Finish writing unit tests in tests/test_models/test_registry.py

* Rename test_models/test_windowed_frame_classification_model.py -> test_frame_classification_model.py and fix tests

* Refactor src/vak/models/registry.py to just use MODEL_REGISTRY dict -- previous way was unneccesarily convoluted

* Have src/vak/models/get.py use registry.MODEL_REGISTRY

* Fix tests in tests/test_models/test_registry.py after refactoring module

* Fix unit test in tests/test_models/test_decorator.py so it removes the models it registers -- this way we don't raise errors in other unit tests because MockModel is already registered

* Fix model family name / arguments in test_tweetynet.py + test_teenytweetynet.py

* Add ignore for torchmetrics warning in pyproject.toml

* Remove reference to pytest.Mark in tests/conftest.py that caused warning -- was unused anyway

* Fix unit tests in tests/test_nets/test_tweetynet.py

* Remove unused variable in tests/conftest.py

* Remove unused import in tests/test_models/test_teenytweetynet.py

* Fix unit tests in tests/test_nets/test_teenytweetynet.py

* WIP: Fix tests in tests/test_predict/test_frame_classification.py

* Remove stale comment from src/vak/eval/parametric_umap.py

* Remove get_default_padding function from src/vak/models/convencoder_umap.py -- deciding to not do this

* Fix output_dir option in tests/data_for_tests/configs/ConvEncoderUMAP_eval_audio_cbin_annot_notmat.toml

* Remove calls to convencoder_umap.get_default padding in src/vak/train/parametric_umap.py

* Remove call to convencoder_umap.get_default padding in src/vak/eval/parametric_umap.py

* Do not add padding in src/vak/transforms/defaults/parametric_umap.py

* Have ConvEncoderUMAP eval config re-use datast from train config, so that there's no issue with input shape being different that will lead to cryptic 'incorrect parameter size' errors when we re-load the checkpoint

* Clean src/vak/cli/prep.py

- Import annotations from __future__ to be able to use pipe
  for type annotations
- Add type annotations to `purpose_from_toml`
- Change `Path` -> `pathlib.Path`, to be explicit

* Fix vaktestdata/configs.py so we get dataset paths from the right section

* Add test_dur in tests/data_for_tests/configs/ConvEncoderUMAP_train_audio_cbin_annot_notmat.toml so that we can re-use the same dataset for the eval config

* Clean src/vak/common/tensorboard.py -- add type annotations, fix formatting in docstrings

* Fix 'vak.datasets.metadata.Metadata' -> 'vak.datasets.frame_classification.Metadata' in tests/test_predict/test_frame_classification.py

* Clean src/vak/prep/audio_dataset.py

- Fix order of parameters to `prep_audio_dataset`
- Fix type annotation, remove default for parameter `data_dir`
- Also fix parameter ordering in docstring
- Fix validation of `data_dir` in pre-condition section of function
- Use `vak.common.typing.PathLike` for type hint

* Rewrite fixtures in tests/fixtures/audio.py so we can import as constants in tests where needed, to parametrize specific unit tests

* WIP: Fix tests/test_prep/test_audio_dataset.py so it actually tests correctly -- need to add more cases to parametrize

* Rename vak/train/train.py -> train_.py so we can still import train from train_ in vak/train/__init__.py and write 'vak.train.train', but *also* use unitest.mock.patch on functions where they are looked up in the train_ module

* Fix imports in src/vak/train/__init__.py after renaming train.py -> train_.py

* Write unit test for tests/test_train/test_train.py

* Rename vak/eval/eval.py -> eval_.py as for train

* Rename vak/predict/predict.py -> predict_.py as for train

* Rename vak/prep/prep.py -> prep_.py as for train

* Fixup tests/test_train/test_train.py

* Add a 'break' in tests/fixtures/config.py fixture 'specific_config', so we don't loop unneccesarily through all configs

* WIP: Add unit test in tests/test_eval/test_eval.py

* Fix test in tests/test_eval/test_eval.py

* Fix unit test names in tests/test_eval/test_frame_classification.py

* Fix docstring, remove device fixture in tests/test_eval/test_eval.py

* Remove device fixture in tests/test_train/test_train.py -- not needed since we're mocking anyways

* Add tests/test_predict/test_predict.py

* Fix module-level docstring in tests/test_predict/test_predict.py

* Fix tests in tests/test_train/test_frame_classification.py

* Add input_shape attribute to ConvEncoder neural network

* Add tests/test_models/test_parametric_umap_model.py

* Fix docstring, remove unused variable and unused import in tests/test_models/test_frame_classification_model.py

* Add tests/test_train/test_parametric_umap.py

* Fix docstring in tests/test_datasets/test_frame_classification/test_window_dataset.py

* Add tests/test_datasets/test_frame_classification/test_frames_dataset.py

* Add tests/test_datasets/test_parametric_umap/

* Add tests/test_models/test_ed_tcn.py

* Add tests/test_models/test_convencoder_umap.py

* Add tests/test_nets/test_ed_tcn.py

* Add tests/test_nets/test_convencoder.py

* Fix a unit test in tests/test_transforms/test_frame_labels/test_functional.py

* Fix a test in tests/test_transforms/test_transforms.py

* Fix undeclared variable 'device' in tests/test_train/test_train.py

* Fix undeclared variable 'device' in tests/test_eval/test_eval.py

* Fix undeclared variable 'device' in tests/test_predict/test_predict.py

* Make othinor fixes in tests/test_predict/test_predict.py

* Make input size smaller to speed up test in tests/test_models/test_convencoder_umap.py

* Modify ConvEncoderUMAP configs to make dataset smaller, speed up tests

* Fix test in tests/test_prep/test_prep.py

* Fix docstring in src/vak/prep/frame_classification/dataset_arrays.py and fix function so that it does not add 'index' or 'level_0' columns to dataframes

* Fix tests in tests/test_prep/test_frame_classification/test_dataset_arrays.py

* Fix src/vak/prep/frame_classification/learncurve.py so it resets index on returned dataframe

* Fix how we reset index on dataframe (again) in src/vak/prep/frame_classification/dataset_arrays.py

* Fix how we reset index on dataframe in src/vak/prep/frame_classification/learncurve.py

* Fix tests in tests/test_prep/test_frame_classification/test_frame_classification.py

* Change LABELSET_YARDEN in tests/fixtures/annot.py to match what we use in config files in test data

* Add return type in annotations on from_path classmethod in src/vak/datasets/frame_classification/metadata.py

* Fix typo in docstring in src/vak/prep/split/split.py

* Rewrite fixtures in tests/fixtures/spect.py to return constants we define at module level so we can import those in tests where needed to parametrize

* Rewrite/fix tests for split_frame_classification_dataframe in tests/test_prep/test_split/test_split.py

* Add unit tests for split.unit_dataframe to tests/test_prep/test_split/test_split.py

* Rewrite one-line definition of prep_audio_dataset in src/vak/prep/audio_dataset.py for clarity

* Revise docstring of prep_spectrogram_dataset and add return type to type annotations, in src/vak/prep/spectrogram_dataset/spect_helper.py

* Fix how we build constants in tests/fixtures/spect.py so we don't clobber names of fixtures in other modules

* Fix SPECT_DIR_NPZ and glob of SPECT_DIR_NPZ that produces SPECT_LIST_NPZ so that we are using a specific 'spectrograms_generated' directory' inside a dataset dir

* Remove 'spect_annot_map' arg from src/vak/prep/spectrogram_dataset/spect_helper.py, and no longer do recursive glob of spect_dir

* Rewrite/fix unit tests in tests/test_prep/test_spectrogram_dataset/test_spect_helper.py

* Remove unused variable, add line break in docstring in tests/test_prep/test_spectrogram_dataset/test_spect_helper.py

* Fix unit test in tests/test_prep/test_frame_classification/test_learncurve.py

* Revise docstring in src/vak/prep/frame_classification/learncurve.py

* Add fixture 'specific_audio_list' in tests/fixtures/audio.py

* Fix variable name in tests/fixtures/audio.py

* Fix/rewrite unit tests in tests/test_prep/test_spectrogram_dataset/test_prep.py

* Change variable names for clarity in tests/test_prep/test_spectrogram_dataset/test_spect_helper.py

* Fix tests in tests/test_train/test_parametric_umap.py -- use correct models, remove inappropriate asserts

* Add tests/test_eval/test_parametric_umap.py

* Add tests/vak.tests.config.toml

* Use vak.tests.config.toml in tests/conftest.py to set default for command-line arg 'models'

* Use vak.tests.config.toml in noxfile.py, for running tests and for generating test data

* Fix root_results_dir option in train_continue configs

* Change default parameters for ConvEncoderUMAP + add maxpool layers to reduce checkpoint size

* Update GENERATED_TEST_DATA_ALL_URL in noxfile.py

* Rewrite tests/vak.tests.config.toml as tests/vak.tests.config.json

* Use json to load vak tests config in tests/conftest.py

* Use json to load vak tests config in noxfile.py

* Comment out calling fix_prep_csv_paths to see if we actually need to run it

* Fix how we build DEFAULT_MODELS constant in noxfile.py

* Remove constraints on dependencies on pyproject.toml to get pip to work

* Fix path in conftest.py to avoid FileNotFoundError

* Fix unit test in tests/test_cli/test_learncurve.py -- we just need to test that cli calls the right function

* Fix unit test in tests/test_cli/test_predict.py to not use 'model' fixture -- we just need to test that cli calls the right function

* Fix unit test in tests/test_cli/test_train.py to not use 'model' fixture -- we just need to test that cli calls the right function

* Fix unit test in tests/test_config/test_parse.py to not use 'model' fixture -- we're not testing something model specific here

* Fix 'accelerator' in src/vak/common/trainer.py so it is not set to None

* Fix 'accelerator' in src/vak/eval/frame_classification.py so it is not set to None

* Fix unit test in tests/test_eval/test_frame_classification.py to not use 'model' fixture -- we don't want to use ConvEncoderUMAP model here

* Fix 'accelerator' in src/vak/eval/parametric_umap.py so it is not set to None

* Add back lower bounds for pytorch-lightning + torch and torchvision in pyproject.toml

* Remove commented code in noxfile.py

* Delete tests/scripts/fix_prep_csv_paths.py, no longer needed

* Change unit test in tests/test_models/test_base.py to use locally parametrized model_name instead of model fixture

* Fix 'accelerator' in src/vak/predict/frame_classification.py so it is not set to None

* Fix 'accelerator' in src/vak/predict/parametric_umap.py so it is not set to None

* Fix 'parametric UMAP' -> 'parametric umap' in tests/fixtures/csv.py

* Fix fixture in tests/test_predict/test_frame_classification.py to use locally parametrized 'model_name' instead of model fixture

* Fix test in tests/test_prep/test_sequence_dataset.py to use locally parametrized 'model_name' instead of model fixture

* Fix test in tests/test_learncurve/test_frame_classification.py to use locally parametrized 'model_name' instead of model fixture

* Delete TeenyTweetyNet configs in tests/data_for_tests/configs

* Delete TeenyTweetyNet from vak/nets

* Fix test in tests/test_train/test_frame_classification.py to use locally parametrized 'model_name' instead of model fixture

* Delete TeenyTweetyNet from vak/models

* Add [TweetyNet.network] table to all TweetyNet configs in tests/data_for_tests/configs that makes a 'tiny' TweetyNet

* Remove metadata for TeenyTweetyNet configs from tests/data_for_tests/configs/configs.json after deleting those configs

* Add [ConvEncoderUMAP.network] table to all ConvEncoderUMAP configs in tests/data_for_tests/configs that makes a 'tiny' ConvEncoder

* Delete tests/test_models/test_teenytweetynet.py and tests/test_nets/test_teenytweetynet.py

* Change 'TeenyTweetyNet' -> 'TweetyNet' in tests/fixtures/dataframe.py

* Change 'TeenyTweetyNet' -> 'TweetyNet' in tests/test_cli/test_eval.py

* Change 'TeenyTweetyNet' -> 'TweetyNet' many places in tests

* Remove TeenyTweetyNet from modules in tests/test_models

* Fix test in tests/test_models/test_base.py to use network config from .toml file so we don't get tensor size mismatch errors

* Fix a unit test in tests/test_models/test_frame_classification_model.py

* Mark a test xfail in tests/test_models/test_parametric_umap_model.py because fixing it will require fixing/changing how we parse config files

* Fix 'accelerator' in src/vak/train/parametric_umap.py so it is not set to None

* Remove models command-line argument from tests, no longer used

* Add attribute 'dataset_type' to PrepConfig docstring

* Use locally parametrized variable 'model_name' in tests/test_cli/test_eval.py instead of 'model' fixture that was removed

* Fix unit tests in tests/test_config/ to not use 'model' fixture that was removed

* Refactor noxfile.py: separate into routinely used sessions at top and less-used sessions specific to test data at bottom. Remove use of model argument in test and coverage sessions, since that fixture was removed

* Fix lower bound on torchvision, '15.2' -> '0.15.2'

* Import annotations from __future__ in src/vak/transforms/transforms.py

* Import annotations from __future__ in src/vak/prep/frame_classification/frame_classification.py

* Import annotations from __future__ in src/vak/prep/parametric_umap/parametric_umap.py

* Import annotations from __future__ in src/vak/prep/prep_.py

* Remove 'running-on-ci' arg from call to nox session 'coverage' in .github/workflows/ci-linux.yml -- arg no longer used in that session
  • Loading branch information
NickleDave authored Sep 11, 2023
1 parent af7f4d1 commit f3c6f4b
Show file tree
Hide file tree
Showing 136 changed files with 3,122 additions and 4,192 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci-linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,6 @@ jobs:
run: |
nox -s test-data-download-source
nox -s test-data-download-generated-ci
nox -s coverage --verbose -- running-on-ci
nox -s coverage --verbose
- name: upload code coverage
uses: codecov/codecov-action@v3
177 changes: 81 additions & 96 deletions noxfile.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import json
import os
import pathlib
import shutil
Expand All @@ -10,6 +11,11 @@
DIR = pathlib.Path(__file__).parent.resolve()
VENV_DIR = pathlib.Path('./.venv').resolve()


with pathlib.Path('./tests/vak.tests.config.json').open('rb') as fp:
VAK_TESTS_CONFIG = json.load(fp)


nox.options.sessions = ['test', 'coverage']


Expand Down Expand Up @@ -62,13 +68,57 @@ def lint(session):
session.run("flake8", "./src", "--max-line-length", "120", "--exclude", "./src/crowsetta/_vendor")


# ---- used by sessions that "clean up" data for tests
def clean_dir(dir_path):
@nox.session
def test(session) -> None:
"""
Run the unit and regular tests.
"""
session.install(".[test]")
if session.posargs:
session.run("pytest", *session.posargs)
else:
session.run("pytest", "-x", "--slow-last")


@nox.session
def coverage(session) -> None:
"""
Run the unit and regular tests, and save coverage report
"""
"clean" a directory by removing all files
(that are not hidden)
without removing the directory itself
session.install(".[test]")
session.run(
"pytest", "--cov=./", "--cov-report=xml", *session.posargs
)


@nox.session
def doc(session: nox.Session) -> None:
"""
Build the docs.
To run ``sphinx-autobuild``, do:
.. code-block::console
nox -s doc -- autobuild
Otherwise the docs will be built once using
"""
session.install(".[doc]")
if session.posargs:
if "autobuild" in session.posargs:
print("Building docs at http://127.0.0.1:8000 with sphinx-autobuild -- use Ctrl-C to quit")
session.run("sphinx-autobuild", "doc", "doc/_build/html")
else:
print("Unsupported argument to docs")
else:
session.run("sphinx-build", "-nW", "--keep-going", "-b", "html", "doc/", "doc/_build/html")


# ---- sessions below this all have to do with data for tests ----------------------------------------------------
def clean_dir(dir_path):
"""Helper function that "cleans" a directory by removing all files
(that are not hidden) without removing the directory itself."""
dir_path = pathlib.Path(dir_path)
dir_contents = dir_path.glob('*')
for content in dir_contents:
Expand All @@ -92,9 +142,7 @@ def clean_dir(dir_path):

@nox.session(name='test-data-clean-source')
def test_data_clean_source(session) -> None:
"""
Clean (remove) 'source' test data, used by TEST_DATA_GENERATE_SCRIPT.
"""
"""Clean (remove) 'source' test data, used by TEST_DATA_GENERATE_SCRIPT."""
clean_dir(SOURCE_TEST_DATA_DIR)


Expand All @@ -109,18 +157,14 @@ def copy_url(url: str, path: str) -> None:

@nox.session(name='test-data-tar-source')
def test_data_tar_source(session) -> None:
"""
Make a .tar.gz file of just the 'generated' test data used to run tests on CI.
"""
"""Make a .tar.gz file of just the 'generated' test data used to run tests on CI."""
session.log(f"Making tarfile with source data: {SOURCE_TEST_DATA_TAR}")
make_tarfile(SOURCE_TEST_DATA_TAR, SOURCE_TEST_DATA_DIRS)


@nox.session(name='test-data-download-source')
def test_data_download_source(session) -> None:
"""
Download and extract a .tar.gz file of 'source' test data, used by TEST_DATA_GENERATE_SCRIPT.
"""
"""Download and extract a .tar.gz file of 'source' test data, used by TEST_DATA_GENERATE_SCRIPT."""
session.log(f'Downloading: {SOURCE_TEST_DATA_URL}')
copy_url(url=SOURCE_TEST_DATA_URL, path=SOURCE_TEST_DATA_TAR)
session.log(f'Extracting downloaded tar: {SOURCE_TEST_DATA_TAR}')
Expand All @@ -133,9 +177,7 @@ def test_data_download_source(session) -> None:

@nox.session(name='test-data-generate', python="3.10")
def test_data_generate(session) -> None:
"""
Produced 'generated' test data, by running TEST_DATA_GENERATE_SCRIPT on 'source' test data.
"""
"""Produced 'generated' test data, by running TEST_DATA_GENERATE_SCRIPT on 'source' test data."""
session.install(".[test]")
session.run("python", TEST_DATA_GENERATE_SCRIPT)

Expand All @@ -145,13 +187,12 @@ def test_data_generate(session) -> None:

@nox.session(name='test-data-clean-generated')
def test_data_clean_generated(session) -> None:
"""
Clean (remove) 'generated' test data.
"""
"""Clean (remove) 'generated' test data."""
clean_dir(GENERATED_TEST_DATA_DIR)


def make_tarfile(name: str, to_add: list):
"""Helper function that makes a tarfile"""
with tarfile.open(name, "w:gz") as tf:
for add_name in to_add:
tf.add(name=add_name)
Expand All @@ -161,8 +202,21 @@ def make_tarfile(name: str, to_add: list):
PREP_DIR = f'{GENERATED_TEST_DATA_DIR}prep/'
RESULTS_DIR = f'{GENERATED_TEST_DATA_DIR}results/'

PREP_CI = sorted(pathlib.Path(PREP_DIR).glob('*/*/teenytweetynet'))
RESULTS_CI = sorted(pathlib.Path(RESULTS_DIR).glob('*/*/teenytweetynet'))
PREP_CI: list = []
for model_name in VAK_TESTS_CONFIG['models']:
PREP_CI.extend(
sorted(
pathlib.Path(PREP_DIR).glob(f'*/*/{model_name}')
)
)
RESULTS_CI: list = []
for model_name in VAK_TESTS_CONFIG['models']:
PREP_CI.extend(
sorted(
pathlib.Path(RESULTS_DIR).glob(f'*/*/{model_name}')
)
)

GENERATED_TEST_DATA_CI_TAR = f'{GENERATED_TEST_DATA_DIR}generated_test_data-version-1.x.ci.tar.gz'
GENERATED_TEST_DATA_CI_DIRS = [CONFIGS_DIR] + PREP_CI + RESULTS_CI

Expand All @@ -172,30 +226,24 @@ def make_tarfile(name: str, to_add: list):

@nox.session(name='test-data-tar-generated-all')
def test_data_tar_generated_all(session) -> None:
"""
Make a .tar.gz file of all 'generated' test data.
"""
"""Make a .tar.gz file of all 'generated' test data."""
session.log(f"Making tarfile with all generated data: {GENERATED_TEST_DATA_ALL_TAR}")
make_tarfile(GENERATED_TEST_DATA_ALL_TAR, GENERATED_TEST_DATA_ALL_DIRS)


@nox.session(name='test-data-tar-generated-ci')
def test_data_tar_generated_ci(session) -> None:
"""
Make a .tar.gz file of just the 'generated' test data used to run tests on CI.
"""
"""Make a .tar.gz file of just the 'generated' test data used to run tests on CI."""
session.log(f"Making tarfile with generated data for CI: {GENERATED_TEST_DATA_CI_TAR}")
make_tarfile(GENERATED_TEST_DATA_CI_TAR, GENERATED_TEST_DATA_CI_DIRS)


GENERATED_TEST_DATA_ALL_URL = 'https://osf.io/uvgjt/download'
GENERATED_TEST_DATA_ALL_URL = 'https://osf.io/xfp6n/download'


@nox.session(name='test-data-download-generated-all')
def test_data_download_generated_all(session) -> None:
"""
Download and extract a .tar.gz file of all 'generated' test data
"""
"""Download and extract a .tar.gz file of all 'generated' test data"""
session.install("pandas")
session.log(f'Downloading: {GENERATED_TEST_DATA_ALL_URL}')
copy_url(url=GENERATED_TEST_DATA_ALL_URL, path=GENERATED_TEST_DATA_ALL_TAR)
Expand All @@ -204,80 +252,17 @@ def test_data_download_generated_all(session) -> None:
tf.extractall(path='.')
session.log('Fixing paths in .csv files')
session.install("pandas")
session.run(
"python", "./tests/scripts/fix_prep_csv_paths.py"
)


GENERATED_TEST_DATA_CI_URL = 'https://osf.io/un2zs/download'


@nox.session(name='test-data-download-generated-ci')
def test_data_download_generated_ci(session) -> None:
"""
Download and extract a .tar.gz file of just the 'generated' test data used to run tests on CI
"""
"""Download and extract a .tar.gz file of just the 'generated' test data used to run tests on CI"""
session.install("pandas")
session.log(f'Downloading: {GENERATED_TEST_DATA_CI_URL}')
copy_url(url=GENERATED_TEST_DATA_CI_URL, path=GENERATED_TEST_DATA_CI_TAR)
session.log(f'Extracting downloaded tar: {GENERATED_TEST_DATA_CI_TAR}')
with tarfile.open(GENERATED_TEST_DATA_CI_TAR, "r:gz") as tf:
tf.extractall(path='.')
session.log('Fixing paths in .csv files')
session.run(
"python", "./tests/scripts/fix_prep_csv_paths.py"
)


@nox.session
def test(session) -> None:
"""
Run the unit and regular tests.
"""
session.install(".[test]")
session.run("pytest", *session.posargs)


@nox.session
def coverage(session) -> None:
"""
Run the unit and regular tests, and save coverage report
"""
session.install(".[test]")
if session.posargs:
if "running-on-ci" in session.posargs:
# on ci, just run `teenytweetynet` model
session.run(
"pytest", "--models", "teenytweetynet", "--cov=./", "--cov-report=xml"
)
return
else:
print("Unsupported argument to coverage")

session.run(
"pytest", "--cov=./", "--cov-report=xml", *session.posargs
)


@nox.session
def doc(session: nox.Session) -> None:
"""
Build the docs.
To run ``sphinx-autobuild``, do:
.. code-block::console
nox -s doc -- autobuild
Otherwise the docs will be built once using
"""
session.install(".[doc]")
if session.posargs:
if "autobuild" in session.posargs:
print("Building docs at http://127.0.0.1:8000 with sphinx-autobuild -- use Ctrl-C to quit")
session.run("sphinx-autobuild", "doc", "doc/_build/html")
else:
print("Unsupported argument to docs")
else:
session.run("sphinx-build", "-nW", "--keep-going", "-b", "html", "doc/", "doc/_build/html")
10 changes: 6 additions & 4 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ dependencies = [
"dask >=2.10.1",
"evfuncs >=0.3.4",
"joblib >=0.14.1",
"pytorch-lightning >=1.8.4.post0, <2.0",
"pytorch-lightning >=2.0.7",
"matplotlib >=3.3.3",
"numpy >=1.18.1",
"pynndescent >=0.5.10",
Expand All @@ -37,8 +37,8 @@ dependencies = [
"pandas >=1.0.1",
"tensorboard >=2.8.0",
"toml >=0.10.2",
"torch >=1.7.1, <2.0.0",
"torchvision >=0.5.0",
"torch >= 2.0.1",
"torchvision >=0.15.2",
"tqdm >=4.42.1",
"umap-learn >=0.5.3",
]
Expand Down Expand Up @@ -85,5 +85,7 @@ markers = [
filterwarnings = [
"ignore:::torch.utils.tensorboard",
'ignore:Deprecated call to `pkg_resources.declare_namespace',
'ignore:pkg_resources is deprecated as an API'
'ignore:pkg_resources is deprecated as an API',
'ignore:Implementing implicit namespace packages',
'ignore:distutils Version classes are deprecated.',
]
16 changes: 10 additions & 6 deletions src/vak/cli/prep.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# note NO LOGGING -- we configure logger inside `core.prep`
# so we can save log file inside dataset directory
"""Function called by command-line interface for prep command"""
from __future__ import annotations

import shutil
import warnings
from pathlib import Path
import pathlib

import toml

Expand All @@ -12,7 +13,7 @@
from ..config.validators import are_sections_valid


def purpose_from_toml(config_toml, toml_path=None):
def purpose_from_toml(config_toml: dict, toml_path: str | pathlib.Path | None = None) -> str:
"""determine "purpose" from toml config,
i.e., the command that will be run after we ``prep`` the data.
Expand All @@ -35,6 +36,9 @@ def purpose_from_toml(config_toml, toml_path=None):
return section_name.lower() # this is the "purpose" of the file


# note NO LOGGING -- we configure logger inside `core.prep`
# so we can save log file inside dataset directory

# see https://github.com/NickleDave/vak/issues/334
SECTIONS_PREP_SHOULD_PARSE = ("PREP", "SPECT_PARAMS", "DATALOADER")

Expand All @@ -45,7 +49,7 @@ def prep(toml_path):
Parameters
----------
toml_path : str, Path
toml_path : str, pathlib.Path
path to a configuration file in TOML format.
Used to rewrite file with options determined by this function and needed for other functions
Expand Down Expand Up @@ -75,7 +79,7 @@ def prep(toml_path):
dataset, and for all rows the 'split' columns for that dataset
will be 'predict' or 'test' (respectively).
"""
toml_path = Path(toml_path)
toml_path = pathlib.Path(toml_path)

# open here because need to check for `dataset_path` in this function, see #314 & #333
config_toml = _load_toml_from_path(toml_path)
Expand Down
Loading

0 comments on commit f3c6f4b

Please sign in to comment.