--mode all,random-asset #43

yarikoptic · 2023-03-06T18:31:44Z

Since it takes forever ATM to go through entire collection, I want to make it possible to "sparsely sample" by adding a run --mode, where default all -- is current behavior. In case of random-asset:

only 1 (nwb) asset is chosen from a dandiset
testing ran on that asset and results are added/replaced within assets_ok/assets_nok
- if current testing environment versions differ from prior record in the "top level" version, a record would not be just a string containing the path, but becomes a record with path and versions fields, e.g.:

path: sub-YutaMouse57/sub-YutaMouse57_ses-YutaMouse57-161016_behavior+ecephys.nwb
versions:
 - hdmf: 3.4.7
 - matnwb: v2.5.0.0
 - pynwb: 2.2.0

while saving, analyze: if all paths have identical versions: (for those who have only path -- take "top level" versions) -- just place those identical versions to the top level versions: record and renormalize all paths to just string records.

While RF for this,

extend test_files (which run tests on a given file(s)) command with --save-results flag, so it would store the results as well (now they are just printed). It would allow for selective rerunning
optional : to avoid need for locking in case of some parallel execution of check in some mode and e.g. test_files -- RF code so storing of results from each file consists of reloading results file, modifying entry for that path, saving back. Caution would need to be taken to support removal of stale results - i.e. for paths which were removed already from dandisets, so might need recheck of all files being available as symlink or present?

The text was updated successfully, but these errors were encountered:

yarikoptic · 2023-03-06T18:35:30Z

another mode could be random-outdated-asset which would be the random asset (among ok/nok) which has versions which are not current. Then we could sweep with such option slowly getting to the most recent state without waiting for long single sweep to finish.

yarikoptic · 2023-03-14T11:47:54Z

I like random-outdated-asset, make it into random-outdated-asset-first which would do random-outdated-asset but if none is outdated -- do random-asset. That is the only extra mode pretty much I would like to see ATM but wouldn't mind all 3 of them. This way we can just keep it running forever slowly going through all assets.

jwodder · 2023-03-21T19:55:19Z

@yarikoptic To be clear, do you want random-outdated-asset, random-outdated-asset-first, or both?

yarikoptic · 2023-03-22T00:13:43Z

I think we can be fine with random-outdated-asset-first only ATM in addition to all, and it should be easy after refactoring to add any other random- if really needed.

jwodder · 2023-03-22T14:03:02Z

@yarikoptic

extend test_files (which run tests on a given file(s)) command with --save-results flag, so it would store the results as well

This would require the test_files command to know what Dandisets the assets belong to in order to save the results in the proper files. How should that work?

yarikoptic · 2023-03-22T19:53:29Z

for each path it can discover the dandiset by location of the dandiset.yaml in the .parents.

jwodder · 2023-03-22T19:54:55Z

@yarikoptic Should running test_files on files that aren't in a Dandiset be supported?

yarikoptic · 2023-03-22T20:12:31Z

ideally yes . saving concerns only the --save-results mode of operation.
BTW it is ok if --save-results STATUS_FILE_PATH make to get a path to the status.yaml file where to update/save results so code doesn't even need to figure out where to save them. Would be explicit, and if not there -- just error out to not bother with initiating the full thing.

jwodder · 2023-03-24T16:17:57Z

@yarikoptic Problem: When saving test results obtained with test-files, the script needs to know the various package versions so that they can be properly recorded in status.yaml. However, the MatNWB version is currently only known if MatNWB was downloaded & unpacked, which is only done if test-files was asked to test against MatNWB, as otherwise it's a waste of time. Leaving the MatNWB version empty when it's not known will lead to difficulties with managing and "normalizing" status.yaml-wide vs. per-asset versions.

One option to determine the MatNWB version when testing against pynwb would be to just query GitHub for the tag of MatNWB's latest release and not download any code, but this could lead to status.yaml files being normalized less often than possible if all recorded versions list the same older MatNWB version, in which case we would then record a newer version for just the tested asset(s) that wasn't even used for the test.

yarikoptic · 2023-03-24T21:16:48Z

I asked chatgpt to give me a function which would do git describe --tags in the directory where a function defined, it gave me

function tag = getGitTagForFunction(funcName)
% This function returns the output of "git describe --tags" command for
% the directory containing the file defining the input function.

% Get the path to the function file
funcFile = which(funcName);

% Get the path to the directory containing the function file
funcDir = fileparts(funcFile);

% Run the "git describe --tags" command in the function directory
[status, result] = system(sprintf('cd %s && git describe --tags', funcDir));

% If the command ran successfully, parse the output to get the tag
if status == 0
    tag = strtrim(result);
else
    error('Error running "git describe --tags" command: %s', result);
end
end

which I tested to work on drogon:

(base) dandi@drogon:/tmp$ MATLABPATH=/tmp/matnwb/clone/:/tmp/mymat matlab -nodesktop -batch "disp(getGitTagForFunction('nwbRead'))"
v2.6.0.1-2-gda49922

so if we switch to use clone of the matnwb repo -- it should work.

As for master export -- I do not see how they could have potentially made it work to provide a version based on the tags, and we also don't in those projects which use versioneer and your versioningit, right?

jwodder · 2023-03-27T14:58:25Z

@yarikoptic Why should we switch to using a clone of the matnwb repo? That would mean we'd frequently be using unreleased code, which seems like not the best idea. It'd also increase the number of different matnwb versions listed in status.yaml files, hindering normalization.

Moreover, if we're using a clone of the matnwb repo, what's the point of running a matlab script to get the git describe output instead of just running the command in the clone?

Honestly, I prefer my suggested solution of querying GitHub for the latest matnwb version whenever matnwb isn't downloaded.

yarikoptic · 2023-03-27T15:32:29Z

To say the truth I forgot that we are using released version ATM, but I remain of an opinion that a clone would be a better choice: I don't like querying because it heavily relies on assumption that whatever we installed elsewhere has that version.

with current run time of weeks the version we actually use might be not the one queried later
it should be easy to tune up install_matnwb() to clone and still checkout that recent release. As a result we would gain ability to check exactly the version we are actually using instead of what is the current (possibly already different) release
for Make it possible to test in "development" scenario #20 we would need to test against different (master) version. It would be trivial to parametrize that function and stay at current master and again keep consistent discoverable version information

jwodder · 2023-03-27T15:40:23Z

@yarikoptic So, if we switch to cloning, exactly when should the script update the clone to use the latest release?

yarikoptic · 2023-03-27T18:33:13Z

ATM we have at

dandisets-healthstatus/code/src/healthstatus/__main__.py

Line 184 in edb5b51

if MATNWB_INSTALL_DIR.exists():

to completely skip installing if target folder exists. We trigger install_matnwb at the beginning of the sweep (check) and in test_files (if testname includes matnwb). so it seems that ATM we do not upgrade at all if installation already exists. For the purpose of this issue could be just that, no updates etc.

For the #20 (comment) we would anyways create some kind of scenarios to switch between and decide when to update which likely would be some separate step.

jwodder · 2023-03-27T18:35:48Z

@yarikoptic That's not what that line does; if the matnwb folder already exists, it is deleted, and then (regardless of whether it existed or not) the latest release is downloaded & installed.

yarikoptic · 2023-03-27T19:56:45Z

doh - how could I misread rmtree ;) well -- FWIW can just keep doing the same for now anyways. My point is that the aspect of update is orthogonal to either you get it from tarball or git clone but with git clone we can discover the version in use.

Add `--mode` option

yarikoptic assigned jwodder Mar 6, 2023

yarikoptic mentioned this issue Mar 6, 2023

Make it possible to test in "development" scenario #20

Open

jwodder mentioned this issue Mar 21, 2023

Add --mode option #44

Merged

3 tasks

yarikoptic closed this as completed in #44 Mar 29, 2023

yarikoptic added a commit that referenced this issue Mar 29, 2023

Merge pull request #44 from dandi/gh-43

e977fed

Add `--mode` option

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--mode all,random-asset #43

--mode all,random-asset #43

yarikoptic commented Mar 6, 2023

yarikoptic commented Mar 6, 2023

yarikoptic commented Mar 14, 2023

jwodder commented Mar 21, 2023

yarikoptic commented Mar 22, 2023

jwodder commented Mar 22, 2023

yarikoptic commented Mar 22, 2023

jwodder commented Mar 22, 2023

yarikoptic commented Mar 22, 2023

jwodder commented Mar 24, 2023

yarikoptic commented Mar 24, 2023

jwodder commented Mar 27, 2023

yarikoptic commented Mar 27, 2023

jwodder commented Mar 27, 2023

yarikoptic commented Mar 27, 2023

jwodder commented Mar 27, 2023

yarikoptic commented Mar 27, 2023

--mode all,random-asset #43

--mode all,random-asset #43

Comments

yarikoptic commented Mar 6, 2023

yarikoptic commented Mar 6, 2023

yarikoptic commented Mar 14, 2023

jwodder commented Mar 21, 2023

yarikoptic commented Mar 22, 2023

jwodder commented Mar 22, 2023

yarikoptic commented Mar 22, 2023

jwodder commented Mar 22, 2023

yarikoptic commented Mar 22, 2023

jwodder commented Mar 24, 2023

yarikoptic commented Mar 24, 2023

jwodder commented Mar 27, 2023

yarikoptic commented Mar 27, 2023

jwodder commented Mar 27, 2023

yarikoptic commented Mar 27, 2023

jwodder commented Mar 27, 2023

yarikoptic commented Mar 27, 2023