Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determining whether a functional test has failed (changepoint) #780

Closed
MichaelClerx opened this issue Apr 15, 2019 · 10 comments
Closed

Determining whether a functional test has failed (changepoint) #780

MichaelClerx opened this issue Apr 15, 2019 · 10 comments
Labels

Comments

@MichaelClerx
Copy link
Member

Chatting this morning we thought it might be good for functional tests to fail like this:

  1. Gather the last N samples*
  2. Run M tests
  3. See if the M results are likely from the distribution approximated by N

There's some extra complexity in step 1, namely (A) ignoring samples that previously failed the test, (B) if the distribution changes sharply at some point, e.g. after a bug is fixed, we should manually mark all samples before that switch as points to be ignored

This still requires a bit of manual work, but at least we could get rid of some arbitrary thresholds?

Thoughts @ben18785 @martinjrobins ?

@chonlei
Copy link
Member

chonlei commented Apr 16, 2019

First, have a look at the distribution #783

@MichaelClerx
Copy link
Member Author

@MichaelClerx
Copy link
Member Author

@fcooper8472 what's our thinking on running this?

E.g. will ./funk report call this in a subprocess and wait for it to finish before analysing the output?

@MichaelClerx MichaelClerx changed the title Functional test failures when distribution changes? Functional test failures when distribution changes? (changepoint) May 17, 2019
@abhidg abhidg self-assigned this Jul 30, 2019
@MichaelClerx
Copy link
Member Author

@abhidg https://dev.azure.com/OxfordRSE/pints-functional-testing/_build/results?buildId=439

Seems to be falling over with:

ERROR:pfunk._test:Exception in plot: mcmc_banana_EmceeHammerMCMC_3
Creating plot for mcmc_banana_EmceeHammerMCMC_3
Traceback (most recent call last):
  File "./funk", line 14, in <module>
    main()
  File "/home/pints/functional-testing/pfunk/__main__.py", line 494, in main
    args.func(args)
  File "/home/pints/functional-testing/pfunk/__main__.py", line 133, in run
    pfunk.tests.plot(name, args.database, args.show)
  File "/home/pints/functional-testing/pfunk/tests/_tests.py", line 34, in plot
    _tests[name].plot(database, show)
  File "/home/pints/functional-testing/pfunk/_test.py", line 115, in plot
    figs = self._plot(results)
  File "/home/pints/functional-testing/pfunk/tests/mcmc_banana.py", line 157, in _plot
    figs.append(pfunk.ChangePints().data(results['kld']).figure())
  File "/home/pints/functional-testing/pfunk/changepints.py", line 76, in figure
    fig, ax = rpt.display(self.signal, self.breakpoints())
AttributeError: 'ChangePints' object has no attribute 'signal'

@abhidg
Copy link

abhidg commented Aug 2, 2019

@MichaelClerx
Copy link
Member Author

Next up: Figure out test/pass when to email etc.

@MichaelClerx MichaelClerx changed the title Functional test failures when distribution changes? (changepoint) Determining whether a functional test has failed (changepoint) Sep 28, 2019
@MichaelClerx
Copy link
Member Author

We should propbably prioritise this issue, and resolve it in the next few weeks :-)

Have added some new methods, and re-added a few tests we removed earlier because they seemed to hard.

It's interesting that the changepoint code so far hasn't complained about any of the methods. That's

  1. Good! Because it appears more robust than our threshold-based testing
  2. Not ideal, because consistently bad isn't maybe what we're after :D

So I'm guessing the final criterion would be a combination of what we currently have and the changepoint code?

@MichaelClerx
Copy link
Member Author

For single chain MCMC methods, we could also consider adding a test that runs mutliple chains and tests whether they've converged?

@MichaelClerx
Copy link
Member Author

For optimisers, see #906

@MichaelClerx
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants