Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find out why adaptive covariance func test results have changed slightly #784

Closed
MichaelClerx opened this issue Apr 16, 2019 · 23 comments
Closed
Assignees
Labels

Comments

@MichaelClerx
Copy link
Member

And emcee hammer

My current guesses:

  1. Changes I made to functional testing itself, not changes to the methods
  2. Some change in shared code, e.g. picking a default sigma
@MichaelClerx
Copy link
Member Author

Investigating this is made a bit harder by the fact that I removed the studies that time, which changed a lot of old commit hashes :-|

@MichaelClerx
Copy link
Member Author

So I've added some functionality to pfunk to re-investigate old commits (but store the results in a temporary directory), and am now running tests with

/funk investigate -c 02be2f50e3abc13485e8ffe9080a5be4ae167a80 mcmc_banana_AdaptiveCovarianceMCMC_1 temp -r 6

on scrambler

@MichaelClerx
Copy link
Member Author

This is everything since february

mcmc_banana_AdaptiveCovarianceMCMC_1-2019-04-25-04:28:50

@MichaelClerx
Copy link
Member Author

Zoomed in on the end:
mcmc_banana_AdaptiveCovarianceMCMC_1-2019-04-25-08:59:04

@MichaelClerx
Copy link
Member Author

MichaelClerx commented Apr 25, 2019

I think this really means we can't use a threshold for testing. The top graph shows there were several times when exactly this behaviour occurred, and then presumably we fixed it

@MichaelClerx
Copy link
Member Author

This seems to say the bug is between these two commits:

commit c1218dda4def71a3657a33c7ff626fb891eec9e1
Merge: be6ba60a 90746169
Author: Michael Clerx <[email protected]>
Date:   Fri Feb 22 09:45:06 2019 +0000

    Merge pull request #736 from pints-team/i725-more-log-priors
    
    I725 more log priors

commit 907461692f4a226414c594cc080618fab5ca3431
Author: Fergus Cooper <[email protected]>
Date:   Thu Feb 21 17:19:52 2019 +0000

    #725 Correct support for log normal prior

But I can't really see how that would be the case...

@MichaelClerx
Copy link
Member Author

Aha! This is getting very interesting. The change happens when the merge occurs. But we can't just go back down the list of commits chronologically because then commits from master and the merged_branch get mixed up: https://stackoverflow.com/questions/10886001/#11212541

@MichaelClerx
Copy link
Member Author

So some of the "noise" we see before the final jump is the cause, not the final jump!

@mirams
Copy link
Member

mirams commented Apr 25, 2019

So we aren't just testing the master branch but all commits?

@MichaelClerx
Copy link
Member Author

Sorry this got a bit weird. We're testing only master. But, when you merge something into master, the ordering of commits in time gets a bit weird. So what I was doing here was not using normal functional testing, but scanning back in time over master, trying to find out exactly where a change happens. It turns out now that that's a bit harder than I thought, because you can't just order all the previous commits by time.

So in the above statement, the merge commit (top) is definitely what caused the change, but the commit below isn't necessarily the culprit

@MichaelClerx
Copy link
Member Author

commit 6d196ac68196d5e3b8443e97c9dba345da4c3934
Author: Ben Lambert <[email protected]>
Date:   Wed Feb 20 01:39:17 2019 +0000

    Changes to notebooks
    
    - Changes to description in annulus and twisted Gaussian
    - Changes to method in simple egg box

commit c017043350095376e39519fbeefcd4781ebc18a1
Author: Fergus Cooper <[email protected]>
Date:   Tue Feb 19 23:57:33 2019 +0000

    #722 Changes for Michael's review

This is another candidate for the change we're seeing

@mirams
Copy link
Member

mirams commented Apr 25, 2019

I miss SVN!

@MichaelClerx
Copy link
Member Author

OK, think the difference arises here:

commit 69e427ec6d9fcc327896a1e0033107a787920b90 (HEAD)
Author: Ben Lambert <[email protected]>
Date:   Fri Feb 15 01:21:47 2019 +0000

    Various fixes to toy distribution and Hamiltonian MC
    
    - Fixed handling of divergent iterations in HMC
    - Changed all toy example notebooks to remove my import os comment
    - Added simple egg box suggested bounds and sensitivities and tests
    - Corrected twisted Gaussian call and kl_divergence so that first element of covariance matrix (as in paper) is 100
    - Added plot of HMC divergences in twisted Gaussian notebook
    - Added value based tests for twisted Gaussian

commit f2693492f228828406faa05b224afac9cb4a2f89
Author: ben18785 <[email protected]>
Date:   Thu Feb 14 18:21:04 2019 +0000

    Various to simple eggbox
    
    - Changed name to kl_divergence
    - Created distance function

@MichaelClerx
Copy link
Member Author

We used to have

        # Calculate the Kullback-Leibler divergence between the given samples
        # and the multivariate normal distribution underlying this banana.
        # From wikipedia:
        #
        # k = dimension of distribution
        # dkl = 0.5 * (
        #       trace(s1^-1 * s0)
        #       + (m1 - m0)T * s1^-1 * (m1 - m0)
        #       + log( det(s1) / det(s0) )
        #       - k
        #       )
        #
        # For this distribution, s1 is the identify matrix, and m1 is zero,
        # so it simplifies to
        #
        # dkl = 0.5 * (trace(s0) + m0.dot(m0) - log(det(s0)) - k))
        #
        m0 = np.mean(y, axis=0)
        s0 = np.cov(y.T)
        return 0.5 * (
            np.trace(s0) + m0.dot(m0)
            - np.log(np.linalg.det(s0)) - self._n_parameters)

but then it became this:

        # Calculate the Kullback-Leibler divergence between the given samples
        # and the multivariate normal distribution underlying this banana.
        # From wikipedia:
        #
        # k = dimension of distribution
        # dkl = 0.5 * (
        #       trace(s1^-1 * s0)
        #       + (m1 - m0)T * s1^-1 * (m1 - m0)
        #       + log( det(s1) / det(s0) )
        #       - k
        #       )
        #
        # For this distribution, s1 is the identify matrix, and m1 is zero,
        # so it simplifies to
        #
        # dkl = 0.5 * (trace(s0) + m0.dot(m0) - log(det(s0)) - k))
        #
        m0 = np.mean(y, axis=0)
        s0 = np.cov(y.T)
        s1 = self._sigma
        m1 = np.zeros(self.n_parameters())
        s1_inv = np.linalg.inv(s1)
        return 0.5 * (
            np.trace(np.matmul(s1_inv, s0)) +
            np.matmul(np.matmul(m1 - m0, s1_inv), m1 - m0) -
            np.log(np.linalg.det(s0)) +
            np.log(np.linalg.det(s1)) -
            self._n_parameters)

where

        self._sigma = np.eye(self._n_parameters)
        self._sigma[0, 0] = 100

@ben18785
Copy link
Collaborator

ben18785 commented Apr 25, 2019 via email

@MichaelClerx
Copy link
Member Author

There's a (m1 - m0) transpose in the wiki equation I don't see in the code - will check if that matters - but otherwise looks good to me
We should update the comment above too

@MichaelClerx
Copy link
Member Author

Yeah it's a vector so the transpose doesn't matter. Have stuck it in anyway and adapted the comment above. Would be good if someone could review quickly!

@MichaelClerx
Copy link
Member Author

Checked other "new" fails in functional tests, and they're all banana tests, from the same commit. So this is definitely what we're seeing

@MichaelClerx
Copy link
Member Author

Double-checked the paper and agree we now have the banana distribution correct

@MichaelClerx
Copy link
Member Author

Conclusion: Not a bug! We just made the problem harder.

Footnote about versioning: When you use branches and merge them, your history isn't linear - it's got branches after all! So you have to be careful when interpreting a linearised view of your history. (Not a git-specific thing it turns out Gary ;-)

@MichaelClerx
Copy link
Member Author

Wondering now whether to delete the test results from before this change. Thoughts @martinjrobins @ben18785 ?

@martinjrobins
Copy link
Member

martinjrobins commented Apr 25, 2019 via email

@MichaelClerx
Copy link
Member Author

We need a general method to mark (and ignore) false failures. Should be easy to run, but happy to go with the delete option for now untill we figure it out

As part of #780 I imagine we'd have some sort of "known good result" marker indicating where the region whose distribution to estimate starts. We could plot a nice line for this in the graph...

Actually, @chonlei might already need that line as part of #783

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants