Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nominal weights when correcting already weighted original #56

Open
RDMoise opened this issue Dec 20, 2018 · 1 comment
Open

Nominal weights when correcting already weighted original #56

RDMoise opened this issue Dec 20, 2018 · 1 comment
Labels

Comments

@RDMoise
Copy link

RDMoise commented Dec 20, 2018

Hi, I'm trying to correct the distribution D in an original (MC) sample that already has some weights, say w_i, that correct something else (say Dp). The way I'm currently doing this is I obtain weights, say x_i, by calling predict_weights(original = D_array, original_weight = w).
My question is the following: once I've done this, do I have to use x_i or w_i * x_i as nominal weights for my MC (i.e. to have both D and Dp corrected)? If the answer is x_i, then very naively one could assume that the ratio of the two sets of corrections (x_i, w_i) would yield something that corrects Dp but not D. Is this assumption correct?

Cheers,
Dan

@arogozhnikov
Copy link
Owner

arogozhnikov commented Dec 22, 2018

Hello Dan,

here is how weight prediction is implemented

In [2]: GBReweighter.predict_weights??
Signature: GBReweighter.predict_weights(self, original, original_weight=None)
Source:   
    def predict_weights(self, original, original_weight=None):
        """
        Returns corrected weights. Result is computed as original_weight * reweighter_multipliers.

        :param original: values from original distribution of shape [n_samples, n_features]
        :param original_weight: weights of samples before reweighting.
        :return: numpy.array of shape [n_samples] with new weights.
        """
        original, original_weight = self._normalize_input(original, original_weight)
        multipliers = numpy.exp(self.gb.decision_function(original))
        return multipliers * original_weight

So multiplication is done for you (as the last line says), just use the output of this method. Note that during training of reweighter you should also provide weights that you previously used to correct Dp, then it should work as expected.

Also note that second step of correction may break corrections of the first step if you don't require reweighter to correct Dp too. In many practical situations you may not care about that if D and Dp are quite independent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants