Student's t-distribution as base distribution #31
-
Hello Guys, I am currently working on implementing the techniques proposed in this paper (https://arxiv.org/pdf/1907.04481.pdf), which focuses on improving flow results in the tails of distributions, which can be very useful in may physics analysis. The authors suggest using a Student's t-distribution instead of a multivariate diagonal Gaussian as the loss function. Additionally, they introduce the number of degrees of freedom for the t-distribution as a learnable parameter. I think I have successfully implemented the Student's t-distribution in place of the Gaussian. The corresponding code can be found here: class MultiStudentT(Independent):
r"""Creates a multivariate student's t-distribution parametrized by the variables
degrees of freedom :math `\nu`, mean :math:`\mu` and standard deviation :math:`\sigma`, but assumes no
correlation between the variables.
Arguments:
df: The number of degress of freedom of the distribution
loc: The mean :math:`\mu` of the variables.
scale: The standard deviation :math:`\sigma` of the variables.
ndims: The number of batch dimensions to interpret as event dimensions.
Example:
>>> d = MultiStudentT(torch.tensor(2.5),torch.zeros(3), torch.ones(3))
>>> d.event_shape
torch.Size([3])
>>> d.sample()
tensor([-0.9570, 1.0004, 0.4297])
"""
def __init__(self, df: Tensor ,loc: Tensor, scale: Tensor, ndims: int = 1):
super().__init__(torch.distributions.studentT.StudentT( torch.as_tensor(df), torch.as_tensor(loc), torch.as_tensor(scale)), ndims)
def __repr__(self) -> str:
return 'Diag' + repr(self.base_dist)
def expand(self, batch_shape: Size, new: Distribution = None) -> Distribution:
new = self._get_checked_instance(MultiStudentT, new)
return super().expand(batch_shape, new) But I still was not able to implement the learning parameter, what I tried until the moment was: flow = zuko.flows.NSF( self.training_inputs.size()[1] , self.training_conditions.size()[1] , transforms = 5, bins = 8 ,hidden_features=[256] * 3)
self.t_degress_of_fredom = nn.Parameter( torch.tensor( 5.0 ), requires_grad=True)
d = Unconditional(MultiStudentT, df = self.t_degress_of_fredom, loc = mu , scale=sigma, buffer=True )
flow = zuko.flows.Flow( flow.transform , base = d)
optimizer = torch.optim.Adam( flow.parameters(), 1e-3) But the degrees of freedom parameters does not show in the flow.parameters() list, and it is not changing though the training. do you have any leads in how I could implement this extra learnable parameter? Cheers, |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
Hi, A update, I think one can solve this by simply adding the additional parameter to the optimiser like this: optimizer = torch.optim.Adam( itertools.chain(flow.parameters(), (self.t_degress_of_fredom,)) , 1e-3) In the end the question was not related to Zuko itself, I am sorry for the noise! But perhaps this might be useful to others that want to also try this. Best, |
Beta Was this translation helpful? Give feedback.
-
Hello @CaioDaumann, thank you for your question! There are several ways to create a custom base distribution, but it must be a The first way is to create a function (or a class constructor) that returns a def student_t(log_df: Tensor) -> Distribution:
return Independent(StudentT(df=log_df.exp()), 1)
base = Unconditional(student_t, torch.randn(5)) There are a few subtleties with Note that parameters should always be "unconstrained", meaning that they can take any value in class LazyStudentT(zuko.flows.LazyDistribution):
def __init__(self, features: int):
super().__init__()
self.log_df = torch.nn.Parameter(torch.randn(features))
def forward(self, c: Tensor = None) -> Distribution:
return Independent(StudentT(df=self.log_df.exp()), 1)
base = LazyStudentT(features=5) Now that your flow = zuko.flows.NSF(features=5, ...)
flow.base = base # replace Gaussian with StudentT I hope this helps! |
Beta Was this translation helpful? Give feedback.
Hello @CaioDaumann, thank you for your question! There are several ways to create a custom base distribution, but it must be a
LazyDistribution
, that is a module that returns a distribution when called.The first way is to create a function (or a class constructor) that returns a
Distribution
when called and wrap it insideUnconditional
.There are a few subtleties with
Unconditional
. First, keyword arguments are not considered as parameters (or buffers), they will be passed unmodified to the function during the forward. This means that te…