-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Robust loss functions for outlier rejection #332
Comments
Not at the moment but it looks like a relatively simple modification of RLM to support it. The rescaling described in http://ceres-solver.org/nnls_modeling.html#theory looks like it would work for RLM too. |
@kellertuer do you think it would be better to add robustification support to the current implementation of RLM or make a separate implementation? |
In the Exact Penalty Method (EPM) that is included in the method itself, so I think that would be fitting here as well. We even have types already for different types of relaxations (e.g. Huber). If we extend those and use them in RLM as well, I think that would be great. edit: to provide a link, we currently have https://manoptjl.org/stable/solvers/exact_penalty_method/#Manopt.SmoothingTechnique – those could either be extended or combined into a common framework of robustifcation / smoothing. |
Cool 👍 . RLM seems to need a somewhat different interface though. |
Sure, no problem – maybe we can also revise the interface with EPM a bit to have a common one then. |
I revisited this and for the most general case (outside of RLM), I have no real idea how that could be done or what that would even mean for an algorithm like Douglas-Rachford for example. Within RLM, one could really just store the Ah I am not so sure we need the second derivative for now? We only use first order information in RLM for now I think. Then besides that field only the |
I don't think it can work for any loss function, only those without splitting, nonlinear least squares and most likely stochastic optimization, though I couldn't find any papers about what would be the Euclidean variant. Maybe let's tackle each case separately?
As far as I can tell, In an earlier post you had the idea of combining robust loss functions with |
You are right in full generality that might not be possible, so let's just do it for RLM for now. Yes, storing Yes the smoothing techniques are basically the same, though there we handle that with storing a symbol (and only support 2 functions). I am not yet sure how well this can be combined, but it would most probably be the ELM being adapted to the mode here (since their symbol-approach is far more restrictive than storing the smoothing function). |
Hi, One thing I am not yet so sure about is, whether we need the second derivative in out tangent-space-Jacobian-model subproblem, I will have to think about that. Maybe we do not. For now I think, since we solve the subproblem (2.1) from https://arxiv.org/pdf/2210.00253, this fixes a Jacobian and does not involve a derivative of the Jacobian. In the Theory section of the Ceres docs that would just be the gradient (since they consider a single function) and then they introduce an approximate Hessian. Our approach, instead, basically approximates the Hessian implicitly, and hence does not require For now (yesterday that is) I just worked a bit on generalising the vector functions we have to be able to do LM with them. Then the chain rule with the smoothing should not be that complicated. Will open a PR once this seems to work or once I am stuck somewhere. |
Scaling by
I think Trigg's paper is a bit more clear than Ceres docs when it comes to the use of Hessian. It won't modify the structure of the subproblem (2.1) but it will modify what we take as |
Sure I can do the scaling by And yes, Triggs paper is my next go to. Hopefully I understand it a bit better then. (and I will call |
For me Triggs stays as vague as the Ceres docs in the end of Sec 4.3 (Eq. 10 and following texts), especially understanding |
The derivations were left as an exercise to the reader 😃. Some derivations are standard for Gauss-Newton so you can look for that if you want to find a more detailed description. I can check it if you have trouble figuring out some part. |
Yeah, as well as references are left to the readers imagination. I think my main “mismatch” is currently, that we use J*J (while Triggs has some other J) and Triggs uses the Hessian, which we do not use. So to some extend the methods are quite different, at least in derivation. And I am too lazy to(or stupid) to derive the method with the Riemannian Hessian of the chain rule So I do understand the derivation of Triggs – sure. Just that our model approximation in the tangent space seems to be a different one – and I am not able to match both. |
I guess we don't need the Riemannian Hessian just as the Riemannian LM paper doesn't use one, though then we have to derive the corrections from scratch. And given that Ceres also works with box constraints, this starts to looks like a material for a slightly larger paper 😉 . That's something I'd like to work on but first we need to finish the Kalman filter one. |
That is what I mean, let's first do the Kalman project ;) I worked a bit on the code today, and I think I am nearly done with a rework, that still uses slightly wrong lamda_k but the rest should soon (tm) be done. |
While #432 did now rework LM to work much nicer with vectorial functions and hence allows more ways to provide the first order information, it will not resolve this issue
So this issue had a nice side effect of some restructuring. No. There does not exist enough theory to use robust loss functions in Manopt.jl. If it is ok, I would prefer to close this issue, since without the theory, there is no real perspective on this being doable. |
Is it possible to use loss functions such as Huber, Tukey, and adaptive in Manopt.jl (specifically RLM)?
Similar to ceres solver: http://ceres-solver.org/nnls_modeling.html#lossfunction
The text was updated successfully, but these errors were encountered: