-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add PD3O with tests #1834
add PD3O with tests #1834
Conversation
Thanks @epapoutsellis this looks great! We're not far from releasing 24.1, but we'll be planning the next release in a couple of weeks and we can add this to the workplan. |
Thanks @epapoutsellis! It looks good. I added a few todos for myself around testing and documentation. Feel free to answer them if you have time before me but I will aim to look again around the beginning of July. |
Hi @epapoutsellis - I am going through this in detail and have a few questions:
|
Actually, playing with this further. I considered def test_pd3o_vs_fista(self):
alpha = 0.1
G = alpha * TotalVariation(max_iteration=5, lower=0)
F= 0.5 * L2NormSquared(b=self.data)
algo=FISTA(f=F, g=G,initial=0*self.data)
algo.run(200)
F1= 0.5 * L2NormSquared(b=self.data)
H1 = alpha * MixedL21Norm()
operator = GradientOperator(self.data.geometry)
G1= IndicatorBox(lower=0)
norm_op=operator.norm()
algo_pd3o=PD3O(f=F1, g=G1, h=H1, operator=operator, initial=0*self.data, gamma=gamma, delta=delta)
algo_pd3o.run(500)
np.testing.assert_allclose(algo.solution.as_array(), algo_pd3o.solution.as_array(), atol=1e-2)
np.testing.assert_allclose(algo.objective[-1], algo_pd3o.objective[-1], atol=1e-2) For the proposed default step sizes: gamma =2./F.L
delta = 1./(gamma*norm_op**2) PD3O does not converge to the correct solution however with gamma = 0.99*2./F.L
delta = 1./(gamma*norm_op**2) it does converge. Any thoughts? |
For the step size, it sounds like the "safety factor" of 0.99 is sensible. For you question 1 on AA^T having gone missing. Maybe I misunderstand, but I read the implemented code as based on equation 4 (also the code comment states this), which is reformulated from equation 3, and the AA^T appears only in eq 3. Not sure if this affects whether AA^T should be included in the step size bound as well? |
Eq. (4a) contains the gradient of l^*. I don't see that in the code? |
Aaah - I was looking at the arxiv version of the paper which had a different eq4. That makes more sense. |
For reference the link to the published version: In the lines just below eq (4) it says l^* is considered equal to zero. So perhaps this implementation assumes l^*=0. @epapoutsellis can you confirm? |
Re. question 3, whether a warning. I think it makes sense to allow the algorithm to run with ZeroFunction being passed. I think also consistent with allowing this in PDHG and FISTA? Have you checked if the algorithm behaves sensibly with a ZeroFunction passed, including converging to expected result? I think having a warning is okay to recommend the user to switch to PDHG, but allow it to run. If the algorithm runs and behaves as expected I'd also be okay with omitting the warning. |
I think that makes sense, if |
Yes, one of the tests compares the behaviour of PDHG and PD3O with a zero function. I think in PDHG we have the strong-convexity additions so it is probably worth the user using that over PD3O |
OK great yes then I suggest keeping the warning to allow users to run PD3O with ZeroFunction to deliberately create the special case algorithm and compare with our PDHG implementation, and continue to have the warning recommend the user to switch to PD3O for efficiency reasons. |
BTW, a few places appear to use the number 0 instead of the letter O in ie "PD30" instead of "PD3O", please search and replace. |
Yes, I use equations below from https://doi.org/10.1007/s10915-018-0680-3
|
Thanks Vaggelis, I think i have done this.
I have set the same default as PDHG when |
@epapoutsellis @jakobsj - please can I revive this PR. Do you have any more comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a brief look at updates and everything looks good to me, thanks! Exciting to have PD3O added - thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be more memory efficient with some simple changes to the order of operations in update
. This will impact what needs to be in set_up
and update_objective
too.
Signed-off-by: Margaret Duff <[email protected]>
Description
Adding PD3O algorithm with unitests.
Implementation based from A new primal-dual algorithm for minimizing the sum of
three functions with a linear operator. With this algorithm we have for free 2 additional algs: : PAPC, David-Yin (sometimes called PDDY). PDHG is a special case. Also, we can have Condat-Vu, AFBA, see details in the above paper. Finally, all of the above algorithms can be combined with stochastic (variance-reduced) algorithms.
Below some results, presented 2 years (Workshop on modern image reconstruction algorithms and practices for medical imaging) ago using the stochastic framework just before removing the 1/n weight convention.
Related issues/links
Closes #1890
Checklist
Contribution Notes
Please read and adhere to the developer guide and local patterns and conventions.