-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add differentiable coverage probability regularizer #15
Conversation
31247d8
to
9c472ec
Compare
@francois-rozet please have a look at this one :) I added all the functionalities expected in this PR, however, I can imagine you would see some parts implemented/organized differently so I keep it as a draft. Maybe we should arrange a call to discuss it? As already discussed, the new dependency |
Hello @macio232, I am looking at this right now. I read the paper more carefully to understand the method. Is it expected that the formula written in the docstrings does not correspond to Eq. (12) in your paper? I also don't understand where the In addition, could we drop the need for the sort? The rank This is actually mentioned in Section 3.2. Did you try this or just went for the sort? Also, did you try with a soft step function (e.g. sigmoid) instead of the STE? Anyway, if you want to keep the sort as in your paper, what I propose is to only import torchsort in the losses and refer to their repository for installation. Like this torchsort would not be a dependency of LAMPE. |
I just pused a commit that moves inference components into different files, which should make it easier to read/add/maintain code. I think you can move your two losses into a new file (e.g. |
It depends on your preference :) What I put in the docstring is supposed to describe what is the idea behind the method (corresponding to Eg. (7) in the paper), while Eq. (12) precisely describes how this idea is implemented. I can modify docstrings so that they match Eq. (12) but this requires introducing additional symbols, and probably some explanation. Do you prefer the short but more abstract formulation, or the very precise but long?
Short: I tried, didn't work well.
Not in the full experiments, but as a PoC and it didn't work well. But I can imagine that under certain circumstances it could work. In fact, a recent arXiv preprint does exactly this if I understood correctly.
Sound good. Where would you see the information about the additional requirement (torchsort) places? Docstrings? README in the repo? Another direction is to try to replace torchsort with diffsort which is easier to setup. But this is a method based on sorting networks which means it is slower. And I have no idea about the "quality" of the gradients it gives. At some point, I will try to evaluate it empirically.
As you see, the idea of using calibration (conservativennes) error as a regularizer can be implemented in many ways. The idea of this PR is to implement what I described in the paper. I hope to see people exploring the alternatives and sharing them in the library. However, there is a question about how would you like to maintain it? Let me discuss briefly version a) on the example of "sorting-based" vs "direct". Once we go for direct, there I the question of levels at which the loss should be computed. I see at least four options immediately (and the combinations of them):
|
Btw, do you have an idea how to call the two loss classes? As a placeholder, I used The idea of the method is to control the (Expected) Coverage Probability - how about |
As you prefer. The docstring can be as long (see NRE) or short (see FMPE) as you want, it is just a means to give some context. By the way, with the new file organization, you can have a long explanation in the module docstring and a short one in the classes docstring (see NRE). Do as you think is better.
Preferably in the module docstring. You can even write a little "installation" section like Installation
------------
blabla `torchsort` blabla
.. code-block:: bash
$ pip install torchsort
I agree, this is probably the most sensible way. Discussion about improvements can come later.
It depends on the differences I guess. If a flag that switches between two behaviors is enough, I guess the same class is fine. If the code is completely different, I would say two classes. However, don't make your class significantly more complex to be "future proof". Let's go for something simple for now.
Maybe |
@francois-rozet I applied all the modifications that you suggested.
❯ sphinx-build . html
Running Sphinx v7.2.6
Configuration error:
There is a programmable error in your configuration file:
Traceback (most recent call last):
File "/usr/lib/python3.11/site-packages/sphinx/config.py", line 358, in eval_config_file
exec(code, namespace) # NoQA: S102
^^^^^^^^^^^^^^^^^^^^^
File "<...>/lampe/docs/conf.py", line 6, in <module>
import lampe
ModuleNotFoundError: No module named 'lampe' despite being in an environment where |
Did you install the docs dependencies? Your env has Spinx 7.2.6 which is not what we use currently.
Also what is the version of python in your env? I usually stick to 3.9, because they are still issues with 3.11.
In this PR.
I think a single tutorial (either NRE or NPE + DCP) is enough. For the explanations, the level of the FMPE tutorial should be enough. |
By the way, for the tests, I think it will be necessary to skip testing
|
I thought so but that was not the case. Now works fine.
Indeed. The tests don't use backprop so we could fall back to torch's default sorting. What do you think? |
The tests should try to back-prop through the loss. I think using |
Long week it has been :) Do you want me to prepare the notebook(s) before your review? |
Ah sorry @macio232, my comment was not clear, I was waiting for you to finish the PR before I would review it. Do you want me to do a first pass before you add tests and tutorials? |
… module's docstring
Hello @macio232, I looked at your PR and it seems good to me. Thank you for including the installation instructions. I have rebased the PR to the master branch. Some tests were failing and I fixed them, but there is also a bug on the master branch due to the release of NumPy 2.0... I'll fix that bug then merge your PR. |
I tried the DCP losses by replacing the normal losses (e.g. I will merge this PR now. |
Resolve #14