Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit search reimplementation #144

Open
albertz opened this issue Apr 29, 2022 · 3 comments
Open

Explicit search reimplementation #144

albertz opened this issue Apr 29, 2022 · 3 comments

Comments

@albertz
Copy link
Member

albertz commented Apr 29, 2022

I'm considering to rewrite the whole search concept on returnn-common side.

I think this is actually not too difficult.

Every function should treat unhandled dims just like the batch dim. So having an additional dim for the beam should be no problem at all. So we don't need to merge the beam into the batch dim. This simplifies many things and also makes it much cleaner. If there is any function which does not behave this way, we really should fix this, in any case, independent of this proposal here. This is a fundamental concept of the building blocks of returnn-common.

We now have the very generic top_k function (#140, #143) which can operate over multiple axes together.

So consider an input [batch, beam, classes], top_k can operate on [beam, classes]. I then returns two indices, one for the (source) beam, one for the classes.

The output is of shape [batch, k]. We can set k as the new beam.

Combining [batch, old_beam, dim] + [batch, new_beam, dim] would not directly work like it works in RETURNN. But we can introduce an explicit function (translate_beam or so, analogue to how it is done in RETURNN), and then the user needs to take care about this explicitly.

If this is in a loop, and you probably later want to traceback through the beam indices to construct the K output sequences. This needs some further manual explicit logic.

@albertz
Copy link
Member Author

albertz commented Apr 29, 2022

@patrick-wilken @michelwi any thoughts?

Related is rwth-i6/returnn#714 but this here would provide even more flexibility.

@patrick-wilken
Copy link

I don't really consider myself a reviewer for returnn_common as I have never used it so far. The RETURNN search implementation works fine for us, so no immediate need to implement something else.
Regarding rwth-i6/returnn#714, I'm sorry, but this just had no high priority for me. If others need this too I'm happy to start working on it.

@albertz
Copy link
Member Author

albertz commented May 6, 2022

returnn-common is work-in-progress. You should consider yourself a reviewer, as probably many upcoming setups will make use of it, as it simplifies to share setups and building blocks among us, and also simplifies writing models, or any code. That is the main motivation behind returnn-common, that we can more easily share things. So it does not make sense that you ignore it now.

rwth-i6/returnn#714 probably becomes also obsolete if we follow this here.

But if you say you currently don't need more flexibility in the search, then this is not so important for you. But I thought that you actually do need more flexibility. You started rwth-i6/returnn#714 and rwth-i6/returnn#649 about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants