Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: tuple.index(x): x not in tuple #14

Open
saadnaeem-dev opened this issue Oct 17, 2023 · 6 comments
Open

ValueError: tuple.index(x): x not in tuple #14

saadnaeem-dev opened this issue Oct 17, 2023 · 6 comments

Comments

@saadnaeem-dev
Copy link

saadnaeem-dev commented Oct 17, 2023

Error in decoding.py

this line is causing an issue in decoder.py:

self.sot_index: int = self.initial_tokens.index(tokenizer.sot)

where

ValueError: tuple.index(x): x not in tuple

self.initial_tokens
Out[10]: (50257,)
type(self.initial_tokens)
Out[11]: tuple
type(tokenizer.sot)
Out[12]: int
tokenizer.sot
Out[13]: 50333

since 50257 is not in (50333) we get ValueError: tuple.index(x): x not in tuple

for the correct cases these are the values (when using latest openai-whisper) we get

self.initial_tokens
Out[5]: (50257,)
type(self.initial_tokens)
Out[6]: tuple
tokenizer.sot
Out[7]: 50257
type(tokenizer.sot)
Out[8]: int

which is correct as 50257 exists in tuple 50257 and we are able to get its index

@vidalfer
Copy link

same here

2 similar comments
@marcoyang1998
Copy link

same here

@XuJingye2022
Copy link

same here

@cglackin
Copy link

same here, anyone find a workaround yet?

@constan1
Copy link

constan1 commented Dec 15, 2023

I caught a minor work around but dont know the exact details of why it works. It can't find the 50334 token but i dont see where its being implement. If anyone could let me know.

Main.py file
image

And in the decoding .py
image

seems to work with batches of 3 per gpu. Not more though. The original below gives the error in the OP

image

image
for

image

line 487 decoder.py

It saves the [50258] token in the get_tokenizer sot_sequence. But then this [50334] comes out of nowhere which cant be found in the original list

image

@mohith7548
Copy link

any fix for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants