Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initializing via AutoImageProcessor before AutoProcessor is imported causes AttributeError #34307

Open
1 of 4 tasks
alex-jw-brooks opened this issue Oct 22, 2024 · 15 comments · May be fixed by #36184
Open
1 of 4 tasks
Labels
bug WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

Comments

@alex-jw-brooks
Copy link
Contributor

alex-jw-brooks commented Oct 22, 2024

System Info

  • transformers version: 4.45.2
  • Platform: Linux-4.18.0-513.18.1.el8_9.x86_64-x86_64-with-glibc2.28
  • Python version: 3.10.4
  • Huggingface_hub version: 0.24.5
  • Safetensors version: 0.4.4
  • Accelerate version: 0.33.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.4.0+cu121 (False)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: No

Who can help?

Probably @zucchini-nlp @amyeroberts, and @qubvel

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

There seems to be an edge-case in the loading behavior that can sometimes be hit if something is initialized with from_pretrained through AutoImageProcessor before AutoProcessor is imported, and then from_pretrained is used on AutoProcessor.

Repro case - this works as expected

model_name = "microsoft/Phi-3.5-vision-instruct"
from transformers import AutoImageProcessor
from transformers import AutoProcessor
AutoImageProcessor.from_pretrained(model_name, trust_remote_code=True)
AutoProcessor.from_pretrained(model_name, trust_remote_code=True)

But this breaks:

model_name = "microsoft/Phi-3.5-vision-instruct"
from transformers import AutoImageProcessor
AutoImageProcessor.from_pretrained(model_name, trust_remote_code=True)

from transformers import AutoProcessor
AutoProcessor.from_pretrained(model_name, trust_remote_code=True)

with AttributeError: module transformers has no attribute Phi3VImageProcessor.

I've tried to reproduce this with a few other models, but haven't been able to yet. It is also probably worth noting that the AutoProcessor doesn't return the same class as AutoImageProcessor for this model, which might matter

Expected behavior

AutoImageProcessor / AutoProcessor from_pretrained should have the same behavior if possible, regardless of import and invocation order

@LysandreJik
Copy link
Member

Seems like a bug indeed, thanks for the great report! cc @Rocketknight1, can you take a look at this?

@Rocketknight1
Copy link
Member

Confirmed the bug, and thanks for the clean reproduction script! I suspect this is an issue with how we construct the autoclass mappings, I'm working on it now.

@alex-jw-brooks
Copy link
Contributor Author

Awesome, thank you both! 🙂

@Rocketknight1
Copy link
Member

After investigation, the issue seems to be here. Manually setting an attribute of the transformers library like that seems to break other stuff in our imports, and it interacts with the AutoProcessor import in ways I don't fully understand

@Rocketknight1
Copy link
Member

Final diagnosis:

  • The fundamental cause is our code here
  • This code grabs all the attributes (i.e. tokenizers, image_processors) that belong to a processor class, and gets arguments from each of them
  • However, to get the arguments, it tries to import those classes from the transformers root. As a result, it fails when the image processor class is custom code and not present in the transformers root.
  • The Microsoft team added the line here, obviously to work around this issue. However, this line has weird side-effects and breaks if we import classes in the wrong order, as @alex-jw-brooks noticed.
  • To fix this, we should rewrite _get_arguments_from_pretrained so that it can handle custom code classes correctly, and then make a PR to the Microsoft repo to remove the line that assigns the class as a root attribute of transformers.

cc @zucchini-nlp - can you see an obvious fix to that method that would work for custom classes? I can take this if you're overloaded!

@zucchini-nlp
Copy link
Member

Hey @Rocketknight1, thanks for investigating this.

The quickest worrkaround I see is to change this line in Microsoft repo to be an AutoImageProcessor. Just tried it by modifying the code locally, works as expected because the AutoClass will be importable in any case. I am not sure though that forcing autoclass within our loading code would work in all cases, since some classes can be non-automappable? 🤔

But I agree that we need a better solution here than forcing folks on the hub to use AutoClasses. I cannot say right now what would be the best option, and yeah would be super cool if you have any ideas for long-term solution 😄

@Rocketknight1
Copy link
Member

Yeah, that's a really good point! Let me investigate this when I get a chance, maybe I can come up with a (semi-) clean solution.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@DarkLight1337
Copy link

Any update on this?

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@DarkLight1337
Copy link

Any update on this?

Bump

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@zucchini-nlp zucchini-nlp added the WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress label Jan 11, 2025
@zucchini-nlp
Copy link
Member

Adding WIP, I think this needs a fix so users who upload models to the hub can get everything working smoothly. @Rocketknight1 , am I right that this is on you?

@Rocketknight1
Copy link
Member

Yes, this does need a fix and I just haven't gotten to it yet!

@Rocketknight1
Copy link
Member

The PR at #36184 should resolve the underlying issue, but we might need to update the custom Phi code to stop applying the workaround in order to fully resolve this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress
Projects
None yet
5 participants