Skipping cudagraphs for unknown reason #31645
Labels
Cache
Compilation
Issues related to torchdynamo and torchinductor
Feature request
Request for a new feature
Good Second Issue
Issues that are more difficult to do than "Good First" issues - give it a try if you want!
System Info
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
transformers
version: 4.41.2Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I read issue 30055 and issue 30351, and Llama works well with
cache_implementation="static"
. However, I am trying to usetorch.compile
for other models such aspythia
andphi-2
where thecache_implementation="static"
is not appliable, and it will produce errors like:Here is my code for reproducing the errors.
Expected behavior
The models such as
pythia
andphi-2
can run withtorch.compile
and a clear latency improvement can be observed.The text was updated successfully, but these errors were encountered: