You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2. Please use English, otherwise it will be closed.
Motivation
This might be difficult to implement, but I am facing the following issue:
When running Qwen2-VL on bigger images, the preprocessor takes a long time to convert the images to tokens.
It would be awesome if we could have a way (OpenAI API with extra parameters) to tell the backend to store the cache of a request and load it by ID for another request, which would make it possible to not reprocess every image (and prompt in general) on each call.
If my problem could be solved in an easier way I would be thankful for any input :)
Related resources
No response
The text was updated successfully, but these errors were encountered:
Checklist
Motivation
This might be difficult to implement, but I am facing the following issue:
When running Qwen2-VL on bigger images, the preprocessor takes a long time to convert the images to tokens.
It would be awesome if we could have a way (OpenAI API with extra parameters) to tell the backend to store the cache of a request and load it by ID for another request, which would make it possible to not reprocess every image (and prompt in general) on each call.
If my problem could be solved in an easier way I would be thankful for any input :)
Related resources
No response
The text was updated successfully, but these errors were encountered: