Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add state management and queuing #1

Open
4 tasks
KastanDay opened this issue Oct 11, 2023 · 0 comments
Open
4 tasks

Add state management and queuing #1

KastanDay opened this issue Oct 11, 2023 · 0 comments

Comments

@KastanDay
Copy link
Member

KastanDay commented Oct 11, 2023

State management:

  • Keep track of which model(s) is in memory to help with advanced batching (NOT pure FIFO)
  • Prioritization?

Queuing

  • Inference queue
  • Advanced batching -- when the queue contains separate requests for the same model, batch them and run all jobs requesting that model before moving onto the next model (with a max of 15-20 minutes with any one model in memory, if we have other jobs waiting in the queue. This should balance efficiency, i.e. batching, with fairness, i.e. FIFO queuing).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant