-
Notifications
You must be signed in to change notification settings - Fork 268
Feat/workers per api instance #646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feat/workers per api instance #646
Conversation
- `_resolve_workers_per_device_config` make `resolve_workers_per_device` into dict[api_path, workers_per_device_int] - `_inference_workers_config_for_api` to instantiate workers per device
|
The CI and e2e tests stuck at |
…id` by iterating through APIs since the number of workers can now vary per API
for more information, see https://pre-commit.ci
|
Precommit and tests successful. I hope this PR could get reviewed |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #646 +/- ##
===================================
- Coverage 85% 85% -0%
===================================
Files 39 39
Lines 3212 3261 +49
===================================
+ Hits 2721 2762 +41
- Misses 491 499 +8 🚀 New features to boost your workflow:
|
|
Hmmm, failed on macos. Any suggestion? |
|
request for review. Need approval for pending checks, cc @bhimrazy @aniketmaurya |
|
I've see that all tests (automatic and dispatch) and codecov is passed. Looking to see review about this PR |
|
Weird, it passed before in 3.12. Can elaborate @bhimrazy? |
|
@bhimrazy looks like in main it is passed now, but somehow only fail in ubuntu 3.10 |
yeah, it could be another flaky test 😅 |
|
I actually have no clue here, should I make some change or else? @bhimrazy |
Maybe try triggering the CI again with an empty commit. btw, we’re still waiting on a review/decision from @andyland @aniketmaurya. |
|
Alright, I've added some minimal comment commit to trigger the test. Can you retrigger the pending checks? @bhimrazy |
|
Seems like it passed all the checks required, looking for review of the code owner. cc @bhimrazy @aniketmaurya |
andyland
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR description is inadequate, could you at least detail how the interface changes w/ some example usage of the new functionality
|
Hi @andyland I've updated the details in the PR message. Another examples of this PR coverage is as follows:
server = ls.LitServer(
[sentiment_api, generate_api],
accelerator="cuda",
devices=[0, 1],
workers_per_device=2, # same for all routes
)
server = ls.LitServer(
[sentiment_api, generate_api],
accelerator="cuda",
devices=[0, 1],
workers_per_device={
"/sentiment": 2, # 2 workers per GPU for sentiment
"/generate": 3, # 3 workers per GPU for generation
},
)What this means in worker counts:
server = ls.LitServer(
[sentiment_api, generate_api],
accelerator="cuda",
devices=[0, 1],
workers_per_device=[2, 3], # sentiment then generate (same order as API list)
) |
What does this PR do?
Added feature requested on enabling ability to control
workers_per_deviceper API instance #572 .Interface change: workers_per_device now accepts either:
int(previous behavior, applied to all APIs), orlist[int](per-API in connector order), ordict[str, int]mappingapi_path -> workers_per_device(per-route)Example:
This starts
2 * len(devices)=4inference workers for/sentimentand3 * len(devices)=6for/generate.This directly addresses #572 by allowing a single endpoint (e.g.
/generate) to be backed by multiple worker processes without duplicating API instances into multiple routes.Before submitting
Tests are passing locally. I added coverage in
tests/unit/test_lit_server.pyand verifiedThe PR adds unit coverage in
tests/unit/test_lit_server.py:test_workers_per_device_can_be_configured_per_routevalidates:api_path{"/sentiment": 2, "/generate": 3} with devices=[0,1]{"/sentiment": 4, "/generate": 6}test_workers_per_device_per_route_raises_on_unknown_routevalidates:api_path, otherwise raisesValueErrorOn my device (4 x T4) the tests results in:
Repro (4× T4):
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.
Did you have fun?
Definitely!