-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat: add OpenAI-compatible Bedrock provider #3748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add OpenAI-compatible Bedrock provider #3748
Conversation
59d4cfa to
e4d71e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use this one as a reference #3707
14919c1 to
3a9af0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please report the result of the tests/integration/inference/test_openai_completion.py and other openai related tested.
Also why has the uv.lock changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- nothing from models.py is used, please remove it
- is the /v1/embeddings endpoint available? if not, add a NotImplementedError stub
- is the /v1/compltions endpoint available? if not...
- great find wrt telemetry and stream usage, after this pr we should consider adding that nugget to the mixin for all providers
3a9af0c to
56bff11
Compare
56bff11 to
7024e56
Compare
7024e56 to
55aaa6e
Compare
55aaa6e to
dd2a1f6
Compare
|
@leseb I addressed missing comments. I had actually thought I addressed these before. Sorry, with deeper look I found logger category comment to be addressed. Thanks. |
no worries, please rebase. |
dd2a1f6 to
296d7a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- nothing from models.py is used, please remove it
- is the /v1/embeddings endpoint available? if not, add a NotImplementedError stub
- is the /v1/compltions endpoint available? if not...
- great find wrt telemetry and stream usage, after this pr we should consider adding that nugget to the mixin for all providers
new -
-
BedrockConfigis aRemoteInferenceProviderConfig, useauth_credentialinstead of a newapi_keyfield, see https://github.com/llamastack/llama-stack/blob/main/src/llama_stack/providers/utils/inference/model_registry.py#L22 - you don't need to override
get_api_key - instead of overriding
register_model, useasync def check_model_availability(self, model: str) -> bool: return True
e4a5321 to
96c2793
Compare
Implements AWS Bedrock inference provider using OpenAI-compatible endpoint for Llama models available through Bedrock. Changes: - Add BedrockInferenceAdapter using OpenAIMixin base - Configure region-specific endpoint URLs - Add NotImplementedError stubs for unsupported endpoints - Implement authentication error handling with helpful messages - Remove unused models.py file - Add comprehensive unit tests (12 total) - Add provider registry configuration
Refactor to use auth_credential for consistent credential management and improve error handling with defensive checks. Changes: - Use auth_credential instead of api_key for better credential handling - Simplify model availability check to accept all pre-registered models - Guard metrics collection when usage data is missing in responses - Add debug logging for better troubleshooting of API issues - Update unit tests for auth_credential refactoring
96c2793 to
a7a6191
Compare
|
@skamenan7 CI is failing on a few tests |
…amastack#3748" This reverts commit 4dd8f25.
…lamastack#3748" This reverts commit 6c793af.
7f1acbd to
a7a6191
Compare
- Fix test assertion to match actual error message - Add missing has_model method to FakeModelStore mock - Remove redundant comments and update docs
ba0c4e4 to
e0f0edc
Compare
Implements AWS Bedrock inference provider using OpenAI-compatible endpoint for Llama models available through Bedrock.
Closes: #3410
What does this PR do?
Adds AWS Bedrock as an inference provider using the OpenAI-compatible endpoint. This lets us use Bedrock models (GPT-OSS, Llama) through the standard llama-stack inference API.
The implementation uses LiteLLM's OpenAI client under the hood, so it gets all the OpenAI compatibility features. The provider handles per-request API key overrides via headers.
Test Plan
Tested the following scenarios:
Bedrock OpenAI-Compatible Provider - Test Results
Model:
bedrock-inference/openai.gpt-oss-20b-1:0Test 1: Model Listing
Request:
Response:
Test 2: Non-Streaming Completion
Request:
Response:
Test 3: Streaming Completion
Request:
Response:
Test 4: Error Handling - Invalid Model
Request:
Response:
Test 5: Multi-Turn Conversation
Request 1:
Response 1:
Request 2 (with history):
Response 2:
Context retained across turns
Test 6: System Messages
Request:
Response:
Test 7: Tool Calling
Request:
Response:
Test 8: Sampling Parameters
Request:
Response:
Test 9: Authentication Error Handling
Subtest A: Invalid API Key
Request:
Response:
Subtest B: Empty API Key (Fallback to Config)
Request:
Response:
Fell back to config key
Subtest C: Malformed Token
Request:
Response: