-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Refactor completion integration tests #7905
Merged
TaoChenOSU
merged 7 commits into
microsoft:main
from
TaoChenOSU:taochen/integration-test-refactor
Aug 15, 2024
Merged
Python: Refactor completion integration tests #7905
TaoChenOSU
merged 7 commits into
microsoft:main
from
TaoChenOSU:taochen/integration-test-refactor
Aug 15, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
TaoChenOSU
added
PR: in progress
Under development and/or addressing feedback
python
Pull requests for the Python Semantic Kernel
labels
Aug 6, 2024
github-actions
bot
changed the title
Refactor completion integration tests
Python: Refactor completion integration tests
Aug 6, 2024
Python Unit Test Overview
|
TaoChenOSU
removed
the
PR: in progress
Under development and/or addressing feedback
label
Aug 15, 2024
moonbox3
approved these changes
Aug 15, 2024
alliscode
approved these changes
Aug 15, 2024
LudoCorporateShark
pushed a commit
to LudoCorporateShark/semantic-kernel
that referenced
this pull request
Aug 25, 2024
### Motivation and Context <!-- Thank you for your contribution to the semantic-kernel repo! Please help reviewers and future users, providing the following information: 1. Why is this change required? 2. What problem does it solve? 3. What scenario does it contribute to? 4. If it fixes an open issue, please link to the issue here. --> We currently have two integration test modules for testing our completion services: one for chat completion and one for text completion. We have many use cases, including image inputs, tool calls outputs, and others. They all live in single test modules. This makes the them hard to maintain, especially when we are adding more connectors. We are also only testing the output contents of the services, not the types. Testing only the output contents can result in flaky tests because the models don’t always return what we expect them to return. Checking the types of the contents makes the tests more robust. ### Description <!-- Describe your changes, the overall approach, the underlying design. These notes will help understanding how your code works. Thanks! --> 1. Create a base class to enforce the structure of the completion test modules. 2. Create two new test modules for image content and function calling contents. 3. Simply the original text completion test and chat completion test to only test for the simplest scenarios. 4. No longer check if the services return specific words. They are considered passing as long as they return something non empty. 5. Bug fixes on function calling for Azure AI Inference and the Google connectors. ### Contribution Checklist <!-- Before submitting this PR, please make sure: --> - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation and Context
We currently have two integration test modules for testing our completion services: one for chat completion and one for text completion.
We have many use cases, including image inputs, tool calls outputs, and others. They all live in single test modules. This makes the them hard to maintain, especially when we are adding more connectors.
We are also only testing the output contents of the services, not the types. Testing only the output contents can result in flaky tests because the models don’t always return what we expect them to return. Checking the types of the contents makes the tests more robust.
Description
Contribution Checklist