Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Refactor completion integration tests #7905

Merged

Conversation

TaoChenOSU
Copy link
Contributor

@TaoChenOSU TaoChenOSU commented Aug 6, 2024

Motivation and Context

We currently have two integration test modules for testing our completion services: one for chat completion and one for text completion.

We have many use cases, including image inputs, tool calls outputs, and others. They all live in single test modules. This makes the them hard to maintain, especially when we are adding more connectors.

We are also only testing the output contents of the services, not the types. Testing only the output contents can result in flaky tests because the models don’t always return what we expect them to return. Checking the types of the contents makes the tests more robust.

Description

  1. Create a base class to enforce the structure of the completion test modules.
  2. Create two new test modules for image content and function calling contents.
  3. Simply the original text completion test and chat completion test to only test for the simplest scenarios.
  4. No longer check if the services return specific words. They are considered passing as long as they return something non empty.
  5. Bug fixes on function calling for Azure AI Inference and the Google connectors.

Contribution Checklist

@TaoChenOSU TaoChenOSU added PR: in progress Under development and/or addressing feedback python Pull requests for the Python Semantic Kernel labels Aug 6, 2024
@TaoChenOSU TaoChenOSU self-assigned this Aug 6, 2024
@TaoChenOSU TaoChenOSU requested a review from a team as a code owner August 6, 2024 16:41
@github-actions github-actions bot changed the title Refactor completion integration tests Python: Refactor completion integration tests Aug 6, 2024
@markwallace-microsoft
Copy link
Member

Python Unit Test Overview

Tests Skipped Failures Errors Time
2244 1 💤 0 ❌ 0 🔥 56.382s ⏱️

@TaoChenOSU TaoChenOSU removed the PR: in progress Under development and/or addressing feedback label Aug 15, 2024
@TaoChenOSU TaoChenOSU added this pull request to the merge queue Aug 15, 2024
Merged via the queue into microsoft:main with commit 2d31245 Aug 15, 2024
25 checks passed
@TaoChenOSU TaoChenOSU deleted the taochen/integration-test-refactor branch August 15, 2024 22:49
LudoCorporateShark pushed a commit to LudoCorporateShark/semantic-kernel that referenced this pull request Aug 25, 2024
### Motivation and Context

<!-- Thank you for your contribution to the semantic-kernel repo!
Please help reviewers and future users, providing the following
information:
  1. Why is this change required?
  2. What problem does it solve?
  3. What scenario does it contribute to?
  4. If it fixes an open issue, please link to the issue here.
-->
We currently have two integration test modules for testing our
completion services: one for chat completion and one for text
completion.

We have many use cases, including image inputs, tool calls outputs, and
others. They all live in single test modules. This makes the them hard
to maintain, especially when we are adding more connectors.

We are also only testing the output contents of the services, not the
types. Testing only the output contents can result in flaky tests
because the models don’t always return what we expect them to return.
Checking the types of the contents makes the tests more robust.

### Description

<!-- Describe your changes, the overall approach, the underlying design.
These notes will help understanding how your code works. Thanks! -->
1. Create a base class to enforce the structure of the completion test
modules.
2. Create two new test modules for image content and function calling
contents.
3. Simply the original text completion test and chat completion test to
only test for the simplest scenarios.
4. No longer check if the services return specific words. They are
considered passing as long as they return something non empty.
5. Bug fixes on function calling for Azure AI Inference and the Google
connectors.


### Contribution Checklist

<!-- Before submitting this PR, please make sure: -->

- [x] The code builds clean without any errors or warnings
- [x] The PR follows the [SK Contribution
Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md)
and the [pre-submission formatting
script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts)
raises no violations
- [x] All unit tests pass, and I have added new tests where possible
- [x] I didn't break anyone 😄
@TaoChenOSU TaoChenOSU linked an issue Aug 26, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests for the Python Semantic Kernel
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Python: Flaky test: Chat Completion Integration Test
4 participants