Python: Refactor completion integration tests #7905

TaoChenOSU · 2024-08-06T16:41:22Z

Motivation and Context

We currently have two integration test modules for testing our completion services: one for chat completion and one for text completion.

We have many use cases, including image inputs, tool calls outputs, and others. They all live in single test modules. This makes the them hard to maintain, especially when we are adding more connectors.

We are also only testing the output contents of the services, not the types. Testing only the output contents can result in flaky tests because the models don’t always return what we expect them to return. Checking the types of the contents makes the tests more robust.

Description

Create a base class to enforce the structure of the completion test modules.
Create two new test modules for image content and function calling contents.
Simply the original text completion test and chat completion test to only test for the simplest scenarios.
No longer check if the services return specific words. They are considered passing as long as they return something non empty.
Bug fixes on function calling for Azure AI Inference and the Google connectors.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the SK Contribution Guidelines and the pre-submission formatting script raises no violations
All unit tests pass, and I have added new tests where possible
I didn't break anyone 😄

markwallace-microsoft · 2024-08-07T01:29:37Z

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
2244	1 💤	0 ❌	0 🔥	56.382s ⏱️

### Motivation and Context  We currently have two integration test modules for testing our completion services: one for chat completion and one for text completion. We have many use cases, including image inputs, tool calls outputs, and others. They all live in single test modules. This makes the them hard to maintain, especially when we are adding more connectors. We are also only testing the output contents of the services, not the types. Testing only the output contents can result in flaky tests because the models don’t always return what we expect them to return. Checking the types of the contents makes the tests more robust. ### Description  1. Create a base class to enforce the structure of the completion test modules. 2. Create two new test modules for image content and function calling contents. 3. Simply the original text completion test and chat completion test to only test for the simplest scenarios. 4. No longer check if the services return specific words. They are considered passing as long as they return something non empty. 5. Bug fixes on function calling for Azure AI Inference and the Google connectors. ### Contribution Checklist  - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄

TaoChenOSU added 5 commits August 4, 2024 16:27

Part 1; next function calling

10d2859

Merge branch 'main' into local-branch-integration-test-refactor

d7661fe

Next: pure chat completion

36bc717

Chat completions

706e8c3

Rename base class

002a85f

TaoChenOSU added PR: in progress Under development and/or addressing feedback python Pull requests for the Python Semantic Kernel labels Aug 6, 2024

TaoChenOSU self-assigned this Aug 6, 2024

TaoChenOSU requested a review from a team as a code owner August 6, 2024 16:41

github-actions bot changed the title ~~Refactor completion integration tests~~ Python: Refactor completion integration tests Aug 6, 2024

TaoChenOSU added 2 commits August 6, 2024 17:02

Merge branch 'main' into local-branch-integration-test-refactor

2672e6f

Fix unit tests

25d66d4

TaoChenOSU requested review from eavanvalkenburg, alliscode and moonbox3 August 7, 2024 21:18

TaoChenOSU removed the PR: in progress Under development and/or addressing feedback label Aug 15, 2024

moonbox3 approved these changes Aug 15, 2024

View reviewed changes

alliscode approved these changes Aug 15, 2024

View reviewed changes

TaoChenOSU added this pull request to the merge queue Aug 15, 2024

Merged via the queue into microsoft:main with commit 2d31245 Aug 15, 2024
25 checks passed

TaoChenOSU deleted the taochen/integration-test-refactor branch August 15, 2024 22:49

TaoChenOSU linked an issue Aug 26, 2024 that may be closed by this pull request

Python: Flaky test: Chat Completion Integration Test #7368

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Refactor completion integration tests #7905

Python: Refactor completion integration tests #7905

TaoChenOSU commented Aug 6, 2024 •

edited

Loading

markwallace-microsoft commented Aug 7, 2024

Python: Refactor completion integration tests #7905

Python: Refactor completion integration tests #7905

Conversation

TaoChenOSU commented Aug 6, 2024 • edited Loading

Motivation and Context

Description

Contribution Checklist

markwallace-microsoft commented Aug 7, 2024

Python Unit Test Overview

TaoChenOSU commented Aug 6, 2024 •

edited

Loading