- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.2k
ci: Add vLLM support to integration testing infrastructure #3128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1f29aaa    to
    6a4da14      
    Compare
  
    21f2737    to
    0f0e9ca      
    Compare
  
    | @ashwinb With this we'll need to run the record tests for 2 providers, but they can't be run in parrallel because 
 it works if you run them sequentially @ashwin to avoid conflicts what would you think about removing the index.sqlite file altogether? | 
fbe1472    to
    9e9687d      
    Compare
  
    | 
 I also dealt with | 
2ef7b4a    to
    79784ff      
    Compare
  
    | 
 index.sqlite has been removed here #3254 | 
82c7e69    to
    c38a15a      
    Compare
  
    c38a15a    to
    0e71d65      
    Compare
  
    eb23960    to
    2fdab23      
    Compare
  
    2fdab23    to
    903cffd      
    Compare
  
    6052457    to
    c13151e      
    Compare
  
    | # Additional exclusions for vllm setup | ||
| if [[ "$TEST_SETUP" == "vllm" ]]; then | ||
| EXCLUDE_TESTS="${EXCLUDE_TESTS} or test_inference_store_tool_calls" | ||
| EXCLUDE_TESTS="${EXCLUDE_TESTS} or test_inference_store_tool_calls or test_text_chat_completion_structured_output" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about adding these to the skips in the test files directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is because of the model, Our skips in test files are all based on provider
I put the skips here so that it only skips them in CI, anybody running integration test with a more capable model will still be able to use them.
If we can get to the point that this job is running, I'll happy test other models to see if I can get ride of this line alltogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes please, ci w/ a model that passes more tests.
having a gap between what ci test and what developers see in the test suite is going to lead to bugs and confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened an alternative PR that instead use qwen3 #3545
I'll can close which ever one we don't want to go with
72da0de    to
    1a759f5      
    Compare
  
    | 
 tldr: I removed all of the trivial changes that the ollama record ci job produced -- I took this commit and removed the trivial changes (these arn't needed), | 
- Update earth question to be more specific with multiple choice format to prevent Llama-3.2-1B-Instruct from rambling about other planets - Skip test_text_chat_completion_structured_output as it sometimes times out during CI execution again with Llama-3.2-1B-Instruct on vllm Signed-off-by: Derek Higgins <[email protected]>
Add vLLM provider support to integration test CI workflows alongside existing Ollama support. Configure provider-specific test execution where vLLM runs only inference specific tests (excluding vision tests) while Ollama continues to run the full test suite. This enables comprehensive CI testing of both inference providers but keeps the vLLM footprint small, this can be expanded later if it proves to not be too disruptive. Signed-off-by: Derek Higgins <[email protected]>
Signed-off-by: Derek Higgins <[email protected]>
1a759f5    to
    0ec2427      
    Compare
  
    | Closing this i favour of the qwen version (as per discussion in community meeting) | 


o Introduces vLLM provider support to the record/replay testing framework
o Enabling both recording and replay of vLLM API interactions alongside existing Ollama support.
The changes enable testing of vLLM functionality. vLLM tests focus on
inference capabilities, while Ollama continues to exercise the full API surface
including vision features.
Related: #2888
--
see alternative using qwen here #3545