Skip to content

Conversation

@nathaliellenaa
Copy link
Contributor

@nathaliellenaa nathaliellenaa commented Nov 5, 2025

Description

Add execute tool + scratch pad tests

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

Tests

  • Enhanced security test coverage by adding tests for permission denial scenarios across tool execution components
  • Added integration tests to verify resource limit enforcement and unauthorized access handling
  • Expanded test suite to cover error handling paths and edge cases

✏️ Tip: You can customize this high-level summary in your review settings.

mingshl
mingshl previously approved these changes Nov 5, 2025
super.setUp();
}

public void testScratchpadSizeLimit() throws Exception {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for verifying the exception in UT.

@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval November 6, 2025 08:30 — with GitHub Actions Error
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval November 6, 2025 08:30 — with GitHub Actions Failure
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval November 6, 2025 08:30 — with GitHub Actions Error
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval November 6, 2025 08:30 — with GitHub Actions Failure
@dhrubo-os
Copy link
Collaborator

 Could not determine the dependencies of task ':opensearch-ml-plugin:dependencyLicenses'.
> Failed to query the value of task ':opensearch-ml-plugin:dependencyLicenses' property 'dependencies'.
   > Could not resolve all dependencies for configuration ':opensearch-ml-plugin:runtimeClasspath'.
      > Could not resolve software.amazon.awssdk:kms:2.32.29.
        Required by:
            project :opensearch-ml-plugin > project :opensearch-ml-algorithms > software.amazon.awssdk:bom:2.32.29
            project :opensearch-ml-plugin > project :opensearch-ml-algorithms > org.opensearch:opensearch-remote-metadata-sdk-ddb-client:3.4.0.0-SNAPSHOT:20251106.022320-9
         > Conflict found for module 'software.amazon.awssdk:kms': between versions 2.32.29 and 2.26.3
      > Could not resolve software.amazon.awssdk:dynamodb:2.32.29.
        Required by:
            project :opensearch-ml-plugin > project :opensearch-ml-algorithms > software.amazon.awssdk:bom:2.32.29
            project :opensearch-ml-plugin > project :opensearch-ml-algorithms > software.amazon.awssdk:bom:2.32.29 > software.amazon.awssdk:dynamodb-enhanced:2.32.29
         > Conflict found for module 'software.amazon.awssdk:dynamodb': between versions 2.32.29 and 2.26.3
      > Could not resolve org.dafny:DafnyRuntime:4.9.0.
        Required by:
            project :opensearch-ml-plugin > project :opensearch-ml-algorithms > org.opensearch:opensearch-remote-metadata-sdk-ddb-client:3.4.0.0-SNAPSHOT:20251106.022320-9 > software.amazon.cryptography:aws-database-encryption-sdk-dynamodb:3.9.0
            project :opensearch-ml-plugin > project :opensearch-ml-algorithms > org.opensearch:opensearch-remote-metadata-sdk-ddb-client:3.4.0.0-SNAPSHOT:20251106.022320-9 > software.amazon.cryptography:aws-cryptographic-material-providers:1.11.0

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

Seems like we started having a dependency conflict issue.

@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval November 7, 2025 01:59 — with GitHub Actions Error
@mingshl mingshl temporarily deployed to ml-commons-cicd-env-require-approval November 10, 2025 19:25 — with GitHub Actions Inactive
@mingshl mingshl had a problem deploying to ml-commons-cicd-env-require-approval November 10, 2025 19:25 — with GitHub Actions Error
@ylwu-amzn ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval November 14, 2025 19:40 — with GitHub Actions Failure
@ylwu-amzn ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval November 14, 2025 19:40 — with GitHub Actions Error
@ylwu-amzn ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval November 14, 2025 19:40 — with GitHub Actions Error
@ylwu-amzn ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval November 14, 2025 19:40 — with GitHub Actions Failure
@nathaliellenaa
Copy link
Contributor Author

Failing tests are unrelated

RestBedRockInferenceIT > test_bedrock_multimodal_model_empty_imageInput FAILED
    java.lang.AssertionError: Failing test case name: without_step_size, inference result: {"error":{"root_cause":[{"type":"status_exception","reason":"Error from remote service: {\"message\":\"The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. Consult the service documentation for details.

@mingshl
Copy link
Collaborator

mingshl commented Nov 18, 2025

@nathaliellenaa this test is constantly failing, can you try fix the test by downloading another imageurl?

we need the required CI passed to merge this.

RestMLRAGSearchProcessorIT > testBM25WithOpenAIWithConversationAndImage FAILED
    org.opensearch.client.ResponseException: method [POST], host [http://[::1]:44963], URI [/test/_search?size=5&search_pipeline=pipeline_test], status line [HTTP/1.1 400 Bad Request]
    {"error":{"root_cause":[{"type":"status_exception","reason":"Error from remote service: {\n  \"error\": {\n    \"message\": \"Error while downloading [https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg.\](https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg./)",\n    \"type\": \"invalid_request_error\",\n    \"param\": null,\n    \"code\": \"invalid_image_url\"\n  }\n}"}],"type":"status_exception","reason":"Error from remote service: {\n  \"error\": {\n    \"message\": \"Error while downloading [https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg.\](https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg./)",\n    \"type\": \"invalid_request_error\",\n    \"param\": null,\n    \"code\": \"invalid_image_url\"\n  }\n}"},"status":400}

@nathaliellenaa nathaliellenaa temporarily deployed to ml-commons-cicd-env-require-approval November 26, 2025 01:53 — with GitHub Actions Inactive
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval November 26, 2025 01:53 — with GitHub Actions Failure
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval November 26, 2025 01:53 — with GitHub Actions Error
@nathaliellenaa nathaliellenaa temporarily deployed to ml-commons-cicd-env-require-approval November 26, 2025 01:53 — with GitHub Actions Inactive
@nathaliellenaa nathaliellenaa mentioned this pull request Nov 26, 2025
5 tasks
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval November 28, 2025 20:43 — with GitHub Actions Failure
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
Signed-off-by: Nathalie Jonathan <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Dec 1, 2025

Walkthrough

The PR adds comprehensive test coverage for security and error handling scenarios across tool execution components. Changes include new unit tests for SecurityException handling in tool operations, permission validation in integration tests, and size limit verification for scratchpad functionality.

Changes

Cohort / File(s) Summary
Unit tests for tool security exceptions
ml-algorithms/src/test/java/org/opensearch/ml/engine/algorithms/tool/MLToolExecutorTest.java, ml-algorithms/src/test/java/org/opensearch/ml/engine/tools/ReadFromScratchPadToolTests.java, ml-algorithms/src/test/java/org/opensearch/ml/engine/tools/WriteToScratchPadToolTests.java
Added test cases covering SecurityException handling and permission denial scenarios during tool execution; one assertion removed from WriteToScratchPadToolTests.
Integration tests for tool permission and size validation
plugin/src/test/java/org/opensearch/ml/tools/ListIndexToolIT.java, plugin/src/test/java/org/opensearch/ml/tools/ScratchPadToolIT.java
Added testListIndexWithNoPermissions() to verify permission checks with HTTPS clients; introduced new ScratchPadToolIT class with setUp() and testScratchpadSizeLimit() to validate 100 MB oversized content rejection.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • All changes are test-only additions following consistent patterns
  • No production code modifications
  • Homogeneous nature of changes (test case additions) reduces review complexity
  • New test class ScratchPadToolIT follows standard integration test patterns

Poem

🐰 Tests of security, strong and tight,
Permissions checked, exceptions caught right,
Size limits guarded, no overflow here,
Tool execution safe, without any fear! 🛡️

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description check ❓ Inconclusive The description lacks substantive detail about what was actually added; it only repeats the title and leaves the Related Issues field incomplete with a placeholder. Provide specific details about which tests were added and what scenarios they cover (e.g., permission failures, size limits, security exceptions). Replace the issue placeholder with the actual issue number(s).
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main changes: adding test cases for execute tool and scratch pad functionality.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval December 1, 2025 23:32 — with GitHub Actions Failure
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval December 1, 2025 23:32 — with GitHub Actions Error
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval December 1, 2025 23:32 — with GitHub Actions Failure
@nathaliellenaa nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval December 1, 2025 23:32 — with GitHub Actions Error
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
ml-algorithms/src/test/java/org/opensearch/ml/engine/tools/WriteToScratchPadToolTests.java (2)

215-225: Consider asserting scratchpad mutation in this JSON-array conversion test

testRun_StringConversion_WithJsonArray now only checks the response string; asserting the updated SCRATCHPAD_NOTES_KEY here as well would better match the test name and keep this path fully covered (even though other tests cover similar mutations).


227-242: SecurityException test does not exercise WriteToScratchPadTool behavior

testRun_SecurityException directly calls listener.onFailure(securityException) and then verifies the listener, without invoking tool.run(...). That means the test is not validating how WriteToScratchPadTool itself behaves when a SecurityException occurs; it only checks that Mockito and the listener work as expected.

If the goal is tool-level behavior, consider:

  • Driving tool.run(...) and arranging dependencies so a SecurityException flows through the listener, or
  • Dropping this test and relying on higher‑level tests like MLToolExecutorTest.test_ToolExecutionFailsWithoutProperPermission, which already validate the permission error path end‑to‑end.
ml-algorithms/src/test/java/org/opensearch/ml/engine/tools/ReadFromScratchPadToolTests.java (1)

220-233: SecurityException test does not exercise ReadFromScratchPadTool behavior

testRun_SecurityException manually triggers listener.onFailure(securityException) and verifies the captured exception, but never calls tool.run(...). This doesn’t assert anything about how ReadFromScratchPadTool handles permission failures; it just checks the listener wiring.

To make this more meaningful, consider:

  • Refactoring to drive tool.run(...) and simulate a SecurityException from its internal operations, or
  • Removing this test and relying on higher‑level permission tests (e.g., MLToolExecutorTest and the ITs) for SecurityException coverage.
plugin/src/test/java/org/opensearch/ml/tools/ListIndexToolIT.java (1)

44-80: Robust permission IT; only minor polish possible

This test cleanly verifies that a no‑permission user gets a permission/forbidden/unauthorized error over HTTPS, and it correctly cleans up the client and user. The string‑contains checks are flexible enough to tolerate backend wording differences.

If you want to tighten things later, optional tweaks would be:

  • Use Collections.emptyList() instead of new ArrayList<>() for the no‑roles argument.
  • Use try‑with‑resources for RestClient instead of an explicit try/finally.
plugin/src/test/java/org/opensearch/ml/tools/ScratchPadToolIT.java (1)

1-39: Effective size‑limit IT; consider tuning payload size / construction

This IT correctly exercises the WriteToScratchPad tool via the REST endpoint with an oversized payload and asserts that some form of content‑length / 413‑style error is returned, without over‑fitting to a specific exception type or message.

Two optional refinements you might consider:

  • Use a payload just over the configured http.max_content_length (rather than a full 100 MB) to reduce memory pressure in CI while still triggering the same failure.
  • If you ever change largeContent to something more complex than "A", switch to building the JSON body via a helper or escaping utility instead of raw String.format to avoid quoting/escaping pitfalls.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2854ad2 and 6ed156d.

📒 Files selected for processing (5)
  • ml-algorithms/src/test/java/org/opensearch/ml/engine/algorithms/tool/MLToolExecutorTest.java (1 hunks)
  • ml-algorithms/src/test/java/org/opensearch/ml/engine/tools/ReadFromScratchPadToolTests.java (1 hunks)
  • ml-algorithms/src/test/java/org/opensearch/ml/engine/tools/WriteToScratchPadToolTests.java (1 hunks)
  • plugin/src/test/java/org/opensearch/ml/tools/ListIndexToolIT.java (2 hunks)
  • plugin/src/test/java/org/opensearch/ml/tools/ScratchPadToolIT.java (1 hunks)
🔇 Additional comments (1)
ml-algorithms/src/test/java/org/opensearch/ml/engine/algorithms/tool/MLToolExecutorTest.java (1)

206-226: Good coverage of permission‑denied tool execution path

test_ToolExecutionFailsWithoutProperPermission correctly simulates a SecurityException from the tool and verifies it is propagated unchanged to actionListener, including the message fragment. This nicely complements test_ToolExecutionFailed and the scratchpad tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants