Skip to content

Conversation

patrickkon
Copy link
Contributor


name: Pull Request

Description

  • Summary:

Expected Behavior

  • Snapshot of Logging Details:

Performance Evaluation

This PR has been tested using the first quick start example, with the following performance metrics:

  • Cost: $ (with Model:)
  • Correctness:
  • Number of Steps:
  • Execution Time:
  • Uploaded Log File (Optional):

@patrickkon patrickkon requested a review from AmberLJC March 28, 2025 11:51
@patrickkon
Copy link
Contributor Author

@AmberLJC could you double check that the ground truth makes sense? Thanks!

Copy link
Contributor

@AmberLJC AmberLJC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Let's focus on LLM reasoning 1, which reviewer cares about most!

1. **Effect on Increasing Data Size**: ?
2. **Effect on Increasing Mean**: The recall is not significantly affected by changes in the mean, showing robustness.
3. **Effect on Increasing Variance**: The recall remains mostly stable, with some rare instances of decreased performance.
3. **Effect on Increasing Variance**: Higher variance leads to a decline in recall.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the previous ground truth is more correct

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the variance increases, the vectors become more dispersed, making it harder to capture all neighboring points, and the recall rate decreases. Does this reasoning make sense?

@patrickkon
Copy link
Contributor Author

Hey @yiliazhuxyxy could you please review the comments left by @AmberLJC ? Thank you!

@AmberLJC
Copy link
Contributor

This PR is duplicated with #46
Maybe let work on 46 together?

@patrickkon
Copy link
Contributor Author

This PR is duplicated with #46 Maybe let work on 46 together?

Hey @AmberLJC no worries, I will manually handle it. Let's not merge/close #46, I intend to leave it open for quite some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants