Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: milvus support #457

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

feat: milvus support #457

wants to merge 11 commits into from

Conversation

jamescalam
Copy link
Member

@jamescalam jamescalam commented Oct 28, 2024

PR Type

enhancement, tests, dependencies


Description

  • Implemented a new MilvusIndex class to integrate Milvus as an index choice, with methods for initialization, adding, querying, and deleting data.
  • Updated the __init__.py to include MilvusIndex in the list of available indexes.
  • Added MilvusIndex to the unit tests, ensuring it is included in the test indexes if pymilvus is available.
  • Updated pyproject.toml to include pymilvus as an optional dependency, allowing for Milvus integration.

Changes walkthrough 📝

Relevant files
Enhancement
__init__.py
Add MilvusIndex to index initialization                                   

semantic_router/index/init.py

  • Added import for MilvusIndex.
  • Updated __all__ to include MilvusIndex.
  • +2/-0     
    milvus.py
    Implement MilvusIndex class for Milvus integration             

    semantic_router/index/milvus.py

  • Implemented MilvusIndex class with methods for initialization, adding,
    querying, and deleting data.
  • Defined constants and fields for managing Milvus collections.
  • Included error handling for missing dependencies.
  • Provided methods for converting metrics and managing collection
    schema.
  • +270/-0 
    Tests
    test_layer.py
    Add MilvusIndex to unit tests                                                       

    tests/unit/test_layer.py

  • Added MilvusIndex to the list of test indexes.
  • Conditional import check for pymilvus.
  • +3/-0     
    Dependencies
    pyproject.toml
    Update dependencies for Milvus support                                     

    pyproject.toml

  • Added pymilvus as an optional dependency.
  • Adjusted version for qdrant-client.
  • Updated optional dependencies list to include Milvus.
  • +3/-1     

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    @jamescalam jamescalam changed the title James/milvus test feat: milvus support Oct 28, 2024
    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Error Handling
    The error handling in the MilvusIndex class could be improved. Currently, the code raises general exceptions which might not be informative enough for debugging issues related to Milvus operations.

    Dependency Check
    The code checks for the 'pymilvus' dependency at runtime, which could lead to runtime errors if the dependency is missing. Consider ensuring this dependency is properly managed and documented.

    Method Implementation
    The methods '_remove_and_sync' and '_sync_index' are not implemented and only log an error. This could lead to functionality gaps or unexpected behavior in production.

    Copy link

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Possible bug
    Prevent potential runtime errors by ensuring that dimensions is set before initializing the collection

    Ensure that the dimensions attribute is initialized before calling
    _initialize_collection to prevent runtime errors.

    semantic_router/index/milvus.py [147-149]

     self.dimensions = self.dimensions or len(embeddings[0])
    +if self.dimensions is None:
    +    raise ValueError("Dimensions must be set before initializing the collection.")
     if not self.client.has_collection(collection_name=self.index_name):
         self._initialize_collection()
    Suggestion importance[1-10]: 9

    Why: This suggestion addresses a potential runtime error by ensuring that the dimensions attribute is set before initializing the collection, which is crucial for the correct functioning of the MilvusIndex class. It significantly enhances the robustness of the code.

    9
    Enhancement
    Use the index_name attribute instead of a hardcoded string to improve code flexibility and maintainability

    Replace the hardcoded collection name in _clear_collections with the index_name
    attribute to maintain consistency and flexibility.

    semantic_router/index/milvus.py [57-58]

    -if self.client.has_collection(collection_name=DEFAULT_COLLECTION_NAME):
    -    self.client.drop_collection(collection_name=DEFAULT_COLLECTION_NAME)
    +if self.client.has_collection(collection_name=self.index_name):
    +    self.client.drop_collection(collection_name=self.index_name)
    Suggestion importance[1-10]: 8

    Why: Replacing the hardcoded collection name with the index_name attribute increases the flexibility and maintainability of the code, allowing for easier modifications and reducing the risk of errors due to hardcoding.

    8
    Add error handling to the delete method to improve robustness and error reporting

    Implement error handling for the delete method to manage exceptions that may occur
    when the specified route does not exist.

    semantic_router/index/milvus.py [177-178]

     filter_route = f"route == '{route_name}'"
    -self.client.delete(collection_name=self.index_name, filter=filter_route)
    +try:
    +    self.client.delete(collection_name=self.index_name, filter=filter_route)
    +except Exception as e:
    +    logger.error(f"Failed to delete route: {route_name}. Error: {str(e)}")
    Suggestion importance[1-10]: 7

    Why: Implementing error handling in the delete method enhances the robustness of the code by managing exceptions and providing clear error reporting, which is beneficial for debugging and maintaining the code.

    7
    Best practice
    Validate the top_k parameter to ensure it is positive, enhancing the method's reliability and correctness

    Validate the top_k parameter in the query method to ensure it is a positive integer
    before executing the query.

    semantic_router/index/milvus.py [204-207]

     if not self.client.has_collection(collection_name=self.index_name):
         raise ValueError("Index not found.")
    +if top_k <= 0:
    +    raise ValueError("The top_k parameter must be a positive integer.")
     vector = vector.reshape(1, -1)
    Suggestion importance[1-10]: 8

    Why: Validating the top_k parameter ensures that it is a positive integer, which prevents potential logical errors and enhances the reliability and correctness of the query method. This is a good practice for input validation.

    8

    @jamescalam
    Copy link
    Member Author

    @ericljx2020-gmail I'm seeing this issue in tests:

    =========================== short test summary info ============================
    FAILED tests/unit/test_layer.py::TestRouteLayer::test_add_route[MilvusIndex] - pymilvus.exceptions.MilvusException: <MilvusException: (code=1100, message=empty expression should be used with limit: invalid parameter)>
    FAILED tests/unit/test_layer.py::TestRouteLayer::test_add_multiple_routes[MilvusIndex] - pymilvus.exceptions.MilvusException: <MilvusException: (code=1100, message=empty expression should be used with limit: invalid parameter)>
    ====== 2 failed, 440 passed, 23 skipped, 58 warnings in 417.17s (0:06:57) ======
    

    @jamescalam jamescalam self-assigned this Oct 28, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants