Skip to content

Conversation

@chethanuk
Copy link

What type of PR is this?

Implement PVC volume mounting and node selector handling for PVC-backed models in InferenceService controller.

What this PR does / why we need it:

  • UpdatePodSpecVolumes: Detect PVC storage and create PVC volumes instead of HostPath volumes. Maintains backward compatibility with existing HostPath logic for non-PVC storage.
  • UpdatePodSpecNodeSelector: Skip node selector for PVC storage as PVC models are accessible from any node via Kubernetes volume mounting (unlike downloaded models tied to specific nodes).

Tests:

  • TestUpdatePodSpecNodeSelector_PVC Storage: 4 test cases covering PVC skip logic and non-PVC node selector behavior
  • TestUpdatePodSpecVolumes_PVCStorage: 3 test cases covering PVC volume creation, namespace handling, and HostPath fallback

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?


@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @chethanuk, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the InferenceService controller by introducing comprehensive support for models stored on Kubernetes PersistentVolumeClaims (PVCs). It modifies the controller to intelligently handle PVC-backed models, including dynamic volume mounting, optimized node selection, and robust validation. A new metadata extraction workflow, powered by Kubernetes Jobs, is implemented to ensure proper model lifecycle management and observability. This change improves flexibility for model storage and enhances the reliability of model serving infrastructure.

Highlights

  • PVC Storage Support: The InferenceService controller now supports models stored on Kubernetes PersistentVolumeClaims (PVCs), allowing for more flexible and robust model storage options.
  • Dynamic Volume Mounting: Pods are automatically configured to mount PVCs as read-only volumes for model access, streamlining the process of integrating PVC-backed models.
  • Node Selector Optimization: Node selector assignment is now skipped for PVC-backed models, as PVCs are generally accessible across any node in the cluster, improving scheduling flexibility.
  • Enhanced PVC Validation: Robust validation has been implemented for PVCs, including checks for cross-namespace access, PVC existence, bound status, and compatible access modes, ensuring secure and correct usage.
  • Metadata Extraction Jobs: Kubernetes Jobs are introduced to extract metadata from PVC-backed models, ensuring proper lifecycle management and up-to-date model information.
  • Retry Mechanism with Exponential Backoff: An exponential backoff and retry logic has been added for transient errors encountered during PVC validation and metadata extraction job creation, enhancing system resilience.
  • Standardized Status Conditions: Standard Kubernetes conditions have been added to the status of BaseModel, ClusterBaseModel, and FinetunedWeights resources, providing clearer observability into PVC-related operations and their states.
  • Dedicated RBAC for Metadata Extractor: New ServiceAccount, ClusterRole, and ClusterRoleBinding resources have been added to provide the necessary permissions for the metadata extraction jobs.
  • Improved Job Configuration: Metadata extraction jobs now include ActiveDeadlineSeconds for timeouts, TTLSecondsAfterFinished for automatic cleanup, a dedicated ServiceAccount, and resource constraints for better management and efficiency.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for PVC-backed models in the InferenceService controller, a significant feature enhancement. The changes include new logic to handle PVC volumes and node selectors, along with corresponding RBAC and CRD updates. The implementation is robust, featuring comprehensive validation, error handling, and retry mechanisms for PVC operations. My review focuses on improving code clarity, performance, and adherence to best practices in the new pvc_handler.go file. I've identified a couple of areas for improvement, mainly around error classification and the removal of a custom utility function in favor of the standard library.

Comment on lines +113 to +132
func classifyError(err error) ErrorType {
if err == nil {
return ErrorTypeTransient
}

errStr := err.Error()

// Security-related errors (don't retry)
if contains(errStr, "cross-namespace") || contains(errStr, "access denied") {
return ErrorTypeSecurity
}

// Validation errors (don't retry)
if contains(errStr, "not found") || contains(errStr, "not bound") || contains(errStr, "invalid") {
return ErrorTypeValidation
}

// Default to transient (can retry)
return ErrorTypeTransient
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The classifyError function relies on string matching via a custom contains function, which is brittle. It's better to use errors.Is or functions like apierrors.IsNotFound where possible. For other cases, strings.Contains is more idiomatic and performant than the custom contains function. Please also add import "strings" to the file.

I've updated the function to use apierrors.IsNotFound and strings.Contains.

func classifyError(err error) ErrorType {
	if err == nil {
		return ErrorTypeTransient
	}

	if apierrors.IsNotFound(err) {
		return ErrorTypeValidation
	}

	errStr := err.Error()

	// Security-related errors (don't retry)
	if strings.Contains(errStr, "cross-namespace") || strings.Contains(errStr, "access denied") {
		return ErrorTypeSecurity
	}

	// Validation errors (don't retry)
	if strings.Contains(errStr, "not bound") || strings.Contains(errStr, "invalid") {
		return ErrorTypeValidation
	}

	// Default to transient (can retry)
	return ErrorTypeTransient
}

app.kubernetes.io/component: metadata-extractor
{{- include "ome-resources.labels" . | nindent 4 }}
rules:
# Allow reading ConfigMaps for status updates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment on line 21 is misleading. It states 'Allow reading ConfigMaps for status updates', but the verbs ["get", "list", "create", "update", "patch"] grant full management permissions. To improve clarity and adhere to the principle of least astonishment, the comment should accurately reflect the granted permissions.

# Allow managing ConfigMaps for status updates

app.kubernetes.io/name: ome
app.kubernetes.io/component: metadata-extractor
rules:
# Allow reading ConfigMaps for status updates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The comment on line 19 is misleading. It states 'Allow reading ConfigMaps for status updates', but the verbs ["get", "list", "create", "update", "patch"] grant full management permissions. To improve clarity and adhere to the principle of least astonishment, the comment should accurately reflect the granted permissions.

# Allow managing ConfigMaps for status updates

Comment on lines +135 to +139
func contains(s, substr string) bool {
return len(s) >= len(substr) && (s == substr || len(substr) == 0 ||
(len(s) > len(substr) && contains(s[1:], substr)) ||
(len(s) >= len(substr) && s[:len(substr)] == substr))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The custom contains function is a very inefficient and complex recursive reimplementation of strings.Contains. It should be removed and strings.Contains from the standard library should be used instead, as suggested for the classifyError function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant