Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix merging conflict for upgrade from 0.12.1 to 0.14 #719

Closed
wants to merge 282 commits into from

Conversation

Jooho
Copy link

@Jooho Jooho commented Nov 7, 2024

This fix all merging conflicts for upgrade from 0.12.1 to 0.14.

mkumatag and others added 30 commits March 17, 2024 19:18
Signed-off-by: Manjunath Kumatagi <[email protected]>
* upgrade knative

Signed-off-by: Andrews Arokiam <[email protected]>

updrage version

Signed-off-by: Andrews Arokiam <[email protected]>

upgrade version

Signed-off-by: Andrews Arokiam <[email protected]>

* Update kn cli
---------

Signed-off-by: Andrews Arokiam <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
Co-authored-by: Dan Sun <[email protected]>
Update README for generative AI model support

Signed-off-by: Dan Sun <[email protected]>
* Update kserve diagram

Signed-off-by: Dan Sun <[email protected]>

* Remove white background

Signed-off-by: Dan Sun <[email protected]>

---------

Signed-off-by: Dan Sun <[email protected]>
Since ODH would support KServe's RawDeployment mode, this modifies the scripts around OpenShift-ci setup to be possible to run RawDeployment-related E2Es.

The run-e2e-tests.sh script is modified to exclude installation of Service Mesh and Serverless, when RawDeployments E2Es are requested to run. A supporting file inferenceservice-openshift-ci-raw.yaml was added to patch KServe's configuration to use RawDeployment mode by default and to use OpenShift Ingress when exposing Inference Services.

Since the E2Es use some annotations in the InferenceService, changes done to the v1beta1_inference_service.py file in commit ecff079 were reverted. As an alternative, the `enablePassthrough` annotation was moved to the ServingRuntime resources. This is not only cleaner, but also reduces the diverging code with the upstream repository. Furthermore, this seems to be an auto-generated file that should not be touched.

Signed-off-by: Edgar Hernández <[email protected]>
In Serverless version 1.32.0, the `domain-mapping` deployment no longer exists. This updates the deploy.serverless.sh script used in openshift-ci to no longer wait for this deployment.

Signed-off-by: Edgar Hernández <[email protected]>
…ess-1320

Fix CI: Serverless removed domain-mapping deployment
chore:	fixes the GH [Alert](https://github.com/kserve/kserve/security/code-scanning/12080).
	filepath.Clean sanitizes the directory path and remove any unnecessary components (such as . and ..)

Signed-off-by: Spolti <[email protected]>
Due to changes in kserve@39b8a67 which added `reinvocationPolicy: IfNeeded` to the WebHook configuration, the injection called can (and will be) called multiple times, and needs to be idempotent (which is a good thing anyway).

This commit fixes the array field handling and adding volumes, volumemounts and containers only if they not already had been added.

Fixes kserve#3506

Signed-off-by: Roland Huß <[email protected]>
…e#3481)

Remove redundant helm chart affinity

labels: 
- app.kubernetes.io/managed-by 
- app.kubernetes.io/instance
- app.kubernetes.io/name: 

with value modelmesh-controller cause affinity to non existent helm chart

Signed-off-by: Ondrej Trojan <[email protected]>
Add capability to run RawDeployment E2Es in OpenShift-ci
update codeQL to v3

chore:	Update CodeQL to V3 to get rid of this warning:
	`Warning: CodeQL Action v2 will be deprecated on December 5th, 2024`
	Plus, attenpt to fix the Snyk Container scan failures due errors when trying to
	upload the SERIF file:
	`Processing sarif files: ["application/storage-initializer/docker.snyk.sarif"]
	  Uploading results
	  Successfully uploaded results
	Waiting for processing to finish
	Error: Code Scanning could not process the submitted SARIF file:
	could not convert rules: invalid security severity value, is not a number: null
	ConfigurationError: Code Scanning could not process the submitted SARIF file:
	could not convert rules: invalid security severity value, is not a number: null
	    at run (/home/runner/work/_actions/github/codeql-action/v2/lib/upload-sarif-action.js:65:15)`

Signed-off-by: Spolti <[email protected]>
* switch e2e test inference graph to raw mode

Signed-off-by: Andrews Arokiam <[email protected]>

* download xgb server image

Signed-off-by: Andrews Arokiam <[email protected]>

---------

Signed-off-by: Andrews Arokiam <[email protected]>
Pad left for decode-only architecture models.

Signed-off-by: Sivanantham Chinnaiyan <[email protected]>
* CVE-2024-24762 - update fastapi to 0.109.1

chore:	Fix [CVE-2024-24762](https://www.cve.org/CVERecord?id=CVE-2024-24762) - fastapi Regular Expression Denial of Service (ReDoS)
	Plus, update Ray to 2.10 to allow updating fastapi. On previous versions of Ray
	the fastapi version was pinned, which was preventing the fastapi version update.

use the new handle api:

From Ray Serve docs:
Ray 2.7 introduces a new {mod}`DeploymentHandle <ray.serve.handle.DeploymentHandle>` API that will replace the existing `RayServeHandle` and `RayServeSyncHandle` APIs.

Signed-off-by: Spolti <[email protected]>

* add link to about the RayServeHandle deprecation

Signed-off-by: Spolti <[email protected]>

---------

Signed-off-by: Spolti <[email protected]>
* wip

Signed-off-by: Yuan Tang <[email protected]>

* comment out

Signed-off-by: Yuan Tang <[email protected]>

* fix wf

Signed-off-by: Yuan Tang <[email protected]>

* helm test

Signed-off-by: Yuan Tang <[email protected]>

* remove mlserver relate tests

Signed-off-by: Yuan Tang <[email protected]>

* fix lint

Signed-off-by: Yuan Tang <[email protected]>

* sklearnserver runtime

Signed-off-by: Yuan Tang <[email protected]>

* Fix test

Signed-off-by: Yuan Tang <[email protected]>

* fix

Signed-off-by: Yuan Tang <[email protected]>

* disable check

Signed-off-by: Yuan Tang <[email protected]>

* reunused imports

Signed-off-by: Yuan Tang <[email protected]>

* Add back mlserver

Signed-off-by: Yuan Tang <[email protected]>

* pre-commit fix

Signed-off-by: Yuan Tang <[email protected]>

* update storage url

Signed-off-by: Yuan Tang <[email protected]>

* fix build

Signed-off-by: Yuan Tang <[email protected]>

* fix codegen

Signed-off-by: Yuan Tang <[email protected]>

* revert uri

Signed-off-by: Yuan Tang <[email protected]>

* int_contents

Signed-off-by: Yuan Tang <[email protected]>

* Remove unused script

Signed-off-by: Yuan Tang <[email protected]>

* remove dockerfile

Signed-off-by: Yuan Tang <[email protected]>

* Empty-Commit

Signed-off-by: Yuan Tang <[email protected]>

* Empty-Commit

Signed-off-by: Yuan Tang <[email protected]>

* Empty-Commit

Signed-off-by: Yuan Tang <[email protected]>

---------

Signed-off-by: Yuan Tang <[email protected]>
* Auto-format all Python files

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Use black for linting

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Don't run poetry check on root pyproject.toml

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Re-add flake8 linting

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Fix linting errors

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Add python path

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Fix linting

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Fix circular dependency

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Fix circular dependency

Signed-off-by: Curtis Maddalozzo <[email protected]>

---------

Signed-off-by: Curtis Maddalozzo <[email protected]>
…serve#3558)

* support model revision and tokenizer revision

Signed-off-by: Lize Cai <[email protected]>

* point to specified commit in test case

Signed-off-by: Lize Cai <[email protected]>

* format code

Signed-off-by: Lize Cai <[email protected]>

---------

Signed-off-by: Lize Cai <[email protected]>
Signed-off-by: Dan Sun <[email protected]>
Co-authored-by: Dan Sun <[email protected]>
[RHOAIENG-5073] - Routing and Headless Service Support in KServe Raw …
* OpenAI data models and endpoints from vLLM

Signed-off-by: Tessa Pham <[email protected]>

* more components for OpenAI endpoints

Signed-off-by: Tessa Pham <[email protected]>

* add OpenAI endpoints to router

Signed-off-by: Tessa Pham <[email protected]>

* modify generate() in data plane

Signed-off-by: Tessa Pham <[email protected]>

* class OpenAIModel

Signed-off-by: Tessa Pham <[email protected]>

* delete and rename files

Signed-off-by: Tessa Pham <[email protected]>

* add create_chat_completion() to OpenAIModel

Signed-off-by: Tessa Pham <[email protected]>

* update routers and lint

Signed-off-by: Tessa Pham <[email protected]>

* Implement streaming

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Add tests for OpenAI data conversion

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Register OpenAI endpoints when appropriate

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Add comments

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Add tests for create_completion and create_chat_completion

Signed-off-by: Curtis Maddalozzo <[email protected]>

* Remove completion types from dataplane methods

Signed-off-by: Curtis Maddalozzo <[email protected]>

* WIP

Signed-off-by: Curtis Maddalozzo <[email protected]>

* fix lint errors

Signed-off-by: Tessa Pham <[email protected]>

* update poetry.lock

Signed-off-by: Tessa Pham <[email protected]>

* update poetry.lock files

Signed-off-by: Tessa Pham <[email protected]>

* add dependency

Signed-off-by: Tessa Pham <[email protected]>

* fix test

Signed-off-by: Tessa Pham <[email protected]>

* revert poetry.lock files

Signed-off-by: Tessa Pham <[email protected]>

* add .itermconfig to .gitignore

Signed-off-by: Tessa Pham <[email protected]>

* add docker-compose.yml to .gitignore

Signed-off-by: Tessa Pham <[email protected]>

* fix build error

Signed-off-by: Tessa Pham <[email protected]>

* fix function descriptions

Signed-off-by: Tessa Pham <[email protected]>

* increase limit for model decompression size

Signed-off-by: Tessa Pham <[email protected]>

* add license & autoformat

Signed-off-by: Tessa Pham <[email protected]>

* make openai dependency mandatory

Signed-off-by: Tessa Pham <[email protected]>

* openai dependency back to optional

Signed-off-by: Tessa Pham <[email protected]>

* fix openai module import error

Signed-off-by: Tessa Pham <[email protected]>

* fix JSON unmarshalling of headers

Signed-off-by: Tessa Pham <[email protected]>

* drop formatting changes in unrelated files

Signed-off-by: Tessa Pham <[email protected]>

* fix openai_is_available()

Signed-off-by: Tessa Pham <[email protected]>

* black reformat

Signed-off-by: Tessa Pham <[email protected]>

---------

Signed-off-by: Tessa Pham <[email protected]>
Signed-off-by: Curtis Maddalozzo <[email protected]>
Co-authored-by: Curtis Maddalozzo <[email protected]>
yuzisun and others added 23 commits October 9, 2024 14:02
* Fix local testing

Signed-off-by: Dan Sun <[email protected]>

* Fix codegen

Signed-off-by: Dan Sun <[email protected]>

---------

Signed-off-by: Dan Sun <[email protected]>
* Add a flag for automount serviceaccount

Signed-off-by: Jin Dong <[email protected]>

* Set default to false

Signed-off-by: Jin Dong <[email protected]>

* Default to true

Signed-off-by: Jin Dong <[email protected]>

* Fix test error

Signed-off-by: Jin Dong <[email protected]>

* Update openapi generated.go

Signed-off-by: Jin Dong <[email protected]>

* Fix python lint

Signed-off-by: Jin Dong <[email protected]>

* Fix config loading

Signed-off-by: Jin Dong <[email protected]>

---------

Signed-off-by: Jin Dong <[email protected]>
…ainer (kserve#3985)

* Do not set security context on the storage initializer from user container

Signed-off-by: Jin Dong <[email protected]>

* Add securityContext to the default storage container in the helm chart

Signed-off-by: Jin Dong <[email protected]>

---------

Signed-off-by: Jin Dong <[email protected]>
This adds the model container as an init-container to mitigate a race
condition that would happen if the model container is not present on the
cluster-node. The race condition happens if the cluster is able to fetch
and start the runtime container before the modelcar is fetched. This
would lead to the runtime to terminate with error.

By configuring the model container as an init-container the runtime
won't start until the modelcar is fetched. Although there is still the
risk of a race condition when the cluster schedules the runtime
container first, the pod should stabilize after a few restarts of the
runtime container and should either prevent a CrashLoopBackOff event on
the pod, or the crash event would finish quickly.

This improves compatibility with the runtimes which can now stay
agnostic to the modelcar implementation, until better techniques (like
native sidecars, and oci volume mounts) become mature.

Signed-off-by: Edgar Hernández <[email protected]>
* Initial commit for headers passing issue

Signed-off-by: Andrews Arokiam <[email protected]>

* modifying the e2e test for rebase conflict

Signed-off-by: Andrews Arokiam <[email protected]>

* bug fix on unittest

Signed-off-by: Andrews Arokiam <[email protected]>

* review changes

Signed-off-by: Andrews Arokiam <[email protected]>

* fix for test failure

Signed-off-by: Andrews Arokiam <[email protected]>

* bug fix on e2e test

Signed-off-by: Andrews Arokiam <[email protected]>

* overridding the entrypoint of custom model images

Signed-off-by: Andrews Arokiam <[email protected]>

* custom response header

Signed-off-by: Andrews Arokiam <[email protected]>

* fix for unittest failure

Signed-off-by: Andrews Arokiam <[email protected]>

* added custom response headers in post process

Signed-off-by: Andrews Arokiam <[email protected]>

* added predict time latency in example response header

Signed-off-by: Andrews Arokiam <[email protected]>

* fix OOM

---------

Signed-off-by: Andrews Arokiam <[email protected]>
Co-authored-by: Dan Sun <[email protected]>
* security update

Signed-off-by: udai <[email protected]>

* adding sign off

Signed-off-by: udai <[email protected]>

---------

Signed-off-by: udai <[email protected]>
* temp commit

Signed-off-by: Jin Dong <[email protected]>

* python-release.sh

Signed-off-by: Jin Dong <[email protected]>

---------

Signed-off-by: Jin Dong <[email protected]>
…14-upgrade

Code sync with upstream, up to v0.14.

Signed-off-by: Edgar Hernández <[email protected]>
Signed-off-by: Edgar Hernández <[email protected]>
…4-upgrade

Code sync for upstream v0.14.0
Reduce E2Es dependency on CI environment

Some code of the E2Es assume the environment is GitHub, because it is referring to GitHub-specific variables. This PR focuses on the `kserve/custom-model-grpc` container image, so that no Python code of the E2Es using this image is referencing the `github_sha` variable.

Also, a small improvement on the `get_isvc_endpoint` utility function is done to use the schema in the endpoint specified in the status of the InferenceService, rather than hard-coding to plain-text HTTP. This adds compatibility for CI environments where KServe ConfigMap has been configured with `urlScheme: https` for the Ingress.

Signed-off-by: Edgar Hernández <[email protected]>
* remove patch for webhookconfiguration

Signed-off-by: jooho lee <[email protected]>

* comment out localmodel for now, this need to be reverted

Signed-off-by: jooho lee <[email protected]>

* add dsc/dsci objects for e2e test

Signed-off-by: jooho lee <[email protected]>

* fix e2e test

Signed-off-by: jooho lee <[email protected]>

* follow up comments

Signed-off-by: jooho lee <[email protected]>

* fix e2e test after latest sync

Signed-off-by: jooho lee <[email protected]>

---------

Signed-off-by: jooho lee <[email protected]>
[Cherry-pick] To fix e2e test, cherry pick commits to 0.14 release banch
* Multi-Node Inference Implementation (kserve#3972)

Signed-off-by: jooho lee <[email protected]>

* fix lint and unit test for odh

Signed-off-by: jooho lee <[email protected]>

---------

Signed-off-by: jooho lee <[email protected]>
Copy link

openshift-ci bot commented Nov 7, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jooho

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Nov 7, 2024
@Jooho Jooho force-pushed the rhoai_0.14_upgrade branch from e41fd40 to d3cb8fe Compare November 7, 2024 04:52
@Jooho
Copy link
Author

Jooho commented Nov 7, 2024

close this pr and new pr will be created soon

@Jooho Jooho closed this Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.