update gitignore file

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
NVIDIA · Apr 16, 2024 · 9a780e5 · 9a780e5
1 parent e53056c
commit 9a780e5
Show file tree

Hide file tree

Showing 8 changed files with 415 additions and 1,095 deletions.
diff --git a/README.md b/README.md
@@ -25,10 +25,7 @@
       - [Updating Per-Node Configuration With a Node Label](#updating-per-node-configuration-with-a-node-label)
     + [Setting other helm chart values](#setting-other-helm-chart-values)
     + [Deploying with gpu-feature-discovery for automatic node labels](#deploying-with-gpu-feature-discovery-for-automatic-node-labels)
-<!--
-TODO: We are still in the process of migrating GFD to this repo. Once this is ready we can uncomment this section.
     + [Deploying gpu-feature-discovery in standalone mode](#deploying-gpu-feature-discovery-in-standalone-mode)
--->
   * [Deploying via `helm install` with a direct URL to the `helm` package](#deploying-via-helm-install-with-a-direct-url-to-the-helm-package)
 - [Building and Running Locally](#building-and-running-locally)
 - [Changelog](#changelog)
@@ -42,6 +39,8 @@ The NVIDIA device plugin for Kubernetes is a Daemonset that allows you to automa
 - Run GPU enabled containers in your Kubernetes cluster.
 
 This repository contains NVIDIA's official implementation of the [Kubernetes device plugin](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/).
+As of v0.15.0 this repository also holds the implementation for GPU Feature Discovery labels,
+for further information on GPU Feature Discovery see [here](docs/gpu-feature-discovery/README.md).
 
 Please note that:
 - The NVIDIA device plugin API is beta as of Kubernetes v1.10.
@@ -559,11 +558,11 @@ $ helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
 $ helm repo update
 ```
 
-Then verify that the latest release (`v0.14.5`) of the plugin is available:
+Then verify that the latest release (`v0.15.0`) of the plugin is available:
 ```
 $ helm search repo nvdp --devel
 NAME                     	  CHART VERSION  APP VERSION	DESCRIPTION
-nvdp/nvidia-device-plugin	  0.14.5	 0.14.5		A Helm chart for ...
+nvdp/nvidia-device-plugin	  0.15.0	 0.15.0		A Helm chart for ...
 ```
 
 Once this repo is updated, you can begin installing packages from it to deploy
@@ -574,7 +573,7 @@ The most basic installation command without any options is then:
 helm upgrade -i nvdp nvdp/nvidia-device-plugin \
   --namespace nvidia-device-plugin \
   --create-namespace \
-  --version 0.14.5
+  --version 0.15.0
 ```
 
 **Note:** You only need the to pass the `--devel` flag to `helm search repo`
@@ -583,7 +582,7 @@ version (e.g. `<version>-rc.1`). Full releases will be listed without this.
 
 ### Configuring the device plugin's `helm` chart
 
-The `helm` chart for the latest release of the plugin (`v0.14.5`) includes
+The `helm` chart for the latest release of the plugin (`v0.15.0`) includes
 a number of customizable values.
 
 Prior to `v0.12.0` the most commonly used values were those that had direct
@@ -593,7 +592,7 @@ case of the original values is then to override an option from the `ConfigMap`
 if desired. Both methods are discussed in more detail below.
 
 The full set of values that can be set are found here:
-[here](https://github.com/NVIDIA/k8s-device-plugin/blob/v0.14.5/deployments/helm/nvidia-device-plugin/values.yaml).
+[here](https://github.com/NVIDIA/k8s-device-plugin/blob/v0.15.0/deployments/helm/nvidia-device-plugin/values.yaml).
 
 #### Passing configuration to the plugin via a `ConfigMap`.
 
@@ -632,7 +631,7 @@ EOF
 And deploy the device plugin via helm (pointing it at this config file and giving it a name):
 ```
 $ helm upgrade -i nvdp nvdp/nvidia-device-plugin \
-    --version=0.14.5 \
+    --version=0.15.0 \
     --namespace nvidia-device-plugin \
     --create-namespace \
     --set-file config.map.config=/tmp/dp-example-config0.yaml
@@ -654,7 +653,7 @@ $ kubectl create cm -n nvidia-device-plugin nvidia-plugin-configs \
 ```
 ```
 $ helm upgrade -i nvdp nvdp/nvidia-device-plugin \
-    --version=0.14.5 \
+    --version=0.15.0 \
     --namespace nvidia-device-plugin \
     --create-namespace \
     --set config.name=nvidia-plugin-configs
@@ -682,7 +681,7 @@ EOF
 And redeploy the device plugin via helm (pointing it at both configs with a specified default).
 ```
 $ helm upgrade -i nvdp nvdp/nvidia-device-plugin \
-    --version=0.14.5 \
+    --version=0.15.0 \
     --namespace nvidia-device-plugin \
     --create-namespace \
     --set config.default=config0 \
@@ -701,7 +700,7 @@ $ kubectl create cm -n nvidia-device-plugin nvidia-plugin-configs \
 ```
 ```
 $ helm upgrade -i nvdp nvdp/nvidia-device-plugin \
-    --version=0.14.5 \
+    --version=0.15.0 \
     --namespace nvidia-device-plugin \
     --create-namespace \
     --set config.default=config0 \
@@ -784,7 +783,7 @@ chart values that are commonly overridden are:
 ```
 
 Please take a look in the
-[`values.yaml`](https://github.com/NVIDIA/k8s-device-plugin/blob/v0.14.5/deployments/helm/nvidia-device-plugin/values.yaml)
+[`values.yaml`](https://github.com/NVIDIA/k8s-device-plugin/blob/v0.15.0/deployments/helm/nvidia-device-plugin/values.yaml)
 file to see the full set of overridable parameters for the device plugin.
 
 Examples of setting these options include:
@@ -793,7 +792,7 @@ Enabling compatibility with the `CPUManager` and running with a request for
 100ms of CPU time and a limit of 512MB of memory.
 ```shell
 $ helm upgrade -i nvdp nvdp/nvidia-device-plugin \
-    --version=0.14.5 \
+    --version=0.15.0 \
     --namespace nvidia-device-plugin \
     --create-namespace \
     --set compatWithCPUManager=true \
@@ -804,7 +803,7 @@ $ helm upgrade -i nvdp nvdp/nvidia-device-plugin \
 Enabling compatibility with the `CPUManager` and the `mixed` `migStrategy`
 ```shell
 $ helm upgrade -i nvdp nvdp/nvidia-device-plugin \
-    --version=0.14.5 \
+    --version=0.15.0 \
     --namespace nvidia-device-plugin \
     --create-namespace \
     --set compatWithCPUManager=true \
@@ -823,7 +822,7 @@ Discovery to perform this labeling.
 To enable it, simply set `gfd.enabled=true` during helm install.
 ```
 helm upgrade -i nvdp nvdp/nvidia-device-plugin \
-    --version=0.14.5 \
+    --version=0.15.0 \
     --namespace nvidia-device-plugin \
     --create-namespace \
     --set gfd.enabled=true
@@ -865,8 +864,7 @@ product name, e.g.:
 ```
 nvidia.com/gpu.product = A100-SXM4-40GB-MIG-1g.5gb-SHARED
 ```
-<!--
-TODO: We are still in the process of migrating GFD to this repo. Once this is ready we can uncomment this section.
+
 #### Deploying gpu-feature-discovery in standalone mode
 
 As of `v0.15.0`, the device plugin's helm chart has integrated support to deploy
@@ -893,6 +891,7 @@ Once this repo is updated, you can begin installing packages from it to deploy
 the `gpu-feature-discovery` component in standalone mode.
 
 The most basic installation command without any options is then:
+
 ```
 $ helm upgrade -i nvdp nvdp/nvidia-device-plugin \
   --version 0.15.0 \
@@ -929,31 +928,31 @@ Using the default values for the flags:
 $ helm upgrade -i nvdp \
     --namespace nvidia-device-plugin \
     --create-namespace \
-    https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.14.5.tgz
+    https://nvidia.github.io/k8s-device-plugin/stable/nvidia-device-plugin-0.15.0.tgz
 ```
--->
+
 ## Building and Running Locally
 
 The next sections are focused on building the device plugin locally and running it.
 It is intended purely for development and testing, and not required by most users.
-It assumes you are pinning to the latest release tag (i.e. `v0.14.5`), but can
+It assumes you are pinning to the latest release tag (i.e. `v0.15.0`), but can
 easily be modified to work with any available tag or branch.
 
 ### With Docker
 
 #### Build
 Option 1, pull the prebuilt image from [Docker Hub](https://hub.docker.com/r/nvidia/k8s-device-plugin):
 ```shell
-$ docker pull nvcr.io/nvidia/k8s-device-plugin:v0.14.5
-$ docker tag nvcr.io/nvidia/k8s-device-plugin:v0.14.5 nvcr.io/nvidia/k8s-device-plugin:devel
+$ docker pull nvcr.io/nvidia/k8s-device-plugin:v0.15.0
+$ docker tag nvcr.io/nvidia/k8s-device-plugin:v0.15.0 nvcr.io/nvidia/k8s-device-plugin:devel
 ```
 
 Option 2, build without cloning the repository:
 ```shell
 $ docker build \
     -t nvcr.io/nvidia/k8s-device-plugin:devel \
     -f deployments/container/Dockerfile.ubuntu \
-    https://github.com/NVIDIA/k8s-device-plugin.git#v0.14.5
+    https://github.com/NVIDIA/k8s-device-plugin.git#v0.15.0
 ```
 
 Option 3, if you want to modify the code:
@@ -1015,29 +1014,38 @@ See the [changelog](CHANGELOG.md)
 * You can report a bug by [filing a new issue](https://github.com/NVIDIA/k8s-device-plugin/issues/new)
 * You can contribute by opening a [pull request](https://help.github.com/articles/using-pull-requests/)
 
-## Documentation
-
-- [Quick Start](docs/quick_start.md)
-  * [Prerequisites](docs/quick_start.md#prerequisites)
-  * [Preparing your GPU Nodes](docs/quick_start.md#preparing-your-gpu-nodes)
-  * [Node Feature Discovery (NFD)](docs/quick_start.md#node-feature-discovery-nfd)
-  * [Enabling GPU Support in Kubernetes](docs/quick_start.md#enabling-gpu-support-in-kubernetes)
-  * [Running GPU Jobs](docs/quick_start.md#running-gpu-jobs)
-- [Configuring the NVIDIA device plugin binary](docs/customizing.md)
-  * [As command line flags or envvars](docs/customizing.md#as-command-line-flags-or-envvars)
-  * [As a configuration file](docs/customizing.md#as-a-configuration-file)
-  * [Configuration Option Details](docs/customizing.md#configuration-option-details)
-  * [Shared Access to GPUs with CUDA Time-Slicing](docs/customizing.md#shared-access-to-gpus-with-cuda-time-slicing)
-- [Deployment via `helm`](docs/deployment_via_helm.md)
-  * [Configuring the device plugin's `helm` chart](docs/deployment_via_helm.md#configuring-the-device-plugins-helm-chart)
-    + [Passing configuration to the plugin via a `ConfigMap`.](docs/deployment_via_helm.md#passing-configuration-to-the-plugin-via-a-configmap)
-      - [Single Config File Example](docs/deployment_via_helm.md#single-config-file-example)
-      - [Multiple Config File Example](docs/deployment_via_helm.md#multiple-config-file-example)
-      - [Updating Per-Node Configuration With a Node Label](docs/deployment_via_helm.md#updating-per-node-configuration-with-a-node-label)
-    + [Setting other helm chart values](docs/deployment_via_helm.md#setting-other-helm-chart-values)
-    + [Deploying with gpu-feature-discovery for automatic node labels](docs/deployment_via_helm.md#deploying-with-gpu-feature-discovery-for-automatic-node-labels)
-  * [Deploying via `helm install` with a direct URL to the `helm` package](docs/deployment_via_helm.md#deploying-via-helm-install-with-a-direct-url-to-the-helm-package)
-- [Building and Running Locally](docs/building_and_running.md)
-- [GPU Feature Discovery CMD](docs/gfd_cmd.md)
-- [GPU Feature Discovery Labels](docs/gfd_labels.md)
-- [Changelog](CHANGELOG.md)
+### Versioning
+
+Before v1.10 the versioning scheme of the device plugin had to match exactly the version of Kubernetes.
+After the promotion of device plugins to beta this condition was was no longer required.
+We quickly noticed that this versioning scheme was very confusing for users as they still expected to see
+a version of the device plugin for each version of Kubernetes.
+
+This versioning scheme applies to the tags `v1.8`, `v1.9`, `v1.10`, `v1.11`, `v1.12`.
+
+We have now changed the versioning to follow [SEMVER](https://semver.org/). The
+first version following this scheme has been tagged `v0.0.0`.
+
+Going forward, the major version of the device plugin will only change
+following a change in the device plugin API itself. For example, version
+`v1beta1` of the device plugin API corresponds to version `v0.x.x` of the
+device plugin. If a new `v2beta2` version of the device plugin API comes out,
+then the device plugin will increase its major version to `1.x.x`.
+
+As of now, the device plugin API for Kubernetes >= v1.10 is `v1beta1`.  If you
+have a version of Kubernetes >= 1.10 you can deploy any device plugin version >
+`v0.0.0`.
+
+### Upgrading Kubernetes with the Device Plugin
+
+Upgrading Kubernetes when you have a device plugin deployed doesn't require you
+to do any, particular changes to your workflow.  The API is versioned and is
+pretty stable (though it is not guaranteed to be non breaking). Starting with
+Kubernetes version 1.10, you can use `v0.3.0` of the device plugin to perform
+upgrades, and Kubernetes won't require you to deploy a different version of the
+device plugin. Once a node comes back online after the upgrade, you will see
+GPUs re-registering themselves automatically.
+
+Upgrading the device plugin itself is a more complex task. It is recommended to
+drain GPU tasks as we cannot guarantee that GPU tasks will survive a rolling
+upgrade. However we make best efforts to preserve GPU tasks during an upgrade.
diff --git a/docs/building_and_running.md b/docs/building_and_running.md