-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add IBM Cloud IKS docs #49
Draft
jacobtomlinson
wants to merge
32
commits into
rapidsai:main
Choose a base branch
from
jacobtomlinson:iks
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
bd050ee
Create single-node
marifse 2cc0942
Update single-node
marifse b0e48a7
Rename single-node to single-node.md
marifse 91f8358
Update single-node.md
marifse 68ece07
Update single-node.md
marifse bb7e96d
Update single-node.md
marifse a263762
Create index.md
marifse f4f36d4
Update index.md
marifse e209dbe
Update index.md
marifse ac4358e
Create iks.md
marifse ca89675
Update iks.md
marifse ea4537a
Update iks.md
marifse 387530d
Update iks.md
marifse 89b342e
Update iks.md
marifse 2227523
Update iks.md
marifse 99582b1
Update iks.md
marifse 041333f
Update iks.md
marifse d19704d
Update iks.md
marifse 225d05f
Update iks.md
marifse fb4a85b
Update iks.md
marifse aea3f04
Update iks.md
marifse 0660489
Update iks.md
marifse 26066a1
Update iks.md
marifse 0383716
Delete source/cloud/IBM directory
marifse c21aede
Create index.md
marifse e3dc7b6
Create single-node.md
marifse 656cd0a
Create iks.md
marifse fdda35d
Update iks.md
marifse 4848cd7
Fix linting
jacobtomlinson 0a39e34
Formatting, layout and reuse of existing documentation
jacobtomlinson 24f06cd
Merge branch 'main' of https://github.com/rapidsai/deployment into ma…
jacobtomlinson cc9268a
Merge branch 'main' of https://github.com/rapidsai/deployment into iks
jacobtomlinson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# IBM Kubernetes Service (IKS) | ||
|
||
RAPIDS can be deployed on IBM Cloud via IBM Cloud managed Kubernetes service (IKS) using any of the [supported Kubernetes installation methods](../../platforms/kubernetes). | ||
|
||
## Install pre-requisites | ||
|
||
Install and configure dependencies in your local environment: [kubectl](https://kubernetes.io/docs/tasks/tools/), [helm](https://helm.sh/), [IBM cloud cli](https://cloud.ibm.com/docs/cli?topic=cli-getting-started) and [IBM Kubernetes Service (KS) plugin](https://cloud.ibm.com/docs/containers?topic=containers-cs_cli_install). | ||
|
||
## Login to IBM CLI | ||
|
||
```shell | ||
$ ibmcloud login -a cloud.ibm.com -r <region> | ||
$ ibmcloud target -g <resource group> | ||
``` | ||
|
||
```{note} | ||
You can list regions with `$ ibmcloud regions` and resource groups with `$ ibmcloud resource groups`. | ||
``` | ||
|
||
## Create a Kubernetes cluster | ||
|
||
```shell | ||
$ ibmcloud ks cluster create classic \ | ||
--name <CLUSTER_NAME> \ | ||
--zone dal10 \ | ||
--flavor gx2-8x64x1v100 \ | ||
--hardware dedicated \ | ||
--workers 1 \ | ||
--version <kubernetes_version> | ||
``` | ||
|
||
`<CLUSTER_NAME>` = Name of the IKS cluster. This will be auto generated if not specified. <br> | ||
`<kubernetes_version>` = Kubernetes version, the tested version for this deployment is 1.21.14. <br> | ||
|
||
Upon successful creation, you would get the cluster id, note that down, it will be required in the next step to connect to the cluster. | ||
|
||
## Connect to the cluster | ||
|
||
```shell | ||
$ ibmcloud ks cluster config --cluster <cluster_id> | ||
``` | ||
|
||
`<cluster_id>` = When creating the cluster using IBM KS CLI, use that cluster id to connect to the cluster. | ||
|
||
## Install GPU drivers | ||
|
||
```shell | ||
$ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia | ||
$ helm repo update | ||
$ helm install --wait --generate-name \ | ||
-n gpu-operator --create-namespace \ | ||
nvidia/gpu-operator | ||
``` | ||
|
||
## Install RAPIDS | ||
|
||
Follow any of the [Kubernetes installation methods to install and use RAPIDS](../../platforms/kubernetes). | ||
|
||
## Delete the cluster | ||
|
||
When you are finished delete the Kubernetes cluster. | ||
|
||
Before you delete the cluster you need to manually delete services running in the cluster with external IPs to release network resources. | ||
|
||
```shell | ||
$ kubectl get svc --all-namespaces | ||
$ kubectl delete svc <SERVICE_NAME> | ||
``` | ||
|
||
`<SERVICE_NAME>` = Name of the services which have an `EXTERNAL-IP` value. | ||
|
||
Delete the cluster and its associated nodes. | ||
|
||
```shell | ||
$ ibmcloud ks cluster rm --cluster <cluster_name_or_ID> | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unable to find any
gx2
flavor instances, are these still available? Do we need to enable something specific on our account.cc @marifse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @jacobtomlinson! thank you four response. To provision IBM Kubernetes cluster, you can use the GPU enabled mg4c.32x384.2xp100 (bare-metal server) instance, though it will incur charges per month one time not hourly, so would be careful. And for your information the gx2-8x64x1v100 is only available for Virtual Server Instance, which is a virtual instance and charged per hour basis. Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the response @marifse.
I'm apprehensive about providing instructions on deploying a very expensive cluster that bills monthly 😅. Your original instructions used
gx2-8x64x1v100
which I've just copied over here, is there any way to get per-hour billing for GPUs on IKS?We also need to test these instructions for every RAPIDS release (every two months) and right now this means launching two nodes and checking that everything runs. We can't really justify ~$1k per test because of a limitation around the billing period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marifse gentle nudge here, do you have any suggestions on how we can safely document IKS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jacobtomlinson sorry for being late, i have communicated to IBM guy, for now there is no option in IKS (IBM Kubernetes Services) to use GPU enabled instance without bare-metal profile.
For testing the instructions for every RAPIDS release, i have communicated to IBM guy, he will sort out some solution for this, as soon as he will be back to work, as he is on leave for next 15 days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @marifse I'll leave this as a draft in the meantime, but the single-node docs are already merged and will be in the next stable release which we hope to get out this week.