Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions ansible/roles/host-ocp4-assisted-installer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# host-ocp4-assisted-installer

- Provisions an OpenShift cluster using Assisted Installer and KubeVirt VMs running on an existing OpenShift/Kubernetes cluster.
- Handles SNO (Single Node, workers=0) and Full (control plane + workers) topologies.
- Creates KubeVirt VMs, Assisted Installer cluster + infra-env, injects manifests, configures DNS/Services, waits for readiness, starts install, and downloads credentials/artifacts.

## Flow
```mermaid
flowchart TD
subgraph Local
A1[Determine oc client URL<br/>stable vs specific] --> A2[Download and install oc]
A2 --> A3[Login via openshift_auth<br/>optional]
end

subgraph "KubeVirt/k8s"
K1{workers > 0?}
K2S[SNO: Create Svc LoadBalancer]
K2M[Create masters LB Service]
K2W[Create workers LB Service]
K3[Create OVN secondary network<br/>if ai_install_use_network not set]
K4[Create PVC for installation ISO]
K5[Create Control Plane VMs<br/>+ etcd disk, extra disks]
K6[Create Worker VMs<br/>if workers > 0]
K7[Delete failed installer Pods<br/>label app=installer]
end

subgraph DNS
DNSSEL{DNS provider?}
DNS_NS[Create A records via nsupdate]
DNS_R53[Create A records via Route53]
end

subgraph "Assisted Installer API"
AI1[create_cluster]
AI1a[Set HA mode - Full or None]
AI2[Upload manifests:<br/>etcd disk, router replicas, network config]
AI2b{OCP >= 4.14?}
AI2c[Upload sysctl manifests<br/>control-plane and workers]
AI3[Upload custom MachineConfigs<br/>optional]
AI4[create_infra_env]
AI5[wait_for_hosts]
AI6[install_cluster - async]
AI7[get_credentials +<br/>download_credentials/files]
end

subgraph "Local render"
L1[Generate MACs for NICs and attached networks]
L2[Render static_network_config from template]
L3[Setup ~/.kube for ansible_user and root]
L4[Setup student kubeconfig<br/>optional]
L5[Install oc bash completion]
L6[Print cluster info and user messages]
end

A3 --> K1
K1 -- no --> K2S --> DNSSEL
K1 -- yes --> K2M --> K2W --> DNSSEL
DNSSEL -- nsupdate --> DNS_NS --> K3
DNSSEL -- Route53 --> DNS_R53 --> K3
K3 --> AI1 --> AI1a --> AI2 --> AI2b
AI2b -- yes --> AI2c --> AI3
AI2b -- no --> AI3
AI3 --> L1 --> L2 --> AI4 --> K4 --> K5 --> K6 --> AI5 --> AI6 --> AI7 --> L3 --> L4 --> L5 --> K7 --> L6
```

## Requirements
- Access to an OpenShift cluster API (`sandbox_openshift_api_url`) with KubeVirt and LoadBalancer services available.
- Assisted Installer credentials and pull secret.
- DNS either via `nsupdate` or AWS Route53 when configured.
- Collections used: `kubernetes.core`, `kubevirt.core`, `community.general`, `amazon.aws`, `rhpds.assisted_installer`.

## Variables
Required (must be provided by inventory/group vars):
- `ocp4_installer_version`: OpenShift version (e.g. `4.13` or `4.13.21`).
- `ocp4_ai_pull_secret` or `ai_pull_secret`: Pull secret JSON string/object.
- `ocp4_ai_offline_token` or `ai_offline_token`: Assisted Installer offline token.
- `sandbox_openshift_api_url`: API endpoint of the management cluster running KubeVirt.
- `sandbox_openshift_username`/`sandbox_openshift_password` or `sandbox_openshift_api_key`: For API auth.
- `cluster_name` and `cluster_dns_zone`: Base cluster FQDN components.

Common defaults you may override (see `defaults/main.yml`):
- `ai_cluster_version` (defaults to `ocp4_installer_version` or `4.13`)
- `ai_cluster_iso_type` (e.g. `minimal-iso`)
- `ai_ocp_namespace` (defaults to `env_type-guid`)
- `ai_ocp_vmname_master_prefix`, `ai_ocp_vmname_worker_prefix`
- `ai_storage_class`, `ai_local_storageclass`
- `ai_network_prefix`, `ai_service_network_cidr`, `ai_cluster_network_cidr`, `ai_network_mtu`
- `ai_control_plane_cores`, `ai_control_plane_memory`, `ai_workers_cores`, `ai_workers_memory`
- MAC/attached networks lists: `ai_masters_macs*`, `ai_workers_macs*`, `ai_attach_*_networks`, `ai_attach_*_macs`
- Extra disks: `ai_masters_extra_disks`, `ai_workers_extra_disks`
- Output and SSH: `ai_ocp_output_dir`, `ai_ssh_authorized_key`
- Optional: `ai_machineconfigs` (array of MachineConfig objects to upload)

Other inputs used by tasks (set by your inventory/parent role):
- `master_instance_count`, `worker_instance_count`
- `env_type`, `guid`, `ansible_user`, `student_name`, `install_student_user`
- DNS (optional): `cluster_dns_server`, `cluster_dns_port`, `cluster_dns_zone`, `ddns_key_name`, `ddns_key_secret`
- Route53 (optional): `route53_aws_zone_id`, `route53_aws_access_key_id`, `route53_aws_secret_access_key`
- Optional network override: `ai_install_use_network`

## Outputs
- Downloads to `{{ ai_ocp_output_dir }}/{{ cluster_name }}/`: kubeconfig(s), kubeadmin-password, ignition files, install-config and custom manifests.
- Writes user info messages with console/API URLs and client download link.
- Configures `/home/{{ ansible_user }}/.kube/config` and `/root/.kube/config`.

### Notes
- When `worker_instance_count == 0`, the role configures SNO and only creates the SNO Service and DNS.
- If `ai_machineconfigs` is provided, each item is uploaded as an Assisted Installer custom manifest.
210 changes: 210 additions & 0 deletions ansible/roles/host-ocp4-hcp-cnv-install/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
host-ocp4-hcp-cnv-install
=========================

Purpose
-------
Provision an OpenShift 4 Hosted Control Plane (HCP) cluster on OpenShift Virtualization (CNV/KubeVirt) using HyperShift APIs in a management cluster. The role:

- Creates TLS for OAuth via cert-manager `ClusterIssuer`
- Creates pull secret and infra kubeconfig secret for HyperShift
- Optionally configures `htpasswd` identity provider and generates users
- Creates `HostedCluster` and `NodePool` (KubeVirt platform)
- Waits for admin kubeconfig and makes it available locally
- Exposes apps via a LoadBalancer Service and creates a Route53 wildcard DNS record
- Copies kubeconfig to `/home/<ansible_user>/.kube/config` and `/root/.kube/config`
- Optionally sets up a student user kubeconfig
- Prints/saves access information via `agnosticd_user_info`

Requirements
------------
- Management OpenShift cluster with HyperShift installed (CRDs: `HostedCluster`, `NodePool`)
- OpenShift Virtualization (CNV/KubeVirt) enabled in the management cluster
- cert-manager installed with a `ClusterIssuer` matching `hcp_cluster_issuer`
- A DNS zone managed in Route53 if DNS automation is desired
- Access to the management cluster via username/password or API token
- Ansible collections in execution context: `kubernetes.core`, `community.okd`, `community.general`, `amazon.aws`
- Python virtualenv at `/opt/virtualenvs/k8s` with Kubernetes client libs (the role sets `ansible_python_interpreter` to this path)

Behavior Overview
-----------------
1. Downloads `oc` client matching the requested OCP version
2. Authenticates to the management cluster (username/password or API token)
3. Resolves `ocp_release_image` from ClusterImageSets for `hcp_cluster_version` (supports `x.y` or `x.y.z`)
4. Applies templates to create:
- cert-manager `Certificate` for OAuth endpoint
- pull secret for the release image
- infra kubeconfig `Secret` used by HyperShift
- optional `htpasswd` Secret and identity provider
- `HostedCluster` and `NodePool` (KubeVirt)
5. Waits for hosted admin kubeconfig, writes `/tmp/kubeconfig`, copies to users
6. Creates a LoadBalancer Service targeting worker nodeports and adds a Route53 wildcard `A` record
7. Grants cluster-admin to the configured admin user and prints/saves access data

Diagram
-------
```mermaid
flowchart TD
A[Start role] --> B[Set ansible_python_interpreter /opt/virtualenvs/k8s]
B --> C{ocp4_installer_version format}
C -->|x.y.z| D[Set GA client URL - exact]
C -->|x.y| E[Set GA client URL - stable]
D --> F[Download oc client]
E --> F
F --> G[Authenticate to mgmt cluster]
G --> H{Resolve ocp_release_image
from ClusterImageSets}
H --> I[Create OAuth Certificate]
I --> J[Wait Certificate Ready]
J --> K[Create pull secret]
K --> L[Create infra kubeconfig Secret]
L --> M{hcp_authentication == htpasswd?}
M -->|yes| N[Generate users + htpasswd Secret]
M -->|no| O[Read kubeadmin password Secret]
N --> P[Create HostedCluster - KubeVirt]
O --> P
P --> Q[Create NodePool]
Q --> R[Wait hosted admin kubeconfig]
R --> S[Write /tmp/kubeconfig and copy to users]
S --> T[Expose apps via LB Service]
T --> U[Create Route53 wildcard A record]
U --> V[Grant cluster-admin to admin user]
V --> W[Print and save access info]
```

Sequence Diagram
----------------
```mermaid
sequenceDiagram
participant R as Role
participant M as Mgmt OCP API
participant CM as cert-manager
participant HS as HyperShift Operator
participant KV as KubeVirt CNV
participant DNS as Route53

R->>M: Apply Certificate for OAuth
M->>CM: Reconcile Certificate
CM-->>M: Secret oauth-<guid> ready

R->>M: Create pull secret and infra kubeconfig secret

R->>M: Create HostedCluster
M->>HS: Reconcile HostedCluster
HS->>KV: Provision control-plane VMs
HS-->>M: Admin kubeconfig secret available
R->>M: Read admin kubeconfig secret
R-->>R: Write /tmp/kubeconfig and copy to users

R->>M: Create NodePool
M->>HS: Reconcile NodePool
HS->>KV: Provision worker VMs

R->>M: Create LoadBalancer Service for apps
M-->>R: Service ingress IP available
R->>DNS: Create wildcard A record
R-->>R: Save and print access info
```

Role Variables (defaults)
------------------------
Defined in `defaults/main.yaml`:

- `num_users` (int, default `1`): Number of non-admin users to create when `htpasswd` auth is enabled
- `hcp_ssh_authorized_key` (string): SSH public key for node access
- `hcp_cluster_name` (string, default `guid`): Cluster name
- `hcp_cluster_version` (string, default `4.16`): OCP version (`x.y` or `x.y.z`)
- `hcp_storage_class` (string, default `ocs-external-storagecluster-ceph-rbd`): Default storage class name
- `hcp_etcd_storage_class` (string, default `hcp_storage_class`): etcd storage class
- `hcp_ocp_namespace` (string): Namespace for hosted cluster resources (defaults to `<env_type>-<guid>`)
- `hcp_cluster_issuer` (string, default `letsencrypt-production-ec2`): cert-manager ClusterIssuer name
- `hcp_admin_password_length` (int, default `16`)
- `hcp_user_password_length` (int, default `16`)
- `hcp_user_base` (string, default `user`)
- `hcp_admin_user` (string, default `admin`)
- `hcp_user_passwords` (list, default `[]`): Populated when generating random passwords
- `hcp_admin_password` (string, default empty): If provided, used as admin password
- `hcp_user_password` (string, default empty): If provided, used for all non-admin users
- `hcp_enable_user_info_messages` (bool, default `true`): Print user info messages
- `hcp_enable_user_info_data` (bool, default `true`): Save user info data
- `hcp_controller_availability_policy` (string, default `SingleReplica`)
- `hcp_authentication` (string, default `htpasswd`): `htpasswd` or any other value to use kubeadmin secret
- `hcp_etcd_pvc_size` (string, default `8Gi`)
- `hcp_worker_cores` (int, default `16`)
- `hcp_worker_memory` (string, default `32Gi`)
- `hcp_worker_root_volume_size` (string, default `100Gi`)
- `hcp_worker_instance_count` (int): Number of worker replicas when autoscaling is disabled
- `hcp_worker_autoscale` (bool, default `false`)
- `hcp_worker_instance_min_count` (int, default `3`)
- `hcp_worker_instance_max_count` (int, default `5`)
- `hcp_quay_api_url` (string): Used to query release tags for tooling (informational)
- `hcp_disable_storage_class` (bool, default `false`): When true, sets storage driver to `None` in HostedCluster

Additional required/expected variables
--------------------------------------
- `guid` (string): Unique environment identifier
- `cluster_dns_zone` (string): Base DNS zone (e.g., `example.com`)
- `ocp4_pull_secret` (object or string): Pull secret JSON (object or already-serialized string)
- `worker_instance_count` (int) when not using autoscaling
- `sandbox_openshift_api_url` (string): Management cluster API URL
- One of:
- `sandbox_openshift_username` and `sandbox_openshift_password` (strings)
- or `sandbox_hcp.sandbox_openshift_api_key` (string)
- `sandbox_hcp.sandbox_openshift_api_url` (string): Management API URL used by HyperShift module defaults
- `sandbox_hcp.sandbox_openshift_namespace` (string): Namespace where hosted resources are created
- `sandbox_hcp.sandbox_openshift_apps_domain` (string): Apps domain of the management cluster (used for OAuth certificate and named certs)
- Route53 (optional, for DNS automation):
- `route53_aws_access_key_id`, `route53_aws_secret_access_key`, `route53_aws_zone_id`

Optional variables
------------------
- `install_student_user` (bool): If true, copies kubeconfig to `/home/<student_name>/.kube/config`
- `student_name` (string): Target username for student kubeconfig copy
- `ansible_user` (string): Used to place kubeconfig in `/home/<ansible_user>/.kube/config`
- `ocp4_installer_version` (string): `x.y` or `x.y.z`; determines client URL and release image selection
- `ocp4_installer_root_url` (string): Override clients mirror root if needed

Outputs and Side Effects
------------------------
- Resources created in the management cluster namespace `{{ sandbox_hcp.sandbox_openshift_namespace }}`:
- `Certificate` `oauth-{{ guid }}`
- `Secret` `hcp-{{ guid }}-pull-secret`
- `Secret` `hcp-{{ guid }}-infra-credentials`
- `Secret` `htpasswd-{{ guid }}` (when `hcp_authentication == 'htpasswd'`)
- `HostedCluster` `hcp-{{ guid }}`
- `NodePool` `hcp-{{ guid }}`
- LoadBalancer `Service` `svc-{{ guid }}-apps` and Route53 wildcard `A` record `*.apps.hcp-{{ guid }}.{{ cluster_dns_zone }}`
- Local files: kubeconfig copied to `/home/{{ ansible_user }}/.kube/config` and `/root/.kube/config`
- User info printed and stored via `agnosticd_user_info`

Example
-------
Playbook snippet:

```yaml
- hosts: bastion
gather_facts: false
roles:
- role: host-ocp4-hcp-cnv-install
vars:
guid: abc123
cluster_dns_zone: example.com
hcp_cluster_version: "4.16"
sandbox_openshift_api_url: https://api.mgmt.example.com:6443
sandbox_hcp:
sandbox_openshift_api_url: https://api.mgmt.example.com:6443
sandbox_openshift_api_key: "<token>"
sandbox_openshift_namespace: hcp-abc123
ocp4_pull_secret: "{{ lookup('file', 'pull-secret.json') | from_json }}"
hcp_authentication: htpasswd
num_users: 5
route53_aws_access_key_id: "AKIA..."
route53_aws_secret_access_key: "..."
route53_aws_zone_id: "Z123456789"
```

Notes
-----
- `hcp_cluster_version` may be specified as `x.y.z` (exact) or `x.y` (latest matching `ClusterImageSet` will be selected). The role fails early if a suitable image set is not found.
- If `hcp_authentication != 'htpasswd'`, the role reads the kubeadmin password from `hcp-{{ guid }}-kubeadmin-password` Secret.
- DNS automation via Route53 is optional; skip Route53 variables to disable it.
- To remove a hosted cluster, delete the `HostedCluster` and `NodePool` objects (or use a corresponding destroy role if available).
Loading