Skip to content

Conversation

@emosbaugh
Copy link
Member

@emosbaugh emosbaugh commented Nov 7, 2025

Summary

This PR implements the headless install orchestrator for Embedded Cluster v3, enabling automated installations without interactive UI prompts. The orchestrator coordinates the full installation flow including host preflights, infrastructure setup, airgap processing, application configuration, app preflights, and final application installation.

Changes

API & Client Enhancements

  • New Host Preflights Endpoints: Added dedicated endpoints for running and checking host preflight status during headless install
    • POST /api/linux/install/host-preflights/run
    • GET /api/linux/install/host-preflights/status
  • New Airgap Processing Endpoints: Added endpoints for processing airgap bundles during installation
    • POST /api/linux/install/airgap/process
    • GET /api/linux/install/airgap/status
  • Enhanced App Install: Updated app installation endpoint to support ignoring app preflights via request body
  • Client Interface Updates: Extended API client with new methods for host preflights and airgap processing

Controller Architecture

  • Dependency Injection: Added Kubernetes client and metadata client support across all controllers
    • api/api.go:124,138: Added WithKubeClient() and WithMetadataClient() options
    • Updated install/upgrade controllers for both Linux and Kubernetes targets
  • Manager Updates: Infrastructure managers now receive kube and metadata clients for better cluster interaction

Headless Install Orchestrator

  • Mock Client: Updated mock API client with new host preflights and airgap methods for testing
  • Installation Flow: Orchestrator coordinates the complete headless installation workflow:
    1. Installation configuration
    2. Host preflights
    3. Infrastructure setup
    4. Airgap processing (if applicable)
    5. Application configuration
    6. App preflights
    7. Application installation

Dry Run Support

  • Enhanced Dry Run Mode: Improved dry run support for development and testing
    • API now creates helm, kube, and metadata clients in dry run mode (cmd/installer/cli/api.go:112-130)
    • Preflights automatically succeed in dry run mode
    • Records preflight specs for validation

Other Changes

  • Gitignore Update: Removed exclusion of *.tgz files (helm chart dependencies now tracked)
  • Test Updates: Updated client tests to reflect new API request/response structures

Example Usage

$ embedded-cluster install \
  --target=linux \
  --headless \
  --admin-console-password password \
  --license license.yaml \
  --config-values config-values.yaml \
  --yes

No certificate files provided. A self-signed certificate will be used, and your browser will show a security warning.
To use your own certificate, provide both --tls-key and --tls-cert flags.

Continuing with a self-signed certificate...

✔  Application configuration complete
✔  Installation configuration complete
✔  Host preflights passed
✔  Infrastructure setup complete
✔  App preflights passed
✔  Application is ready

Installation completed successfully

Testing

  • Unit tests updated for new client methods and request structures (api/client/client_test.go)
  • Mock client implementation for headless orchestrator testing
  • Dry run mode enables local development without cluster

TODO (for follow up PRs)

  • Make progress steps more fine-grained like v2 install, break down infrastructure setup and application installation steps
  • Additional dryrun tests as specified in the headless_install.md proposal
  • Minimal E2E test coverage

Related

Part of v3 headless install implementation work.

@github-actions
Copy link

github-actions bot commented Nov 7, 2025

This PR has been released (on staging) and is available for download with a embedded-cluster-smoke-test-staging-app license ID.

Online Installer:

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci/appver-dev-4f2a1b0" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Airgap Installer (may take a few minutes before the airgap bundle is built):

curl "https://staging.replicated.app/embedded/embedded-cluster-smoke-test-staging-app/ci-airgap/appver-dev-4f2a1b0?airgap=true" -H "Authorization: $EC_SMOKE_TEST_LICENSE_ID" -o embedded-cluster-smoke-test-staging-app-ci.tgz

Happy debugging!

@emosbaugh emosbaugh force-pushed the emosbaugh/sc-130867/headless-install-headless-orchestrator-implementation-2 branch from 2e00122 to 2fc9e43 Compare November 7, 2025 14:23
@emosbaugh emosbaugh force-pushed the emosbaugh/sc-130867/headless-install-headless-orchestrator-implementation-2 branch from 7b94ab5 to 86be12d Compare November 7, 2025 19:29

// --- validate registry --- //
expectedRegistryIP := "10.2.128.11" // lower band index 10
validateCustomCIDR(t, &dr, hcli)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was moved to install_common.go and user for both v2 and v3 tests

@emosbaugh emosbaugh marked this pull request as ready for review November 7, 2025 19:53
}

if dryrun.Enabled() {
hcli, err := helm.NewClient(helm.HelmOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be coming from the dryrun package?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to work as is

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it uses the factory pattern and returns a helm client singleton for now when dryrun. this will not work for tests in parallel but its something we can fix later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm sure it'll work because dryrun overrides globals. but it's still confusing. any reason not to expose the dryrun's helm client via a "HelmClient" public method like the dryrun.KubeClient method below?


func TestV3InstallHeadless_HappyPath(t *testing.T) {
licenseFile, configFile := setupV3HeadlessTest(t)
func TestV3InstallHeadless_HappyPathAirgap(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this test doesn't validate anything? just that there is no error? feels like it should validate many things similar to the v2 tests that have airgap bundles

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are no airgap tests in v2. if you look in the TODO section of the description I intend to add more dryrun tests but it will drag this pull request out especially if im chasing full coverage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im going to vaildate airgap things like the registry addon and certain preflight checks...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

var (
//go:embed assets/real-license.yaml
realLicenseData string
func TestV3InstallHeadless_HappyPathOnline(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like these functions should be validating more than just "no error was returned"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like that should be up to the targeted tests. What would you have this validate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, im going to vaildate non-airgap things like the registry addon and certain preflight checks...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

k0s.Set(client.K0sClient)
k0s.SetClientFactory(func() k0s.K0sInterface {
return &K0s{}
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Factory creates broken objects, causing crashes.

The k0s.SetClientFactory returns an uninitialized dryrun.K0s struct with a nil k0s field. When any method like Install, NewK0sConfig, WriteK0sConfig, or PatchK0sConfig is called on the returned instance, it will panic with a nil pointer dereference because these methods call c.k0s.Method(...). The factory should initialize the k0s field with new(k0s.K0s) like the direct initialization does on line 76.

Fix in Cursor Fix in Web

found := false
extraEnvValue := operatorOpts.Values["extraEnv"]
switch extraEnv := extraEnvValue.(type) {
case []any:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something is wrong here that needs to be fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's the yaml marshal / unmarshal that's ruining this, and that only happens when end user overrides are specified maybe? either way, maybe we should create a helper function so this logic isn't duplicated everywhere

sgalsaleh
sgalsaleh previously approved these changes Nov 10, 2025
@emosbaugh emosbaugh enabled auto-merge (squash) November 10, 2025 19:24
@emosbaugh emosbaugh merged commit b55a4f7 into main Nov 10, 2025
98 of 100 checks passed
@emosbaugh emosbaugh deleted the emosbaugh/sc-130867/headless-install-headless-orchestrator-implementation-2 branch November 10, 2025 21:19
crdant pushed a commit that referenced this pull request Nov 12, 2025
* feat(v3): headless install orchestrator implementation

* f

* f

* f

* feedback

* feedback

* feedback

* f

* f

* f

* f

* f

* f

* f

* f

* f

* f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants