Skip to content

Conversation

@printminion-co
Copy link

@printminion-co printminion-co commented Nov 11, 2025

🎯 Overview

This PR introduces a major optimization to the Nextcloud Workspace build workflow by implementing intelligent cache-based detection that only rebuilds apps when their code has changed. This significantly reduces CI/CD build times and resource consumption.

📊 Impact

Before

  • Every build: All 25+ external apps were rebuilt from scratch
  • Typical build time: 30-45 minutes for full pipeline
  • Resource usage: High compute and network bandwidth

After

  • Smart caching: Only apps with changed code are rebuilt
  • Typical build time: 5-15 minutes (depending on changes)
  • Cache hit rate: 80-95% for typical PRs (only 1-5 apps changed)
  • Resource savings: ~70-90% reduction in build time for incremental changes

🏗️ Architecture

Cache Key Strategy

Cache Key Format: <CACHE_VERSION>-app-build-<APP_NAME>-<GIT_SHA>

Example: v1.0-app-build-mail-a1b2c3d4e5f6g7h8
  • CACHE_VERSION: Environment variable for cache invalidation
  • APP_NAME: Unique app identifier
  • GIT_SHA: Git commit SHA of the app submodule

When to increment CACHE_VERSION:

  • Node.js version changes
  • PHP version changes
  • Build scripts modified
  • Major dependency updates

Workflow Flow

graph TD
    A[prepare-matrix] --> B{Generate Matrix}
    B --> C[Detect Cache Status]
    C --> D{Check Each App}
    D --> E[apps_to_build]
    D --> F[apps_to_restore]
    
    E --> G[build-external-apps]
    G --> H[Build Changed Apps]
    H --> I[Save to Cache]
    I --> J[Upload Artifacts]
    
    F --> K[restore-cached-apps]
    K --> L[Restore from Cache]
    L --> M[Validate Cache]
    M --> N[Upload Artifacts]
    
    J --> O[build-artifact]
    N --> O
    O --> P[Assemble Final Package]
Loading

Job Dependencies

prepare-matrix (always runs)
  ├── build-external-apps (conditional: if apps_to_build != [])
  └── restore-cached-apps (conditional: if apps_to_restore != [])
      └── build-artifact (depends on both completing successfully)

📁 Files Changed

File Changes Purpose
.github/workflows/build-artifact.yml 868 lines modified Optimized workflow with cache detection
.github/workflows/build-artifact-original.yml 924 lines added Preserved original workflow for reference

✅ Testing Instructions

Prerequisites

  • Access to GitHub Actions workflows
  • Permissions to trigger workflow runs
  • Understanding of git submodules

Test Scenario 1: Full Cold Build (No Cache)

This simulates the first build or after cache invalidation.

Steps:

  1. Increment CACHE_VERSION in .github/workflows/build-artifact.yml (e.g., from v1.0 to v1.1)
  2. Create a PR or push to ionos-dev branch
  3. Monitor the workflow run

Expected Results:

  • ✅ All 25+ apps appear in "Needs build" list
  • build-external-apps job runs with full matrix
  • restore-cached-apps job is skipped (no cached apps)
  • ✅ All apps are built and cached
  • ✅ Build time: ~30-45 minutes
  • ✅ Cache hit rate: 0%

Verification:

# Check workflow summary for cache status report
# Should show all apps with "🔨 Needs build" status

Test Scenario 2: Incremental Build (Change 1 App)

This tests the optimization when only a single app is modified.

Steps:

  1. Make a code change in one external app (e.g., apps-external/mail)
    cd apps-external/mail
    # Make a small change (e.g., update a comment in a file)
    git add .
    git commit -m "test: trigger rebuild of mail app"
    cd ../..
    git add apps-external/mail
    git commit -m "chore: update mail app submodule"
  2. Push the changes or create a PR
  3. Monitor the workflow run

Alternative: Test without code changes (delete cache manually)

# Get the SHA of the mail app submodule
cd apps-external/mail
MAIL_SHA=$(git rev-parse HEAD)
cd ../..

# Delete the cached build for mail app to force rebuild
gh cache delete "v1.0-app-build-mail-${MAIL_SHA}" --repo <owner>/<repo>

# Re-run the workflow - mail will be rebuilt while others remain cached

Expected Results:

  • ✅ Only 1 app (mail) appears in "Needs build" list
  • ✅ 24+ apps appear in "Cached" list
  • build-external-apps job runs with 1 app only
  • restore-cached-apps job runs with 24+ apps
  • ✅ Build time: ~5-10 minutes
  • ✅ Cache hit rate: ~96% (24/25)

Verification:

# Check workflow summary
# - "Apps needing build: 1"
# - "Apps with cached builds: 24+"
# - "Cache hit rate: ~96%"

# Check that build-external-apps only has 1 job
# Check that restore-cached-apps has 24+ jobs (all complete quickly)

Test Scenario 3: Multi-App Change

This tests multiple apps being changed simultaneously.

Steps:

  1. Make changes in 3 different apps:
    # Update mail app
    cd apps-external/mail
    git pull origin main
    cd ../..
    
    # Update calendar app
    cd apps-external/calendar
    git pull origin main
    cd ../..
    
    # Update deck app
    cd apps-external/deck
    git pull origin main
    cd ../..
    
    git add apps-external/mail apps-external/calendar apps-external/deck
    git commit -m "chore: update mail, calendar, deck submodules"
  2. Push the changes
  3. Monitor the workflow

Alternative: Test without code changes (delete caches manually)

# Get the SHAs of the app submodules
MAIL_SHA=$(git -C apps-external/mail rev-parse HEAD)
CALENDAR_SHA=$(git -C apps-external/calendar rev-parse HEAD)
DECK_SHA=$(git -C apps-external/deck rev-parse HEAD)

# Delete cached builds for these apps to force rebuild
gh cache delete "v1.0-app-build-mail-${MAIL_SHA}" --repo <owner>/<repo>
gh cache delete "v1.0-app-build-calendar-${CALENDAR_SHA}" --repo <owner>/<repo>
gh cache delete "v1.0-app-build-deck-${DECK_SHA}" --repo <owner>/<repo>

# Re-run the workflow - these 3 apps will be rebuilt while others remain cached

Expected Results:

  • ✅ 3 apps (mail, calendar, deck) in "Needs build" list
  • ✅ 22+ apps in "Cached" list
  • build-external-apps runs with 3 apps in parallel
  • restore-cached-apps runs with 22+ apps in parallel
  • ✅ Build time: ~10-15 minutes
  • ✅ Cache hit rate: ~88% (22/25)

Test Scenario 4: No Changes (Re-run)

This tests pure cache restoration when no code has changed.

Steps:

  1. Re-run an existing successful workflow
  2. Click "Re-run all jobs" in GitHub Actions UI

Expected Results:

  • ✅ All apps appear in "Cached" list
  • build-external-apps job is skipped entirely
  • restore-cached-apps job runs with all apps
  • ✅ Build time: ~3-5 minutes
  • ✅ Cache hit rate: 100%

Verification:

# Workflow summary should show:
# - "Apps needing build: 0"
# - "Apps with cached builds: 25+"
# - "🎉 All apps are cached! No builds needed."

Test Scenario 5: Cache Validation

This tests that cached builds are validated before use.

Steps:

  1. Manually corrupt a cache entry (if possible via GitHub CLI)
  2. Or wait for cache to expire and be deleted
  3. Trigger a build

Expected Results:

  • ✅ Validation step detects corrupted/missing cache
  • ✅ App is moved to "needs build" list automatically
  • ✅ Build completes successfully with fresh build
  • ✅ New cache is saved

Test Scenario 6: New App Addition

This tests adding a brand new app to the matrix.

Steps:

  1. Add a new app submodule (using the tools script):
    ../nc-docs-and-tools/tools/add_app_submodule.sh
    # Follow prompts to add new app
  2. Add build target to IONOS/Makefile
  3. Push changes

Expected Results:

  • ✅ New app appears in matrix automatically
  • ✅ New app is in "Needs build" list (no cache exists)
  • ✅ Existing apps remain cached
  • ✅ Build completes with new app built and cached

Test Scenario 7: Cache Invalidation

This tests the cache version bump mechanism.

Steps:

  1. Change CACHE_VERSION from v1.0 to v2.0 in workflow file
  2. Commit and push:
    git add .github/workflows/build-artifact.yml
    git commit -m "chore: bump cache version to v2.0"
    git push

Expected Results:

  • ✅ All apps appear in "Needs build" list (old cache keys ignored)
  • ✅ Full rebuild of all apps
  • ✅ New cache entries created with v2.0 prefix
  • ✅ Old v1.0 caches eventually garbage collected

Test Scenario 8: Parallel Build Performance

This tests the parallel execution of builds and restores.

Steps:

  1. Trigger a build with mixed changes (some apps cached, some not)
  2. Monitor the "Jobs" view in GitHub Actions

Expected Results:

  • ✅ Up to 20 build-external-apps jobs run concurrently
  • ✅ Up to 20 restore-cached-apps jobs run concurrently
  • ✅ Both job types run in parallel (not sequentially)
  • ✅ Total time is much less than sequential execution

Test Scenario 9: Artifact Assembly

This tests the final artifact creation with mixed built/restored apps.

Steps:

  1. Complete any of the incremental build scenarios above
  2. Wait for build-artifact job to complete
  3. Download the final artifact

Expected Results:

  • ✅ Final artifact contains all 25+ apps
  • ✅ Mix of freshly built and cache-restored apps
  • ✅ All apps are functional and complete
  • ✅ No missing files or broken builds
  • ✅ Artifact size is consistent with previous builds

Verification:

# Download and extract nc-workspace.zip
unzip nc-workspace.zip -d test-extract

# Verify all apps are present
ls test-extract/apps-external/

# Check a built app has compiled assets
ls test-extract/apps-external/mail/js/
ls test-extract/apps-external/mail/css/

# Check a restored app also has compiled assets
ls test-extract/apps-external/calendar/js/
ls test-extract/apps-external/calendar/css/

Test Scenario 10: Error Recovery with Retries

This tests the retry logic for artifact uploads.

Steps:

  1. Monitor a workflow run during periods of potential GitHub API instability
  2. Or simulate by checking logs for retry attempts

Expected Results:

  • ✅ Transient failures are automatically retried (up to 3 attempts)
  • ✅ Exponential backoff is applied (10s, 20s, 40s)
  • ✅ Build succeeds even with intermittent failures
  • ✅ Retry attempts are logged clearly

Test Scenario 11: Artifact Comparison (Old vs New Pipeline)

This critical test validates that the optimized pipeline produces identical artifacts to the original pipeline.

Steps:

  1. Trigger a build using the original workflow (full rebuild):

    # Manually trigger the original workflow or use the preserved version
    gh workflow run build-artifact-original.yml --ref <branch>
    
    # Wait for completion and download artifact
    gh run download <run-id> --name nc-workspace.zip --dir old-pipeline-artifact
  2. Trigger a build using the optimized workflow (with full cache invalidation):

    # Ensure full rebuild by incrementing cache version or deleting all caches
    # Then trigger the optimized workflow
    gh workflow run build-artifact.yml --ref <branch>
    
    # Wait for completion and download artifact
    gh run download <run-id> --name nc-workspace.zip --dir new-pipeline-artifact
  3. Compare the artifacts:

    # Extract both artifacts
    cd old-pipeline-artifact
    unzip nc-workspace.zip -d old-extract
    cd ..
    
    cd new-pipeline-artifact
    unzip nc-workspace.zip -d new-extract
    cd ..
    
    # Compare directory structures
    diff -r old-pipeline-artifact/old-extract new-pipeline-artifact/new-extract \
      --exclude=".git" \
      --exclude="node_modules" \
      --exclude=".cache" \
      --exclude="*.log" > artifact-diff.txt
    
    # Check file counts
    echo "Old pipeline file count:"
    find old-pipeline-artifact/old-extract -type f | wc -l
    
    echo "New pipeline file count:"
    find new-pipeline-artifact/new-extract -type f | wc -l
    
    # Compare specific critical files
    diff old-pipeline-artifact/old-extract/apps-external/mail/js/mail.js \
         new-pipeline-artifact/new-extract/apps-external/mail/js/mail.js
    
    # Check compiled asset sizes
    du -sh old-pipeline-artifact/old-extract/apps-external/*/js
    du -sh new-pipeline-artifact/new-extract/apps-external/*/js
  4. Compare file checksums for critical apps:

    # Generate checksums for old pipeline
    cd old-pipeline-artifact/old-extract/apps-external
    find . -type f -name "*.js" -o -name "*.css" | sort | xargs md5sum > ../../../old-checksums.txt
    cd ../../..
    
    # Generate checksums for new pipeline
    cd new-pipeline-artifact/new-extract/apps-external
    find . -type f -name "*.js" -o -name "*.css" | sort | xargs md5sum > ../../../new-checksums.txt
    cd ../../..
    
    # Compare checksums
    diff old-checksums.txt new-checksums.txt
  5. Functional validation:

    # Deploy both artifacts to test environments and verify:
    # - All apps are present and enabled
    # - App versions match
    # - Frontend assets load correctly
    # - No console errors
    
    # Check app versions in both deployments
    php occ app:list --output=json | jq '.enabled' > old-apps.json  # Old deployment
    php occ app:list --output=json | jq '.enabled' > new-apps.json  # New deployment
    diff old-apps.json new-apps.json

Expected Results:

  • ✅ Both pipelines produce identical file counts
  • ✅ No differences in directory structure
  • ✅ Compiled JavaScript/CSS files are byte-for-byte identical
  • ✅ File checksums match for all built assets
  • ✅ Both artifacts have same total size (±1%)
  • ✅ All apps function identically in both deployments
  • ✅ No missing or extra files in either artifact
  • ✅ appinfo/info.xml files are identical
  • ✅ Composer vendor directories match
  • ✅ No differences in configuration files

Acceptable Differences:
The following differences are acceptable and expected:

  • Timestamps in build logs or metadata files
  • Random UUIDs or session IDs in generated files
  • Source map file paths (if they reference absolute paths)
  • Build cache directories (should be excluded from comparison)

Critical Files to Verify:

# Verify these key files are identical:
apps-external/*/appinfo/info.xml
apps-external/*/js/*.js
apps-external/*/css/*.css
apps-external/*/lib/**/*.php
config/config.php
version.php

Troubleshooting:
If differences are found:

  1. Check if differences are in excluded directories (node_modules, .git)
  2. Verify both builds used same commit/submodule SHAs
  3. Check for timestamp-only differences
  4. Investigate non-deterministic build processes
  5. Ensure both builds used same Node.js/PHP versions

Success Criteria:

  • Zero functional differences between artifacts
  • All user-facing compiled assets (JS/CSS) are identical
  • All PHP backend code is identical
  • App configurations match exactly

🔍 Validation Checklist

For reviewers, please verify:

  • Cache Detection Logic

    • Apps are correctly identified as cached/not cached
    • SHA comparison is accurate
    • Cache keys are properly formatted
  • Build Execution

    • Only changed apps are rebuilt
    • Cached apps are restored correctly
    • All apps end up in final artifact
  • Performance

    • Incremental builds are significantly faster
    • Cache hit rate is reported accurately
    • Parallel execution works as expected
  • Error Handling

    • Invalid cache entries are detected
    • Missing apps are handled gracefully
    • Retry logic works for transient failures
  • Observability

    • Workflow summary shows cache status
    • Build reports are clear and helpful
    • Job dependencies are correct
  • Artifact Integrity

    • New pipeline produces identical artifacts to old pipeline
    • All compiled assets (JS/CSS) are byte-for-byte identical
    • No missing or corrupted files
    • File checksums match between pipelines
    • Functional testing passes for both artifacts

📈 Metrics to Monitor

After deployment, monitor these metrics:

  1. Build Time Reduction

    • Average build time for PRs
    • Build time for full rebuilds
    • Time savings percentage
  2. Cache Hit Rate

    • Overall cache hit rate
    • Per-app cache hit frequency
    • Cache invalidation frequency
  3. Resource Usage

    • GitHub Actions minutes consumed
    • Bandwidth for artifact transfers
    • Storage used for caches
  4. Reliability

    • Build failure rate
    • Retry frequency
    • Cache validation failures

🚀 Rollout Plan

Phase 1: Testing (Current)

  • Deploy to feature branch
  • Run comprehensive test scenarios
  • Gather feedback from team

Phase 2: Canary (After approval)

  • Merge to ionos-dev branch
  • Monitor for 1-2 weeks
  • Collect metrics

Phase 3: Production

  • Merge to ionos-stable branch
  • Full rollout to production builds
  • Continue monitoring

Rollback Plan

If issues are discovered:

  1. Revert to build-artifact-original.yml
  2. Rename original workflow back to build-artifact.yml
  3. Investigate and fix issues
  4. Re-test before re-deployment

Checklist

@printminion-co printminion-co force-pushed the mk/poc/dont_rebuild_already_built_apps branch from fb01de5 to 331ca61 Compare November 11, 2025 16:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes the GitHub Actions build workflow by implementing cache-based detection to avoid rebuilding apps that haven't changed. The optimization checks for cached builds using each app's git SHA, only building apps when no cached version exists, significantly reducing build time through smart caching.

Key changes:

  • Added cache detection logic to identify apps that have already been built
  • Introduced separate jobs for building new apps vs. restoring cached apps
  • Added the password_policy app to the build matrix

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
IONOS Updated subproject commit reference
.github/workflows/build-artifact.yml Implemented cache-based app build detection, added cache save/restore logic, split build and cache restoration into separate jobs
.github/workflows/build-artifact-original.yml New file containing the original unoptimized workflow for reference

Copy link

Copilot AI commented Nov 12, 2025

@printminion-co I've opened a new pull request, #128, to work on those changes. Once the pull request is ready, I'll request review from you.

@printminion-co printminion-co force-pushed the mk/poc/dont_rebuild_already_built_apps branch 4 times, most recently from 62e4dc9 to 81e70b8 Compare November 12, 2025 11:21
@printminion-co printminion-co changed the base branch from ionos-dev to ionos-donot-merge November 12, 2025 13:36
@printminion-co printminion-co force-pushed the mk/poc/dont_rebuild_already_built_apps branch from 30abc08 to 9460269 Compare November 12, 2025 13:52
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

Copy link

Copilot AI commented Nov 12, 2025

@printminion-co I've opened a new pull request, #130, to work on those changes. Once the pull request is ready, I'll request review from you.

@printminion-co printminion-co force-pushed the mk/poc/dont_rebuild_already_built_apps branch 4 times, most recently from c833229 to 283955e Compare November 12, 2025 19:22
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

@printminion-co printminion-co force-pushed the mk/poc/dont_rebuild_already_built_apps branch from 89b742e to acd2289 Compare November 13, 2025 12:01
Copy link

Copilot AI commented Nov 13, 2025

@printminion-co I've opened a new pull request, #131, to work on those changes. Once the pull request is ready, I'll request review from you.

@printminion-co printminion-co force-pushed the mk/poc/dont_rebuild_already_built_apps branch from 047e72f to 09c736f Compare November 13, 2025 12:07
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 16 comments.

Comments suppressed due to low confidence (1)

.github/workflows/build-artifact.yml:405

  • The restore-cached-apps job doesn't include a checkout step before restoring the cache. While the cache action will restore files to the specified path, the subsequent upload-artifact step (line 399-405) references ${{ matrix.app.path }} which is a relative path.

Without a workspace context established by a checkout, this relative path resolution might not work as expected. The restored cache files will be uploaded, but the directory structure might not match what's expected in the build-artifact job.

Consider adding a minimal checkout step (even without submodules) to establish the workspace structure, or verify that the artifact upload correctly handles the restored files without a checkout context.

      - name: Restore cached build from cache
        uses: actions/cache/restore@v4
        with:
          path: ${{ matrix.app.path }}
          key: ${{ env.CACHE_VERSION }}-app-build-${{ matrix.app.name }}-${{ matrix.app.sha }}
          fail-on-cache-miss: true

      - name: Validate cached build
        run: |
          APP_PATH="${{ matrix.app.path }}"

          # Check that the directory exists and is not empty
          if [ ! -d "$APP_PATH" ] || [ -z "$(ls -A $APP_PATH)" ]; then
            echo "❌ Cache validation failed: Directory is empty or missing"
            exit 1
          fi

          # Check for appinfo/info.xml (required for all Nextcloud apps)
          if [ ! -f "$APP_PATH/appinfo/info.xml" ]; then
            echo "❌ Cache validation failed: Missing appinfo/info.xml"
            exit 1
          fi

          echo "✅ Cache validation passed for ${{ matrix.app.name }}"

      - name: Upload cached ${{ matrix.app.name }} build artifacts
        uses: actions/upload-artifact@v5
        with:
          retention-days: 1
          name: external-app-build-${{ matrix.app.name }}
          path: |
            ${{ matrix.app.path }}

@printminion-co printminion-co force-pushed the mk/poc/dont_rebuild_already_built_apps branch 2 times, most recently from bfd419e to 6afffe5 Compare November 13, 2025 14:25
@bromiesTM bromiesTM force-pushed the ionos-donot-merge branch 2 times, most recently from 413bc91 to 4ceb9f7 Compare November 20, 2025 14:28
seriAlizations added a commit that referenced this pull request Nov 25, 2025
seriAlizations added a commit that referenced this pull request Nov 25, 2025
@seriAlizations seriAlizations force-pushed the mk/poc/dont_rebuild_already_built_apps branch from 6963427 to 6afffe5 Compare November 25, 2025 14:38
@seriAlizations seriAlizations self-requested a review December 2, 2025 08:23
Copy link

@seriAlizations seriAlizations left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and compared new pipeline with Misha and LGTM, only problem I see is changes in IONOS config submodule not being registered as changes.

for later performance comparison

Signed-off-by: Misha M.-Kupriyanov <[email protected]>
in order not to overwrite optimized artifact

Signed-off-by: Misha M.-Kupriyanov <[email protected]>
lets use underscore

Signed-off-by: Misha M.-Kupriyanov <[email protected]>
drop validation since it is now Redundant (by design)

Signed-off-by: Misha M.-Kupriyanov <[email protected]>
Optimized build workflow that uses cache-based detection
 - Checks cache for each app's current SHA
 - Only builds apps with no cached build
 - Significantly reduces build time through smart caching

Signed-off-by: Misha M.-Kupriyanov <[email protected]>
@printminion-co printminion-co force-pushed the mk/poc/dont_rebuild_already_built_apps branch from 6afffe5 to 80e40e2 Compare December 2, 2025 12:38
@printminion-co printminion-co merged commit f4430d4 into ionos-donot-merge Dec 2, 2025
36 of 38 checks passed
@printminion-co printminion-co deleted the mk/poc/dont_rebuild_already_built_apps branch December 2, 2025 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants