HYPERFLEET-752 | ci: Improve E2E CI Test deployment logic#51
HYPERFLEET-752 | ci: Improve E2E CI Test deployment logic#51yingzhanredhat wants to merge 1 commit intoopenshift-hyperfleet:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
WalkthroughAdds centralized debug-log capture and automated cleanup to CLM deploy scripts. Introduces DEBUG_LOG_DIR (default: ${PROJECT_ROOT}/.debug-work) and CLI flag Sequence DiagramsequenceDiagram
actor User
participant DeployScript as Deploy Script
participant Helm
participant Kubernetes
participant DebugLogs as Debug Log Capture
User->>DeployScript: Run install (optional --debug-log-dir)
DeployScript->>Helm: helm upgrade --install <release>-<rand>
alt Helm install succeeds
Helm-->>DeployScript: Release created
DeployScript->>Kubernetes: Run health check probe
alt Health check passes
Kubernetes-->>DeployScript: Healthy
DeployScript-->>User: Installation complete
else Health check fails
Kubernetes-->>DeployScript: Unhealthy
DeployScript->>DebugLogs: capture_debug_logs(namespace, selector, release, DEBUG_LOG_DIR)
DebugLogs->>Kubernetes: kubectl logs/describe/events/workloads/services...
Kubernetes-->>DebugLogs: Diagnostics collected
DebugLogs-->>DeployScript: Logs saved
DeployScript->>Helm: helm uninstall <release>-<rand> -n <ns> --wait --timeout 5m
Helm-->>DeployScript: Uninstall result
DeployScript-->>User: Installation failed
end
else Helm install fails
Helm-->>DeployScript: Install failed
DeployScript->>Helm: helm list -n <ns> --selector adapter-resource-type=...,adapter-name=... -q
alt Matching releases found
Helm-->>DeployScript: Releases listed
DeployScript->>DebugLogs: capture_debug_logs(namespace, selector, release, DEBUG_LOG_DIR)
DebugLogs->>Kubernetes: kubectl logs/describe/events/workloads/services...
DebugLogs-->>DeployScript: Logs saved
DeployScript->>Helm: helm uninstall <matching-release> -n <ns> --wait --timeout 5m
Helm-->>DeployScript: Uninstall result(s)
else No labeled releases
DeployScript->>Helm: fallback: list by release-name prefix
Helm-->>DeployScript: Releases listed
DeployScript->>Helm: helm uninstall <matching-release> -n <ns> --wait --timeout 5m
end
DeployScript-->>User: Installation failed
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
deploy-scripts/deploy-clm.sh (1)
445-459: Debug log preservation logic may fail if DEBUG_LOG_DIR is outside WORK_DIR.If a user specifies
--debug-log-dir /var/log/hyperfleet(a path outsideWORK_DIR), this preservation logic would unnecessarily move logs to a temp directory and back. Worse, if the custom path doesn't exist initially, the check passes but subsequent operations might fail.Consider adding a check to only preserve when DEBUG_LOG_DIR is a subdirectory of WORK_DIR:
♻️ Proposed fix
# Clean up work directory (but preserve debug logs) if [[ "${DRY_RUN}" == "false" && "${VERBOSE}" == "false" ]]; then log_verbose "Cleaning up work directory" # Preserve debug logs if they exist - if [[ -d "${DEBUG_LOG_DIR}" ]]; then + if [[ -d "${DEBUG_LOG_DIR}" && "${DEBUG_LOG_DIR}" == "${WORK_DIR}"/* ]]; then local temp_debug_dir temp_debug_dir=$(mktemp -d) mv "${DEBUG_LOG_DIR}" "${temp_debug_dir}/debug-logs" 2>/dev/null || true rm -rf "${WORK_DIR}" mkdir -p "${WORK_DIR}" mv "${temp_debug_dir}/debug-logs" "${DEBUG_LOG_DIR}" 2>/dev/null || true rm -rf "${temp_debug_dir}" else rm -rf "${WORK_DIR}" fi fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy-scripts/deploy-clm.sh` around lines 445 - 459, The preservation logic for DEBUG_LOG_DIR may move or touch paths outside WORK_DIR; update the cleanup block to first resolve paths (e.g., realpath) and only perform the temp-preserve dance when DEBUG_LOG_DIR exists and its resolved path is under WORK_DIR's resolved path (starts-with check); if DEBUG_LOG_DIR is outside WORK_DIR or doesn't exist, skip moving it and simply rm -rf "${WORK_DIR}" as before; reference the DEBUG_LOG_DIR and WORK_DIR variables and the existing temp_debug_dir usage when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deploy-scripts/deploy-clm.sh`:
- Around line 445-459: The preservation logic for DEBUG_LOG_DIR may move or
touch paths outside WORK_DIR; update the cleanup block to first resolve paths
(e.g., realpath) and only perform the temp-preserve dance when DEBUG_LOG_DIR
exists and its resolved path is under WORK_DIR's resolved path (starts-with
check); if DEBUG_LOG_DIR is outside WORK_DIR or doesn't exist, skip moving it
and simply rm -rf "${WORK_DIR}" as before; reference the DEBUG_LOG_DIR and
WORK_DIR variables and the existing temp_debug_dir usage when making this
change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: de36bfd4-cfa9-4f04-a8bc-5b0f18af6067
📒 Files selected for processing (5)
deploy-scripts/deploy-clm.shdeploy-scripts/lib/adapter.shdeploy-scripts/lib/api.shdeploy-scripts/lib/common.shdeploy-scripts/lib/sentinel.sh
|
/test lint |
Code reviewFound 1 issues:
hyperfleet-e2e/deploy-scripts/deploy-clm.sh Lines 446 to 460 in 1ae2387 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
1ae2387 to
4558a57
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deploy-scripts/lib/common.sh`:
- Around line 195-201: The debug-log capture currently allows empty WORK_DIR and
ignores failures from directory creation and capture commands; update the logic
around the output_dir and capture steps so failures are propagated: validate
that WORK_DIR (used when computing output_dir) is non-empty before defaulting,
check the return code of mkdir -p for "${output_dir}" and exit non-zero with an
error log via log_section/processLogger on failure, and similarly add explicit
error checks for the capture commands referenced around lines 259-264 so any
failed capture causes the function to log the error and return a non-zero exit
code instead of silently succeeding.
- Around line 202-205: The current timestamp used to build log_prefix (variables
timestamp and log_prefix, referencing output_dir and component_name) has only
second-level granularity and can collide; update timestamp generation to be
collision-resistant by including higher-resolution time and a unique process
identifier (e.g., use date +"%Y%m%d-%H%M%S-%N" and append $$ or similar) or use
a safe unique generator (mktemp/uuid) and then rebuild
log_prefix="${output_dir}/${component_name}-${timestamp}" so concurrent runs
cannot overwrite each other.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3d925e03-58fb-4db5-bc75-505579580ac6
📒 Files selected for processing (5)
deploy-scripts/deploy-clm.shdeploy-scripts/lib/adapter.shdeploy-scripts/lib/api.shdeploy-scripts/lib/common.shdeploy-scripts/lib/sentinel.sh
🚧 Files skipped from review as they are similar to previous changes (4)
- deploy-scripts/deploy-clm.sh
- deploy-scripts/lib/api.sh
- deploy-scripts/lib/sentinel.sh
- deploy-scripts/lib/adapter.sh
4558a57 to
bec5898
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deploy-scripts/deploy-clm.sh`:
- Around line 101-103: The help text for the --debug-log-dir flag is
inconsistent with the actual default
(DEBUG_LOG_DIR="${DEBUG_LOG_DIR:-${PROJECT_ROOT}/.debug-work}"); update the
documented default in the option/usage text for --debug-log-dir to
"${PROJECT_ROOT}/.debug-work" so the help matches the implementation (or
alternatively change the DEBUG_LOG_DIR assignment to use ${WORK_DIR}/debug-logs
if you prefer the documented path); ensure you modify the help string that
references ${WORK_DIR}/debug-logs and keep the flag name --debug-log-dir and
variable DEBUG_LOG_DIR in sync.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 446ca419-15e2-44b0-98da-c957b9c3b2ba
📒 Files selected for processing (5)
deploy-scripts/deploy-clm.shdeploy-scripts/lib/adapter.shdeploy-scripts/lib/api.shdeploy-scripts/lib/common.shdeploy-scripts/lib/sentinel.sh
🚧 Files skipped from review as they are similar to previous changes (3)
- deploy-scripts/lib/api.sh
- deploy-scripts/lib/common.sh
- deploy-scripts/lib/sentinel.sh
bec5898 to
e17776f
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
deploy-scripts/lib/adapter.sh (1)
110-112: Inefficient/dev/urandomread pattern.
head /dev/urandomreads until a newline character, but/dev/urandomrarely produces newlines, causing it to buffer a large chunk (up to 64KB) before piping totr. This wastes entropy and CPU cycles.♻️ Proposed fix
# Generate random suffix to prevent namespace conflicts local random_suffix - random_suffix=$(head /dev/urandom | LC_ALL=C tr -dc 'a-z0-9' | head -c 8) + random_suffix=$(LC_ALL=C tr -dc 'a-z0-9' < /dev/urandom | head -c 8)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deploy-scripts/lib/adapter.sh` around lines 110 - 112, Replace the unbounded "head /dev/urandom" pattern used to generate random_suffix with a fixed-byte read from /dev/urandom so you only consume the entropy you need; update the random_suffix assignment (the random_suffix variable initialization) to read a small, fixed number of bytes (e.g., one block) from /dev/urandom and then filter to [a-z0-9] and cut to 8 characters, rather than piping an open-ended head, to avoid buffering a large chunk and wasting CPU/entropy.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@deploy-scripts/lib/adapter.sh`:
- Around line 110-112: Replace the unbounded "head /dev/urandom" pattern used to
generate random_suffix with a fixed-byte read from /dev/urandom so you only
consume the entropy you need; update the random_suffix assignment (the
random_suffix variable initialization) to read a small, fixed number of bytes
(e.g., one block) from /dev/urandom and then filter to [a-z0-9] and cut to 8
characters, rather than piping an open-ended head, to avoid buffering a large
chunk and wasting CPU/entropy.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7f8e5c39-9c43-4c14-b084-787bdb685d6d
📒 Files selected for processing (5)
deploy-scripts/deploy-clm.shdeploy-scripts/lib/adapter.shdeploy-scripts/lib/api.shdeploy-scripts/lib/common.shdeploy-scripts/lib/sentinel.sh
🚧 Files skipped from review as they are similar to previous changes (1)
- deploy-scripts/lib/sentinel.sh
Summary by CodeRabbit
New Features
Improvements
Bug Fixes