Conversation
Integrate HDSentinel into the shared drive-health flow so backup-drive health changes surface in the UI and alerts while staying testable in fake mode.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
This PR integrates HDSentinel into the existing drive-health flow so backup-drive health (via SMART + ML prediction and HDSentinel status) is surfaced in the UI, scheduled checks, and the summary API while remaining usable in fake mode.
Changes:
- Added a unified
drive_health.pymodule to centralize SMART telemetry/prediction plus HDSentinel snapshot/state + alerting. - Updated the Drive Health UI + endpoint handling to display HDSentinel status and allow toggling HDSentinel monitoring/health-change alerts.
- Switched the systemd health-check entrypoint from
check_health.shtocheck_health.pyand updated install/docs to support HDSentinel installation.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| templates/drive_health.html | Adds HDSentinel monitoring section + settings form and routes POST actions via form_action. |
| app.py | Uses the new shared drive-health module; adds HDSentinel data to UI flow and API summary. |
| drive_health.py | New shared implementation for SMART read/prediction/telemetry and HDSentinel snapshot + state + alerting. |
| scripts/check_health.py | New systemd-friendly Python entrypoint that runs the scheduled health check and prints a concise summary. |
| scripts/check_health.sh | Reduced to a compatibility wrapper that delegates to check_health.py. |
| system_utils.py | Installs the new script and updates systemd service ExecStart + config generation for HDSentinel settings. |
| install.sh | Adds optional HDSentinel download/install and ensures required packages are installed. |
| config_manager.py | Adds default [hdsentinel] settings to the generated config. |
| docs/manual_install.md | Documents installing the vendor HDSentinel binary and required packages. |
| docs/drive_health.md | Updates Drive Health documentation to include HDSentinel status/settings + alert behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| except Exception as exc: | ||
| settings_error = f"Failed to save HDSentinel settings: {exc}" |
There was a problem hiding this comment.
The UI error banner for saving HDSentinel settings interpolates the raw exception text into the response. This can leak internal details (paths, config values, stack-adjacent messages) to any authenticated user. Consider logging the exception server-side and returning a generic user-facing message (optionally with a short correlation ID).
| hdsentinel_enabled = False | ||
| if isinstance(hdsentinel_settings, dict): | ||
| hdsentinel_enabled = bool(hdsentinel_settings.get('enabled')) | ||
| else: | ||
| hdsentinel_enabled = bool(getattr(hdsentinel_settings, 'enabled', False)) |
There was a problem hiding this comment.
get_hdsentinel_settings() always returns a dict, but the API summary code includes a dict-vs-attribute branch. This extra branching is dead code and makes the endpoint harder to read/maintain; simplify to a single dict access (and keep the payload shape stable).
| hdsentinel_enabled = False | |
| if isinstance(hdsentinel_settings, dict): | |
| hdsentinel_enabled = bool(hdsentinel_settings.get('enabled')) | |
| else: | |
| hdsentinel_enabled = bool(getattr(hdsentinel_settings, 'enabled', False)) | |
| hdsentinel_enabled = bool(hdsentinel_settings.get('enabled')) |
| ) as tmp_file: | ||
| json.dump(payload, tmp_file, indent=2) | ||
| tmp_path = Path(tmp_file.name) | ||
| tmp_path.replace(path) |
There was a problem hiding this comment.
_write_json_atomically writes the HDSentinel state file but never sets permissions on the resulting file. Because snapshots include drive serial/model data, it’s safer to explicitly restrict readability (e.g., chmod 0o600 after replace) and/or ensure the parent dir permissions are appropriate.
| tmp_path.replace(path) | |
| tmp_path.replace(path) | |
| if os.name == "posix": | |
| try: | |
| os.chmod(path, 0o600) | |
| except OSError as exc: | |
| LOGGER.warning("Failed to set permissions on %s: %s", path, exc) |
| case "$arch" in | ||
| amd64) | ||
| url="https://www.hdsentinel.com/hdslin/hdsentinel-020c-x64.zip" | ||
| ;; | ||
| arm64) | ||
| url="https://www.hdsentinel.com/hdslin/hdsentinel-armv8.zip" | ||
| ;; | ||
| *) | ||
| echo -e "${YELLOW}HDSentinel auto-install skipped: unsupported architecture '${arch:-unknown}'.${NC}" | ||
| return 0 | ||
| ;; | ||
| esac | ||
|
|
||
| tmpdir=$(mktemp -d) | ||
| if ! curl -L --fail --output "$tmpdir/hdsentinel.zip" "$url"; then | ||
| echo -e "${YELLOW}HDSentinel download failed. Continuing without it.${NC}" | ||
| rm -rf "$tmpdir" | ||
| return 0 | ||
| fi | ||
|
|
||
| if ! unzip -o "$tmpdir/hdsentinel.zip" -d "$tmpdir" >/dev/null; then | ||
| echo -e "${YELLOW}HDSentinel extraction failed. Continuing without it.${NC}" | ||
| rm -rf "$tmpdir" | ||
| return 0 | ||
| fi | ||
|
|
||
| if [ ! -f "$tmpdir/HDSentinel" ]; then | ||
| echo -e "${YELLOW}HDSentinel binary not found in downloaded archive. Continuing without it.${NC}" | ||
| rm -rf "$tmpdir" | ||
| return 0 | ||
| fi | ||
|
|
||
| install -m 755 "$tmpdir/HDSentinel" "$HDSENTINEL_BIN" | ||
| rm -rf "$tmpdir" | ||
| echo -e "${GREEN}✔ HDSentinel installed to $HDSENTINEL_BIN.${NC}\n" |
There was a problem hiding this comment.
The installer downloads and installs a third-party binary (HDSentinel) over the network without any integrity verification (checksum/signature) or pinning beyond a URL. This is a supply-chain risk; consider verifying a published checksum/signature, or making the install an explicit opt-in step with clear instructions for manual verification.
| if [ -x "/opt/SimpleSaferServer/scripts/check_health.py" ]; then | ||
| exec /usr/bin/python3 "/opt/SimpleSaferServer/scripts/check_health.py" | ||
| fi | ||
|
|
||
| if [ -z "$parent_device" ]; then | ||
| send_email "Drive Health Check Failed - Cannot Determine Parent Device" "Cannot determine parent device for $partition_device" | ||
| exit 1 | ||
| fi | ||
|
|
||
| echo "Checking health of device: $parent_device" | ||
|
|
||
| # Get SMART attributes using smartctl | ||
| echo "Getting SMART attributes..." | ||
| smart_attrs=$(smartctl -A "$parent_device" | grep -E "^( [0-9]+| [0-9]+)" | grep -E "^( 5| 9| 12|177|194|197|198|199|200|201|202|203|204|205|206|207|208|209|211|212|220|221|222|223|224|225|226|227|228|230|231|232|233|234|235|240|241|242|250|251|252|254|255)" | awk '{print $2 " " $10}') | ||
|
|
||
| if [ -z "$smart_attrs" ]; then | ||
| send_email "Drive Health Check Failed - No SMART Data" "Could not retrieve SMART data from $parent_device" | ||
| exit 1 | ||
| fi | ||
|
|
||
| # Convert to JSON format for Python script | ||
| json_data="{\"smart_attrs\":{" | ||
| while read -r line; do | ||
| if [ ! -z "$line" ]; then | ||
| attr_id=$(echo $line | awk '{print $1}') | ||
| value=$(echo $line | awk '{print $2}') | ||
| json_data="${json_data}\"$attr_id\":$value," | ||
| fi | ||
| done <<< "$smart_attrs" | ||
| json_data="${json_data%,}}}" | ||
|
|
||
| echo "SMART data collected, making health prediction..." | ||
|
|
||
| # Call Python script to make prediction | ||
| prediction=$(python3 /opt/SimpleSaferServer/scripts/predict_health.py "$json_data") | ||
|
|
||
| if [ $? -ne 0 ]; then | ||
| send_email "Drive Health Check Failed - Prediction Error" "Error running health prediction script" | ||
| exit 1 | ||
| fi | ||
|
|
||
| # Parse prediction result | ||
| probability=$(echo $prediction | jq -r '.probability') | ||
| prediction_result=$(echo $prediction | jq -r '.prediction') | ||
|
|
||
| if [ "$prediction_result" = "1" ]; then | ||
| send_email "Drive Health Warning" "Drive health check failed with probability $probability. Drive: $parent_device" | ||
| exit 1 | ||
| else | ||
| echo "Drive health check passed with probability $probability" | ||
| exit 0 | ||
| fi No newline at end of file | ||
| SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" | ||
| exec /usr/bin/python3 "$SCRIPT_DIR/check_health.py" |
There was a problem hiding this comment.
This script hard-codes the Python interpreter path as /usr/bin/python3. On some distros/containers python3 may live elsewhere; using /usr/bin/env python3 (or relying on the shebang in check_health.py) makes the wrapper more portable.
Integrate HDSentinel into the shared drive-health flow so backup-drive health changes surface in the UI and alerts while staying testable in fake mode.