Skip to content

Conversation

@callebtc
Copy link
Collaborator

Description

Here’s a focused code/diff review of the background-persistence work, centered on long‑running foreground service behavior, mesh reliability, notifications, and UI state retention.

Scope Reviewed

  • Git branch: background-persistence
  • Key changes: Foreground service + boot receiver, process‑wide service holder + prefs, UI and VM wiring to service, App-wide init, notification strings.
  • Files: AndroidManifest, BitchatApplication, MainActivity, BluetoothMeshService, MeshForegroundService, MeshServiceHolder, MeshServicePreferences, BootCompletedReceiver,
    AppStateStore, related UI handlers.

Overall Assessment

  • Quality: Good direction with solid separation of concerns. The MeshForegroundService and MeshServiceHolder clearly aim to establish a stable, app‑wide mesh lifecycle. The
    AppStateStore helps mitigate Activity recreation.
  • Organization: Changes are cohesive and discoverable. Manifest/perms align with intended behavior (connectedDevice|dataSync types). UI delegates and service responsibilities are
    thoughtfully split.
  • Risk areas: Foreground service lifecycle edge cases, process death/restart handling, long‑run memory growth, and background service launch policy on newer Android versions.

Strengths

  • Clear FGS channel, notification, and periodic refresh loop with permission gating.
  • Mesh service delegated notifications when UI delegate is detached (good for background DMs).
  • AppStateStore allows UI hydration after Activity recreation.
  • MainActivity cleanly reattaches the mesh delegate and switches background/foreground state.
  • Permissions and battery optimization are considered in onboarding flow.
  • Service type declarations (connectedDevice|dataSync) match BLE use.

Potential Issues Impacting Long-Run Reliability

  • Foreground service start semantics:
    • Application.onCreate calls MeshForegroundService.start(). If POST_NOTIFICATIONS or BL perms are not granted yet, you fall back to startService(); on modern Android, background
      service starts can be restricted or killed if not foregrounded. Risk: service runs without FGS or gets killed before promotion, especially on OEMs with aggressive policies.
    • BootCompletedReceiver unconditionally calls MeshForegroundService.start(). At boot, POST_NOTIFICATIONS (API 33+) is typically not granted, so shouldStartAsForeground()
      returns false and you call startService() from a background context. Risk: background start blocked/killed; also locked boot may run before user unlock and storage/keys might be
      unavailable.
  • FGS promotion and 5s rule:
    • If startForegroundService() is called (only when permissions allow), the service must call startForeground() within 5 seconds or crash. Your shouldStartAsForeground() check
      mitigates this by only using startForegroundService() when both BL and notification perms are granted. That’s good, but edge conditions remain:
      • If perms become available mid‑run, the loop calls startForeground() in updateJob every 5s—OK—but if you ever call startForegroundService() without perms (e.g., future code
        path), you may hit the 5s crash.
  • Service instance lifecycle and coroutine scope:
    • BluetoothMeshService has serviceScope = CoroutineScope(Dispatchers.IO + SupervisorJob()) and stopServices() cancels that scope. If the same service instance is reused
      afterwards (no re-instantiation), subsequent startServices() will try to use a cancelled scope. Risk: periodic tasks, syncs, and internal coroutines won’t restart, causing silent
      malfunction after a stop→start cycle (including process recreation if MeshServiceHolder still references the old instance).
    • MeshServiceHolder currently returns the existing instance without checking its liveness. If the OS destroys the service or you stop it intentionally, MeshServiceHolder can
      hand out a “dead” instance with cancelled scope.
  • Memory growth over very long runs:
    • AppStateStore accumulates messages (public, private, channel) and a global seen set without pruning. Over long sessions this will grow unbounded, increasing memory pressure
      and risking OOM. Also, UI state won’t persist across process death since this store is in‑memory only.
  • Background update loop battery cost:
    • MeshForegroundService updates every 5 seconds; that’s frequent. Wakes + updates can add up over long periods, especially on lower‑end devices. Consider backing off or
      event‑based updates.
  • Boot/locked-boot timing and storage keys:
    • Boot receiver listens to both BOOT_COMPLETED and LOCKED_BOOT_COMPLETED. Starting mesh before device unlock may cause failures for components that need credential‑encrypted
      storage (e.g., Noise keys). No guard is present to defer starting until after user unlock if needed.
  • Notification icon:
    • Using R.mipmap.ic_launcher for notifications can render poorly on newer Android where a non‑adaptive, monochrome small icon is recommended. Not a functional blocker but worth
      adjusting.
  • Service scope cleanup:
    • MeshForegroundService cancels updateJob in onDestroy but not the service’s scope job. Not a leak in practice since only updateJob runs, but better to cancel the scope to be
      safe.
  • Fallbacks when perms are missing:
    • The service runs START_STICKY and keeps looping even when perms are missing. This is okay, but on devices with background restrictions it can keep waking only to fail
      promotion, with no explicit backoff or stopSelf strategy until user grants perms.
  • UI state persistence across process death:
    • Current approach restores state only if process is alive (via AppStateStore). After a long time, the OS may kill the process; on reopening, message state is gone until mesh
      repopulates. If “preserve latest UI state” means last seen chats/messages across process death, consider persistence.

Protocol-Specific Observations

  • Periodic peer announcements and sync start on startServices(), good for ongoing discovery. Ensure reconnections continue after long idle times; the cancel/restart bug mentioned
    above could silently prevent these tasks from resuming after a stop.
  • Duplicate handling and SecurityManager reset on peer removal are in place.
  • Background DM notifications when no delegate is attached are implemented; ensure notification permission gating is enforced to avoid SecurityExceptions on API 33+ (looks covered
    by the NotificationManager path).

Recommended Next Steps

  • Instance liveness and scope fix:
    • Add an “isAlive” or “hasActiveScope” check to BluetoothMeshService, or expose isActive, and have MeshServiceHolder create a new instance if the old one is inactive or its
      scope was cancelled.
    • Alternatively, refactor BluetoothMeshService to create a new serviceScope inside startServices() if the previous one was cancelled.
  • Foreground service start policy:
    • Avoid starting the service from Application.onCreate unless the app is in foreground and perms are satisfied. Gate this behind a setting, UI action, or defer until Activity
      starts and permissions are granted.
    • For boot: only start FGS when requirements are met. If not, schedule a retry via WorkManager with constraints (e.g., device unlocked, BL available) or defer to first user
      unlock.
  • Handle locked boot:
    • If keys or prefs require credential-encrypted storage, skip starting the mesh on LOCKED_BOOT_COMPLETED and wait for BOOT_COMPLETED/user unlock broadcasts before initializing.
  • Notification update cadence:
    • Increase update interval (e.g., 30–60s) or update on mesh events (peer list changes) to reduce wakeups. You already trigger notifications on peer updates; rely on that and
      only run a slow heartbeat as fallback.
  • AppStateStore memory management:
    • Add pruning (e.g., cap per list to last N messages, e.g., 500–1000 per stream).
    • Persist lightweight state as needed (e.g., last selected chat, unread counts) to survive process death.
    • Consider moving message source-of-truth to persistent storage (Room) if you want UI continuity after process kill.
  • Service scope cleanup:
    • In MeshForegroundService.onDestroy(), cancel the service’s scope as well (or ensure nothing else is running on it). Also consider stopSelf() when disabled and no perms for an
      extended period.
  • Notification icon:
    • Provide a proper small monochrome notification icon resource for best compatibility.
  • Defensive checks in BootCompletedReceiver:
    • Validate permissions and, on API 33+, the POST_NOTIFICATIONS state; if not granted, avoid starting any service and maybe schedule a gentle reminder or wait for app launch.
  • Testing checklist to validate long-run behavior:
    • Grant/not grant notification/BL perms and verify: no 5-second FGS crash; background start behavior on O+; app reopen after hours shows updated peer count and notifications
      work.
    • Kill the app process; verify reopening hydrates UI reasonably and mesh resumes; ensure MeshServiceHolder returns a fresh, live service if needed.
    • Revoke POST_NOTIFICATIONS on API 33+ and ensure service doesn’t attempt FGS promotion; no SecurityExceptions from NotificationManager path.
    • Device reboot with/without user unlock; verify boot receiver behavior and deferral if storage not available.
    • Extended runtime (>12h) memory profile to check AppStateStore growth.

If you’d like, I can patch:

  • MeshServiceHolder to recreate the service when inactive/cancelled.
  • BluetoothMeshService to reinitialize serviceScope safely on restart.
  • MeshForegroundService to cancel its scope onDestroy and back off updates.
  • Boot receiver gating to avoid background starts when perms/conditions aren’t satisfied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants