feat: receive relay events via webhook callbacks#1254
feat: receive relay events via webhook callbacks#1254PierreLeGuen wants to merge 17 commits intostagingfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request overhauls the event reception mechanism from the channel-relay service, transitioning from a pull-based Server-Sent Events (SSE) model to a more efficient and secure push-based webhook system. This change simplifies the internal RelayChannel by removing the need for client-side stream management and introduces a new webhook endpoint with HMAC signature verification to ensure the authenticity and integrity of incoming events. The update also includes necessary adjustments to the RelayClient, configuration, and extension management to support this new event delivery architecture, alongside a version bump and artifact updates. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1c2f499741
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Code Review
This pull request refactors the channel-relay integration to use webhook callbacks instead of an SSE stream for receiving events. This is a great simplification, removing complex reconnection and token renewal logic. The changes are well-implemented across the board, from the client and channel logic to configuration and tests. I've added a few suggestions to improve maintainability and robustness.
Replace the SSE pull model with push-based webhook callbacks from channel-relay. Eliminates the reconnect loop, stream token auth, and SSE parser — events arrive via HTTP POST to /relay/events. - Add webhook handler with HMAC signature verification - Simplify RelayChannel to use mpsc from webhook handler - Remove SSE connect/reconnect/parse logic from RelayClient - Add register_callback() to RelayClient for callback URL registration - Update activation flow to create event channel and register callback - Wire relay webhook endpoint into web gateway
e872c3e to
3167976
Compare
- Return 503 when relay event channel is full/closed (enables retry) - Reject malformed timestamps with 400 instead of proceeding - Allow relay activation without settings store (no-store/ephemeral mode) - Check installed_relay_extensions set in is_relay_channel for no-db mode - Fix staging test constructors for new RelayChannel signature
zmanian
left a comment
There was a problem hiding this comment.
Review: REQUEST CHANGES
Good architectural direction -- replacing SSE pull with push-based webhooks, removing ~870 lines of reconnection logic. HMAC verification is correct (constant-time, timestamp freshness). No new dependencies, no .unwrap() in production code, CI green.
Critical
1. Duplicate handler -- dead code with security weakness
src/channels/relay/webhook.rs contains a full handler (RelayWebhookState, webhook_router(), handler function) that is never wired into the route tree. Only the handler in src/channels/web/server.rs is used in production. The dead version has a timestamp bypass bug (malformed timestamp silently passes instead of returning BAD_REQUEST).
Fix: Strip webhook.rs to just verify_relay_signature and its tests. Remove RelayWebhookState, webhook_router, and the duplicate handler.
Important
2. API key used directly as HMAC signing secret
relay_signing_secret() returns the API key bytes as the HMAC secret. If the API key leaks (logs, error messages), the signing secret is also compromised. At minimum, add a comment documenting this design decision. Ideally, support a separate RELAY_SIGNING_SECRET env var.
3. No retry for callback registration
activate_channel_relay does best-effort registration but the comment "will retry on next event" has no implementation. If initial registration fails, the channel is silently broken with no recovery path.
4. Callback URL construction duplicated
Same logic for constructing callback URL from tunnel_url/callback_url/GATEWAY_HOST+PORT appears in both auth_channel_relay and activate_channel_relay. Extract to a helper.
5. Web gateway CLAUDE.md not updated
The new /relay/events route (public, HMAC-authenticated) should be added to the route table in src/channels/web/CLAUDE.md.
What's good
- HMAC signature verification with
subtle::ConstantTimeEq-- correct approach - Timestamp freshness checking (5-minute window) for replay protection
- Clean removal of SSE complexity (~870 lines)
- Tests for webhook flow, HMAC verification, event processing
- No new dependencies
Adapts the relay integration to the hardened channel-relay security model: - Switch from X-API-Key header to Authorization: Bearer sk-agent-* for all relay API calls (chat-api token verification) - Remove register_callback() — PUT /callbacks endpoint removed - Remove event_callback_url from initiate_oauth() — parameter removed - Make signing_secret a required field in RelayConfig (new env var: CHANNEL_RELAY_SIGNING_SECRET) - Update integration tests for Bearer auth and removed endpoints Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ting - Approval flow now calls POST /approvals to register server-side record, then embeds only the opaque approval_token in button value - Remove instance_id parameter from proxy_provider() — channel-relay no longer accepts it (uses verified identity) - Remove instance_id and user_id from initiate_oauth() — channel-relay derives them from the Bearer token - Add create_approval() to RelayClient Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The channel-relay OAuth flow now accepts webhook_url to set the callback_url during connection creation. IronClaw computes its webhook URL from callback_base + webhook_path and passes it during initiate_oauth. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Channel-relay now derives the callback URL from chat-api's instance_url. IronClaw no longer supplies webhook_url during OAuth — the relay is the authority on where events get delivered. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
IronClaw no longer supplies any URLs to channel-relay. The relay derives all URLs from the trusted instance_url in chat-api. initiate_oauth() takes no parameters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
zmanian
left a comment
There was a problem hiding this comment.
Re-review: Architecture evolved significantly
The PR has gone through substantial rework since my last review (7 new commits). The implementation shifted from webhook-based callbacks to OAuth + SSE streaming with token renewal.
Previous feedback status:
- Duplicate webhook handler (Critical) -- FIXED.
webhook.rsremoved entirely. - API key as HMAC secret -- MOOT. Architecture no longer uses HMAC webhook signing. OAuth + CSRF tokens handle validation instead.
- No retry for callback registration -- FIXED. Retries moved to SSE reconnect layer with exponential backoff + token renewal on
TokenExpired. - Callback URL construction duplicated -- FIXED. URL construction now only in
auth_channel_relay(). - CLAUDE.md not updated -- NOT FIXED. The
/oauth/slack/callbackroute exists in code butsrc/channels/web/CLAUDE.mddoesn't document it separately.
New observation:
- Stream token passed via OAuth callback query parameter -- visible in browser history/referer headers. Low risk since it's a short-lived token, but worth noting.
Remaining ask: update src/channels/web/CLAUDE.md to document the relay-related routes, then this is good to merge.
Cross-PR note: shared HMAC verification with #1207PR #1207 (WASM webhook improvements for WhatsApp) adds a Both PRs also touch Suggestion: whichever lands second should reuse the shared HMAC verification utility rather than maintaining two implementations. If #1207 lands first, the relay webhook handler can import from |
Re-add nonce generation and secret storage in auth_channel_relay. The nonce is passed to channel-relay as state_nonce param (not a URL). Channel-relay embeds it in the signed state and appends it to the redirect URL so IronClaw's callback handler can validate and activate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
relay_signing_secret() now prefers OPENCLAW_GATEWAY_TOKEN (per-instance) over the shared CHANNEL_RELAY_SIGNING_SECRET. A compromised instance can no longer forge callbacks to other instances on the same relay. CHANNEL_RELAY_SIGNING_SECRET is now optional in RelayConfig. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…fallbacks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR migrates the channel-relay integration from an SSE pull model to a push-based webhook callback model, with HMAC signature verification for incoming relay events. It simplifies the RelayChannel runtime by consuming events from an internal mpsc queue fed by the web gateway’s /relay/events endpoint.
Changes:
- Add a public
/relay/eventswebhook endpoint on the web gateway that verifies HMAC signatures and enqueuesChannelEvents. - Refactor
RelayChannel/RelayClientto remove SSE streaming and instead use webhook-delivered events plus new relay APIs (signing-secret fetch, approval token creation). - Update relay configuration/tests to reflect bearer auth and the new signing-secret flow.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/relay_integration.rs | Updates integration tests from SSE streaming to signing-secret + bearer auth + updated proxy call shape. |
| src/extensions/manager.rs | Adds relay event sender + signing-secret cache; updates relay activation/auth/removal paths for webhook model. |
| src/config/relay.rs | Replaces SSE/backoff settings with a webhook_path setting and updates config tests/docs accordingly. |
| src/channels/web/server.rs | Adds /relay/events handler with HMAC verification; updates Slack OAuth callback params/storage to remove stream token. |
| src/channels/relay/webhook.rs | Introduces reusable signature verification helper (and an additional webhook router/handler). |
| src/channels/relay/mod.rs | Exposes the new webhook module. |
| src/channels/relay/client.rs | Removes SSE APIs; adds signing-secret fetch, approval token creation, and bearer auth header usage. |
| src/channels/relay/channel.rs | Simplifies relay channel to read from an mpsc receiver and adds approval-token registration flow. |
Comments suppressed due to low confidence (1)
src/channels/web/server.rs:956
- In the OAuth callback,
team_idis only persisted whenstate.storeisSome, butactivate_stored_relay()is attempted unconditionally. When the gateway runs without a DB store, activation will proceed with an emptyteam_idand (now) fail when fetching the signing secret. Consider either (a) storingteam_idin the secrets store / ExtensionManager memory for no-store mode, or (b) returning a user-visible error before attempting activation whenstate.storeisNone.
let result: Result<(), String> = async {
// Store team_id in settings
if let Some(ref store) = state.store {
let team_id_key = format!("relay:{}:team_id", DEFAULT_RELAY_NAME);
let _ = store
.set_setting(&state.user_id, &team_id_key, &serde_json::json!(team_id))
.await;
}
// Activate the relay channel
ext_mgr
.activate_stored_relay(DEFAULT_RELAY_NAME)
.await
.map_err(|e| format!("Failed to activate relay channel: {}", e))?;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| //! HTTP client for the channel-relay service. | ||
| //! | ||
| //! Wraps reqwest for all channel-relay API calls: OAuth initiation, | ||
| //! SSE streaming, token renewal, and Slack API proxy. | ||
| //! callback registration, and Slack API proxy. | ||
|
|
src/channels/relay/client.rs
Outdated
| /// Proxy an API call through channel-relay for any provider. | ||
| /// | ||
| /// Returns a stream of parsed `ChannelEvent`s and the `JoinHandle` of the | ||
| /// background SSE parser task. The caller is responsible for reconnection | ||
| /// logic on stream end/error and for aborting the handle on shutdown. | ||
| pub async fn connect_stream( | ||
| /// Calls `POST /proxy/{provider}/{method}?team_id=X&instance_id=Y` with the given JSON body. | ||
| /// Register a pending approval with channel-relay and receive an opaque token. | ||
| /// The token is embedded in Slack button values instead of routing fields. | ||
| /// Register a pending approval with channel-relay and receive an opaque token. | ||
| /// The relay derives the authorized approver from the connection's authed_user_id. | ||
| pub async fn create_approval( |
| body.get("signing_secret") | ||
| .and_then(|v| v.as_str()) | ||
| .and_then(|s| hex::decode(s).ok()) | ||
| .ok_or_else(|| RelayError::Protocol("missing signing_secret in response".to_string())) |
src/extensions/manager.rs
Outdated
| @@ -3988,20 +3995,37 @@ impl ExtensionManager { | |||
| ) | |||
| .map_err(|e| ExtensionError::ActivationFailed(e.to_string()))?; | |||
|
|
|||
| // Fetch the per-instance signing secret from channel-relay. | |||
| // This must succeed — there is no fallback. | |||
| let signing_secret = client.get_signing_secret(&team_id).await.map_err(|e| { | |||
| ExtensionError::Config(format!("Failed to fetch relay signing secret: {e}")) | |||
| })?; | |||
| .delete_setting(&self.user_id, &format!("relay:{}:team_id", name)) | ||
| .await; | ||
| } | ||
|
|
| async fn shutdown(&self) -> Result<(), ChannelError> { | ||
| if let Some(handle) = self.reconnect_handle.write().await.take() { | ||
| handle.abort(); | ||
| } | ||
| if let Some(handle) = self.parser_handle.write().await.take() { | ||
| handle.abort(); | ||
| } | ||
| // Nothing to clean up — the event channel will close naturally | ||
| // when the sender is dropped | ||
| Ok(()) | ||
| } |
| pub instance_id: Option<String>, | ||
| /// HTTP request timeout in seconds (default: 30). | ||
| pub request_timeout_secs: u64, | ||
| /// SSE stream long-poll timeout in seconds (default: 86400 = 24 h). | ||
| pub stream_timeout_secs: u64, | ||
| /// Initial exponential backoff in milliseconds (default: 1000). | ||
| pub backoff_initial_ms: u64, | ||
| /// Maximum exponential backoff in milliseconds (default: 60000). | ||
| pub backoff_max_ms: u64, | ||
| /// Path for the webhook callback endpoint (default: `/relay/events`). | ||
| pub webhook_path: String, | ||
| } |
src/channels/relay/webhook.rs
Outdated
| /// Maximum allowed age of a callback timestamp (5 minutes). | ||
| const MAX_TIMESTAMP_AGE_SECS: i64 = 300; | ||
|
|
||
| /// Shared state for the relay webhook endpoint. | ||
| #[derive(Clone)] | ||
| pub struct RelayWebhookState { | ||
| pub event_tx: mpsc::Sender<ChannelEvent>, | ||
| pub signing_secret: Arc<Vec<u8>>, | ||
| } | ||
|
|
||
| /// Build an axum Router for the relay webhook endpoint. | ||
| pub fn webhook_router(state: RelayWebhookState) -> Router { | ||
| Router::new() | ||
| .route("/relay/events", post(relay_events_handler)) | ||
| .with_state(state) | ||
| } | ||
|
|
||
| async fn relay_events_handler( | ||
| State(state): State<RelayWebhookState>, | ||
| headers: HeaderMap, | ||
| body: Bytes, | ||
| ) -> impl IntoResponse { | ||
| // Verify signature | ||
| let signature = match headers | ||
| .get("x-relay-signature") | ||
| .and_then(|v| v.to_str().ok()) | ||
| { | ||
| Some(s) => s.to_string(), | ||
| None => { | ||
| tracing::warn!("relay callback missing X-Relay-Signature header"); | ||
| return (axum::http::StatusCode::UNAUTHORIZED, "missing signature").into_response(); | ||
| } | ||
| }; | ||
|
|
||
| let timestamp = match headers | ||
| .get("x-relay-timestamp") | ||
| .and_then(|v| v.to_str().ok()) | ||
| { | ||
| Some(t) => t.to_string(), | ||
| None => { | ||
| tracing::warn!("relay callback missing X-Relay-Timestamp header"); | ||
| return (axum::http::StatusCode::UNAUTHORIZED, "missing timestamp").into_response(); | ||
| } | ||
| }; | ||
|
|
||
| // Check timestamp freshness | ||
| if let Ok(ts) = timestamp.parse::<i64>() { | ||
| let now = chrono::Utc::now().timestamp(); | ||
| if (now - ts).abs() > MAX_TIMESTAMP_AGE_SECS { | ||
| tracing::warn!( | ||
| timestamp = ts, | ||
| now = now, | ||
| "relay callback timestamp too old" | ||
| ); | ||
| return (axum::http::StatusCode::UNAUTHORIZED, "stale timestamp").into_response(); | ||
| } | ||
| } | ||
|
|
||
| // Verify HMAC | ||
| if !verify_signature(&state.signing_secret, ×tamp, &body, &signature) { | ||
| tracing::warn!("relay callback signature verification failed"); | ||
| return (axum::http::StatusCode::UNAUTHORIZED, "invalid signature").into_response(); | ||
| } | ||
|
|
||
| // Parse event | ||
| let event: ChannelEvent = match serde_json::from_slice(&body) { | ||
| Ok(e) => e, | ||
| Err(e) => { | ||
| tracing::warn!(error = %e, "relay callback invalid JSON"); | ||
| return (axum::http::StatusCode::BAD_REQUEST, "invalid JSON").into_response(); | ||
| } | ||
| }; | ||
|
|
||
| tracing::debug!( | ||
| event_type = %event.event_type, | ||
| sender = %event.sender_id, | ||
| channel = %event.channel_id, | ||
| "received relay callback event" | ||
| ); | ||
|
|
||
| // Push to channel (non-blocking — if full, log and drop) | ||
| if let Err(e) = state.event_tx.try_send(event) { | ||
| tracing::warn!(error = %e, "relay callback event channel full or closed"); | ||
| } | ||
|
|
||
| axum::Json(serde_json::json!({"ok": true})).into_response() | ||
| } | ||
|
|
| /// Returns `None` if the relay channel has not been activated yet. | ||
| pub fn relay_signing_secret(&self) -> Option<Vec<u8>> { | ||
| self.relay_signing_secret_cache.lock().ok()?.clone() |
src/channels/relay/webhook.rs
Outdated
| if let Ok(ts) = timestamp.parse::<i64>() { | ||
| let now = chrono::Utc::now().timestamp(); | ||
| if (now - ts).abs() > MAX_TIMESTAMP_AGE_SECS { | ||
| tracing::warn!( | ||
| timestamp = ts, | ||
| now = now, | ||
| "relay callback timestamp too old" | ||
| ); | ||
| return (axum::http::StatusCode::UNAUTHORIZED, "stale timestamp").into_response(); | ||
| } |
|
Addressed the remaining important relay review issues in Fixed in this patch:
Verified locally:
Verified on GitHub:
|
Summary
/relay/eventson the web gatewayTest plan
Depends on: nearai/channel-relay feat/webhook-callbacks