Use full URL when redirecting to snapshots, for consistency with primary redirects #7373

eddyashton · 2025-10-17T15:24:54Z

First change to solidify snapshot fetching, hoping to be iteratively backportable:

Change the redirect response header from Location: /node/snapshot/XXX to Location: https://1.2.3.4/node/snapshot/XXX. This prevents the client returning to the load balancer and being silently redirected to a different node, and is consistent with other nodefrontend-specific redirects.

Requires a frustratingly large diff, I'll justify/explain that inline.

…direction

…_precise_redirect

Copilot

Pull Request Overview

This PR changes snapshot redirect responses to use full URLs instead of relative paths, ensuring clients connect directly to the correct node rather than potentially being redirected through a load balancer. The change updates both the redirect generation logic and the client-side handling to use full URLs.

Modifies the node frontend to generate full HTTPS URLs in redirect responses for snapshot endpoints
Updates the snapshot fetching client code to properly handle redirect URLs using CURL's built-in redirect following
Updates tests to validate the new full URL redirect behavior

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
src/node/rpc/node_frontend.h	Generates full HTTPS URLs for snapshot redirects using node's published address
src/snapshots/fetch.h	Refactors redirect handling to use CURL's CURLINFO_REDIRECT_URL and follows redirects iteratively
src/http/curl.h	Changes URL parameter from rvalue reference to const reference for reusability
tests/e2e_operations.py	Updates tests to expect full URLs in redirect responses and uses path for subsequent requests

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

src/snapshots/fetch.h

eddyashton · 2025-10-17T15:25:33Z

src/node/rpc/node_frontend.h

+          auto redirect_url =
+            fmt::format("https://{}/node/snapshot/{}", address, snapshot_path);


This is the core goal - changing this value.

eddyashton · 2025-10-17T15:27:14Z

src/node/rpc/node_frontend.h

+          }
+
+          const auto& address =
+            info->rpc_interfaces[interface_id.value()].published_address;


NB: This is a buggy pattern - what if the interface isn't present on this node? In this specific instance it should be fine (we're currently only asking for an interface on ourselves, based on a request that we are parsing). But we'll fix this (and other uses of the same pattern) separately, and before this redirects to other nodes which may not have the same interfaces.

That should be an error at this point, there is no reliable way to guess a suitable alternative.

src/snapshots/fetch.h

eddyashton · 2025-10-17T15:35:54Z

src/snapshots/fetch.h

+      // Make initial requests, following redirects to a specific snapshot,
+      // resulting in final path and snapshot size


Diff here looks big but semantic change is smaller.

In pseudocode, previously:

string snapshot_path; { request = do_head_request(initial_url); assert(request.response == REDIRECT); snapshot_path = request.headers["Location"]; } size_t content_size; { request = do_head_request(snapshot_path); assert(request.response == OK); content_size = request.headers["Content-size"]; } do_actual_fetch_loop(snapshot_url, content_size);

We need to change the calling pattern slightly to handle the location header being a full URL, not just a path, and potentially on a different host. Curl's FOLLOW_REDIRECT would do the magic, but annoyingly won't then tell us the final path! So we do the loop ourself, updating snapshot_url each time, and extract content_size from the first non-redirect:

string snapshot_url = initial_url; size_t content_size; while (redirect_count < max_redirects) { request = do_head_request(initial_url); if (request.response == OK) { content_size = request.headers["Content-size"]; break; } assert(request.response == REDIRECT); snapshot_url = curl_helper_to_get_location_as_full_url(request); ++redirect_count; } do_actual_fetch_loop(snapshot_url, content_size);

Co-authored-by: Copilot <[email protected]>

achamayou · 2025-10-17T17:26:05Z

src/snapshots/fetch.h

+            content_size);
+          if (ec != std::errc())
+          {
+            throw std::runtime_error(fmt::format(


I am not clear on why it matters to get the size ahead of time, and why the HEAD requests needs to exist. Isn't there a way to go straight for the snapshot, with a range restriction, and if we get lucky get it right back, or otherwise be redirected?
Why do we need HEAD first to get the size?

The standard HTTP flow seems to be to return a 416 if the Range cannot be satisfied, but if the Range is strictly bigger than the document, we could opt to return the bytes we have without error.

If say a connecting node asks for bytes=0-1000, and the snapshot is only 500 bytes, we could just return 500 bytes. If it's longer we can ask for the next chunk etc. We can return a Content-Range on the first response to give the full size and avoid an error on the last chunk (Client: "give me 0-1000", Server: "Here is 0-1000, by the way the document is 1700 long", Client: "Ok, give me 1001-1700 now then").

It's not standard, but it's going to save at least one round trip and possibly more - we always gate that behaviour behind a User-Agent if we want to preserve compatibility with standard clients (which we don't expect will call this interface).

achamayou

This definitely goes in the right direction, but I think we want a clear error when we have an interface mismatch. Also I wonder if we can do something more efficient (see comment above).

eddyashton added 4 commits October 17, 2025 13:10

Include node address in snapshot redirect, to avoid load balancer mis…

ab21fb2

…direction

Follow redirect chains ourself

2894255

Merge branch 'main' of https://github.com/microsoft/CCF into snapshot…

27ffac0

…_precise_redirect

Hmmm

d2402af

eddyashton requested a review from a team as a code owner October 17, 2025 15:24

Copilot AI review requested due to automatic review settings October 17, 2025 15:24

Copilot AI reviewed Oct 17, 2025

View reviewed changes

src/snapshots/fetch.h Outdated Show resolved Hide resolved

eddyashton commented Oct 17, 2025

View reviewed changes

Update src/snapshots/fetch.h

f0d62e0

Co-authored-by: Copilot <[email protected]>

achamayou reviewed Oct 17, 2025

View reviewed changes

achamayou approved these changes Oct 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use full URL when redirecting to snapshots, for consistency with primary redirects #7373

Use full URL when redirecting to snapshots, for consistency with primary redirects #7373

eddyashton commented Oct 17, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

eddyashton Oct 17, 2025

Uh oh!

eddyashton Oct 17, 2025

Uh oh!

achamayou Oct 17, 2025

Uh oh!

Uh oh!

eddyashton Oct 17, 2025

Uh oh!

achamayou Oct 17, 2025

Uh oh!

achamayou Oct 17, 2025

Uh oh!

achamayou left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		auto redirect_url =
		fmt::format("https://{}/node/snapshot/{}", address, snapshot_path);

		// Make initial requests, following redirects to a specific snapshot,
		// resulting in final path and snapshot size

Use full URL when redirecting to snapshots, for consistency with primary redirects #7373

Are you sure you want to change the base?

Use full URL when redirecting to snapshots, for consistency with primary redirects #7373

Conversation

eddyashton commented Oct 17, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

eddyashton Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

eddyashton Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

achamayou Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eddyashton Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

achamayou Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

achamayou Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

achamayou left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants