Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RCORE-2160 Make upload completion reporting multiprocess-compatible #7796

Merged
merged 2 commits into from
Jul 1, 2024

Conversation

tgoyne
Copy link
Member

@tgoyne tgoyne commented Jun 10, 2024

Rather than tracking a bunch of derived state in-memory, check for upload completion by checking if there are any unuploaded changesets. This is both multiprocess-compatible and is more precise than the old checks, which had some false-negatives and minor inconsistencies. Previously creating local commits which produced empty changesets and then calling wait_for_upload_completion() would complete immediately, but pausing and then resuming the session would make it wait until the new session performed the upload scan, which didn't happen until after download completion.

The synchronous completion waits (which are hopefully only used in tests) are now just thin wrappers around the async waits. This exposed a small inconsistency around when completion happened when the sync client is stopped, which is something we don't expose publicly so changing it should be fine.

@tgoyne tgoyne self-assigned this Jun 10, 2024
@cla-bot cla-bot bot added the cla: yes label Jun 10, 2024
REALM_ASSERT(self->m_actualized);
if (!status.is_ok()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If post() itself failed we previously never called the completion callback, while now we report the error to the callback. The event loop being able to fail is sort of weird and I'm not sure it can actually happen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect the only error we could get here to be OperationAborted if the event loop were shut down before the sync client.

@@ -1564,6 +1497,23 @@ void SessionWrapper::force_close()
m_sess = nullptr;
// Everything is being torn down, no need to report connection state anymore
m_connection_state_change_listener = {};

// All outstanding wait operations must be canceled
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this from finalize() to force_close() means that in tests we send the notifications when the client is shutdown rather than when the session is abandoned, matching the old behavior of blocking wait for completion or client stop. I think this is clearly correct for tests and should have no effect outside of test code.

@@ -730,6 +730,7 @@ TEST_CASE("flx: client reset", "[sync][flx][client reset][baas]") {
REQUIRE(mode == ClientResyncMode::Recover);
auto subs = local_realm->get_latest_subscription_set();
subs.get_state_change_notification(sync::SubscriptionSet::State::Complete).get();
subs.refresh();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was relying on wait_for_upload_completion() waiting for subscription changes to be uploaded (which it would only sometimes do and wasn't actually guaranteed), which happened to result in the subscription state being Complete before the call to get_state_change_notification() so this worked without the refresh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this been flaky lately or something? i think waiting_for_upload_completion() used to guarantee this, but maybe that's changed from under me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_for_upload_completion() has had specific logic to report completion even if there's unuploaded changesets as long as it had scanned all of the changesets (i.e. as long as all of the remaining ones are empty or from the server) since 2018, and I'm pretty sure there was equivalent behavior achieved differently before that.

The change in functionality that broke this test is that we don't scan the changesets to see if any needed to be uploaded until after the first DOWNLOAD is received, so previously empty changesets made wait_for_uploads() wait for the first DOWNLOAD message and now it doesn't. It's probably possible to preserve that behavior, but it seems really weird and inconsistent (particularly because the presence of empty changesets may not be directly related to anything the developer did).

Either way, the test was incorrect; it should either be waiting on the state change notification and then calling refresh or simply asserting the state without waiting. Waiting then asserting without the refresh in between doesn't really make any sense.

@@ -737,7 +737,7 @@ struct BaasFLXClientReset : public TestClientReset {
if (m_on_post_local) {
m_on_post_local(realm);
}
wait_for_upload(*realm);
wait_for_download(*realm);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was relying on wait_for_upload() waiting for subscription changes after resume(). The thing we're actually waiting for here is for is a server roundtrip so that we receive the client reset error, which wait_for_download() does guarantee.

Copy link

coveralls-official bot commented Jun 12, 2024

Pull Request Test Coverage Report for Build thomas.goyne_416

Details

  • 205 of 210 (97.62%) changed or added relevant lines in 10 files are covered.
  • 73 unchanged lines in 15 files lost coverage.
  • Overall coverage decreased (-0.006%) to 90.941%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/sync/noinst/client_history_impl.cpp 34 36 94.44%
src/realm/sync/client.cpp 59 62 95.16%
Files with Coverage Reduction New Missed Lines %
src/realm/array_string.cpp 1 87.23%
src/realm/object-store/sync/async_open_task.cpp 1 88.36%
src/realm/sort_descriptor.cpp 1 94.06%
src/realm/util/serializer.cpp 1 90.43%
test/fuzz_tester.hpp 1 57.73%
test/test_util_network.cpp 1 95.56%
src/realm/cluster.cpp 2 75.6%
test/test_all.cpp 2 75.82%
src/realm/sync/client.cpp 3 91.26%
src/realm/sync/noinst/client_impl_base.cpp 6 81.93%
Totals Coverage Status
Change from base Build thomas.goyne_415: -0.006%
Covered Lines: 214551
Relevant Lines: 235923

💛 - Coveralls

Base automatically changed from tg/download-progress to master June 18, 2024 20:08
@realm realm deleted a comment from coveralls-official bot Jun 18, 2024
@realm realm deleted a comment from coveralls-official bot Jun 18, 2024
@realm realm deleted a comment from coveralls-official bot Jun 18, 2024
@realm realm deleted a comment from coveralls-official bot Jun 18, 2024
Copy link

coveralls-official bot commented Jun 18, 2024

Pull Request Test Coverage Report for Build thomas.goyne_419

Details

  • 112 of 117 (95.73%) changed or added relevant lines in 8 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall first build on tg/upload-completion at 90.955%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/sync/noinst/client_history_impl.cpp 34 36 94.44%
src/realm/sync/client.cpp 59 62 95.16%
Totals Coverage Status
Change from base Build 2430: 91.0%
Covered Lines: 214681
Relevant Lines: 236031

💛 - Coveralls

@tgoyne tgoyne marked this pull request as ready for review June 18, 2024 21:23
Copy link

coveralls-official bot commented Jun 20, 2024

Pull Request Test Coverage Report for Build thomas.goyne_420

Details

  • 112 of 117 (95.73%) changed or added relevant lines in 8 files are covered.
  • 46 unchanged lines in 16 files lost coverage.
  • Overall coverage decreased (-0.002%) to 90.964%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/sync/noinst/client_history_impl.cpp 34 36 94.44%
src/realm/sync/client.cpp 59 62 95.16%
Files with Coverage Reduction New Missed Lines %
src/realm/array_mixed.cpp 1 91.94%
src/realm/sort_descriptor.cpp 1 94.06%
src/realm/sync/noinst/client_impl_base.cpp 1 81.93%
src/realm/sync/noinst/server/server_history.cpp 1 63.7%
src/realm/util/compression.cpp 1 89.62%
test/fuzz_tester.hpp 1 57.73%
test/test_query2.cpp 1 98.73%
test/test_lang_bind_helper.cpp 2 93.2%
src/realm/sync/client.cpp 3 91.26%
src/realm/table.cpp 3 90.42%
Totals Coverage Status
Change from base Build 2432: -0.002%
Covered Lines: 214645
Relevant Lines: 235966

💛 - Coveralls

if (uploaded_version == current_client_version)
return;

BinaryColumn changesets(db.get_alloc());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use m_array->changesets and m_arrays->origin_file_idents

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a static function.

// empty changesets and did not need to be uploaded. If this is less than
// uploaded_version, we have changesets which have been uploaded but the
// server has not yet told us we can delete and we may need to use for merging.
auto base_version = current_client_version - changesets.size();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should use m_sync_history_base_version here instead

}

auto count = size_t(current_client_version - uploaded_version);
for (size_t i = changesets.size() - count; i < changesets.size(); ++i) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can take a look at the loop in trim_sync_history() since you're doing something similar

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what this comment means. I have indeed looked at that loop?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that the function is static. I meant that you could make the loop pretty much the same.


void on_upload_completion();
version_type m_upload_completion_requested_version = -1;

void on_download_completion();
Copy link
Collaborator

@danieltabacaru danieltabacaru Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to align all completion handlers at some point given the current refactoring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The end state of all this does need to be that download completion is also determinable by inspecting the Realm file, but it's significantly more complicated to get there for downloads (as the server is the source of truth for download completion rather than the client). I'm trying to split off each of the separate pieces to avoid having another monster PR that changes everything.

Rather than tracking a bunch of derived state in-memory, check for upload
completion by checking if there are any unuploaded changesets. This is both
multiprocess-compatible and is more precise than the old checks, which had some
false-negatives.
Copy link

coveralls-official bot commented Jul 1, 2024

Pull Request Test Coverage Report for Build thomas.goyne_425

Details

  • 112 of 117 (95.73%) changed or added relevant lines in 8 files are covered.
  • 91 unchanged lines in 18 files lost coverage.
  • Overall coverage decreased (-0.01%) to 90.99%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/realm/sync/noinst/client_history_impl.cpp 34 36 94.44%
src/realm/sync/client.cpp 59 62 95.16%
Files with Coverage Reduction New Missed Lines %
src/realm/sync/instructions.hpp 1 76.03%
test/test_table.cpp 1 99.51%
src/realm/array_blobs_big.cpp 2 98.58%
src/realm/query_expression.hpp 2 93.81%
src/realm/mixed.cpp 3 86.46%
src/realm/sync/noinst/protocol_codec.hpp 3 74.07%
src/realm/util/future.hpp 3 95.94%
src/realm/util/fifo_helper.cpp 4 85.11%
test/object-store/util/sync/baas_admin_api.cpp 5 84.93%
src/realm/bplustree.cpp 6 72.55%
Totals Coverage Status
Change from base Build 2454: -0.01%
Covered Lines: 215141
Relevant Lines: 236444

💛 - Coveralls

@tgoyne tgoyne merged commit fb46803 into master Jul 1, 2024
40 checks passed
@tgoyne tgoyne deleted the tg/upload-completion branch July 1, 2024 16:59
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants