Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sendToDevice tests #412

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

S7evinK
Copy link
Contributor

@S7evinK S7evinK commented Jul 13, 2022

Adds the test from tests/46direct/01directmessage.pl as well as two new tests to test that to-device messages are only delivered once and are ordered by arrival.

Federation tests are the same as the CS api tests.

@S7evinK S7evinK marked this pull request as draft July 13, 2022 13:27
@S7evinK S7evinK marked this pull request as ready for review August 9, 2022 06:24
@S7evinK S7evinK requested review from a team as code owners August 9, 2022 06:24
@richvdh
Copy link
Member

richvdh commented Aug 17, 2022

@S7evinK I haven't (yet) reviewed this, but: we should remove the tests from Synapse at the same time as we add them to Complement, to avoid the having two copies of a test that get out of sync.

(This is a recent change: we discussed it in the Synapse team yesterday and we are now happy enough with our Complement coverage that we can start turning off Sytest tests.)

@richvdh richvdh self-requested a review August 17, 2022 08:55
for _, message := range toDevice.Array() {
for _, check := range checks {
if err := check([]byte(message.Raw)); err != nil {
return fmt.Errorf("%v: %s", err, message.Raw)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means that if we get a to-device message with the wrong values, we'll keep syncing anyway? Which doesn't sound right?

I think the complement way to do this is to have a SendToDeviceSynced method which sends the message and then calls MustSyncUntil to wait for a message to arrive; then have a separate MustSync call which receives the message again and checks its content.

@kegsay any thoughts on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like SendToDeviceSynced could work but would be tricky because this would mean using /sync to verify it has arrived (which would be on the receiver as well, which is different to say SendEventSynced) - if you advance beyond that position then the event will be lost. It's a bit too error prone I think.

This current implementation is a bit too brittle because it will immediately fail the test if the /sync response does not have wantMessageCount, which is probably why the sleep() was added to ensure all 10 to-device messages would be waiting for the user in a single sync response. This is brittle because you cannot know when these events will be ready, and you simply need to keep trying until you reach a sensible timeout value. This is exactly what all the sync methods do, and we should be making use of them more - i.e take SyncToDeviceHas from #439 and then do:

var i int64
bob.MustSyncUntil(t, client.SyncReq{}, client.SyncToDeviceHas(
    func(msg gjson.Result) bool {
        gotReqID := msg.Get("content.request_id").Int()
        if gotReqID == i {
            i++ // we have the next request ID, look for another
        } else {
            // if we see any other request ID, whine about it e.g out-of-order requests
            t.Errorf("unexpected to-device request_id: %d", gotReqID)
        }
        return i == 10 // terminate when we have seen all requests in order
    }
))

This will work for both cases where:

  • /sync returns all 10 events.
  • /sync returns 7 events, then the next one returns 2, then the next one returns 1.

This will also fail gracefully if:

  • events are missing. We time out after 5s, no sleeps needed.
  • events arrive out-of-order. We get a nice error message.

Comment on lines 53 to 54
// clear previously received sendToDevice events
bob.MustSyncUntil(t, client.SyncReq{Since: nextBatch}, verifyToDeviceResp(t, bob, checks, 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you expand the comment about why this is needed?

(the words "previously received" imply that these are sendToDevice events that were sent by a previous test. I don't think that's what you mean here.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing to-device messages don't get purged on the initial no-since /sync, hence needing to sync again to ACK the to-device message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know why it's needed. I'm asking that the comment says it :)

nextBatch := bob.MustSyncUntil(t, client.SyncReq{}, verifyToDeviceResp(t, bob, checks, 1))
// clear previously received sendToDevice events
bob.MustSyncUntil(t, client.SyncReq{Since: nextBatch}, verifyToDeviceResp(t, bob, checks, 0))
// Charlie syncs for the first time, so no need to clear events
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing he's explaining why charlie isn't having the same treatment as bob with the since token?

Comment on lines 111 to 112
// sync again, we should get the same event again, as we didn't advance "since"
nextBatch := bob.MustSyncUntil(t, client.SyncReq{}, verifyToDeviceResp(t, bob, checks, 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be a MustSyncUntil given we expect it to pass on the first /sync.

Comment on lines 114 to 115
// advance the next_batch, we shouldn't get an event
nextBatch = bob.MustSyncUntil(t, client.SyncReq{Since: nextBatch}, verifyToDeviceResp(t, bob, checks, 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I think this should be MustSync rather than MustSyncUntil.

Comment on lines 126 to 128
// act as if the sync "failed", so we're using the previous next batch
t.Logf("Syncing with %s", prevNextBatch)
bob.MustSyncUntil(t, client.SyncReq{Since: prevNextBatch}, verifyToDeviceResp(t, bob, checks, 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure what this is testing to be honest? It's testing that if you use the same sync token twice you get the message both times? but we already did that at lines 111-112?

// verify the results from /sync
nextBatch := bob.MustSyncUntil(t, client.SyncReq{}, verifyToDeviceResp(t, bob, checks, 1))
// clear previously received sendToDevice events
bob.MustSyncUntil(t, client.SyncReq{Since: nextBatch}, verifyToDeviceResp(t, bob, checks, 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also: this shouldn't be MustSyncUntil, as it should pass on the first /sync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using MustSyncUntil would be the right thing to do here, as there is no guarantee that the to-device message will be delivered on the first /sync, even if the PUT 200 OKs (it could be OKd after persistence but before downstream workers/components for the sync API are aware of it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no guarantee that the to-device message will be delivered on the first /sync

but verifyToDeviceResp(t, bob, checks, 0) checks that no to-device messages are delivered in the /sync. So this is allowing an unexpected/repeated to-device event before getting to that state. I don't see why we would explicitly want to allow to-device events that we don't want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i'm specifically talking about the MustSyncUntil on line 54, in case that is unclear)

Comment on lines 134 to 136
// since we advanced "since", this should return no events anymore
t.Logf("Syncing with %s", prevNextBatch)
bob.MustSyncUntil(t, client.SyncReq{Since: prevNextBatch}, verifyToDeviceResp(t, bob, checks, 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure if this is adding much value.

},
},
})
time.Sleep(time.Millisecond * 100) // Wait a bit for the server to process all messages
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like the wrong way of doing this: putting sleep in tests is inherently racy and unreliable.

Why do we need to sleep here at all? Is there any better way of achieving what we need?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I would argue that servers MUST guarantee ordering in that if I send A and it 200 OKs then I send B and it 200 OKs, then A is before B. If this is true, then there is no need to wait between requests as the server will queue them appropriately. The sleep may make more sense to know when the last request has been processed, but still we should be using a different way to detect this e.g MustSyncUntil will timeout after 5s anyway, so it is good enough to just eagerly sync for all the messages which will fail if one gets lost.

"github.com/matrix-org/complement/internal/match"
)

func TestSendToDevice(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at a glance, it looks like this is pretty similar to the C-S version, so my comments apply here too.

@kegsay kegsay self-assigned this Sep 20, 2022
@DMRobertson DMRobertson requested a review from a team February 16, 2023 12:21
@clokep
Copy link
Contributor

clokep commented Mar 3, 2023

Looks like this has a conflict again. 😢 I'm assuming this is still in need of a review?


}

func countSendToDeviceMessages(t *testing.T, bob *client.CSAPI, req client.SyncReq, i int64, wantCount int64, withClear bool) (nextBatch string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have docstrings for these functions please

})
}

func countSendToDeviceMessages(t *testing.T, bob *client.CSAPI, req client.SyncReq, i int64, wantCount int64, withClear bool) (nextBatch string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the same function as above?

@kegsay kegsay added the stale This issue or PR is old and may be closed soon label Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This issue or PR is old and may be closed soon
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants