-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept use-candidate for ice-lite #739
Conversation
8a368bf
to
fbddc45
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #739 +/- ##
==========================================
+ Coverage 78.53% 78.61% +0.07%
==========================================
Files 41 41
Lines 4762 4764 +2
==========================================
+ Hits 3740 3745 +5
+ Misses 788 785 -3
Partials 234 234
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Interesting scenario. The part which concerns me here is the proposed solution: I don't think that the right solution here should be to disable peer reflexive altogether. This will slow down ICE connectivity in certain cases. |
Thank you @nils-ohlmeier for the feedback. That was my original inclination too (accepting There is this long standing FIrefox issue where it does aggressive nomination even if the peer is ice-lite. I guess in that case the selected pair will change a bunch of times and settle on the final one. Is that correct? |
fbddc45
to
43f85ec
Compare
43f85ec
to
9bcc191
Compare
@nils-ohlmeier Please have a look at this PR when you get a chance and let me know if it looks okay. |
I think we'd better to make it as a default behavior for controlled ice-lite agent insead of all controlled agent. The failure case is chrome/firefox does renomiantion/aggressive-nomination when remote peer is ice lite, and there is no problem for full agent. So the current behavior of pion comply with the rfc, we don't need to break it for the bad behavior cause by remote peer. |
Compare priority is ok for both nomiation type in rfc5245:
Checked the libwebrtc code, it also compares the nominated connections from remote side when it picks controlled role, the conditions are complex but contains the priority. |
9bcc191
to
242239b
Compare
@cnderrauber Thank you for the review. I have updated the code to apply it only for ice-lite agent. Can you please re-review when you get a chance? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
selection.go
Outdated
@@ -241,7 +241,7 @@ func (s *controlledSelector) HandleSuccessResponse(m *stun.Message, local, remot | |||
s.log.Tracef("Found valid candidate pair: %s", p) | |||
if p.nominateOnBindingSuccess { | |||
if selectedPair := s.agent.getSelectedPair(); selectedPair == nil || | |||
(selectedPair != p && selectedPair.priority() <= p.priority()) { | |||
(selectedPair != p && ((!s.agent.enableUseCandidateCheckPriority && s.agent.lite) || selectedPair.priority() <= p.priority())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about make it a agent method needsToCheckPriorityOnNominated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call @cnderrauber . Updated with a helper function.
There could be a mismatch between the two ends in candidate priority when using peer reflexive. It happens in the following scenario 1. Client has two srflx candidates. a. The first one gets discovered by LiveKit server as prflx. b. The second one gets added via ice-trickle first and then gets a STUN ping. So, it is srflx remote candidate from server's point-of-view. 2. This leads to a priority issue. a. Both candidates have same priority from client's point-of-view (both are srflx). b. But, from server's point-of-view, the first candidate has higher priority (prflx). 3. The first candidate establishes connectivity and becomes the selected pair (client is ICE controlling and server is ICE controlled, server is in ICE lite). 4. libwebrtc does a sort and switch some time later based on RTT. As client side has both at same priority, RTT based sorting could make the second candidate the preferred one. So, the client sends useCandidate=1 for the second candidate. pion/ice does not switch because the selected pair is at higher priority due to prflx candidate. 5. STUN pings do not happen and the ICE connection eventually fails. ICE controlled agent should accept use-candidate unconditionally if it is an ICE lite agentt. Just in case existing behaviour is needed, it can be configured using `EnableUseCandidateCheckPriority`. NOTE: With aggressive nomination, the selected pair could change a few times, but should eventually settle on what the controlling side wants.
242239b
to
15ab10a
Compare
@Sean-Der @davidzhao @nils-ohlmeier going to merge this based on discussions in Slack/this PR and @cnderrauber 's review. Want to port this to v2 and start using it in LiveKit project. But, please do review when you get a chance and let me know if you have feedback and I will address it in a subsequent PR. |
There could be a mismatch between the two ends in candidate priority when using peer reflexive. It happens in the following scenario 1. Client has two srflx candidates. a. The first one gets discovered by LiveKit server as prflx. b. The second one gets added via ice-trickle first and then gets a STUN ping. So, it is srflx remote candidate from server's point-of-view. 2. This leads to a priority issue. a. Both candidates have same priority from client's point-of-view (both are srflx). b. But, from server's point-of-view, the first candidate has higher priority (prflx). 3. The first candidate establishes connectivity and becomes the selected pair (client is ICE controlling and server is ICE controlled, server is in ICE lite). 4. libwebrtc does a sort and switch some time later based on RTT. As client side has both at same priority, RTT based sorting could make the second candidate the preferred one. So, the client sends useCandidate=1 for the second candidate. pion/ice does not switch because the selected pair is at higher priority due to prflx candidate. 5. STUN pings do not happen and the ICE connection eventually fails. ICE controlled agent should accept use-candidate unconditionally if it is an ICE lite agentt. Just in case existing behaviour is needed, it can be configured using `EnableUseCandidateCheckPriority`. NOTE: With aggressive nomination, the selected pair could change a few times, but should eventually settle on what the controlling side wants.
@boks1971 sounds good to me! What do you need to move to /v3? If I port livekit/livekit would that be a good start? |
Nothing specifically needed @Sean-Der as the default behaviour is the desired behaviour. I cherry-picked the change to v2 and tagged it v2.3.37 and have updated livekit dependencies to use that tag. |
I was debating adding a settings engine setting to enable the priority check (the flag added in this PR) and thread it through. But, as we discussed the default behaviour added in this PR looks like the right one and the flag is just a safety net. So, decided to wait to see if there are any ill effects before adding another things to SettingsEngine. |
Yeah that one sucks. The problem is that Firefox uses its own ICE stack written in C with lots of pointer magic. Most of the developers don't want to touch it with a ten foot pole. The team is aware of the problem. But my guess is that it is rather unlikely this will get fixed anytime soon. The best known workaround is to implement on the ice-lite server side to use the candidate pair where it receives traffic from. I'm not sure if a workaround like that should go into Pion to be more compatible with Firefox. @Sean-Der do you have thoughts on this? |
I did that approach in a previous life and it worked well. But, that was a few years ago. |
There could be a mismatch between the two ends in candidate priority
when using peer reflexive. It happens in the following scenario
a. The first one gets discovered by LiveKit server as prflx.
b. The second one gets added via ice-trickle first and then
gets a STUN ping. So, it is srflx remote candidate from
server's point-of-view.
a. Both candidates have same priority from client's point-of-view
(both are srflx).
b. But, from server's point-of-view, the first candidate has
higher priority (prflx).
the selected pair (client is ICE controlling and server is
ICE controlled, server is in ICE lite).
As client side has both at same priority, RTT based sorting
could make the second candidate the preferred one.
So, the client sends useCandidate=1 for the second candidate.
pion/ice does not switch because the selected pair is at
higher priority due to prflx candidate.
ICE controlled agent should accept use-candidate unconditionally if
it is an ICE lite agent.
Just in case existing behaviour is needed, it can be configured
using
EnableUseCandidateCheckPriority
.NOTE: With aggressive nomination, the selected pair could change
a few times, but should eventually settle on what the controlling
side wants.