Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Fixed overrideSndSeqNo() not clear buffer. #2753

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gou4shi1
Copy link
Contributor

@gou4shi1 gou4shi1 commented Jun 18, 2023

D:SRT.gs: grp/sendBackup: ... trying @362808828 - resending 11 collected messages...
D:SRT.gs: @362808828: overrideSndSeqNo: sched-seq=1540037828 send-seq=1540037827 (unchanged)
D:SRT.as: @362808828: SND-DROP: %(1540037828-1540037910) n=83pkt 108778B, span=1427 ms, FIRST #160424
D:SRT.as: @362808828: buf:SENDING (BEFORE) srctime:01:38:51.051264000 [STDY] DATA SIZE: 892 sched-SEQUENCE: 1540037828 STAMP: D8C417EF
D:SRT.as: @362808828: buf:SENDING srctime:01:38:51.051264000 [STDY] size=892 #160569 SCHED %1540037828(>> %1540037828) !D8C417EF
D:SRT.as: @362808828: sock:SENDING (END): success, size=892
D:SRT.as: @362808828: buf:SENDING (BEFORE) srctime:01:38:51.133805000 [STDY] DATA SIZE: 1398 sched-SEQUENCE: 1540037829 STAMP: C26BC454
D:SRT.as: @362808828: buf:SENDING srctime:01:38:51.133805000 [STDY] size=1398 #160570 SCHED %1540037829(>> %1540037829) !C26BC454
D:SRT.as: @362808828: sock:SENDING (END): success, size=1398
/SRT:SndQ:w3133*E:SRT.qs: @362808828: IPE: packData: SCHEDULING sequence 1540037828 is behind of EXTRACTION sequence 1540037911, dropping this packet: DIFF=-83 STAMP=D8C417EF

Note the IPE log, the SCHEDULING sequence 1540037828 is behind of EXTRACTION sequence 1540037911, because of the SND-DROP: %(1540037828-1540037910).
In sndDropTooLate(), the m_iSndLastAck was mistakenly updated to 1540037911 by m_pSndBuffer->dropLateData(), so the issue is that m_pSndBuffer was not cleared by overrideSndSeqNo() in main-backup mode.

@codecov-commenter
Copy link

Codecov Report

Merging #2753 (2a5bb72) into master (3cefede) will decrease coverage by 0.75%.
The diff coverage is 0.00%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##           master    #2753      +/-   ##
==========================================
- Coverage   67.51%   66.77%   -0.75%     
==========================================
  Files          99       99              
  Lines       20166    20178      +12     
==========================================
- Hits        13616    13473     -143     
- Misses       6550     6705     +155     
Impacted Files Coverage Δ
srtcore/buffer_snd.cpp 66.77% <0.00%> (-1.94%) ⬇️
srtcore/buffer_snd.h 50.00% <ø> (ø)
srtcore/core.cpp 60.74% <0.00%> (-1.49%) ⬇️
srtcore/group.cpp 37.76% <0.00%> (ø)

... and 9 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ethouris
Copy link
Collaborator

Well, ok, but why is everything dropped from the sender buffer instead of only sequences up to the scheduling sequence?

@gou4shi1
Copy link
Contributor Author

gou4shi1 commented Jun 19, 2023

why is everything dropped from the sender buffer instead of only sequences up to the scheduling sequence?

Correct me if I"m wrong, the scheduling sequence should be the last seq in sender buffer? If no other IPE, "everything" should be same as "only sequences up to the scheduling sequence"?

@maxsharabayko maxsharabayko added Type: Bug Indicates an unexpected problem or unintended behavior [core] Area: Changes in SRT library core labels Jun 19, 2023
@maxsharabayko maxsharabayko added this to the v1.6.0 milestone Jun 19, 2023
@maxsharabayko maxsharabayko modified the milestones: v1.6.0, v1.5.3 Aug 10, 2023
@ethouris
Copy link
Collaborator

ethouris commented Aug 10, 2023

Depends on which sender buffer. We are with backup groups here, but sender buffering is separate for a socket and for a group. Every socket still has its own sender buffering and there's an extra sender buffer for the group, which are stored when group sending happens. The sender buffer for an idle member socket is not filled on regular sending, only when the backup-activation process starts and packets are being resent from the group sender buffer.

The problem is that as the idle link doesn't send packets, it also doesn't store anything in the socket's sender buffer and hence doesn't increase the sending sequence. This gets "tricked", that is, when ACK comes, the anchor sequence of the sender buffer must be shifted to the position of the active link. When the backup activation happens, all packets since the last ACK should be resent, and the sender buffer of the socket should be prepared for this by getting shifted. However, ACK handler and backup activation happen in different threads, so there's no guarantee that the sender buffer will have the right anchor sequence number at the moment of the backup activation. The code must be then prepared for even a worst case scenario, that is, when the anchor sequence of the socket's sender buffer has to be fixed before sending the first packet from the group's sender buffer. It is, however, predicted that the sequence can be shifted forwards, not backwards. The need of backward fixing should be impossible because the anchor sequence either stays unchanged and is then behind the last sent, or it is shifted together with ACK and is equal to the anchor sequence of all other sockets.

Another small problem is that at the moment of backup activation there shouldn't be any packets in the sender buffers of idle member sockets. I don't exactly remember how it is implemented, but anyway: it should catch the empty buffer and only make sure that the anchor sequence of the sender buffer is equal to the last ACK received. ACK should cause dismissal of packets from the group sender buffer as well by obvious reasons - so there shouldn't be also a situation when the sequence number of the packet picked up from the group sender buffer predates the activated member socket's sender buffer's anchor sequence.

@maxsharabayko maxsharabayko modified the milestones: v1.5.3, v1.6.0 Aug 15, 2023
@maxsharabayko maxsharabayko modified the milestones: v1.5.4, v1.6.0 Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[core] Area: Changes in SRT library core Type: Bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants