Conversation
In multi instance flow manager setups, each flow manager gets a slice of the hash table to manage. Due to a logic error in the chunked scanning of the hash slice, instances beyond the first would always rescan the same (first) subslice of their slice. The `pos` variable that is used to keep the state of what the starting position for the next scan was supposed to be, was treated as if it held a relative value. Relative to the bounds of the slice. It was however, holding an absolute position. This meant that when doing it's bounds check it was always considered out of bounds. This would reset the sub- slice to be scanned to the first part of the instances slice. This patch addresses the issue by correctly handling the fact that the value is absolute. Bug: OISF#7365. Fixes: e9d2417 ("flow/manager: adaptive hash eviction timing")
Ticket: 7435
As too many cases are found when splitting tcp payload
As it is also used for HTTP/1 Remove it only for TCP and keep it for UDP. Ticket: 7436
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #12245 +/- ##
==========================================
+ Coverage 83.18% 83.21% +0.03%
==========================================
Files 912 912
Lines 257174 257183 +9
==========================================
+ Hits 213930 214025 +95
+ Misses 43244 43158 -86
Flags with carried forward coverage won't be shown. Click here to find out more. |
|
Information: ERROR: QA failed on SURI_TLPR1_alerts_cmp.
Pipeline 23722 |
Q1: Is the difference in stats on hitting emergency mode? |
No, it's not hitting the emergency mode in the baseline or the PR runs.
It's able to see things more timely for sure, but it's a good question why should lead to more detection. Things should still be covered by the packet path and shutdown timeout handling. |
jufajardini
left a comment
There was a problem hiding this comment.
It's consistent with approved original PRs.
Did you get an answer about the alerts' redistribution? I see that we are missing some, and hitting others.
Question was answered. |
Would it be possible that the way things are now would lead to such a waiting time that some flows would time out before they could be fully inspected, leading to less alerts before? |
Staging: