Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

next/662/20241207/v1 #12245

Merged
merged 5 commits into from
Dec 9, 2024
Merged

Conversation

victorjulien
Copy link
Member

victorjulien and others added 5 commits December 7, 2024 10:23
In multi instance flow manager setups, each flow manager gets a slice
of the hash table to manage. Due to a logic error in the chunked
scanning of the hash slice, instances beyond the first would always
rescan the same (first) subslice of their slice.

The `pos` variable that is used to keep the state of what the starting
position for the next scan was supposed to be, was treated as if it held
a relative value. Relative to the bounds of the slice. It was however,
holding an absolute position. This meant that when doing it's bounds
check it was always considered out of bounds. This would reset the sub-
slice to be scanned to the first part of the instances slice.

This patch addresses the issue by correctly handling the fact that the
value is absolute.

Bug: OISF#7365.

Fixes: e9d2417 ("flow/manager: adaptive hash eviction timing")
As too many cases are found when splitting tcp payload
As it is also used for HTTP/1
Remove it only for TCP and keep it for UDP.

Ticket: 7436
Copy link

codecov bot commented Dec 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.21%. Comparing base (a9b36d8) to head (38d7900).
Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12245      +/-   ##
==========================================
+ Coverage   83.18%   83.21%   +0.03%     
==========================================
  Files         912      912              
  Lines      257174   257183       +9     
==========================================
+ Hits       213930   214025      +95     
+ Misses      43244    43158      -86     
Flag Coverage Δ
fuzzcorpus 61.06% <60.00%> (+0.05%) ⬆️
livemode 19.42% <100.00%> (+0.01%) ⬆️
pcap 44.40% <100.00%> (+<0.01%) ⬆️
suricata-verify 62.79% <100.00%> (+<0.01%) ⬆️
unittests 59.19% <60.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

@suricata-qa
Copy link

Information:

ERROR: QA failed on SURI_TLPR1_alerts_cmp.

field baseline test %
SURI_TLPR1_stats_chk
.app_layer.flow.ftp 32421 36200 111.66%
.app_layer.flow.dcerpc_tcp 40 43 107.5%
.app_layer.error.http.parser 700 729 104.14%
.app_layer.error.ssh.parser 124 128 103.23%
.ftp.memuse 2906 3102 106.74%

Pipeline 23722

@inashivb
Copy link
Member

inashivb commented Dec 9, 2024

Information:

ERROR: QA failed on SURI_TLPR1_alerts_cmp.
field baseline test %
SURI_TLPR1_stats_chk
.app_layer.flow.ftp 32421 36200 111.66%
.app_layer.flow.dcerpc_tcp 40 43 107.5%
.app_layer.error.http.parser 700 729 104.14%
.app_layer.error.ssh.parser 124 128 103.23%
.ftp.memuse 2906 3102 106.74%

Pipeline 23722

Q1: Is the difference in stats on hitting emergency mode?
Q2: Is the FM able to see more flows after scanning the hash rows correctly in a designated time?

@victorjulien
Copy link
Member Author

Q1: Is the difference in stats on hitting emergency mode?

No, it's not hitting the emergency mode in the baseline or the PR runs.

Q2: Is the FM able to see more flows after scanning the hash rows correctly in a designated time?

It's able to see things more timely for sure, but it's a good question why should lead to more detection. Things should still be covered by the packet path and shutdown timeout handling.

Copy link
Contributor

@jufajardini jufajardini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's consistent with approved original PRs.

Did you get an answer about the alerts' redistribution? I see that we are missing some, and hitting others.

@jufajardini
Copy link
Contributor

It's consistent with approved original PRs.

Did you get an answer about the alerts' redistribution? I see that we are missing some, and hitting others.

Question was answered.

@jufajardini
Copy link
Contributor

Q2: Is the FM able to see more flows after scanning the hash rows correctly in a designated time?

It's able to see things more timely for sure, but it's a good question why should lead to more detection. Things should still be covered by the packet path and shutdown timeout handling.

Would it be possible that the way things are now would lead to such a waiting time that some flows would time out before they could be fully inspected, leading to less alerts before?

Copy link
Contributor

@jufajardini jufajardini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistent with approved PRs. Differences in QA checks seem to be for the better/ reflect the changes added.

@victorjulien victorjulien merged commit 38d7900 into OISF:master Dec 9, 2024
61 checks passed
This was referenced Dec 9, 2024
@victorjulien victorjulien deleted the next/662/20241207/v1 branch December 9, 2024 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants