-
-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow selectors on *_NAMES
collections
#1143
base: main
Are you sure you want to change the base?
Conversation
… non-selectable collection
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1143 +/- ##
==========================================
- Coverage 81.68% 81.55% -0.13%
==========================================
Files 168 168
Lines 9762 9847 +85
==========================================
+ Hits 7974 8031 +57
- Misses 1537 1565 +28
Partials 251 251
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Interesting, thank you very much for your contribution Im a bit worried about how the complexity of variables is growing. Maybe not for this PR, but we need to improve generation of code, even for this "selectable" feature |
// CanBeSelected returns true if the variable supports selection (ie, `:foobar`) | ||
func (v RuleVariable) CanBeSelected() bool { | ||
switch v { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think adding everything here makes sense. Just return true on those who can, and use the default otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although for performance it makes sense, I believe this is easier to maintain and more readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is auto generated so I would not be concern about readability. @blotus could you do a quick benchmark on this matter i.e. adding everything or just true and all the rest on a default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that readability does not really matter here.
I've updated the code generation to only generate the cases where true is returned.
Definitely not for this PR. We should create an issue to refactor generation then. |
*_NAMES
collections*_NAMES
collections
LGTM in general, but I believe this lacks negative tests and its decreasing the general project coverage |
@@ -101,11 +101,41 @@ type NamedCollectionNames struct { | |||
} | |||
|
|||
func (c *NamedCollectionNames) FindRegex(key *regexp.Regexp) []types.MatchData { | |||
panic("selection operator not supported") | |||
var res []types.MatchData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any chance data
is empty? if so I would handle the empty case before this allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are probably situations where data
can be empty (I haven't tested, but I'd expect a collection like XML
to have an empty data
on a non-XML request )
AFAIK, declaring a slice like this does not perform any actual allocation (other than the header of the slice, which will be all zero), and the actual allocation will be performed the first time we append to it, but I can add a check on data
if you are worried about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, from what I see, this check is not performed in the existing code (here for example)
internal/collections/named.go
Outdated
for k, data := range c.collection.Map.data { | ||
if key.MatchString(k) { | ||
for _, d := range data { | ||
res = append(res, &corazarules.MatchData{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if MatchData is mutable, if so we probably want to reuse the pointer?
@@ -574,13 +574,15 @@ func (tx *Transaction) GetField(rv ruleVariableParams) []types.MatchData { | |||
if m, ok := col.(collection.Keyed); ok { | |||
matches = m.FindRegex(rv.KeyRx) | |||
} else { | |||
panic("attempted to use regex with non-selectable collection: " + rv.Variable.Name()) | |||
// This should probably never happen, selectability is checked at parsing time | |||
tx.debugLogger.Error().Str("collection", rv.Variable.Name()).Msg("attempted to use regex with non-selectable collection") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed time ago that panic is ok, as this is a low level issue and coraza should not run here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I agree with this: coraza is designed as a library, and in my point of view, this means that explicit panics must be avoided at all costs (with very little exceptions, if you can call panic, you can return an error), and not doing anything is almost always better than bringing down a production website.
If a function call can lead to a panic, it should be made very clear to the caller (either with an explicit function name (Must....
) or, at the very least, with some documentation): I don't mind wrapping every call to coraza with a recover, but I need to be aware it's required.
For this specific case, it can only (AFAIK) be triggered by a configuration error, so this means it should be detected when parsing the configuration (and is now thanks to this PR), so the panic has become redundant.
Hello,
This PR aims to allow the use of rules such as
SecRule &REQUEST_COOKIES_NAMES:JSESSIONID "@eq 0" "id:45"
(supported by ModSecurity and also present as an example in the documentation of Coraza), which currently causes Coraza to crash due to an explicitpanic
call.There are 3 main changes:
collections.NamedCollectionNames
implementscollection.Keyed
: this allows the use of a selector for the collections created with.Names()
panics
calls: as Coraza is designed to be embedded in other software, callingpanic
is never a good idea.Parser changes
I've embedded information about whether a collection can be selected or not in the
internal/variables/variables.go
file, as a comment for each collection that does support it (hopefully, I did not miss any), and added aCanBeSelected
method that is called during parsing to check if the selector is allowed or not.I don't know if I'm really happy with embedding information in comments, but it was the least intrusive way I found to handle this.
collections.NamedCollectionNames
implementscollection.Keyed
This one is straightforward,
NamedCollectionNames
now implementsGet
,FindString
andFindRegex
.Because it's a named collection, the key and the value in the returned results will be the same: the name of the key.
Remove runtime
panic
The first two were removed as part of making
namedCollectionNames
implementsKeyed
.The other two (which are the ones that caused the crash mentioned at the beginning of this PR) have been replaced by an error log.
In theory, this log should never occur because selectability is now checked during parsing (in practice, it could happen if a collection is marked as selectable but does not implement
Keyed
).