Skip to content

feat: allow selectors on *_NAMES collections#1143

Merged
jcchavezs merged 21 commits intocorazawaf:mainfrom
blotus:no-panic-on-non-selectable-col
May 30, 2025
Merged

feat: allow selectors on *_NAMES collections#1143
jcchavezs merged 21 commits intocorazawaf:mainfrom
blotus:no-panic-on-non-selectable-col

Conversation

@blotus
Copy link
Copy Markdown
Contributor

@blotus blotus commented Sep 4, 2024

Hello,

This PR aims to allow the use of rules such as SecRule &REQUEST_COOKIES_NAMES:JSESSIONID "@eq 0" "id:45" (supported by ModSecurity and also present as an example in the documentation of Coraza), which currently causes Coraza to crash due to an explicit panic call.

There are 3 main changes:

  • Check if a collection supports selectors during parsing time, instead of throwing an error at runtime.
  • Make collections.NamedCollectionNames implements collection.Keyed: this allows the use of a selector for the collections created with .Names()
  • Remove runtime panics calls: as Coraza is designed to be embedded in other software, calling panic is never a good idea.

Parser changes

I've embedded information about whether a collection can be selected or not in the internal/variables/variables.go file, as a comment for each collection that does support it (hopefully, I did not miss any), and added a CanBeSelected method that is called during parsing to check if the selector is allowed or not.

I don't know if I'm really happy with embedding information in comments, but it was the least intrusive way I found to handle this.

collections.NamedCollectionNames implements collection.Keyed

This one is straightforward, NamedCollectionNames now implements Get, FindString and FindRegex.
Because it's a named collection, the key and the value in the returned results will be the same: the name of the key.

Remove runtime panic

The first two were removed as part of making namedCollectionNames implements Keyed.

The other two (which are the ones that caused the crash mentioned at the beginning of this PR) have been replaced by an error log.
In theory, this log should never occur because selectability is now checked during parsing (in practice, it could happen if a collection is marked as selectable but does not implement Keyed).

@blotus blotus requested a review from a team as a code owner September 4, 2024 12:32
@codecov
Copy link
Copy Markdown

codecov bot commented Sep 4, 2024

Codecov Report

Attention: Patch coverage is 59.37500% with 39 lines in your changes missing coverage. Please review.

Project coverage is 83.84%. Comparing base (7f11024) to head (7c1b8fb).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/variables/variablesmap.gen.go 50.00% 26 Missing ⚠️
internal/corazawaf/transaction.go 8.33% 11 Missing ⚠️
internal/collections/named.go 93.10% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1143      +/-   ##
==========================================
- Coverage   83.98%   83.84%   -0.15%     
==========================================
  Files         170      170              
  Lines        9824     9909      +85     
==========================================
+ Hits         8251     8308      +57     
- Misses       1329     1357      +28     
  Partials      244      244              
Flag Coverage Δ
coraza.rule.case_sensitive_args_keys 83.80% <59.37%> (-0.15%) ⬇️
coraza.rule.multiphase_evaluation 83.49% <59.37%> (-0.15%) ⬇️
coraza.rule.no_regex_multiline 83.78% <59.37%> (-0.15%) ⬇️
default 83.84% <59.37%> (-0.15%) ⬇️
examples+ 16.31% <2.08%> (-0.21%) ⬇️
examples+coraza.rule.case_sensitive_args_keys 83.80% <59.37%> (-0.15%) ⬇️
examples+coraza.rule.multiphase_evaluation 83.34% <59.37%> (-0.15%) ⬇️
examples+coraza.rule.no_regex_multiline 83.70% <59.37%> (-0.15%) ⬇️
examples+memoize_builders 83.81% <59.37%> (-0.15%) ⬇️
examples+no_fs_access 81.09% <59.37%> (-0.13%) ⬇️
ftw 83.84% <59.37%> (-0.15%) ⬇️
memoize_builders 83.94% <59.37%> (-0.15%) ⬇️
no_fs_access 83.34% <59.37%> (-0.15%) ⬇️
tinygo 83.81% <59.37%> (-0.15%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jptosso
Copy link
Copy Markdown
Member

jptosso commented Sep 4, 2024

Interesting, thank you very much for your contribution

Im a bit worried about how the complexity of variables is growing. Maybe not for this PR, but we need to improve generation of code, even for this "selectable" feature

Comment on lines +215 to +217
// CanBeSelected returns true if the variable supports selection (ie, `:foobar`)
func (v RuleVariable) CanBeSelected() bool {
switch v {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think adding everything here makes sense. Just return true on those who can, and use the default otherwise.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although for performance it makes sense, I believe this is easier to maintain and more readable

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is auto generated so I would not be concern about readability. @blotus could you do a quick benchmark on this matter i.e. adding everything or just true and all the rest on a default?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that readability does not really matter here.
I've updated the code generation to only generate the cases where true is returned.

@fzipi
Copy link
Copy Markdown
Member

fzipi commented Sep 8, 2024

Im a bit worried about how the complexity of variables is growing. Maybe not for this PR, but we need to improve generation of code, even for this "selectable" feature

Definitely not for this PR. We should create an issue to refactor generation then.

@fzipi fzipi changed the title Allow selectors on *_NAMES collections feat: allow selectors on *_NAMES collections Sep 8, 2024
@jptosso
Copy link
Copy Markdown
Member

jptosso commented Sep 18, 2024

LGTM in general, but I believe this lacks negative tests and its decreasing the general project coverage


func (c *NamedCollectionNames) FindRegex(key *regexp.Regexp) []types.MatchData {
panic("selection operator not supported")
var res []types.MatchData
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any chance data is empty? if so I would handle the empty case before this allocation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there are probably situations where data can be empty (I haven't tested, but I'd expect a collection like XML to have an empty data on a non-XML request )

AFAIK, declaring a slice like this does not perform any actual allocation (other than the header of the slice, which will be all zero), and the actual allocation will be performed the first time we append to it, but I can add a check on data if you are worried about it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, from what I see, this check is not performed in the existing code (here for example)

Comment thread internal/collections/named.go Outdated
Comment thread internal/collections/named.go Outdated
for k, data := range c.collection.Map.data {
if key.MatchString(k) {
for _, d := range data {
res = append(res, &corazarules.MatchData{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if MatchData is mutable, if so we probably want to reuse the pointer?

Comment thread internal/variables/generator/variablesmap.go.tmpl Outdated
Comment thread internal/collections/named.go Outdated
} else {
panic("attempted to use regex with non-selectable collection: " + rv.Variable.Name())
// This should probably never happen, selectability is checked at parsing time
tx.debugLogger.Error().Str("collection", rv.Variable.Name()).Msg("attempted to use regex with non-selectable collection")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed time ago that panic is ok, as this is a low level issue and coraza should not run here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree with this: coraza is designed as a library, and in my point of view, this means that explicit panics must be avoided at all costs (with very little exceptions, if you can call panic, you can return an error), and not doing anything is almost always better than bringing down a production website.

If a function call can lead to a panic, it should be made very clear to the caller (either with an explicit function name (Must....) or, at the very least, with some documentation): I don't mind wrapping every call to coraza with a recover, but I need to be aware it's required.

For this specific case, it can only (AFAIK) be triggered by a configuration error, so this means it should be detected when parsing the configuration (and is now thanks to this PR), so the panic has become redundant.

@jcchavezs jcchavezs merged commit 1faa41d into corazawaf:main May 30, 2025
70 of 72 checks passed
@jcchavezs
Copy link
Copy Markdown
Member

Thanks a lot @blotus !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants