Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve AR parser #177

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from

Conversation

futatuki
Copy link

@futatuki futatuki commented Sep 13, 2024

openarc/openarc-ar.c (ares_parse):

  • strictly check the use of "none" (no-result) instead of one or more resinfo
  • record really "result comment" to result_comment field
  • overwrite the former result instead of dedup after writing to new result
  • do not record the results of unknown method (Fix issue OpenARC is generating invalid ARC headers #113)
  • even after the number of recorded result has reached to the limit, continue to parse for syntax checking (and update the former result if newer value is found for seen methods).

Others:

  • Remove dedup() in openarc/opearc-ar.c
  • Fix main() function for ARTEST enabled in openarc-ar.c for testing above
  • In mlfi_eom in openarc.c, add foolproof check for ignoring unknown results

* openarc/openarc-ar.c (main):
  explicitly cast pointer to signed char into pointer to u_char.
* openarc/openarc.c (ares_parse):
  Check prevstate if "none" is found on input token in state 3.
… field

"Result comment" is a comment just after "result" of the "methodspec".
No other comments that can be allowed in AR syntax should be ignored.

* openarc/openarc-ar.c (ares_parse):
   Check prevstate before storing result comment. If it stored once
   update prevstate so as not to store again.
…r adding

When the parser find a new result and its "method" is not "dkim",
overwriting the old result of same "method" if exists, instead of
check the duplicate of the latest result just finished recording
and move it. This may allow to parse longer header a bit.

* openarc/openarc-ar.c
  (ares_dedup): Removed
  (ares_method_seen): New.
  (ares_parse): As the description above.
futatuki added a commit to futatuki/OpenARC that referenced this pull request Sep 16, 2024
…parse

Improve AR parser

openarc/openarc-ar.c (ares_parse):
* strictly check the use of "none" (no-result) instead of one or more
  resinfo
* record really "result comment" to result_comment field
* overwrite the former result instead of dedup after writing to new
  result
* do not record the results of unknown method (Fix issue trusteddomainproject#113)
* even after the number of recorded result has reached to the limit,
  continue to parse for syntax checking (and update the former result
  if newer value is found for seen methods).

Others:
* Remove dedup() in openarc/opearc-ar.c
* Fix main() function for ARTEST enabled in openarc-ar.c for testing above
* In mlfi_eom in openarc.c, add foolproof check for ignoring unknown results

trusteddomainproject#177
@futatuki
Copy link
Author

To correct commit message, I'll rebase and force push it.

According to RFC 8601 Section 2.7.6[1], unknown authentication
methods in AR header should be ignored or deleted. With this
commit, ares_parse() does not store the result of authres
which method is 'unknown'.

[1] https://www.rfc-editor.org/rfc/rfc8601.html#section-2.7.6

* openarc/openarc-ar.c (ares_parse): As described above.
the number of auth results.

* openarc/openarc-ar.c (ares_parse):
   Do not return and continue parsing even if there is no space to
   record more results, for syntax checking, or overwrite the result
   of some auth method already seen.
Since we discard resinfo of which method is not known in AR header
parsing now, it should never happen. However we check it as a foolproof.

* openarc/openarc.c (mlfi_eom):
   Ignore unknown method in the result of parsing AR header, and
   log it.
@futatuki
Copy link
Author

To correct commit message, I'll rebase and force push it.

done.

futatuki added a commit to futatuki/OpenARC that referenced this pull request Sep 17, 2024
…parse

Improve AR parser

openarc/openarc-ar.c (ares_parse):
* strictly check the use of "none" (no-result) instead of one or more
  resinfo
* record really "result comment" to result_comment field
* overwrite the former result instead of dedup after writing to new
  result
* do not record the results of unknown method (Fix issue trusteddomainproject#113)
* even after the number of recorded result has reached to the limit,
  continue to parse for syntax checking (and update the former result
  if newer value is found for seen methods).

Others:
* Remove dedup() in openarc/opearc-ar.c
* Fix main() function for ARTEST enabled in openarc-ar.c for testing above
* In mlfi_eom in openarc.c, add foolproof check for ignoring unknown results

trusteddomainproject#177
@flowerysong
Copy link
Contributor

@futatuki This is next on my list to review, but it might take a couple of days. It looks reasonable on its face and compiles, but I want to write some tests specifically for this functionality and see if they turn up any edge cases.

@flowerysong
Copy link
Contributor

Okay, I've had a chance to look at this and while it's an improvement I was still unhappy overall with the results I was getting from my first couple of high-level tests.

flowerysong/OpenARC@da1fa93 is my initial work on top of your changes. This cleans up the parsing code by using names for the state, building the current resinfo in a temporary struct, and deduplicating method results across all parsed headers from the ADMD instead of just within a single header. This does move the limit of 16 stored results from per-header to overall, but I think that's reasonable.

The potentially controversial changes to things that you also touched:

Comments: Instead of only storing one comment and requiring that it immediately follow the methodspec, treat comments as properties so that we can have multiple per method and they're correctly located in relation to properties (e.g. iprev=fail (IP not in A record) iprev.policy=192.0.2.1 (PTR mail.protection.example.com);). Comments located before the methodspec or in the middle of parsing units (like smtp.mailfrom(authenticated) = [email protected]) will still be discarded, and I'm considering adding a milter config to discard all comments.

Method deduplication: instead of taking the last result for a given method and discarding previous ones, take the first one and then refuse to replace it. The choice of which one to respect is pretty arbitrary, and doing it this way means we don't have different rules for duplicates within a header (last one wins) and duplicates across multiple headers (first one wins.)

I still have more testing to do, but I wanted to check in and see if you have any feedback about what I've done so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants