You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While many regex dialects / implementations use similar symbols they don't necessarily ascribe the same semantics to those e.g. \d, w, \s and their reverse may be ascii only or partially or fully unicode, the latter would be a lot more expensive than the former, possibly unnecessarily.
Furthermore from a performance / memory standpoint 6e65445 modified regexes to limit redos risk, however it did so inconsistently so it's not entirely clear whether and which rules non-backtracking engines which are not sensitive to catastrophic backtracking (e.g. re2, regex, regexp, ...) may convert the regexes back to unbounded repetition, as bounded repetitions are also used in semantically relevant contexts. Having a well defined and consistent substitute for * and + (and maybe some rules ensuring new ones don't get added improperly) would allow engines to track and substitute them on the fly, which can positively impact their memory use and runtime as they don't need to track the number of iterations anymore.
The text was updated successfully, but these errors were encountered:
While many regex dialects / implementations use similar symbols they don't necessarily ascribe the same semantics to those e.g.
\d
,w
,\s
and their reverse may be ascii only or partially or fully unicode, the latter would be a lot more expensive than the former, possibly unnecessarily.Furthermore from a performance / memory standpoint 6e65445 modified regexes to limit redos risk, however it did so inconsistently so it's not entirely clear whether and which rules non-backtracking engines which are not sensitive to catastrophic backtracking (e.g. re2, regex, regexp, ...) may convert the regexes back to unbounded repetition, as bounded repetitions are also used in semantically relevant contexts. Having a well defined and consistent substitute for
*
and+
(and maybe some rules ensuring new ones don't get added improperly) would allow engines to track and substitute them on the fly, which can positively impact their memory use and runtime as they don't need to track the number of iterations anymore.The text was updated successfully, but these errors were encountered: