-
Notifications
You must be signed in to change notification settings - Fork 720
Customizable parser configuration #1881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## dev #1881 +/- ##
==========================================
+ Coverage 83.47% 83.50% +0.02%
==========================================
Files 298 300 +2
Lines 53940 54187 +247
Branches 11979 11784 -195
==========================================
+ Hits 45026 45247 +221
- Misses 7736 8111 +375
+ Partials 1178 829 -349
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@ Dimi1010 overall, it's a good solution that could work! Here are a few comments:
|
That is the default behaviour already. static inline ParserConfiguration& getDefault()
{
static ParserConfiguration defaultConfig = makeDefaultConfiguration();
return defaultConfig;
}
I don't agree with that. IMO, both should be available to the users. In the general use case, the user will configure / modify the default parser and use the parameter-less overload as is (
I suppose that might be possible but the default configuration mappings are created in
Can be done, but it wasn't a priority for the initial draft.
The way I imagine the system working is thus:
Example:
If wildcards are used in the input pair the exact match will be the same as the partial match of the non-wildcard port.
Generally symmetrical bindings are handled on the mapping level, when the rule is being registered. In @seladb Hope that answers the questions! |
Yes, I was referring to the configuration loaded by the user, not only the default one
Will the user configuration be on top of the default one or replace it? For example: HTTP default are port 80 and 8080. The user might want to replace that with port 1234, or add an additional port 1234. How can we support both?
I'd create a simpler interface for the users that can define a single port and have the port mapper create 2 rules behind the scenes |
The idea is that the user can either modify the default configuration, or make his own configuration instance. In the case of modifying the default, the user's changes essentially change the global default. If they want to have a separate instance, the can either make one from scratch, or make a copy of the default and store it at their convenience. A manually constructed config can even be assigned to the default via Ideally the user would be able to inject a configuration to the parsing logic of
Yes, it is mainly called internally. There was an idea to eventually allow Packets to be parsed with different configurations, specified at the packet parse time. So there is a potential use case for the This will be an additive change, so in the base use case, the packet fetches the default parser config static. I haven't profiled the solution yet, but the most likely performance factor would be the
For port mapping, each configuration instance is essentially a global port table. The user can add entries, remove entries or completely replace the entire table. For the example, if the user wants port 1234 to be HTTP, the user adds a rule (1234, *) -> HTTP. If the user wants to have 80 not be HTTP, he removes the rule from the map that is used for parsing. I am unsure by what you mean by keeping the default ports for SSL/TLS. Each port rule is independent of each other.
Wdym? Like a convenience method that takes I think it is better to only allow explicit port pairs for clarity when users are defining their own rules. void PortMapper::addPortMapping(PortPair port, ProtocolType protocol, bool symmetrical)
{
if (port == PortPair())
{
throw std::invalid_argument("PortPair cannot be empty (both src and dst ports are 0)");
}
auto insertResult = m_PortToProtocolMap.insert({ port, protocol });
insertResult.first->second = protocol; // Update the protocol if it already exists
if (!insertResult.second)
{
PCPP_LOG_WARN("Port " << port << " is already mapped to protocol " << insertResult.first->second
<< ", updating to " << protocol);
}
if (symmetrical && port.portSrc != port.portDst)
{
// Add the symmetrical mapping
PortPair symmetricalPort = { port.portDst, port.portSrc };
addPortMapping(symmetricalPort, protocol, false);
}
} |
@seladb Another case I found. FTP has both FTP default port (20) and FTP Data (port 21), but both map to ProtocolType::FTP. |
Use `getProtocolMappingsMatrixForPortPair` for partial matches.
@Dimi1010 I think I remember why I didn't implement this until now: I wasn't sure what the performance implications might be. Packet parsing is the fast path of PcapPlusPlus, and changing it from hard-coded values to using a map might have performance implications, especially in high throughput network traffic (for example: when using DPDK, or even when using libpcap with high traffic). Is it possible to measure it somehow? 🤔 |
It should be able to be measured. A large Pcap file that is loaded and then timed as it is fed into Packet parsing should suffice since we only need the comparative benchmarks for the parsing module. |
@seladb Check #1887 for a benchmark that evaluates only the pure parsing performance. And the current PR head does drop performance somewhat, although that was always gonna be the case. 🤔 Might test it with 3rd party map implementations, to see if the performance can get better.
|
Hmm it's a drop of ~4-6%. We can probably run it on different platforms to see if we get similar results (especially on Linux, which many of the high throughput applications run on). I guess such a drop is not ideal. If there was a way to "configure" what to use it'd be nice, but I'm not sure there's a clean way to do that 🤔 |
It probably can be done, with some compile time magic, where the |
I'm not sure if we want to get into it, it might make things even more complex than they are now... |
This PR adds a new class
ParserConfiguration
with the intent to enable user customization of parsing logic.The initial experimental implementation includes
PortMapper
that can be used to define custom port rule mappings.The default parser configuration
A default configuration singleton is provided by the runtime via
ParserConfiguration::getDefault()
containing all default parsing logic. This configuration is to be used in all instances when no custom configuration is provided.Modifying the default configuration
The reference returned by
ParserConfiguration::getDefault()
allows modification of the underlying configuration. This can be used to modify the default parsing behaviour or even override it completely by assigning a custom configuration.A modified default configuration can be reset to its initial state by calling
ParserConfiguration::resetDefault
.NB: The configuration object is currently not thread safe for modification. Do not modify the config if an active parsing is being performed.
Parser extension
Layer::parseNextLayer(ParserConfiguration const& config)
: The new overload is to be considered the main signature of the parsing system. The parsed configuration is used to determine parser behaviour.Layer::parseNextLayer()
to forward to the new overload: The parameter-less version ofparseNextLayer
has been converted to a non-virtual function that calls the new overload with the default parsing configuration.Limitations
Packet
class only utilizes the default parser config during parsing.TcpLayer
actually uses thePortMapper
, and only for mapping toHttpRequestLayer
andHttpResponseLayer
.