Skip to content

output-json/output-json-flow: Geoip enrichment [v2]#14567

Open
jayhardee9 wants to merge 2 commits intoOISF:mainfrom
jayhardee9:geoip-enrichment-v2
Open

output-json/output-json-flow: Geoip enrichment [v2]#14567
jayhardee9 wants to merge 2 commits intoOISF:mainfrom
jayhardee9:geoip-enrichment-v2

Conversation

@jayhardee9
Copy link

Contribution style:

Our Contribution agreements:

Changes (if applicable):

Link to ticket: https://redmine.openinfosecfoundation.org/issues/6999
Previous PR: #14558

Describe changes:

  • Fixes lint issues in previous PR
  • adds optional geoip enrichment into EVE log by setting the geoip-enrichment option under eve-log configuration in suricata.yaml
  • enrichment is available under the json-output and json-output-flow modules.

Eve output example with enrichment

{
    "timestamp": "2021-05-27T03:37:44.575843+0700",
    "flow_id": 2130735805455113,
    "pcap_cnt": 15756,
    "event_type": "fileinfo",
    "geoip_src": {
        "ip": "192.236.155.230",
        "geo": {
            "continent_code": "NA",
            "country_iso_code": "US",
            "city_name": "Seattle",
            "country_name": "United States",
            "continent_name": "North America",
            "timezone": "America/Los_Angeles",
            "location": {
                "lat": 47.4902,
                "lon": -122.3004
            }
        }
    },
    "geoip_dst": {},
    "src_ip": "192.236.155.230",
    "src_port": 80,
    "dest_ip": "10.5.26.4",
    "dest_port": 56042,
    "proto": "TCP",
    "pkt_src": "wire/pcap",
    "http": {
        "hostname": "192.236.155.230",
        "url": "/images/redbutton.png",
        "http_user_agent": "WinHTTP loader/1.0",
        "http_content_type": "Content-type: application/octet-stream",
        "http_method": "GET",
        "protocol": "HTTP/1.1",
        "status": 200,
        "length": 105556
    },
    "app_proto": "http",
    "fileinfo": {
        "filename": "/images/redbutton.png",
        "gaps": false,
        "state": "TRUNCATED",
        "stored": false,
        "size": 102400,
        "tx_id": 0
    }
}

SV_BRANCH=OISF/suricata-verify#2838

jayhardee9 and others added 2 commits December 31, 2025 11:11
Add GeoIP enrichment to EVE JSON output.

Ticket: 6999

Co-Authored-By: Fandi Gunawan <10239907+fandigunawan@users.noreply.github.com>
Add GeoIP enrichment to flow EVE logs.
@github-actions
Copy link

github-actions bot commented Jan 5, 2026

NOTE: This PR may contain new authors.

@codecov
Copy link

codecov bot commented Jan 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.13%. Comparing base (f1b9669) to head (79f0ade).
⚠️ Report is 10 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14567      +/-   ##
==========================================
+ Coverage   82.12%   82.13%   +0.01%     
==========================================
  Files        1014     1014              
  Lines      262461   262445      -16     
==========================================
+ Hits       215551   215568      +17     
+ Misses      46910    46877      -33     
Flag Coverage Δ
fuzzcorpus 60.36% <100.00%> (+1.01%) ⬆️
livemode 18.74% <100.00%> (-0.01%) ⬇️
pcap 44.60% <100.00%> (-0.01%) ⬇️
suricata-verify 65.03% <100.00%> (-0.01%) ⬇️
unittests 59.24% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@jufajardini jufajardini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this work up!

This is a bit nitty, could you add an output sample that's a good example of what we actually get with your patch? (I say this because your changes should fix the empty geoip_dst object that we see here, for instance).

I think Jason's comment on the format isn't fully addressed, yet: #10703 (comment)

Comment on lines +20 to +28
},
"country_iso_code": {
"type": "string"
},
"city_name": {
"type": "string"
},
"country_name": {
"type": "string"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, the schema-check script will complain about this, due to non-alphabetical order (as the CI shows).

To check these locally, run: ./scripts/schema-sort.py --check ./etc/schema.json

Comment on lines +96 to +102
#ifdef HAVE_GEOIP
if (cfg != NULL && cfg->geoip_enabled) {
SCGeoIPGet(jb, srcip, "geoip_src");
SCGeoIPGet(jb, dstip, "geoip_dst");
}
#endif /* HAVE_GEOIP */

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say we shouldn't change what we have in the flow header. Either we add this info further down in the header, or add this to EveAddFlow...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, so place those under the flow key instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking so... Let's see if we get more feedback. Any thoughts, @jasonish ?

@jayhardee9
Copy link
Author

Ok, after giving it some thought, I'm proposing that the schema isn't fixed at all, the reason being that the mmdb format is technically schema-less. Different mmdb products/custom ones will likely have different schemata.

I propose configuring the enrichment like so (also renamed geoip -> mmdb, except for the pre-existing geoip-database key):

geoip-database: /usr/share/GeoIP/GeoLite2-Country.mmdb
outputs:
  - eve-log:
      mmdb-enrichment:
        enabled: yes
        fields:
          country_iso_code: ["country", "iso_code"]
          country_name: ["country", "names", "en"]

So under fields, we'd have path mappings between the enrichment output key and the mmdb "path". This has the added benefit of flattening the enrichment data (a good thing in my book), since the output keys aren't nested. Then the output would look something like:

{
  "timestamp": "2025-01-15T10:30:00.000000+0000",
  "src_ip": "8.8.8.8",
  "dest_ip": "192.168.1.1",
  "flow": {
    "mmdb_src": {
      "country_iso_code": "US",
      "country_name": "United States"
    },
    ...
  }
}

Now, if I understand the geoip alert filter correctly, it only expects the mmdb to have country.iso_code -- perhaps we could default fields to:

fields:
  country_iso_code: ['country', 'iso_code']

If the end-user specifies fields, then the default gets completely replaced (avoiding any sort of confusing config merge semantics here).

I'm also of the opinion that a nice-to-have would be logging warnings on startup if the field mappings don't match up with the mmdb schema -- it's an uncommon format, and a little help goes a long way (especially for those of us shipping custom mmdbs).


A little bit of a pivot from the original ticket, but this approach makes more sense (to me at least).

@jasonish
Copy link
Member

Ok, after giving it some thought, I'm proposing that the schema isn't fixed at all, the reason being that the mmdb format is technically schema-less. Different mmdb products/custom ones will likely have different schemata.

I propose configuring the enrichment like so (also renamed geoip -> mmdb, except for the pre-existing geoip-database key):

geoip-database: /usr/share/GeoIP/GeoLite2-Country.mmdb
outputs:
  - eve-log:
      mmdb-enrichment:
        enabled: yes
        fields:
          country_iso_code: ["country", "iso_code"]
          country_name: ["country", "names", "en"]

So under fields, we'd have path mappings between the enrichment output key and the mmdb "path". This has the added benefit of flattening the enrichment data (a good thing in my book), since the output keys aren't nested. Then the output would look something like:

{
  "timestamp": "2025-01-15T10:30:00.000000+0000",
  "src_ip": "8.8.8.8",
  "dest_ip": "192.168.1.1",
  "flow": {
    "mmdb_src": {
      "country_iso_code": "US",
      "country_name": "United States"
    },
    ...
  }
}

Now, if I understand the geoip alert filter correctly, it only expects the mmdb to have country.iso_code -- perhaps we could default fields to:

fields:
  country_iso_code: ['country', 'iso_code']

If the end-user specifies fields, then the default gets completely replaced (avoiding any sort of confusing config merge semantics here).

I'm also of the opinion that a nice-to-have would be logging warnings on startup if the field mappings don't match up with the mmdb schema -- it's an uncommon format, and a little help goes a long way (especially for those of us shipping custom mmdbs).

A little bit of a pivot from the original ticket, but this approach makes more sense (to me at least).

I'm not so sure about naming it mmdb. We should try to be somewhat generic, for the case where a future geoip implementation, which might not be MaxMind can use the same schema.

I've started exploring what ECS (Elastic) and OCSF are doing for this. I have some of my own post-processing tooling that does this as well, which I based on ECS. But not sure thats the best 5 years later.

But the schema is an easy thing to change as well. So wouldn't let that block fixing other issues, if any.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants