Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial support to validate the mappings in system tests #2214

Open
wants to merge 59 commits into
base: main
Choose a base branch
from

Conversation

mrodm
Copy link
Contributor

@mrodm mrodm commented Nov 8, 2024

Relates #2206

Add validation of the mappings when running system tests.

This process compares the mappings installed by Fleet (preview mappings) with the ones obtained after ingesting new documents into Elasticsearch.

This validation for the time being is behind an environment variable as a technical preview:

# Default value and same behavior as now (validation of fields)
ELASTIC_PACKAGE_FIELD_VALIDATION_TEST_METHOD=fields elastic-package test system -v

# Test just validation of mappings 
ELASTIC_PACKAGE_FIELD_VALIDATION_TEST_METHOD=mappings elastic-package test system -v

# Test both validation of mappings and fields
ELASTIC_PACKAGE_FIELD_VALIDATION_TEST_METHOD=all elastic-package test system -v

Assumptions considered while working on this PR: #2206 (comment)

As part of this PR:

  • CI pipeline has been updated to remove steps related to running system tests with Elastic Agent.
    • It will be kept one step to ensure that the feature is working.
  • CI pipeline has been updated to add steps to run mapping validation in system tests.

Author's checklist

  • Update CI steps to remove old stack agents
  • Update CI to add new steps for validating fields using mappings
  • Add environment variable to validate that fields are documented with different methods
  • Add new environment variables to docs.
  • Remove debug/testing log statements
  • Test with Elastic stack 8.x
    • Some packages are failing like prometheus failing since some fields depend on dynamic templates (not implemented yet)
  • Test with Elastic stack 7.x (f5 package in integrations repository)
  • Test with integration and input packages (sql_input test package in elastic-package repo).
  • Test with Elastic stacks with LogsDB enabled (synthetics).
    • Tested in CI with ti_anomali_logsdb and auth0_logsdb

@mrodm mrodm self-assigned this Nov 8, 2024
@mrodm
Copy link
Contributor Author

mrodm commented Nov 12, 2024

/test


dataStreamName string

LocalSchema []FieldDefinition
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New variable to hold just the Field Definitions from the local directory (package).

It is needed to check if a given mapping is related to a field definition with type array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using Schema?

@@ -295,6 +328,8 @@ func createValidatorForDirectoryAndPackageRoot(fieldsParentDir string, finder pa
}

v.Schema = append(fields, v.Schema...)

v.LocalSchema = fields
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added here for completeness, but it is not used in this validator.

Should we remove it from here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not used I would say to remove it.

@mrodm
Copy link
Contributor Author

mrodm commented Nov 22, 2024

test integrations

@elastic-vault-github-plugin-prod

Created or updated PR in integrations repository to test this version. Check elastic/integrations#11828

@mrodm
Copy link
Contributor Author

mrodm commented Nov 22, 2024

box_events, claroty_ctd, mimecast, ti_anomali:

  • failures like for threat.indicator.modified_at different types in ECS and in Elasticsearch
  • failures depend on the stack version

There is a difference in the ecs@mappings component template between 8.13.0 and 8.15.3 related to the dynamic template to detect dates causing these errors:

  • 8.13.0 (it is missing *.indicator.last_seen, *.indicate.first_seen and *.indicator.modified_at)
        {
          "ecs_date": {
            "path_match": [
              "*.timestamp",
              "*_timestamp",
              "*.not_after",
              "*.not_before",
              "*.accessed",
              "created",
              "*.created",
              "*.installed",
              "*.creation_date",
              "*.ctime",
              "*.mtime",
              "ingested",
              "*.ingested",
              "*.start",
              "*.end"
            ],
            "unmatch_mapping_type": "object",
            "mapping": {
              "type": "date"
            }
          }
        },
  • 8.15.3
        {
          "ecs_date": {
            "path_match": [
              "*.timestamp",
              "*_timestamp",
              "*.not_after",
              "*.not_before",
              "*.accessed",
              "created",
              "*.created",
              "*.installed",
              "*.creation_date",
              "*.ctime",
              "*.mtime",
              "ingested",
              "*.ingested",
              "*.start",
              "*.end",
              "*.indicator.first_seen",
              "*.indicator.last_seen",
              "*.indicator.modified_at",
              "*threat.enrichments.matched.occurred"
            ],
            "unmatch_mapping_type": "object",
            "mapping": {
              "type": "date"
            }
          }
        }

Not sure if it could be applied some kind of exception @jsoriano

@mrodm
Copy link
Contributor Author

mrodm commented Nov 22, 2024

For the cef package there is a similar error as in box_events and others:

  • 8.6.1 dynamic templates (this mapping does not match url.original):
    • These dynamic templates are added as part of the import_mappings setting.
        {
          "_embedded_ecs-url_original_to_multifield": {
            "path_match": "*.url.original",
            "mapping": {
              "fields": {
                "text": {
                  "type": "match_only_text"
                }
              },
              "type": "wildcard"
            }
          }
        },
  • 8.15.3, these dynamic templates belong to the ecs@mappings component template:
        {
          "ecs_path_match_wildcard_and_match_only_text": {
            "path_match": [
              "*.body.content",
              "*url.full",
              "*url.original"
            ],
            "unmatch_mapping_type": "object",
            "mapping": {
              "fields": {
                "text": {
                  "type": "match_only_text"
                }
              },
              "type": "wildcard"
            }
          }
        },

@mrodm
Copy link
Contributor Author

mrodm commented Nov 22, 2024

In the second build from integrations, teleport started to fail but I could not reproduce it yet locally:
https://buildkite.com/elastic/integrations/builds/18608#_
but in the third attempt it run successfully. I don't know why there could be differences in the mappings.

In the first build, this package was not failing:
https://buildkite.com/elastic/integrations/builds/18601#_

@mrodm
Copy link
Contributor Author

mrodm commented Nov 22, 2024

The error related to ibmmq it's about process.pid.
In this case, this field comes from the Ingest pipeline where it is set the type float (link).
This field is not defined in the package, and when checking with ECS it fails because ECS sets for this field long type (link).

This package fails with both 8.14.0 and 8.15.3 stack versions.

Mapping created for this field in this package during tests:

        "process": {
          "properties": {
            "pid": {
              "type": "float"
            },
            "title": {
              "type": "keyword",
              "fields": {
                "text": {
                  "type": "match_only_text"
                }
              }
            }
          }
        }

EDIT: It looks like that currently (with fields validation), this is accepted if the definition is long (ECS) and the value has type float:

case "float", "long", "double":
switch val := val.(type) {
case float64:
case json.Number:
case string:
if !slices.Contains(v.stringNumberFields, key) {
return invalidTypeError()
}
if _, err := strconv.ParseFloat(val, 64); err != nil {
return invalidTypeError()
}
default:
return invalidTypeError()
}

@mrodm
Copy link
Contributor Author

mrodm commented Nov 22, 2024

Triggered a new execution in integrations with the rest of packages:
https://buildkite.com/elastic/integrations/builds/18617

@mrodm
Copy link
Contributor Author

mrodm commented Nov 25, 2024

In the latest integrations build, there are some failures in network_traffic package. These look related to the same issue #2214 (comment)

There are fields undefined under a group field, that I think thy are not taken into account since until spec 3.0.1 they are not checked.

EDIT: dns.answers in this case it's an array of objects: https://github.com/elastic/integrations/blob/ef28635e4ed1fd875b50861e7bd783af7ef68968/packages/network_traffic/data_stream/dns/sample_event.json#L27

Not sure how to detect this via the actual or preview mappings

@mrodm
Copy link
Contributor Author

mrodm commented Dec 2, 2024

test integrations

@elastic-vault-github-plugin-prod

Created or updated PR in integrations repository to test this version. Check elastic/integrations#11828

@mrodm
Copy link
Contributor Author

mrodm commented Dec 3, 2024

Added an exception to validate mappings that define number type (long, float, double) in 5097aee

This change is related to the issue found in the comment: #2214 (comment)

@mrodm
Copy link
Contributor Author

mrodm commented Dec 3, 2024

Just updated the logic to show warnings when a field is defined as nested, and it could have just some fields under it with no definition.

Tested with gcp package:

Examples:

  • Field type nested (parent), but just one field under that field is not defined:
    • there are actual mappings under this field
    • just some fields are defined in the package
2024/12/03 19:23:34  WARN Validate mappings found (technical preview)
2024/12/03 19:23:34  WARN Skip validation of nested object (spec version gcp.firewall.rule_details.ip_port_info): 3.0.0
2024/12/03 19:23:34  WARN undefined path "gcp.firewall.rule_details.ip_port_info.port_range" (pending to check dynamic templates)
2024/12/03 19:23:34  WARN skip validation due to parent type nested:
[0] field "gcp.firewall.rule_details.ip_port_info.port_range" is undefined: missing definition for path
  • Field type nested (parent):
    • there are actual mappings under this field
    • no field is defined in the package
2024/12/03 19:27:42  WARN Validate mappings found (technical preview)
2024/12/03 19:27:42  WARN Skip validation of nested object (spec version gcp.firewall.rule_details.ip_port_info): 3.0.0
2024/12/03 19:27:42  WARN skipped due to field of type "nested": not found properties in preview mappings for path "gcp.firewall.rule_details.ip_port_info"

Tested updating gcp package to spec 3.0.1, this test fails:

FAILURE DETAILS:
gcp/firewall pubsub:
[0] not found properties in preview mappings for path: gcp.firewall.rule_details.ip_port_info


╭─────────┬─────────────┬───────────┬───────────┬──────────────────────────────────────────────────────────────────────────────────────────────────┬───────────────╮
│ PACKAGE │ DATA STREAM │ TEST TYPE │ TEST NAME │ RESULT                                                                                           │  TIME ELAPSED │
├─────────┼─────────────┼───────────┼───────────┼──────────────────────────────────────────────────────────────────────────────────────────────────┼───────────────┤
│ gcp     │ firewall    │ system    │ pubsub    │ FAIL: test case failed: one or more errors found in mappings in logs-gcp.firewall index template │ 32.218456931s │
╰─────────┴─────────────┴───────────┴───────────┴──────────────────────────────────────────────────────────────────────────────────────────────────┴───────────────╯
--- Test results for package: gcp - END   ---
Done
Error: failed to process results: one or more test cases failed

@mrodm
Copy link
Contributor Author

mrodm commented Dec 3, 2024

test integrations

@elastic-vault-github-plugin-prod

Created or updated PR in integrations repository to test this version. Check elastic/integrations#11828

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably go on with these changes and continue polishing in other PRs.

There seems to be good test coverage, I would only try to make them more real, specially on what we can expect on the actual mappings for a given preview.

internal/fields/mappings.go Show resolved Hide resolved
internal/fields/mappings.go Outdated Show resolved Hide resolved
internal/fields/mappings.go Outdated Show resolved Hide resolved
Comment on lines +498 to +503
if v.specVersion.LessThan(semver3_0_1) {
if mappingParameter("type", actual) == "nested" {
logger.Warnf("Skip validation of nested object (spec version %s): %s", path, v.specVersion)
isNestedParent = true
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation before 3.0.1 ignored everything under nested objects, right? Couldn't we return nil directly here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's not needed, but I didn't return here with nil to allow us to show warnings with all the missing definitions.

internal/fields/mappings.go Outdated Show resolved Hide resolved
"type": "keyword",
},
},
actual: map[string]any{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this case happen? Than an actual mapping doesn't have a field that is in the preview?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that maybe this could happen since in system testing it is appended all the ECS definitions depending on the stack version or import_mappings.

Maybe there could be a case (e.g. with the ecs@mappings component template without import_mappings) where some field does not match any dynamic template but it is in the ECS.

Would that make sense @jsoriano ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that in this actual mapping, the foo field should be also present, right?

},
},
expectedErrors: []string{
`field "user" is undefined: missing definition for path`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user is in the mapping, and is not external, it should be in the preview, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. As user is in the schema, it should have a mapping in the preview too. I'll fix this test case.

internal/fields/mappings_test.go Show resolved Hide resolved
internal/fields/mappings_test.go Show resolved Hide resolved
internal/fields/mappings_test.go Show resolved Hide resolved
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @mrodm

@mrodm
Copy link
Contributor Author

mrodm commented Dec 9, 2024

test integrations

@elastic-vault-github-plugin-prod

Created or updated PR in integrations repository to test this version. Check elastic/integrations#11828

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants