Skip to content

Conversation

@nullifysecurity
Copy link
Contributor

Overview

This pull request adds support for generating a JSON report from the artifacts verify command, along with several quality-of-life improvements to make artifact verification easier to integrate into CI/CD pipelines.

The following flags have been added to the artifacts verify command:

  • soft_fail - returns a zero exit code even when verification failures occur, allowing consumers to determine the outcome programmatically
  • format - specifies the output format (currently only json is supported)
  • output - specifies the file path to which the output should be written

Previously, determining the result of artifacts verify required parsing stdout/stderr and extracting the verification outcome from log output. This approach provides a more structured and reliable way to consume results and also makes it easier to pass them to downstream tasks for further processing or conversion (for example, to JUnit or SARIF).

The implementation is intentionally designed to be extensible. Longer term, the goal is to support native SARIF output for artifact verification so results can be consumed directly by pipelines that support static analysis reporting. At present, differentiating error types would require parsing error messages and mapping them to rule identifiers, which is possible but fairly involved. This change provides a practical intermediate step while laying the groundwork for that future enhancement.

Please let me know if you have any recommendations or changes to help this feature fit into Velociraptor's design.

Example

The following code snippet is an example output, showcasing successful, warning and failed artifact validations.

{
  "timestamp": "2026-01-04T08:44:42Z",
  "metadata": {
    "version": "0.75.6",
    "commit": "c068283e7",
    "build_time": "2026-01-04T19:44:26+11:00",
    "command": "./velociraptor artifacts verify -v --soft_fail --format json --output report.json ./artifacts/Artifact1.yaml ./artifacts/Artifact2.yaml ./artifacts/Artifact3.yaml ./artifacts/Artifact4.yaml"
  },
  "exit_code": 0,
  "summary": {
    "total": 4,
    "passed": 1,
    "warning": 1,
    "failed": 2
  },
  "artifacts": [
    "Generic.Test.Artifact1",
    "Generic.Test.Artifact2",
    "Generic.Test.Artifact3",
    "Unknown"
  ],
  "results": [
    {
      "name": "Generic.Test.Artifact1",
      "path": "./artifacts/Artifact1.yaml",
      "status": "pass",
      "errors": [],
      "warnings": []
    },
    {
      "name": "Generic.Test.Artifact2",
      "path": "./artifacts/Artifact2.yaml",
      "status": "fail",
      "errors": [
        "Generic.Test.Artifact2: query: Call to Artifact.Generic.Client.Info contains unknown parameter Foo",
        "Generic.Test.Artifact2: query: Unknown artifact reference Generic.Unknown.Artifact",
        "Generic.Test.Artifact2: query: Unknown plugin infoxxxx()"
      ],
      "warnings": []
    },
    {
      "name": "Generic.Test.Artifact3",
      "path": "./artifacts/Artifact3.yaml",
      "status": "warning",
      "errors": [],
      "warnings": [
        "\u003cyellow\u003eSuggestion\u003c/\u003e: Add EXECVE to artifact's required_permissions or implied_permissions fields"
      ]
    },
    {
      "name": "Unknown",
      "path": "./artifacts/Artifact4.yaml",
      "status": "fail",
      "errors": [
        "yaml: unmarshal errors:\n  line 7: field invalid not found in type proto.Artifact"
      ],
      "warnings": []
    }
  ]
}

@scudette
Copy link
Contributor

scudette commented Jan 4, 2026

Thanks for the PR! This is an important use case - to be able to provide structured verification output for processing by other tools.

The way Velociraptor is designed, most of the functionality is exported via VQL and the commands are usually just thin wrappers around that functionality (or at least we aim to make all functionality available via VQL artifacts - it is a work in progress for some things).

This approach has the advantage that people can customize the output as they please by just modifying the artifacts if needed.

For example in previous versions the velociraptor rpm client command would build the client rpm but now this is exported to the Server.Utils.CreateLinuxPackages which does all the heavy lifting and the old command just runs that query.

VQL makes the artifact verifier available via the verify() function which can be used to build the basic functionality.

The output you presented can be produced by the following artifact.

name: Server.Utils.ArtifactVerifier
type: SERVER

parameters:
- name: SearchGlob
  default: /tmp/*.yaml
  description: A glob to capture all artifacts to verify

- name: ErrorIsFatal
  type: bool
  default: N
  description: If set, an error is produced if any artifact is failed.

sources:
- query: |
    -- Convert the array to a string
    LET _Stringify(X) = SELECT str(str=_value) AS value
      FROM foreach(row=X)
    LET Stringify(X) = _Stringify(X=X).value

    LET MaybeLogError(Verify, Path) = if(condition=ErrorIsFatal,
     then=Verify.Errors AND log(level="ERROR", dedup= -1, message="%v failed!", args=Path),
     else=NOT Verify.Errors)

    -- Extract the name of the artifact from the raw data - needed if
    -- the yaml can not be parsed at all then we need to fallback to a
    -- regex.
    LET GetName(Artifact) = parse_string_with_regex(
        regex='''^name:\s*(.+)''', string=Artifact).g1

    LET Files = SELECT OSPath,
                       read_file(filename=OSPath, length=10000) AS Data
      FROM glob(globs=SearchGlob)

    LET Results <= SELECT name,
           path,
           MaybeLogError(Verify=Verify, Path=path) AS passed,
           Stringify(X=Verify.Errors) AS errors,
           Stringify(X=Verify.Warnings) AS warnings
    FROM foreach(row=Files,
                 query={
        SELECT OSPath AS path,
               GetName(Artifact=Data) AS name,
               verify(artifact=Data) AS Verify
        FROM scope()
      })

    -- Add some metadata to the output and present in the same row.
    SELECT timestamp(epoch=now()) AS timestamp,
       config.version as metadata,
       dict(
         total=len(list=Results),
         passed=len(list=filter(list=Results, condition="x=>x.passed")),
         failed=len(list=filter(list=Results, condition="x=>NOT x.passed")),
         warnings=len(list=filter(list=Results, condition="x=>x.warnings"))
       ) AS summary,
       { SELECT name FROM Results } as artifacts,
       Results as results
    FROM scope()

I am not sure if this is the best output or some tweaking is required.

I think it is important to have that artifact built in which is why I added it in #4608 but that artifact can be used on older versions using the --definitions flag and the new CLI parser:

velociraptor --definitions path/to/artifactdir/ -v -r Server.Utils.ArtifactVerifier --SearchGlob '/path/to/artifacts/*.yaml' 

The idea is to allow artifacts to be used seamlessly as CLI commands so you can extend this artifact by renaming it or overriding it to suite your own requirement.

PS Note that with bash, you need to protect the * using single quotes to ensure it makes it to the program and bash does not expand it instead.

@nullifysecurity
Copy link
Contributor Author

nullifysecurity commented Jan 4, 2026

Thanks @scudette!

I hadn't thought of turning it into an artifact, that totally makes sense with Velociraptor's design and also makes it backwards compatible which is great. Thanks for taking the time to put together this artifact, I look forward to utilizing it and hope that others will find it useful as well!

Did you want me to help refactor the artifacts verify command to call this artifact, or call the verify VQL function similarly to how you've done it with the debian and rpm commands, or leave it as is for backwards compatibility?

velociraptor/bin/debian.go

Lines 116 to 124 in 04400b5

query := `
LET _ <= log(message="Packaging binary %v to server Deb", args=BinaryToPackage)
SELECT OSPath
FROM deb_create(exe=BinaryToPackage, server=TRUE,
directory_name=Output,
release=Release)`
return runQueryWithEnv(query, builder, "json")

Thanks again! 😃

@scudette
Copy link
Contributor

scudette commented Jan 5, 2026

Thanks - this is where we want to end up - most of the CLI commands should end up calling VQL internally.

The discussion now should be about what is the standard output. The JSON output in the new artifact was done to mimic the output in this PR but I am not sure if this is the best output.

Maybe a more reasonable output is a row oriented output where each row is a verification of a single artifact.

The other issue to consider is how the output should be presented - currently the verify command is used in many CI pipelines - which just rely on the return value. If the pipeline fails we have to manually look through the output to see which artifact failed and the output is just printed in an unstructured way. For example:

image

But if we now emit JSON then it might be too much output and hard to see for a human.

I think maybe we need to suppress verified artifacts and only show ones with warnings or errors, one per line. (So essentially hide all the OK lines in the above screenshot).

So maybe the VQL for the verify command should be slightly different than the Server.Utils.ArtifactVerifier or maybe the output format of Server.Utils.ArtifactVerifier should be changed

Would love a PR and also feedback as to what would be most useful

@scudette
Copy link
Contributor

scudette commented Jan 5, 2026

maybe the artifact should log the "Verified ... OK" logs and emit the full row so we can run it with the --output flag to redirect output to the CI artifacts and then the logs will look exactly the same as the current output.

@nullifysecurity
Copy link
Contributor Author

The discussion now should be about what is the standard output. The JSON output in the new artifact was done to mimic the output in this PR but I am not sure if this is the best output.

Agreed with you here, the format I chose for the output is fairly arbitrary based on the existing output of the command and what would be useful for other downstream consumers.

Maybe a more reasonable output is a row oriented output where each row is a verification of a single artifact.

As in still returning an array, but having each element of the array contain the verification of a single artifact? Pretty much like the current results array?

[{"name": "Generic.Test.Artifact1","path": "/home/user/github/velociraptor/output/artifacts/Artifact1.yaml","passed": true,"errors": [],"warnings": []}, {"name": "Generic.Test.Artifact2","path": "/home/user/github/velociraptor/output/artifacts/Artifact2.yaml","passed": false,"errors": ["Generic.Test.Artifact2: query: Call to Artifact.Generic.Client.Info contains unknown parameter Foo","Generic.Test.Artifact2: query: Unknown artifact reference Generic.Unknown.Artifact","Generic.Test.Artifact2: query: Unknown plugin infoxxxx()"],"warnings": []}]

I guess also like how other artifact emit "rows" as normal? That could definitely also work instead of creating a single object for the output and still achieve the same result of being able to be machine readable.

The other issue to consider is how the output should be presented - currently the verify command is used in many CI pipelines - which just rely on the return value. If the pipeline fails we have to manually look through the output to see which artifact failed and the output is just printed in an unstructured way. For example:

image But if we now emit JSON then it might be too much output and hard to see for a human.

I think maybe we need to suppress verified artifacts and only show ones with warnings or errors, one per line. (So essentially hide all the OK lines in the above screenshot).

Yeah, this is the difficulty of making it readable to both humans and machines. Maybe by default the verified artifacts are hidden and we create a flag that adds them back, or they are only output with verbosity?

I think this is handled nicely when the intent is clear that the caller wants an output by supplying a format flag (or some other method), but the standard output still logs the simple one-liner statuses so that it's quick for a human to identify which artifacts may have failed, but there's verbose information available in the JSON output.

So maybe the VQL for the verify command should be slightly different than the Server.Utils.ArtifactVerifier or maybe the output format of Server.Utils.ArtifactVerifier should be changed

Would love a PR and also feedback as to what would be most useful

maybe the artifact should log the "Verified ... OK" logs and emit the full row so we can run it with the --output flag to redirect output to the CI artifacts and then the logs will look exactly the same as the current output.

I think this is a good idea, is the intention kind of what I alluded to above?

@nullifysecurity
Copy link
Contributor Author

nullifysecurity commented Jan 5, 2026

So noticed something kind of funky, I was running the Server.Utils.ArtifactVerifier artifact across the artifacts from the exchange to test and was getting errors reported from the artifact that weren't being reported by the verify VQL. For example this error from the Server.PostProcess.FluentBit artifact,

./velociraptor -v -r Server.Utils.ArtifactVerifier --SearchGlob "$(pwd)/artifacts/exchange/FluentBit.yaml"

{
    "name": "Server.PostProcess.FluentBit",
    "path": "/home/user/github/velociraptor/output/artifacts/exchange/FluentBit.yaml",
    "passed": false,
    "errors": [
        "Invalid artifact  Server.PostProcess.FluentBit: All Queries in a source must be LET queries, except for the final one."
    ],
    "warnings": []
}

./velociraptor query "LET Data = SELECT * FROM read_file(filenames='/home/user/github/velociraptor/output/artifacts/exchange/FluentBit.yaml') LET _Verify = SELECT verify(artifact=Data) AS Result FROM Data SELECT Result.Errors, Result.Warnings FROM _Verify" -v

[
    {
        "Result": {
            "Artifact": "...",
            "Permissions": [
                "EXECVE",
                "FILESYSTEM_WRITE",
                "MACHINE_STATE",
                "READ_RESULTS"
            ],
            "Errors": null,
            "Warnings": null,
            "Definitions": "..."
        }
    }
]

I'll have a look into it, see if I can figure it out, just flagging it here though (might also be user error..)

Edit: Figured it out, the artifact gets truncated by the read_file length parameter. This will probably need increasing, maybe to a maximum of like 4MB or something? I imagine it'd be rare that artifacts exceed 4MB.

LET Files = SELECT OSPath,
read_file(filename=OSPath, length=10000) AS Data
FROM glob(globs=SearchGlob)

@scudette
Copy link
Contributor

scudette commented Jan 5, 2026

yes that's right - we probably need a much larger buffer.

Refactors the `artifacts verify` command to use the `verify` VQL function which better fits Velociraptor's VQL-first design.

This is only an initial implementation, it currently does not provide feature parity due to missing control over overriding built-in artifacts in the `verify` function.
Adds a `disable_override` argument to the `verify` function which adds the ability control the override of built-in artifacts. This has been added to maintain backwards compability as previously built-in artifacts were overridden by default.
Removes creation of the local artifact repository during verification to ensure that we can check for override of built-in artifacts.
Re-implements the `--builtin` flag functionality for the `verify` command for feature parity and backwards compatibility.
@nullifysecurity
Copy link
Contributor Author

How's this looking, @scudette?

I've refactored the verify command to now call the verify VQL function instead of performing the validation programmatically. I've also had to implement a new disable_override argument for the verify function to ensure feature parity of the command, which meant removing the local registry that was created to check for built-in artifacts (let me know if this unintentionally breaks anything).

This should now hopefully have feature parity and backwards combability to ensure that the behavior stays the same for existing consumers but allows us to continue to modify the functionality. Do we still want to add similar JSON output functionality to the verify command (basically just outputting an array of the states) or leave that up to the Server.Utils.ArtifactVerifier artifact?

I'll probably also modify the Server.Utils.ArtifactVerifier artifact to emit a row per verified artifact instead of the single result output that we have now, I think that fits better with your vision?

Is it worth creating a new pull request for these features as we've deviated from the original pull request scope?

@scudette
Copy link
Contributor

scudette commented Jan 6, 2026

removing the local repository is actually a problem and it exposes a potential bug.

Verifying an artifact we dont necessarily want to import it into the global repository - we just want to verify it. This is why we have a local repository so we can load the artifacts into it without really importing them into the main repository.

The issue is that we actually need to try and import all the artifacts in before we verify them because we dont know if any custom artifacts refer to other custom artifacts.

Say you have an artifact Custom1 which called SELECT * FROM Artifact.Custom2(). If we called verify on the artifacts one at the time, then fed in Custom1 before Custom2. When we ask to verify Custom1, at this point we dont know about Custom2 and this will result in a verification error.

So the old code (in the artifact verify command) was doing it in two steps:

  1. First load all the artifacts into the local repository in the first pass
  2. Then in the second pass, verify them against the local repository

This actually is harder to do in VQL with the verify() function - if we glob all the files, we need a way to load them into a custom repository first, then verify against that one.

So I think we need:

  1. Add a repository parameter to artifact_set() so we can build a local repository and cache it with that name in the scope.
  2. Add a repository parameter to verify() so we can verify against a local repository in the scope

Then the VQL would do it in two passes, first import into the local repository and then verify against it.

The function can get the local repository out of the scope using CacheGet() for example:

cached_any := vql_subsystem.CacheGet(scope, lambdaCacheTag)

@nullifysecurity
Copy link
Contributor Author

Ah okay! That's a misunderstanding on my behalf of the original logic; I hadn't considered that case.

Let me have another crack at it, I'll see if I can come up with a solution that fixes the dependency issue and also respects the prevention of built-in artifact override.

@nullifysecurity
Copy link
Contributor Author

@scudette,

I've made some initial implementations, although I suspect there are a lot of edge cases I haven't considered yet. Additionally, I still need to implement the built-in artifact functionality as it's not checked at the moment in the new changes.

@nullifysecurity
Copy link
Contributor Author

nullifysecurity commented Jan 7, 2026

Just looking at solving this built-in configuration, should we add all of the existing artifact definitions from the global repository to the new local repository so that we can check if the already exist. Or, simply query the global repository before adding to the local repository to see if it's actually a built-in?

Edit:

I've added this to the verify function, not sure if this is the best solution or not? I think this was done elsewhere, where we create a temporary repository with a parent of the global repository and try to load the artifact into it to determine if it's built-in or not.

if arg.DisableOverride {
	tmp_repository := manager.NewRepository()
	tmp_repository.SetParent(repository, config_obj)

	_, err = tmp_repository.LoadYaml(arg.Artifact,
		services.ArtifactOptions{
			ValidateArtifact:  true,
			ArtifactIsBuiltIn: false,
		})
	if err != nil {
		state.SetError(err)
		return state
	}
}

Edit 2:

The above doesn't quite work, I think because there's a bit of complexity in that the artifact can be the name of an artifact or it can be YAML/JSON, still trying to work out how to resolve this.

1. Order of artifact evaluation was non deterministic due to use of
   map
2. If an artifact was not able to load it would not produce any
   error.
3. The Server.Utils.ArtifactVerifier did not account for dependencies
   of other custom artifacts.
@scudette
Copy link
Contributor

scudette commented Jan 7, 2026

This looks reasonable 👍🏽 .

I played with this PR a bit and found a couple of bugs so I pushes a fix in the latest commit

Looking great!

@nullifysecurity
Copy link
Contributor Author

This looks reasonable 👍🏽 .

I played with this PR a bit and found a couple of bugs so I pushes a fix in the latest commit

Looking great!

Thanks for fixing up those issues!

I think I've figured out how to re-implement the configurable built-in override functionality by adding the check into the initial call to artifact_set which now parses the artifact and checks whether it exists in the global repository and is a built-in artifact.

// Determine if this is a built-in artifact
tmp_repository := local_repository.Copy()
built_in := false
artifact, err := tmp_repository.LoadYaml(arg.Definition,
services.ArtifactOptions{
ValidateArtifact: true,
ArtifactIsBuiltIn: true,
})
if err == nil {
if global_artifact, pres := global_repository.Get(ctx, config_obj, artifact.Name); pres {
built_in = global_artifact.BuiltIn
}
}

I'm sure there are some weird edge-cases here that I've missed, but I think everything appears to be working now?

Removed `length` argument from call to `read_file` as it's implicitly set to 4 MiB. This should be enough to cover most artifacts.
Adds a `issues_only` flag to the `verify` command that suppresses output of successful artifact verifications. This can be used to remove noise in the command output.
@scudette scudette merged commit f45b981 into Velocidex:master Jan 7, 2026
5 checks passed
@scudette
Copy link
Contributor

scudette commented Jan 7, 2026

Thanks for working through this PR - it is really good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants