Skip to content

[RSPEED-2267] Convert service commands to return structured output#225

Open
subpop wants to merge 4 commits intorhel-lightspeed:mainfrom
subpop:convert-services-structured-output
Open

[RSPEED-2267] Convert service commands to return structured output#225
subpop wants to merge 4 commits intorhel-lightspeed:mainfrom
subpop:convert-services-structured-output

Conversation

@subpop
Copy link
Member

@subpop subpop commented Jan 21, 2026

Refactor service-related tools to return structured data instead of plain text. This includes converting the output of list_services, get_service_status, and get_service_logs into JSON. Convert get_service_status to use systemctl show instead of the human-readably focused systemctl status. Implement a parser for the systemctl show command to facilitate this structured output.

@subpop subpop force-pushed the convert-services-structured-output branch from 61c0825 to 980c7eb Compare January 21, 2026 15:14
@subpop subpop force-pushed the convert-services-structured-output branch from 980c7eb to f256b00 Compare January 28, 2026 14:11
Copy link
Contributor

@owtaylor owtaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure how to to approach turning systemctl status into structured objects:

A) Parse the output, turn that into objects
B) Use lower-level API to obtain the information, reimplemen* the "summarization" of systemctl status inside our code.
C) Expose the low-level data sources, let the model parse through what it needs

A) is ugly and fragile. B) is ugly. C) will definitely lead to underperformance.

B) is probably best ,but considering how well trained models are on actual systemctl status output, I'm a bit skeptical about structured objects at all as applied to this tool. (It's also possible to have a tool return structured content and freeform content ... but I don't think that really helps things here.)

"service_status": CommandGroup(
commands={
"default": CommandSpec(args=("systemctl", "status", "{service_name}", "--no-pager", "--full")),
"default": CommandSpec(args=("systemctl", "show", "{service_name}", "--no-pager")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really the same thing at all ... systemctl show returns the complete expanded list of properties which includes some status information, but systemctl status provides the most relevant information included information about things like cgroups, and journalctl logs that aren't in the systemctl show.

show also eats a lot more tokens of context.

$ systemctl show avahi-daemon.service --no-pager  | wc -c
8729
$ systemctl status avahi-daemon.service --no-pager  | wc -c
1830

(divide by 3 or so to get tokens.) From my perspective, returning structured output has some advantages we get extra validation, it works better if someone is trying to use the MCP in "code mode", and models are good at reading JSON - better than humans. but we still need to try and focus the result on the most relevant information or we are going to get bad perfomance.

[This is an advantage of guarded command execution - the model itself can figure out what information it needs and write commands to get just that.]

commands={
"default": CommandSpec(args=("journalctl", "-u", "{service_name}", "-n", "{lines}", "--no-pager")),
"default": CommandSpec(
args=("journalctl", "-u", "{service_name}", "-n", "{lines}", "--no-pager", "--output=json")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #275 - it uses entries: list[str] for log entries rather than JSON form, and that made sense to me - models are trained on human readable logs, and the JSON output from systemd logs have a lot of noise in them.

We don't want to expand:

Feb 22 15:56:09 fedora chronyd[1146]: Selected source 72.30.35.89 (2.fedora.pool.ntp.org)

to:

{
        "__SEQNUM_ID" : "9f9a06c9ff7f4d21aa0235bff47c87ad",
        "_SELINUX_CONTEXT" : "system_u:system_r:chronyd_t:s0",
        "_SYSTEMD_INVOCATION_ID" : "b62f1616c8b9453c980910f96f651610",
        "_HOSTNAME" : "fedora",
        "SYSLOG_FACILITY" : "3",
        "__REALTIME_TIMESTAMP" : "1771793899017757",
        "_COMM" : "chronyd",
        "SYSLOG_PID" : "1146",
        "_SYSTEMD_SLICE" : "system.slice",
        "_PID" : "1146",
        "SYSLOG_TIMESTAMP" : "Feb 22 15:58:19 ",
        "_BOOT_ID" : "60ef1101e7f34a23a1c486b7e72913ce",
        "__CURSOR" : "s=9f9a06c9ff7f4d21aa0235bff47c87ad;i=116ab3f;b=60ef1101e7f34a23a1c486b7e72913ce;m=728f9c938;t=64b6fe9f7561d;x=bfa5e635d89c531a",
        "_GID" : "992",
        "_EXE" : "/usr/bin/chronyd",
        "_MACHINE_ID" : "bbe3408bfdf24dae909eb91f836a87f4",
        "_CMDLINE" : "/usr/sbin/chronyd -n -F 2",
        "_RUNTIME_SCOPE" : "system",
        "_TRANSPORT" : "syslog",
        "__MONOTONIC_TIMESTAMP" : "30752229688",
        "_CAP_EFFECTIVE" : "2000400",
        "_SOURCE_REALTIME_TIMESTAMP" : "1771793899017664",
        "SYSLOG_IDENTIFIER" : "chronyd",
        "MESSAGE" : "Selected source 44.190.5.123 (2.fedora.pool.ntp.org)",
        "_UID" : "994",
        "PRIORITY" : "6",
        "_SYSTEMD_CGROUP" : "/system.slice/chronyd.service",
        "__SEQNUM" : "18262847",
        "_SYSTEMD_UNIT" : "chronyd.service"
}

@subpop
Copy link
Member Author

subpop commented Feb 25, 2026

I'm not quite sure how to to approach turning systemctl status into structured objects:

A) Parse the output, turn that into objects B) Use lower-level API to obtain the information, reimplemen* the "summarization" of systemctl status inside our code. C) Expose the low-level data sources, let the model parse through what it needs

A) is ugly and fragile. B) is ugly. C) will definitely lead to underperformance.

B) is probably best ,but considering how well trained models are on actual systemctl status output, I'm a bit skeptical about structured objects at all as applied to this tool. (It's also possible to have a tool return structured content and freeform content ... but I don't think that really helps things here.)

Should we bother returning structured content for this tool then? Maybe the best thing for this tool in particular is to return the true expected output of systemctl status instead of trying to find some sort of structured content. There is no rule that says every tool must return structured content.

@owtaylor
Copy link
Contributor

Should we bother returning structured content for this tool then? Maybe the best thing for this tool in particular is to return the true expected output of systemctl status instead of trying to find some sort of structured content. There is no rule that says every tool must return structured content.

@samdoran - what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants