(old) feat: output security reports as JSON when requested #136

jajanet · 2026-02-06T04:24:46Z

(closed in favor of #138 due to CLA check / merging issues that arose from applying a suggestion from GitHub actions bot)

This PR adds the ability to output security report results in JSON. This enables programmatic parsing for accuracy checks, standardization, and integration with SCM tools and CI/CD pipelines (e.g., GitHub Actions, Jenkins)

Example of original markdown report and corresponding JSON:

- **Vulnerability:** Path Traversal and Command Injection
- **Vulnerability Type:** Security
- **Severity:** Critical
- **Source Location:** `lib/router.js`
- **Line Content:** `full_path = "" + dispatch.static_route + (unescape(pathname));`
- **Description:** The `pathname` variable, derived from the URL, is not sanitized before being used to construct a file path. An attacker can use URL-encoded characters like `../` to traverse the file system and access arbitrary files. This vulnerability is further escalated to command injection because the `full_path` is used in a `spawn` call, allowing an attacker to execute arbitrary commands on the system.
- **Recommendation:** Sanitize the `pathname` variable by removing any directory traversal characters before using it to construct a file path. Use `path.normalize()` or a similar function to resolve the path and ensure it stays within the intended directory.

turns into

[
    {
        "vulnerability": "Path Traversal and Command Injection",
        "vulnerabilityType": "Security",
        "severity": "Critical",
        "extension": {
            "sourceLocation": {
                "File": "lib/router.js",
                "startLine": null,
                "endLine": null
            },
            "sinkLocation": {
                "File": null,
                "startLine": null,
                "endLine": null
            },
            "dataType": null
        },
        "lineContent": "`full_path = \"\" + dispatch.static_route + (unescape(pathname));`",
        "description": "The `pathname` variable, derived from the URL, is not sanitized before being used to construct a file path. An attacker can use URLencoded characters like `../` to traverse the file system and access arbitrary files. This vulnerability is further escalated to command injection because the `full_path` is used in a `spawn` call, allowing an attacker to execute arbitrary commands on the system.",
        "recommendation": "Sanitize the `pathname` variable by removing any directory traversal characters before using it to construct a file path. Use `path.normalize()` or a similar function to resolve the path and ensure it stays within the intended directory."
    },
...
]

Fields are optional and written as null if not present, as the tool assumes that the DRAFT_SECURITY_REPORT.md file is well-formed and has the expected fields

This is an initial implementation to help improve processes, and we may iterate using Vertex calls in the future. There is an upcoming PR on adding a subfield for code changes sometimes present under recommendation as well

…klist

gemini-code-assist · 2026-02-06T04:25:01Z

Summary of Changes

Hello @jajanet, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the security analysis tool by introducing the ability to output findings in a structured JSON format. This change is crucial for enabling automated processing, integration with external systems like CI/CD pipelines, and improving the overall utility of the security reports. The implementation involves adding a new parsing utility, integrating it as a server tool, and updating the workflow configuration to support conditional JSON generation.

Highlights

JSON Output for Security Reports: Introduced the capability to generate security analysis reports in JSON format, alongside the existing Markdown output. This feature is activated when explicitly requested by the user (e.g., via --json flag).
New Markdown Parsing Logic: Added a new parser.ts module containing parseMarkdownToDict function, responsible for converting the structured Markdown security reports into a standardized JSON object array. This includes handling various field formats and extracting location details.
Integration of JSON Conversion Tool: A new convert_report_to_json tool has been registered in mcp-server/src/index.ts. This tool reads the Markdown report, utilizes the new parsing logic, and writes the resulting JSON to .gemini_security/security_report.json.
Updated Workflow Configuration: The analyze.toml configuration has been updated to conditionally invoke the convert_report_to_json tool during the final reporting phase and to include a 'Generate JSON report' task in the initial plan if JSON output is requested.
Comprehensive Unit Tests for Parser: A new test file parser.test.ts has been added, providing extensive unit tests for the parseMarkdownToDict function to ensure accurate and robust parsing of various Markdown report structures, including handling missing data and different formatting.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

commands/security/analyze.toml
- Modified the 'Final Reporting & Cleanup' phase to include a conditional call to the convert_report_to_json tool if JSON output is requested.
- Added a new action to inform the user about the availability of the JSON report at .gemini_security/security_report.json.
- Updated the initial plan generation to include a 'Generate JSON report' task when JSON output is requested.
mcp-server/src/index.ts
- Imported the new parseMarkdownToDict function.
- Registered a new server.tool named convert_report_to_json to handle the conversion of Markdown security reports to JSON.
- The convert_report_to_json tool reads the Markdown report, parses it, and writes the JSON output to a file.
mcp-server/src/parser.test.ts
- Added a new file containing unit tests for the parseMarkdownToDict function.
- Tests cover various scenarios including standard vulnerabilities, privacy violations, multiple findings, markdown formatting, missing line numbers, and missing sink locations.
mcp-server/src/parser.ts
- Added a new file defining Location and Finding interfaces for the structured JSON output.
- Implemented parseLocation function to extract file paths and line numbers from location strings.
- Implemented parseMarkdownToDict function to parse Markdown security reports into an array of Finding objects, handling different field formats and markdown syntax.

Activity

The pull request was created by jajanet with the clear intent of adding JSON output capabilities for security reports.
New files mcp-server/src/parser.ts and mcp-server/src/parser.test.ts were added to implement and test the Markdown to JSON parsing logic.
Existing files commands/security/analyze.toml and mcp-server/src/index.ts were modified to integrate the new JSON output feature into the existing workflow.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

github-actions · 2026-02-06T04:25:02Z

🤖 Hi @jajanet, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

gemini-code-assist

Code Review

This pull request introduces a valuable feature to output security reports in JSON format, which will greatly improve programmatic consumption of the results. The implementation includes a new markdown parser, tests, and a server tool to handle the conversion.

My review focuses on improving the robustness and correctness of the new parser and its integration. I've identified a critical bug in the parsing logic that could lead to incorrect JSON output, along with a few medium-severity issues related to error handling, a typo in a prompt, and code consistency. The provided code suggestions aim to resolve these issues.

mcp-server/src/parser.ts

commands/security/analyze.toml

mcp-server/src/index.ts

mcp-server/src/parser.ts

github-actions

📋 Review Summary

This pull request introduces a valuable feature to output security reports in JSON format. The implementation is well-executed, with a robust markdown parser and comprehensive test coverage. The code is clean, and the new functionality is a great addition for enabling programmatic use of the security reports.

🔍 General Feedback

I've left a few minor suggestions for improving type safety and code readability, but overall this is a solid contribution.

github-actions · 2026-02-06T04:26:30Z

mcp-server/src/index.ts

+    }
+  }) as any
+);
+


🟡 Casting the function to `any` using `as any` bypasses TypeScript's type safety checks. It would be more robust to define and use a specific type for the server tool implementation to ensure type safety and prevent potential runtime errors.

mcp-server/src/parser.ts

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

github-actions

## 📋 Security Analysis Summary

This pull request introduces a new feature to output security reports in JSON format. The implementation adds a new tool to the MCP server and a parser to convert the markdown report to JSON. The code is well-structured and includes tests for the new parser.

🔍 General Feedback

The new feature is a valuable addition that will improve the usability of the security extension.
The code is clean and easy to understand.
One potential security vulnerability was identified in the new parser.

github-actions · 2026-02-06T04:28:46Z

mcp-server/src/parser.ts

+    const extract = (label: string): string | null => {
+      const fieldNames = 'Vulnerability Type|Severity|Source Location|Sink Location|Data Type|Line Content|Description|Recommendation';
+      const patternStr = `(?:-?\\s*\\**)?${label}\\**:\\s*([\\s\\S]*?)(?=\\n(?:-?\\s*\\**)?(?:${fieldNames})|$)`;
+      const pattern = new RegExp(patternStr, 'i');


High ReDoS in parser
The regular expression used to parse the markdown report is vulnerable to Regular Expression Denial of Service (ReDoS). An attacker can craft a malicious markdown file that will cause the regex engine to backtrack excessively, leading to a denial of service. The vulnerable part is ([\\s\\S]*?) which can cause catastrophic backtracking.

Suggested change

const pattern = new RegExp(patternStr, 'i');

const fieldNames = 'Vulnerability Type|Severity|Source Location|Sink Location|Data Type|Line Content|Description|Recommendation';

const patternStr = `(?:-?\\\\s*\\\\**)?${label}\\\\**:\\\\s*([\\\\s\\\\S]*?)(?=\\\\n(?:-?\\\\s*\\\\**)?(?:${fieldNames})|$)`;

const pattern = new RegExp(patternStr, 'i');

can we fix this?

QuanZhang-William

Thank you for the PR!

Left some questions and I think we should also update the README.md

mcp-server/src/parser.ts

commands/security/analyze.toml

shrishabh

Thank you for the PR!

I question that I was wondering - here we are deterministically parsing the markdown into json using regexes. Another option could have been using LLMs to adjust their outputs so that they conform to this particular format. I think deterministically parsing is a great first step, but we might want to keep an eye out for failure cases and potentially move to the alternative if we see lots of errors.

shrishabh · 2026-02-09T19:19:19Z

mcp-server/src/parser.ts

+    const extract = (label: string): string | null => {
+      const fieldNames = 'Vulnerability Type|Severity|Source Location|Sink Location|Data Type|Line Content|Description|Recommendation';
+      const patternStr = `(?:-?\\s*\\**)?${label}\\**:\\s*([\\s\\S]*?)(?=\\n(?:-?\\s*\\**)?(?:${fieldNames})|$)`;
+      const pattern = new RegExp(patternStr, 'i');


can we fix this?

shrishabh · 2026-02-09T19:20:32Z

mcp-server/src/parser.ts

+      vulnerabilityType: extract("Vulnerability Type"),
+      severity: extract("Severity"),
+      dataType: extract("Data Type"),
+      sourceLocation: parseLocation(rawSource),


What happens if there is an error in any of these extraction methods? It might be worth considering failing gracefully here as the rest of the extraction might still produce a reasonable report?

It returns null! Examples:

[ { "vulnerability": "Cross-Site Scripting (XSS)", "vulnerabilityType": "Security", "severity": "High", "dataType": null, // <-- only relevant to privacy issues "sourceLocation": { "file": "js/src/util.js", "startLine": 120, "endLine": 120 }, "sinkLocation": { "file": null, "startLine": null, "endLine": null }, "lineContent": "`const $selector = $(selector)`", "description": "The `getSelectorFromElement` function is vulnerable to DOM-based XSS. It constructs a jQuery selector from the `data-target` or `href` attributes of an element without proper sanitization. An attacker can inject malicious JavaScript code into these attributes, which will then be executed when the selector is processed by jQuery.", "recommendation": "Sanitize the selector before passing it to jQuery. For example, ensure that the selector only contains valid CSS selector characters. A simple fix is to escape the selector using a library like `CSS.escape()` if available, or to validate the selector against a regex for valid characters. Given the context of this file, a more robust solution would be to ensure that any user-provided input is properly encoded before being used in a selector." } ]

and

[ { "vulnerability": "Cross-Site Scripting (XSS)", "vulnerabilityType": "Security", "severity": "High", "dataType": null, "sourceLocation": { "file": "bin/mk-server", "startLine": null, // sometimes missing "endLine": null }, "sinkLocation": { "file": null, "startLine": null, "endLine": null }, "lineContent": "`res.end \"<h1>User No: <span style='color: red;'>\" + req.params.id + \"</span></h1>\"`", "description": "The application is vulnerable to reflected XSS. The `id` parameter from the URL is directly embedded into the HTML response without any sanitization. This allows an attacker to inject malicious scripts into the response, which will be executed in the context of the user's browser.", "recommendation": "Sanitize the `req.params.id` variable before embedding it in the HTML response. Use a library like `he` to encode HTML entities." } ]

Also not sure what you mean by "can we fix this?" and can't reply directly to that comment -- do you mean pull it out of the function? I did add caching the regex pattern (for performance) and small readability changes like pulling it out of the function and using a list for field names, but that might not be what you're referring to? :

const FIELD_NAMES = [ 'Vulnerability', 'Severity', 'Source', 'Sink', 'Data', 'Line', 'Description', 'Recommendation', ].join('|'); // add more labels here const patternCache = new Map<string, RegExp>(); const escapeRegExp = (s: string) => s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); ... /** * Parses a markdown string containing security findings into a structured format. * The markdown should follow a specific format where each finding starts with "Vulnerability:" and includes fields like "Severity:", "Source Location:", etc. * The function uses regular expressions to extract the relevant information and returns an array of findings. * * @param content - The markdown string to parse. * @returns An array of structured findings extracted from the markdown. */ function extractFromSection(section: string, label: string): string | null { const pattern = buildPattern(label); const match = section.match(pattern); return match ? match[1].trim() : null; }; ... const rawSource = extractFromSection(section, "Source Location"); ...

QuanZhang-William

/lgtm to me.

Please address other feedbacks before merge :)

jajanet · 2026-02-11T18:17:02Z

@QuanZhang-William I really can't resolve the failing CLA check -- I tried doing a lot of reauthoring / fixing of commits but doesn't work! I'm gonna open up another PR with these changes

jajanet · 2026-02-11T18:24:48Z

closing and replacing with #138 due to CLA issues

jajanet added 2 commits February 6, 2026 04:02

Add files / tests for outputting security reports as JSON when requested

50b72c0

tighten instructions, set to specific file name only, and update chec…

3faf18f

…klist

jajanet requested review from QuanZhang-William, QuinnDACollins, capachino, evanotero, heltonduarte and shrishabh as code owners February 6, 2026 04:24

gemini-code-assist bot reviewed Feb 6, 2026

View reviewed changes

mcp-server/src/parser.ts Outdated Show resolved Hide resolved

commands/security/analyze.toml Outdated Show resolved Hide resolved

mcp-server/src/index.ts Outdated Show resolved Hide resolved

mcp-server/src/parser.ts Outdated Show resolved Hide resolved

github-actions bot reviewed Feb 6, 2026

View reviewed changes

jajanet and others added 2 commits February 5, 2026 20:27

fix regex

3493140

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fix more regex

1e3a81f

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

github-actions bot reviewed Feb 6, 2026

View reviewed changes

jajanet added 2 commits February 6, 2026 04:34

add gemini suggestions

009798b

fix tests

9e2a85c

QuanZhang-William reviewed Feb 6, 2026

View reviewed changes

mcp-server/src/parser.ts Outdated Show resolved Hide resolved

commands/security/analyze.toml Show resolved Hide resolved

commands/security/analyze.toml Outdated Show resolved Hide resolved

jajanet added 4 commits February 9, 2026 17:36

Fix JSON output types

eb9af7c

Add absolute pathing

04444c1

Explicitly tell extension to keep generated security file if generated

231ba24

Add json output usage instructions

8a94119

shrishabh reviewed Feb 9, 2026

View reviewed changes

QuanZhang-William approved these changes Feb 9, 2026

View reviewed changes

cache for performance + pull out/use list for readability

62228e8

jajanet changed the title ~~feat: output security reports as JSON when requested~~ (old) feat: output security reports as JSON when requested Feb 11, 2026

jajanet closed this Feb 11, 2026

jajanet mentioned this pull request Feb 11, 2026

feat: output security reports as JSON when requested #138

Open

(old) feat: output security reports as JSON when requested #136

(old) feat: output security reports as JSON when requested #136

Conversation

jajanet commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Feb 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

github-actions bot commented Feb 6, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

📋 Review Summary

🔍 General Feedback

Uh oh!

github-actions bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

🔍 General Feedback

Uh oh!

github-actions bot Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

shrishabh Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

QuanZhang-William left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shrishabh left a comment

Choose a reason for hiding this comment

Uh oh!

shrishabh Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

shrishabh Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

jajanet Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

jajanet Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

QuanZhang-William left a comment

Choose a reason for hiding this comment

Uh oh!

jajanet commented Feb 11, 2026

Uh oh!

jajanet commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jajanet commented Feb 6, 2026 •

edited

Loading

jajanet Feb 11, 2026 •

edited

Loading