Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion commands/security/analyze.toml
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,8 @@ For EVERY task, you MUST follow this procedure. This loop separates high-level s
5. **Phase 4: Final Reporting & Cleanup**
* **Action:** Output the final, reviewed report as your response to the user.
* **Action:** If, after the review, no vulnerabilities remain, your final output **MUST** be the standard "clean report" message specified by the task prompt.
* **Action:** Remove the temporary files (`SECURITY_ANALYSIS_TODO.md` and `DRAFT_SECURITY_REPORT.md`) from the `.gemini_security/` directory. Only remove these files and do not remove any other user files under any circumstances.
* **Action:** ONLY IF the user requested JSON output (e.g., via `--json` in context or natural language), call the `convert_report_to_json` tool. Inform the user that the JSON version of the report is available at .gemini_security/security_report.json.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give an example of how to use the --json in the prompt, together with the custom audit scope that we implemented in #131?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly! To use the --json flag with a custom audit scope, you would combine them in your command line prompt. The --json flag signals that you want the final security report to be output in JSON format, and the --scope flag allows you to specify which files or directories should be included in the security analysis.

Here's an example of how you might use them together:

gemini security analyze --json --scope "mcp-server/src/filesystem.ts mcp-server/src/security.ts"

In this example:

  • gemini security analyze is the command to initiate a security analysis.
  • --json is the new flag from this pull request, instructing the tool to generate a security_report.json file in the .gemini_security/ directory.
  • --scope "mcp-server/src/filesystem.ts mcp-server/src/security.ts" specifies that the security analysis should only be performed on the filesystem.ts and security.ts files within the mcp-server/src/ directory. You can list multiple files or directories, separated by spaces, within the quotes.

After the analysis is complete, you would find the JSON version of the report at .gemini_security/security_report.json.

* **Action:** After the final report is delivered and any requested JSON report is complete, remove the temporary files (`SECURITY_ANALYSIS_TODO.md` and `DRAFT_SECURITY_REPORT.md`) from the `.gemini_security/` directory. Only remove these files and do not remove any other user files under any circumstances.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also call out not remove .gemini_security/security_report.json if generated



### Example of the Workflow in `SECURITY_ANALYSIS_TODO.md`
Expand Down Expand Up @@ -116,6 +117,7 @@ You will now begin executing the plan. The following are your precise instructio
* You will rewrite the `SECURITY_ANALYSIS_TODO.md` file.
* Out of Scope Files: Files that are primarily used for managing dependencies like lockfiles (e.g., `package-lock.json`, `package.json` `yarn.lock`, `go.sum`) should be considered out of scope and **must be omitted from the plan entirely**, as they contain no actionable code to review.
* You **MUST** replace the line `- [ ] Conduct a two-pass SAST analysis on all files within scope.` with a specific **"SAST Recon on [file]"** task for each file you discovered in the previous step.
* Addtionally, if the user requested JSON output (e.g., via `--json` in context or natural language), add a final task: - [ ] Generate JSON report.

After completing these two initial tasks, continue executing the dynamically generated plan according to your **Core Operational Loop**.

Expand Down
30 changes: 30 additions & 0 deletions mcp-server/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import { promises as fs } from 'fs';
import path from 'path';
import { getAuditScope } from './filesystem.js';
import { findLineNumbers } from './security.js';
import { parseMarkdownToDict } from './parser.js';

import { runPoc } from './poc.js';

Expand Down Expand Up @@ -64,6 +65,35 @@ server.tool(
(input: { filePath: string }) => runPoc(input)
);

server.tool(
'convert_report_to_json',
'Converts the Markdown security report into a JSON file named security_report.json in the .gemini_security folder.',
{} as any,
(async () => {
try {
const reportPath = '.gemini_security/DRAFT_SECURITY_REPORT.md';
const outputPath = '.gemini_security/security_report.json';

const content = await fs.readFile(reportPath, 'utf-8');
const results = parseMarkdownToDict(content);

await fs.writeFile(outputPath, JSON.stringify(results, null, 2));

return {
content: [{
type: 'text',
text: `Successfully created JSON report at ${outputPath}`
}]
};
} catch (error: any) {
return {
content: [{ type: 'text', text: `Error converting to JSON: ${error.message}` }],
isError: true
};
}
}) as any
);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Casting the function to `any` using `as any` bypasses TypeScript's type safety checks. It would be more robust to define and use a specific type for the server tool implementation to ensure type safety and prevent potential runtime errors.

server.registerPrompt(
'security:note-adder',
{
Expand Down
186 changes: 186 additions & 0 deletions mcp-server/src/parser.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
/**
* @license
* Copyright 2025 Google LLC
* SPDX-License-Identifier: Apache-2.0
*/

import { describe, it, expect } from 'vitest';
import { parseMarkdownToDict } from './parser.js';

describe('parseMarkdownToDict', () => {
it('should parse a standard security vulnerability correctly', () => {
const mdContent = `
Vulnerability: Hardcoded API Key
Vulnerability Type: Security
Severity: Critical
Source Location: config/settings.js:15-15
Line Content: const KEY = "sk_live_12345";
Description: A production secret was found hardcoded in the source.
Recommendation: Move the secret to an environment variable.
`;

const results = parseMarkdownToDict(mdContent);

expect(results).toHaveLength(1);
expect(results[0]).toMatchObject({
vulnerability: 'Hardcoded API Key',
vulnerabilityType: 'Security',
severity: 'Critical',
lineContent: 'const KEY = "sk_live_12345";',
extension: {
sourceLocation: {
File: 'config/settings.js',
startLine: 15,
endLine: 15
}
}
});
});

it('should parse a privacy violation with Sink and Data Type', () => {
const mdContent = `
Vulnerability: PII Leak in Logs
Vulnerability Type: Privacy
Severity: Medium
Source Location: src/auth.ts:22
Sink Location: console.log:45
Data Type: Email Address
Line Content: logger.info("User logged in: " + user.email);
Description: Unmasked email addresses are being written to application logs.
Recommendation: Redact the email address before logging.
`;

const results = parseMarkdownToDict(mdContent);

expect(results).toHaveLength(1);
expect(results[0].extension).toMatchObject({
sinkLocation: {
File: 'console.log',
startLine: 45,
endLine: 45
},
dataType: 'Email Address'
});
});

it('should handle multiple vulnerabilities in one file', () => {
const mdContent = `
Vulnerability: SQL Injection
Vulnerability Type: Security
Severity: High
Source Location: db.js:10
Line Content: query = "SELECT * FROM users WHERE id = " + id;
Description: Raw input used in query.
Recommendation: Use parameterized queries.

Vulnerability: Reflected XSS
Vulnerability Type: Security
Severity: Medium
Source Location: app.js:100
Line Content: res.send("Hello " + req.query.name);
Description: User input rendered without escaping.
Recommendation: Use a templating engine with auto-escaping.
`;

const results = parseMarkdownToDict(mdContent);
expect(results).toHaveLength(2);
expect(results[0].vulnerability).toBe('SQL Injection');
expect(results[1].vulnerability).toBe('Reflected XSS');
});

it('should handle markdown formatting like bolding and bullets', () => {
const mdContent = `
* **Vulnerability:** Hardcoded Secret
- **Severity:** High
* **Source Location:** \`index.js:5-10\`
- **Line Content:** \`\`\`javascript
const secret = "password";
\`\`\`
`;

const results = parseMarkdownToDict(mdContent);

expect(results[0].vulnerability).toBe('Hardcoded Secret');
expect(results[0].severity).toBe('High');
expect(results[0].extension.sourceLocation.File).toBe('index.js');
expect(results[0].lineContent).toBe('const secret = "password";');
});

it('should return empty array if no "Vulnerability:" trigger is found', () => {
const mdContent = "This is a summary report with no specific findings.";
const results = parseMarkdownToDict(mdContent);
expect(results).toHaveLength(0);
});

it('should handle missing line numbers and sink location', () => {
const mdContent = `
Vulnerability: Missing Line Numbers
Vulnerability Type: Security
Severity: High
Source Location: src/index.ts
Line Content: const apiKey = process.env.API_KEY;
Description: Source location without line numbers.
Recommendation: Verify the vulnerability details.
`;

const results = parseMarkdownToDict(mdContent);

expect(results).toHaveLength(1);
expect(results[0]).toMatchObject({
vulnerability: 'Missing Line Numbers',
vulnerabilityType: 'Security',
severity: 'High',
lineContent: 'const apiKey = process.env.API_KEY;'
});
expect(results[0].extension.sourceLocation.File).toBe('src/index.ts');
});

it('should handle missing end line number', () => {
const mdContent = `
Vulnerability: No End Line
Vulnerability Type: Security
Severity: Medium
Source Location: app.js:42
Line Content: res.send(userInput);
Description: Source location with only start line number.
Recommendation: Check this line.
`;

const results = parseMarkdownToDict(mdContent);

expect(results).toHaveLength(1);
expect(results[0].extension.sourceLocation).toMatchObject({
File: 'app.js',
startLine: 42
});
});

it('should handle missing sink location', () => {
const mdContent = `
Vulnerability: No Sink Info
Vulnerability Type: Privacy
Severity: Low
Source Location: logger.ts:15
Data Type: User ID
Line Content: console.log(user.id);
Description: Vulnerability without sink location details.
Recommendation: Use proper logging.
`;

const results = parseMarkdownToDict(mdContent);

expect(results).toHaveLength(1);
expect(results[0]).toMatchObject({
vulnerability: 'No Sink Info',
vulnerabilityType: 'Privacy',
severity: 'Low'
});
expect(results[0].extension.dataType).toBe('User ID');
expect(
results[0].extension.sinkLocation === undefined ||
(results[0].extension.sinkLocation?.File === null &&
results[0].extension.sinkLocation?.startLine === null &&
results[0].extension.sinkLocation?.endLine === null)
).toBe(true);
});
});
115 changes: 115 additions & 0 deletions mcp-server/src/parser.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
/**
* @license
* Copyright 2025 Google LLC
* SPDX-License-Identifier: Apache-2.0
*/

export interface Location {
File: string | null;
startLine: number | null;
endLine: number | null;
}

export interface Finding {
vulnerability: string | null;
vulnerabilityType: string | null;
severity: string | null;
extension: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called extension?

sourceLocation: Location;
sinkLocation: Location;
dataType: string | null;
};
lineContent: string | null;
description: string | null;
recommendation: string | null;
}

/**
* Parses a markdown string containing security findings into a structured format.
* The markdown should follow a specific format where each finding starts with "Vulnerability:" and includes fields like "Severity:", "Source Location:", etc.
* The function uses regular expressions to extract the relevant information and returns an array of findings.
*
* @param content - The markdown string to parse.
* @returns An array of structured findings extracted from the markdown.
*/
function parseLocation(locationStr: string | null): Location {
if (!locationStr) {
return { File: null, startLine: null, endLine: null };
}

const cleanStr = locationStr.replace(/`/g, '').trim();
// Regex: path/file.ext:start-end or path/file.ext:line
// Matches: file.ext:12-34 OR file.ext:12 OR file.ext
const match = cleanStr.match(/^([^:]+)(?::(\d+)(?:-(\d+))?)?$/);

if (match) {
const filePath = match[1].trim();
let start: number | null = null;
let end: number | null = null;
if (match[2] && match[3]) {
start = parseInt(match[2], 10);
end = parseInt(match[3], 10);
} else if (match[2]) {
start = parseInt(match[2], 10);
end = start;
}
return { File: filePath, startLine: start, endLine: end };
}

return { File: cleanStr, startLine: null, endLine: null };
}

/**
* Parses a markdown string containing security findings into a structured format.
* The markdown should follow a specific format where each finding starts with "Vulnerability:" and includes fields like "Severity:", "Source Location:", etc.
* The function uses regular expressions to extract the relevant information and returns an array of findings.
*
* @param content - The markdown string to parse.
* @returns An array of structured findings extracted from the markdown.
*/
export function parseMarkdownToDict(content: string): Finding[] {
const findings: Finding[] = [];

// Remove markdown formatting (bullet points at line start, ** markers), preserve hyphens in text
const cleanContent = content.replace(/^\s*[\*\-]\s*/gm, '').replace(/\*\*/g, '');

// Split each finding by "Vulnerability:" preceded by newline
const sections = cleanContent.split(/\n(?=#{1,6} |\s*Vulnerability:)/);

for (let section of sections) {
section = section.trim();
if (!section || !section.includes("Vulnerability:")) continue;

const extract = (label: string): string | null => {
const fieldNames = 'Vulnerability|Severity|Source|Sink|Data|Line|Description|Recommendation';
const patternStr = `(?:-?\\s*\\**)?${label}\\**:\\s*([\\s\\S]*?)(?=\\n(?:-?\\s*\\**)?(?:${fieldNames})|$)`;
const pattern = new RegExp(patternStr, 'i');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High ReDoS in parser

The regular expression used to parse the markdown report is vulnerable to Regular Expression Denial of Service (ReDoS). An attacker can craft a malicious markdown file that will cause the regex engine to backtrack excessively, leading to a denial of service. The vulnerable part is ([\\s\\S]*?) which can cause catastrophic backtracking.

Suggested change
const pattern = new RegExp(patternStr, 'i');
const fieldNames = 'Vulnerability Type|Severity|Source Location|Sink Location|Data Type|Line Content|Description|Recommendation';
const patternStr = `(?:-?\\\\s*\\\\**)?${label}\\\\**:\\\\s*([\\\\s\\\\S]*?)(?=\\\\n(?:-?\\\\s*\\\\**)?(?:${fieldNames})|$)`;
const pattern = new RegExp(patternStr, 'i');

const match = section.match(pattern);
return match ? match[1].trim() : null;
};

const rawSource = extract("Source Location");
const rawSink = extract("Sink Location");

let lineContent = extract("Line Content");
if (lineContent) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 This is a well-crafted regular expression for splitting the report into individual vulnerability sections. The use of a positive lookahead `(?=...)` to split by "Vulnerability:" while keeping the delimiter is a great approach. It also handles optional markdown headings, which makes the parsing more robust.

lineContent = lineContent.replace(/^```[a-z]*\n|```$/gm, '').trim();
}

findings.push({
vulnerability: extract("Vulnerability"),
vulnerabilityType: extract("Vulnerability Type"),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 The regular expression for extracting fields is quite complex. Adding a comment to explain the different parts of the regex would improve readability and make it easier for future maintainers to understand and modify the code. For example, a brief explanation of the lookahead used to delimit fields would be very helpful.

severity: extract("Severity"),
extension: {
sourceLocation: parseLocation(rawSource),
sinkLocation: parseLocation(rawSink),
dataType: extract("Data Type")
},
lineContent,
description: extract("Description"),
recommendation: extract("Recommendation")
});
}

return findings;
}
Loading