CLI-627 Analyze change sets as multi-file requests with chunking by nquinquenel · Pull Request #446 · SonarSource/sonarqube-cli

nquinquenel · 2026-06-12T13:41:44Z

Summary by Gitar

Core Analysis Engine:
- Switched from concurrent worker-pool to a sequential chunked execution engine for SQAA analysis.
- Added chunking logic to aggregate change-set files based on SQAA_MAX_FILE_BYTES and SQAA_MAX_FILES_PER_REQUEST limits.
- Implemented fetchChunkWith413Split to handle API 413 Request Payload Too Large errors by automatically splitting chunks.
Progress & Reporting:
- Replaced file-by-file progress with an animated live status line for better UX during chunked requests.
- Consolidated text reports with printSqaaTextReport, including path collapsing for clean files (SQAA_COLLAPSE_CLEAN_THRESHOLD).
API Integration:
- Updated SonarQubeClient to handle 413 errors and implemented DEEP analysis depth for all SQAA requests.
Hooks & Context:
- Refactored format-sqaa-hook-context.ts to use shared reporting logic for IDE-style issue display and summary footers.

_{This will update automatically on new commits.}

netlify · 2026-06-12T13:41:49Z

✅ Deploy Preview for sonarqube-cli canceled.

Name	Link
🔨 Latest commit	`80aedde`
🔍 Latest deploy log	https://app.netlify.com/projects/sonarqube-cli/deploys/6a2c0c9b2e10080008da4dcf

hashicorp-vault-sonar-prod · 2026-06-12T13:42:55Z

CLI-627

gitar-bot · 2026-06-12T13:44:37Z

+  for (const item of items) {
+    const candidate = [...current, item];
+    const bytes = estimateRequestBytes({
+      organizationKey: '',
+      projectKey: '',
+      files: candidate.map(toAnalysisFile),
+    });
+
+    if (current.length > 0 && (bytes > maxRequestBytes || candidate.length > maxFilesPerRequest)) {
+      flush();
+      current = [item];
+      continue;
+    }
+
+    current = candidate;


⚠️ Performance: packFilesIntoChunks re-serializes whole candidate per file (O(n²))

For every file added to the current chunk, packFilesIntoChunks builds candidate = [...current, item] and calls estimateRequestBytes, which runs JSON.stringify over the entire accumulated candidate array (and Buffer.byteLength over the whole serialized string). This makes packing O(n²) in the number of files per chunk and, worse, O(n²) in total content bytes — each of up to SQAA_MAX_FILES_PER_REQUEST (200) files triggers a full re-serialization of all previously packed file contents. For change sets with many sizeable files this repeatedly stringifies multi-megabyte payloads on the hot path of every multi-file analysis.

Track the byte size incrementally instead: compute a per-file byte cost once and a fixed envelope overhead, accumulating a running total rather than re-stringifying the growing array each iteration.

Accumulate byte size per file instead of re-serializing the entire candidate array each iteration.:

// Precompute per-file byte cost once, accumulate incrementally. function fileBytes(file: SqaaChunkFile): number { return Buffer.byteLength(JSON.stringify(toAnalysisFile(file)), 'utf8'); } // In packFilesIntoChunks: let currentBytes = ENVELOPE_BYTES; // fixed overhead of the request wrapper for (const item of items) { const itemBytes = fileBytes(item) + 1; // +1 for array comma separator if (current.length > 0 && (currentBytes + itemBytes > maxRequestBytes || current.length + 1 > maxFilesPerRequest)) { flush(); currentBytes = ENVELOPE_BYTES; } current.push(item); currentBytes += itemBytes; }

Apply fix

_{Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎}

gitar-bot · 2026-06-12T13:44:38Z

+export function distributeChunkResponse(
+  issues: SqaaIssue[],
+  errors: Array<{ code: string; message: string }> | null | undefined,
+  chunkPaths: ChunkPath[],
+): FileSuccess[] {
+  const issuesByPath = new Map<string, SqaaIssue[]>();
+  for (const issue of issues) {
+    const path = issue.filePath ?? chunkPaths[0]?.filePath;
+    if (!path) continue;
+    const list = issuesByPath.get(path) ?? [];
+    list.push(issue);
+    issuesByPath.set(path, list);
+  }
+
+  return chunkPaths.map(({ file, filePath }, index) => ({


💡 Edge Case: distributeChunkResponse silently drops issues with unmatched filePath

distributeChunkResponse groups issues by issue.filePath ?? chunkPaths[0]?.filePath, then rebuilds results by reading issuesByPath.get(filePath) for each chunk path. If the server returns an issue whose filePath does not exactly match any of the relative paths sent for the chunk (e.g. a different path normalization, a leading ./, or an absolute path), that issue is stored in the map under an unmatched key and is never read back — it is silently dropped and never counted in the tally. This would under-report issues to the user with no error or warning. The fake test server echoes the exact sent path so tests pass, but a real backend may not.

Consider attributing issues with an unrecognized filePath to the first file of the chunk (as already done for issues with no filePath) so they are never lost, or at minimum logging a debug warning when an issue cannot be matched.

Fall back to the first chunk file when an issue's filePath doesn't match any sent path, so issues are never dropped.:

const knownPaths = new Set(chunkPaths.map((p) => p.filePath)); for (const issue of issues) { const path = issue.filePath && knownPaths.has(issue.filePath) ? issue.filePath : chunkPaths[0]?.filePath; if (!path) continue; const list = issuesByPath.get(path) ?? []; list.push(issue); issuesByPath.set(path, list); }

Apply fix

_{Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎}

gitar-bot · 2026-06-12T13:44:39Z

+function prepareChunks(ctx: RunContext): { chunks: SqaaChunk[]; chunkFileIndices: number[][] } {
+  const chunkFiles: SqaaChunkFile[] = ctx.files.map((file, idx) => ({
+    absolutePath: file,
+    relativePath: ctx.allPaths[idx],
+    content: readSqaaFileContent(file),
+  }));
+  const chunks = packFilesIntoChunks(chunkFiles);
+  const chunkFileIndices = chunks.map((chunk) =>
+    chunk.files.map((f) => ctx.files.indexOf(f.absolutePath)),
  );
+  return { chunks, chunkFileIndices };


💡 Performance: Repeated ctx.files.indexOf inside loops is O(n²)

prepareChunks builds chunkFileIndices via chunk.files.map((f) => ctx.files.indexOf(f.absolutePath)), and processChunk again calls ctx.files.indexOf(chunkFile.absolutePath) for every file in every part and every failed file. Each indexOf is a linear scan, so across a full change set this is O(n²) in the number of files. For large change sets this duplicates the per-file cost. Build a single Map<absolutePath, index> once and reuse it everywhere indices are resolved. (Also note: if indexOf ever returns -1, ctx.allPaths[-1] is undefined and progress.update(-1, ...) writes to a negative index — a Map lookup with an explicit guard would make this safer.)

Precompute an absolutePath→index Map and reuse it instead of repeated linear indexOf scans.:

// Build once, e.g. in RunContext setup or prepareChunks: const fileIndexByAbsolutePath = new Map(ctx.files.map((f, i) => [f, i])); // Then replace ctx.files.indexOf(absPath) with: const idx = fileIndexByAbsolutePath.get(absPath); if (idx === undefined) continue; // guard against unmatched paths

Apply fix

_{Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎}

gitar-bot · 2026-06-12T13:44:41Z

Code Review ⚠️ Changes requested 0 resolved / 3 findings

Implements chunked multi-file request processing for SQAA analysis, but introduces O(n²) performance bottlenecks in file serialization and indexing while silently dropping unmatched issues.

⚠️

Performance: packFilesIntoChunks re-serializes whole candidate per file (O(n²))

📄 src/cli/commands/analyze/sqaa-chunking.ts:94-108 📄 src/cli/commands/analyze/sqaa-chunking.ts:65-67

For every file added to the current chunk, packFilesIntoChunks builds candidate = [...current, item] and calls estimateRequestBytes, which runs JSON.stringify over the entire accumulated candidate array (and Buffer.byteLength over the whole serialized string). This makes packing O(n²) in the number of files per chunk and, worse, O(n²) in total content bytes — each of up to SQAA_MAX_FILES_PER_REQUEST (200) files triggers a full re-serialization of all previously packed file contents. For change sets with many sizeable files this repeatedly stringifies multi-megabyte payloads on the hot path of every multi-file analysis.

Track the byte size incrementally instead: compute a per-file byte cost once and a fixed envelope overhead, accumulating a running total rather than re-stringifying the growing array each iteration.

Accumulate byte size per file instead of re-serializing the entire candidate array each iteration.

// Precompute per-file byte cost once, accumulate incrementally.
function fileBytes(file: SqaaChunkFile): number {
  return Buffer.byteLength(JSON.stringify(toAnalysisFile(file)), 'utf8');
}

// In packFilesIntoChunks:
let currentBytes = ENVELOPE_BYTES; // fixed overhead of the request wrapper
for (const item of items) {
  const itemBytes = fileBytes(item) + 1; // +1 for array comma separator
  if (current.length > 0 && (currentBytes + itemBytes > maxRequestBytes || current.length + 1 > maxFilesPerRequest)) {
    flush();
    currentBytes = ENVELOPE_BYTES;
  }
  current.push(item);
  currentBytes += itemBytes;
}

💡 Edge Case: distributeChunkResponse silently drops issues with unmatched filePath

📄 src/cli/commands/analyze/sqaa-analysis.ts:259-273

distributeChunkResponse groups issues by issue.filePath ?? chunkPaths[0]?.filePath, then rebuilds results by reading issuesByPath.get(filePath) for each chunk path. If the server returns an issue whose filePath does not exactly match any of the relative paths sent for the chunk (e.g. a different path normalization, a leading ./, or an absolute path), that issue is stored in the map under an unmatched key and is never read back — it is silently dropped and never counted in the tally. This would under-report issues to the user with no error or warning. The fake test server echoes the exact sent path so tests pass, but a real backend may not.

Consider attributing issues with an unrecognized filePath to the first file of the chunk (as already done for issues with no filePath) so they are never lost, or at minimum logging a debug warning when an issue cannot be matched.

Fall back to the first chunk file when an issue's filePath doesn't match any sent path, so issues are never dropped.

const knownPaths = new Set(chunkPaths.map((p) => p.filePath));
for (const issue of issues) {
  const path =
    issue.filePath && knownPaths.has(issue.filePath)
      ? issue.filePath
      : chunkPaths[0]?.filePath;
  if (!path) continue;
  const list = issuesByPath.get(path) ?? [];
  list.push(issue);
  issuesByPath.set(path, list);
}

💡 Performance: Repeated ctx.files.indexOf inside loops is O(n²)

📄 src/cli/commands/analyze/sqaa-analysis.ts:101-111 📄 src/cli/commands/analyze/sqaa-analysis.ts:151-165

prepareChunks builds chunkFileIndices via chunk.files.map((f) => ctx.files.indexOf(f.absolutePath)), and processChunk again calls ctx.files.indexOf(chunkFile.absolutePath) for every file in every part and every failed file. Each indexOf is a linear scan, so across a full change set this is O(n²) in the number of files. For large change sets this duplicates the per-file cost. Build a single Map<absolutePath, index> once and reuse it everywhere indices are resolved. (Also note: if indexOf ever returns -1, ctx.allPaths[-1] is undefined and progress.update(-1, ...) writes to a negative index — a Map lookup with an explicit guard would make this safer.)

Precompute an absolutePath→index Map and reuse it instead of repeated linear indexOf scans.

// Build once, e.g. in RunContext setup or prepareChunks:
const fileIndexByAbsolutePath = new Map(ctx.files.map((f, i) => [f, i]));
// Then replace ctx.files.indexOf(absPath) with:
const idx = fileIndexByAbsolutePath.get(absPath);
if (idx === undefined) continue; // guard against unmatched paths

🤖 Prompt for agents

Code Review: Implements chunked multi-file request processing for SQAA analysis, but introduces O(n²) performance bottlenecks in file serialization and indexing while silently dropping unmatched issues.

1. ⚠️ Performance: packFilesIntoChunks re-serializes whole candidate per file (O(n²))
   Files: src/cli/commands/analyze/sqaa-chunking.ts:94-108, src/cli/commands/analyze/sqaa-chunking.ts:65-67

   For every file added to the current chunk, `packFilesIntoChunks` builds `candidate = [...current, item]` and calls `estimateRequestBytes`, which runs `JSON.stringify` over the *entire* accumulated candidate array (and `Buffer.byteLength` over the whole serialized string). This makes packing O(n²) in the number of files per chunk and, worse, O(n²) in total content bytes — each of up to `SQAA_MAX_FILES_PER_REQUEST` (200) files triggers a full re-serialization of all previously packed file contents. For change sets with many sizeable files this repeatedly stringifies multi-megabyte payloads on the hot path of every multi-file analysis.
   
   Track the byte size incrementally instead: compute a per-file byte cost once and a fixed envelope overhead, accumulating a running total rather than re-stringifying the growing array each iteration.

   Fix (Accumulate byte size per file instead of re-serializing the entire candidate array each iteration.):
   // Precompute per-file byte cost once, accumulate incrementally.
   function fileBytes(file: SqaaChunkFile): number {
     return Buffer.byteLength(JSON.stringify(toAnalysisFile(file)), 'utf8');
   }
   
   // In packFilesIntoChunks:
   let currentBytes = ENVELOPE_BYTES; // fixed overhead of the request wrapper
   for (const item of items) {
     const itemBytes = fileBytes(item) + 1; // +1 for array comma separator
     if (current.length > 0 && (currentBytes + itemBytes > maxRequestBytes || current.length + 1 > maxFilesPerRequest)) {
       flush();
       currentBytes = ENVELOPE_BYTES;
     }
     current.push(item);
     currentBytes += itemBytes;
   }

2. 💡 Edge Case: distributeChunkResponse silently drops issues with unmatched filePath
   Files: src/cli/commands/analyze/sqaa-analysis.ts:259-273

   `distributeChunkResponse` groups issues by `issue.filePath ?? chunkPaths[0]?.filePath`, then rebuilds results by reading `issuesByPath.get(filePath)` for each chunk path. If the server returns an issue whose `filePath` does not exactly match any of the relative paths sent for the chunk (e.g. a different path normalization, a leading `./`, or an absolute path), that issue is stored in the map under an unmatched key and is never read back — it is silently dropped and never counted in the tally. This would under-report issues to the user with no error or warning. The fake test server echoes the exact sent path so tests pass, but a real backend may not.
   
   Consider attributing issues with an unrecognized `filePath` to the first file of the chunk (as already done for issues with no `filePath`) so they are never lost, or at minimum logging a debug warning when an issue cannot be matched.

   Fix (Fall back to the first chunk file when an issue's filePath doesn't match any sent path, so issues are never dropped.):
   const knownPaths = new Set(chunkPaths.map((p) => p.filePath));
   for (const issue of issues) {
     const path =
       issue.filePath && knownPaths.has(issue.filePath)
         ? issue.filePath
         : chunkPaths[0]?.filePath;
     if (!path) continue;
     const list = issuesByPath.get(path) ?? [];
     list.push(issue);
     issuesByPath.set(path, list);
   }

3. 💡 Performance: Repeated ctx.files.indexOf inside loops is O(n²)
   Files: src/cli/commands/analyze/sqaa-analysis.ts:101-111, src/cli/commands/analyze/sqaa-analysis.ts:151-165

   `prepareChunks` builds `chunkFileIndices` via `chunk.files.map((f) => ctx.files.indexOf(f.absolutePath))`, and `processChunk` again calls `ctx.files.indexOf(chunkFile.absolutePath)` for every file in every part and every failed file. Each `indexOf` is a linear scan, so across a full change set this is O(n²) in the number of files. For large change sets this duplicates the per-file cost. Build a single `Map<absolutePath, index>` once and reuse it everywhere indices are resolved. (Also note: if `indexOf` ever returns -1, `ctx.allPaths[-1]` is `undefined` and `progress.update(-1, ...)` writes to a negative index — a Map lookup with an explicit guard would make this safer.)

   Fix (Precompute an absolutePath→index Map and reuse it instead of repeated linear indexOf scans.):
   // Build once, e.g. in RunContext setup or prepareChunks:
   const fileIndexByAbsolutePath = new Map(ctx.files.map((f, i) => [f, i]));
   // Then replace ctx.files.indexOf(absPath) with:
   const idx = fileIndexByAbsolutePath.get(absPath);
   if (idx === undefined) continue; // guard against unmatched paths

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.
Unblock → Override a blocking verdict and allow merging.

Comment with these commands to change:

`Auto-apply`	`Compact`	`Unblock`
`gitar auto-apply:on`	`gitar display:verbose`	`gitar unblock`

_{Was this helpful? React with 👍 / 👎 | Gitar}

sonarqubecloud · 2026-06-12T13:54:42Z

Quality Gate failed

Failed conditions
5 New issues

🛠️ Remediation Agent ready

Fix automatically
Creates a separate PR with fixes for eligible issues

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

CLI-627 Analyze change sets as multi-file requests with chunking

80aedde

nquinquenel temporarily deployed to sca-checking June 12, 2026 13:41 — with GitHub Actions Inactive

gitar-bot Bot reviewed Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI-627 Analyze change sets as multi-file requests with chunking#446

CLI-627 Analyze change sets as multi-file requests with chunking#446
nquinquenel wants to merge 1 commit into
masterfrom
feature/nq/CLI-627-sqaa-multi-file-ux

nquinquenel commented Jun 12, 2026 •

edited by gitar-bot Bot

Loading

Uh oh!

netlify Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

hashicorp-vault-sonar-prod Bot commented Jun 12, 2026 •

edited by atlassian Bot

Loading

Uh oh!

gitar-bot Bot Jun 12, 2026

Uh oh!

gitar-bot Bot Jun 12, 2026

Uh oh!

gitar-bot Bot Jun 12, 2026

Uh oh!

gitar-bot Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

sonarqubecloud Bot commented Jun 12, 2026 •

edited by SonarQube Agent AI

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nquinquenel commented Jun 12, 2026 • edited by gitar-bot Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Gitar

Uh oh!

netlify Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for sonarqube-cli canceled.

Uh oh!

hashicorp-vault-sonar-prod Bot commented Jun 12, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gitar-bot Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jun 12, 2026 • edited by SonarQube Agent AI Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Quality Gate failed

🛠️ Remediation Agent ready

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nquinquenel commented Jun 12, 2026 •

edited by gitar-bot Bot

Loading

netlify Bot commented Jun 12, 2026 •

edited

Loading

hashicorp-vault-sonar-prod Bot commented Jun 12, 2026 •

edited by atlassian Bot

Loading

gitar-bot Bot commented Jun 12, 2026 •

edited

Loading

sonarqubecloud Bot commented Jun 12, 2026 •

edited by SonarQube Agent AI

Loading