CLI-627 Analyze change sets as multi-file requests with chunking#446
CLI-627 Analyze change sets as multi-file requests with chunking#446nquinquenel wants to merge 1 commit into
Conversation
✅ Deploy Preview for sonarqube-cli canceled.
|
| for (const item of items) { | ||
| const candidate = [...current, item]; | ||
| const bytes = estimateRequestBytes({ | ||
| organizationKey: '', | ||
| projectKey: '', | ||
| files: candidate.map(toAnalysisFile), | ||
| }); | ||
|
|
||
| if (current.length > 0 && (bytes > maxRequestBytes || candidate.length > maxFilesPerRequest)) { | ||
| flush(); | ||
| current = [item]; | ||
| continue; | ||
| } | ||
|
|
||
| current = candidate; |
There was a problem hiding this comment.
⚠️ Performance: packFilesIntoChunks re-serializes whole candidate per file (O(n²))
For every file added to the current chunk, packFilesIntoChunks builds candidate = [...current, item] and calls estimateRequestBytes, which runs JSON.stringify over the entire accumulated candidate array (and Buffer.byteLength over the whole serialized string). This makes packing O(n²) in the number of files per chunk and, worse, O(n²) in total content bytes — each of up to SQAA_MAX_FILES_PER_REQUEST (200) files triggers a full re-serialization of all previously packed file contents. For change sets with many sizeable files this repeatedly stringifies multi-megabyte payloads on the hot path of every multi-file analysis.
Track the byte size incrementally instead: compute a per-file byte cost once and a fixed envelope overhead, accumulating a running total rather than re-stringifying the growing array each iteration.
Accumulate byte size per file instead of re-serializing the entire candidate array each iteration.:
// Precompute per-file byte cost once, accumulate incrementally.
function fileBytes(file: SqaaChunkFile): number {
return Buffer.byteLength(JSON.stringify(toAnalysisFile(file)), 'utf8');
}
// In packFilesIntoChunks:
let currentBytes = ENVELOPE_BYTES; // fixed overhead of the request wrapper
for (const item of items) {
const itemBytes = fileBytes(item) + 1; // +1 for array comma separator
if (current.length > 0 && (currentBytes + itemBytes > maxRequestBytes || current.length + 1 > maxFilesPerRequest)) {
flush();
currentBytes = ENVELOPE_BYTES;
}
current.push(item);
currentBytes += itemBytes;
}
- Apply fix
Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎
| export function distributeChunkResponse( | ||
| issues: SqaaIssue[], | ||
| errors: Array<{ code: string; message: string }> | null | undefined, | ||
| chunkPaths: ChunkPath[], | ||
| ): FileSuccess[] { | ||
| const issuesByPath = new Map<string, SqaaIssue[]>(); | ||
| for (const issue of issues) { | ||
| const path = issue.filePath ?? chunkPaths[0]?.filePath; | ||
| if (!path) continue; | ||
| const list = issuesByPath.get(path) ?? []; | ||
| list.push(issue); | ||
| issuesByPath.set(path, list); | ||
| } | ||
|
|
||
| return chunkPaths.map(({ file, filePath }, index) => ({ |
There was a problem hiding this comment.
💡 Edge Case: distributeChunkResponse silently drops issues with unmatched filePath
distributeChunkResponse groups issues by issue.filePath ?? chunkPaths[0]?.filePath, then rebuilds results by reading issuesByPath.get(filePath) for each chunk path. If the server returns an issue whose filePath does not exactly match any of the relative paths sent for the chunk (e.g. a different path normalization, a leading ./, or an absolute path), that issue is stored in the map under an unmatched key and is never read back — it is silently dropped and never counted in the tally. This would under-report issues to the user with no error or warning. The fake test server echoes the exact sent path so tests pass, but a real backend may not.
Consider attributing issues with an unrecognized filePath to the first file of the chunk (as already done for issues with no filePath) so they are never lost, or at minimum logging a debug warning when an issue cannot be matched.
Fall back to the first chunk file when an issue's filePath doesn't match any sent path, so issues are never dropped.:
const knownPaths = new Set(chunkPaths.map((p) => p.filePath));
for (const issue of issues) {
const path =
issue.filePath && knownPaths.has(issue.filePath)
? issue.filePath
: chunkPaths[0]?.filePath;
if (!path) continue;
const list = issuesByPath.get(path) ?? [];
list.push(issue);
issuesByPath.set(path, list);
}
- Apply fix
Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎
| function prepareChunks(ctx: RunContext): { chunks: SqaaChunk[]; chunkFileIndices: number[][] } { | ||
| const chunkFiles: SqaaChunkFile[] = ctx.files.map((file, idx) => ({ | ||
| absolutePath: file, | ||
| relativePath: ctx.allPaths[idx], | ||
| content: readSqaaFileContent(file), | ||
| })); | ||
| const chunks = packFilesIntoChunks(chunkFiles); | ||
| const chunkFileIndices = chunks.map((chunk) => | ||
| chunk.files.map((f) => ctx.files.indexOf(f.absolutePath)), | ||
| ); | ||
| return { chunks, chunkFileIndices }; |
There was a problem hiding this comment.
💡 Performance: Repeated ctx.files.indexOf inside loops is O(n²)
prepareChunks builds chunkFileIndices via chunk.files.map((f) => ctx.files.indexOf(f.absolutePath)), and processChunk again calls ctx.files.indexOf(chunkFile.absolutePath) for every file in every part and every failed file. Each indexOf is a linear scan, so across a full change set this is O(n²) in the number of files. For large change sets this duplicates the per-file cost. Build a single Map<absolutePath, index> once and reuse it everywhere indices are resolved. (Also note: if indexOf ever returns -1, ctx.allPaths[-1] is undefined and progress.update(-1, ...) writes to a negative index — a Map lookup with an explicit guard would make this safer.)
Precompute an absolutePath→index Map and reuse it instead of repeated linear indexOf scans.:
// Build once, e.g. in RunContext setup or prepareChunks:
const fileIndexByAbsolutePath = new Map(ctx.files.map((f, i) => [f, i]));
// Then replace ctx.files.indexOf(absPath) with:
const idx = fileIndexByAbsolutePath.get(absPath);
if (idx === undefined) continue; // guard against unmatched paths
- Apply fix
Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎
Code Review
|
| Auto-apply | Compact | Unblock |
|
|
|
Was this helpful? React with 👍 / 👎 | Gitar
|




Summary by Gitar
SQAAanalysis.SQAA_MAX_FILE_BYTESandSQAA_MAX_FILES_PER_REQUESTlimits.fetchChunkWith413Splitto handle API413 Request Payload Too Largeerrors by automatically splitting chunks.printSqaaTextReport, including path collapsing for clean files (SQAA_COLLAPSE_CLEAN_THRESHOLD).SonarQubeClientto handle413errors and implementedDEEPanalysis depth for allSQAArequests.format-sqaa-hook-context.tsto use shared reporting logic for IDE-style issue display and summary footers.This will update automatically on new commits.