Skip to content

Conversation

@allsmog
Copy link

@allsmog allsmog commented Oct 22, 2025

Summary

Fixes silent failures in the Go extractor where OOM errors and file extraction failures would cause the entire extraction process to terminate without proper error logging.

Problem

The extractor was using log.Fatal() when file extraction failed, which caused:

  1. Silent termination - entire extraction process stops on first error
  2. Poor error visibility - OOM errors only visible in build-tracer logs, not extractor output
  3. No partial extraction - remaining files not processed even if only one fails

This was particularly problematic for:

  • Large codebases running into memory constraints
  • Projects with a few problematic files that shouldn't block extraction of other files
  • Debugging issues since errors weren't logged to extractor output

Solution

Minimal changes to improve error resilience:

  1. Replace log.Fatal() with log.Printf() - log errors but continue extraction
  2. Add panic recovery - catch OOM panics and other runtime errors in file extraction goroutines
  3. Move cleanup to defer block - ensure semaphore release and WaitGroup completion even on panic

Benefits

  • Partial extraction succeeds - continues processing remaining files after errors
  • Better error visibility - errors logged to extractor output, not just build-tracer
  • Graceful degradation - extract as much as possible instead of all-or-nothing
  • Minimal code change - only 9 lines added, 3 modified

Testing

Tested on large codebase (700+ packages) with memory constraints. Before fix: extraction would fail silently on OOM. After fix: extraction continues and logs specific files that failed.

Impact

  • Low risk - only changes error handling, doesn't affect extraction logic
  • High value - makes extractor more robust for large/problematic codebases
  • Backward compatible - successful extractions behave identically

Files Changed

  • go/extractor/extractor.go: Updated extractPackage() error handling (lines 694-707)

The extractor was calling log.Fatal() when file extraction failed, causing
the entire extraction process to terminate silently on the first error. This
was particularly problematic for OOM errors which would only appear in
build-tracer logs, not in the extractor output.

Changes:
- Replace log.Fatal() with log.Printf() to log errors without terminating
- Add panic recovery in file extraction goroutines to catch OOM errors
- Move cleanup (semaphore release, WaitGroup.Done) into defer block to
  ensure proper cleanup even when panics occur

This allows extraction to continue processing remaining files when individual
files fail, and ensures errors are visible in extractor logs.
@allsmog allsmog requested a review from a team as a code owner October 22, 2025 20:33
@github-actions github-actions bot added the Go label Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant