Skip to content

Conversation

ienugr
Copy link
Contributor

@ienugr ienugr commented Aug 7, 2025

Summary

This PR fixes issue #603 where Squawk fails to upload large linting reports to GitHub with a 422 "Unprocessable Entity" error.

Problem

When performing linting on large SQL migrations (14,000+ lines across multiple files), Squawk generates GitHub comments that exceed the API's 65,536 character limit, causing upload failures.

Solution

1. Size Limit Enforcement

  • Added GITHUB_COMMENT_MAX_SIZE constant (65,000 chars) with safety margin
  • Pre-check comment size before API calls to provide better error messages

2. Smart Comment Truncation

  • Limit SQL preview to 50 lines per file with truncation notice
  • Preserve all violation information while reducing content size

3. Summary Mode Fallback

  • For very large reports, switch to summary mode that omits SQL content
  • Display file statistics, line counts, and violation details
  • Clear notice about content omission

4. Enhanced Error Handling

  • New CommentTooLarge error variant with descriptive messages
  • Better user feedback when size limits are exceeded

Changes Made

  • crates/squawk/src/github.rs: Main comment generation logic with size checking and fallback modes
  • crates/squawk_github/src/app.rs: Enhanced error handling for comment size limits
  • crates/squawk_github/src/lib.rs: New error variant for size-related failures

Testing

Successful Tests

Normal comment generation (6 violations)

image

Reference: #608 (comment)

SQL truncation (33 violations)

image

Reference: #608 (comment)

Multiple files handling

image

Reference: #608 (comment)

Error handling for oversized content

Error: Upload to GitHub failed

Caused by:
    Comment size error: Comment body is too large (1165116 characters). GitHub API limit is 65,536 characters.

Large File Tests:

  • File: 60,000+ lines
  • Violations: 30,003 issues
  • Comment size: 1,165,116 characters (18x over limit)
  • Error caught: 'Comment body is too large'

Analysis:

  • The fix is working for normal use cases
  • Error handling prevents GitHub API failures
  • Summary mode logic may need adjustment for extreme cases

All existing tests continue to pass.

Fixes #603

Copy link

netlify bot commented Aug 7, 2025

👷 Deploy request for squawkhq pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 16ab036

Copy link
Owner

@sbdchd sbdchd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Left some minor comments!

ienugr added 2 commits August 21, 2025 15:08
- Add size checking to prevent 422 errors when comment exceeds GitHub's 65,536 character limit
- Implement smart comment truncation with SQL preview limited to 50 lines
- Add fallback to summary mode for very large reports that omits SQL content but preserves all violation information
- Include clear notices when content is truncated or summarized
- Add comprehensive tests for comment size handling

Fixes sbdchd#603
…rings

- Extract shared comment formatting logic into format_comment() function
- Remove duplicate markdown template from get_comment_body() and get_summary_comment_body()
- Improve code maintainability by centralizing template structure
- Update test snapshots to reflect cleaner output (removed extra blank line)
- Fix truncate_sql_if_needed() function signature to match actual usage
@ienugr ienugr force-pushed the fix-large-github-comment-upload branch from 757daf2 to 81be829 Compare August 21, 2025 08:10
@ienugr
Copy link
Contributor Author

ienugr commented Aug 26, 2025

Hi @sbdchd, thanks for the feedback, I have updated the PR, kindly review it again. Thanks 🙏

Copy link
Owner

@sbdchd sbdchd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@sbdchd sbdchd added the automerge automerge with kodiak label Sep 2, 2025
@kodiakhq kodiakhq bot removed the automerge automerge with kodiak label Sep 2, 2025
Copy link

kodiakhq bot commented Sep 2, 2025

This PR currently has a merge conflict. Please resolve this and then re-add the automerge label.

@sbdchd sbdchd added the automerge automerge with kodiak label Sep 2, 2025
@kodiakhq kodiakhq bot merged commit b95e6f0 into sbdchd:master Sep 2, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge automerge with kodiak
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Squawk report fails to upload to GitHub for large SQL linting jobs
2 participants