Skip to content

Conversation

@hweawer
Copy link
Collaborator

@hweawer hweawer commented Jul 16, 2025

  1. Comprehensive Nginx Timeout Implementation
  • Robust timeout handling across all Kraken services
  • Service-specific timeout configurations tailored to each component's needs
  • Improved request reliability with appropriate timeout values
  • Better error handling for long-running operations
  1. Enhanced System Reliability
  • Prevents hanging requests with appropriate timeout boundaries
  • Better resource management through controlled request lifecycles
  • Improved user experience with predictable response times
    🛠 Technical Implementation
  • Service-wide timeout policies implemented across all nginx configurations
  • Consistent timeout patterns for different operation types
  • Graceful degradation when timeouts are reached
  • Proper error reporting for timeout scenarios
    🎁 Benefits
  • Improved system stability through proper timeout management
  • Better operational visibility with predictable request boundaries
  • Enhanced debugging capabilities for timeout-related issues
  • Cleaner codebase with unnecessary configuration files removed

@hweawer hweawer self-assigned this Jul 16, 2025
@hweawer hweawer force-pushed the improve-kraken-origin-logging branch from 4c3f228 to 58bfdd6 Compare July 16, 2025 12:36
@hweawer hweawer force-pushed the nginx-timeouts-large-blobs branch from 3d729cb to 100f8d0 Compare July 16, 2025 12:36
@hweawer hweawer force-pushed the improve-kraken-origin-logging branch from 58bfdd6 to 28f4b6f Compare July 17, 2025 08:14
@hweawer hweawer force-pushed the nginx-timeouts-large-blobs branch from dc6a30d to 0441c6d Compare July 17, 2025 08:15
Copy link
Collaborator

@gkeesh7 gkeesh7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the review comments and explain in detail what each config change does


agentserver:
# Timeout configurations (also used by nginx)
download_timeout: 5m # nginx proxy_read_timeout for downloads
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it to 15 minutes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

addr: /tmp/kraken-origin.sock

# Timeout configurations (also used by nginx)
download_timeout: 5m # nginx proxy_read_timeout for downloads
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15 minutes here as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

proxy_send_timeout {{.upload_timeout}};
proxy_read_timeout {{.download_timeout}};
# Keepalive settings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe what is meant by these keep alive setting you are adding and why are they being added ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments

proxy_set_header X-Forwarded-Proto $scheme;
}
# Special handling for upload operations with longer timeout
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will these addtional settings help ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand the upload will happen ubuild -> kraken -> GCS, whereas replication will happen in between origin instances in out network, so I thought that it might have sense for splitting it, but if you think that there is not much sense in this I can put a single timeout.

nginx/nginx.go Outdated
return fmt.Sprintf("%ds", seconds)
}
// Fallback to milliseconds for very short durations
return fmt.Sprintf("%dms", bufferedDuration.Milliseconds())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 256 we are already adding 30seconds Will we ever reach this codepath ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it convert to seconds only

access_log {{.access_log_path}};
error_log {{.error_log_path}};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here as well please describe in more detail

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments

return addr
}

func FormatDurationForNginx(d time.Duration) string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain the purpose o fthis function

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added documentation

@hweawer hweawer requested a review from gkeesh7 July 29, 2025 11:05
@gkeesh7
Copy link
Collaborator

gkeesh7 commented Aug 21, 2025

@hweawer can you rebase the changes and run against the updated CI ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants