You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm experiencing an issue where the RetryHandler in HttpUtil.cs is catching an SSL certificate exception thrown by HttpClient, but ignores it and retries for 2 minutes until the timeout expires even though the error is non-recoverable. This is true for any exception that is thrown by HttpClient. RetryHandler only breaks out if the cancellation tokens are canceled or timeout expires, otherwise all exceptions are swallowed.
The exception is happening because our company's traffic passes through a gateway that appends to the certificate chain, and it's unable to verify the CRL status for these certs (which is throwing as it should). Obviously this is not a Snowflake problem, but the only way I was even able to figure out the problem was to clone SnowflakeConnector to my machine, add the project to the solution, and step through the debugger until we figured it out. This is less than ideal since we are paying to use Snowflake. Setting InsecureMode = true is also not an option since we want to validate the CRL.
What we'd really like is to have is the details of exceptions like this thrown to us so we can handle them, or at least be aware of what's going on. Storing them in a list and throwing them all as an AggregateException at the end would be better than throwing a timeout exception, since it's not really a timeout. We don't want to disable the retries, but we do need some control over the retry logic if the library cannot decide what to do.
Example code
One way to reproduce the endless HttpException is to add a ServerCertificateCustomValidationCallback to the HttpClientHandler that always returns false.
Below is the current retry code.
The line of interest is line 55, //TODO: Should probably check to see if the error is recoverable or transient.
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage requestMessage,
CancellationToken cancellationToken)
{
HttpResponseMessage response = null;
int backOffInSec = 1;
int totalRetryTime = 0;
int maxDefaultBackoff = 16;
ServicePoint p = ServicePointManager.FindServicePoint(requestMessage.RequestUri);
p.Expect100Continue = false; // Saves about 100 ms per request
p.UseNagleAlgorithm = false; // Saves about 200 ms per request
p.ConnectionLimit = 20; // Default value is 2, we need more connections for performing multiple parallel queries
TimeSpan httpTimeout = (TimeSpan)requestMessage.Properties[SFRestRequest.HTTP_REQUEST_TIMEOUT_KEY];
TimeSpan restTimeout = (TimeSpan)requestMessage.Properties[SFRestRequest.REST_REQUEST_TIMEOUT_KEY];
if (logger.IsDebugEnabled())
{
logger.Debug("Http request timeout : " + httpTimeout);
logger.Debug("Rest request timeout : " + restTimeout);
}
CancellationTokenSource childCts = null;
UriUpdater updater = new UriUpdater(requestMessage.RequestUri);
while (true)
{
try
{
childCts = null;
if (!httpTimeout.Equals(Timeout.InfiniteTimeSpan))
{
childCts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
childCts.CancelAfter(httpTimeout);
}
response = await base.SendAsync(requestMessage, childCts == null ?
cancellationToken : childCts.Token).ConfigureAwait(false);
}
catch (Exception e)
{
if (cancellationToken.IsCancellationRequested)
{
logger.Debug("SF rest request timeout or explicit cancel called.");
cancellationToken.ThrowIfCancellationRequested();
}
else if (childCts != null && childCts.Token.IsCancellationRequested)
{
logger.Warn("Http request timeout. Retry the request");
totalRetryTime += (int)httpTimeout.TotalSeconds;
}
else
{
//TODO: Should probably check to see if the error is recoverable or transient.
logger.Warn("Error occurred during request, retrying...", e);
}
}
if (childCts != null)
{
childCts.Dispose();
}
if (response != null)
{
if (response.IsSuccessStatusCode) {
logger.Debug($"Success Response: StatusCode: {(int)response.StatusCode}, ReasonPhrase: '{response.ReasonPhrase}'");
return response;
}
else
{
logger.Debug($"Failed Response: StatusCode: {(int)response.StatusCode}, ReasonPhrase: '{response.ReasonPhrase}'");
bool isRetryable = isRetryableHTTPCode((int)response.StatusCode);
if (!isRetryable)
{
// No need to keep retrying, stop here
return response;
}
}
}
else
{
logger.Info("Response returned was null.");
}
// Disposing of the response if not null now that we don't need it anymore
response?.Dispose();
requestMessage.RequestUri = updater.Update();
logger.Debug($"Sleep {backOffInSec} seconds and then retry the request");
await Task.Delay(TimeSpan.FromSeconds(backOffInSec), cancellationToken).ConfigureAwait(false);
totalRetryTime += backOffInSec;
// Set next backoff time
backOffInSec = backOffInSec >= maxDefaultBackoff ?
maxDefaultBackoff : backOffInSec * 2;
if ((restTimeout.TotalSeconds > 0) && (totalRetryTime + backOffInSec > restTimeout.TotalSeconds))
{
// No need to wait more than necessary if it can be avoided.
// If the rest timeout will be reached before the next back-off,
// use a smaller one to give the Rest request a chance to timeout early
backOffInSec = Math.Max(1, (int)restTimeout.TotalSeconds - totalRetryTime - 1);
}
}
}
The text was updated successfully, but these errors were encountered:
hello and thank you for submitting this issue, and also for the detailed description + aggregating the other painpoints from other tickets here, really appreciated! also sorry for the long period without any response. RetryHandler definitely needs a review and we'll take a look.
sfc-gh-dprzybysz
changed the title
RetryHandler needs better handling of non-recoverable exceptions, or option to disable/provide custom exception handler
SNOW-835593: RetryHandler needs better handling of non-recoverable exceptions, or option to disable/provide custom exception handler
Dec 16, 2024
Issue description
I'm experiencing an issue where the RetryHandler in HttpUtil.cs is catching an SSL certificate exception thrown by HttpClient, but ignores it and retries for 2 minutes until the timeout expires even though the error is non-recoverable. This is true for any exception that is thrown by HttpClient. RetryHandler only breaks out if the cancellation tokens are canceled or timeout expires, otherwise all exceptions are swallowed.
The exception is happening because our company's traffic passes through a gateway that appends to the certificate chain, and it's unable to verify the CRL status for these certs (which is throwing as it should). Obviously this is not a Snowflake problem, but the only way I was even able to figure out the problem was to clone SnowflakeConnector to my machine, add the project to the solution, and step through the debugger until we figured it out. This is less than ideal since we are paying to use Snowflake. Setting InsecureMode = true is also not an option since we want to validate the CRL.
What we'd really like is to have is the details of exceptions like this thrown to us so we can handle them, or at least be aware of what's going on. Storing them in a list and throwing them all as an AggregateException at the end would be better than throwing a timeout exception, since it's not really a timeout. We don't want to disable the retries, but we do need some control over the retry logic if the library cannot decide what to do.
Example code
One way to reproduce the endless HttpException is to add a ServerCertificateCustomValidationCallback to the HttpClientHandler that always returns false.
Related Issues
#213
#323
Actual Code:
Below is the current retry code.
The line of interest is line 55,
//TODO: Should probably check to see if the error is recoverable or transient.
The text was updated successfully, but these errors were encountered: