Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry feature added for exceptions while listening client connections #137

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

aliulug
Copy link

@aliulug aliulug commented Apr 7, 2015

When there is an exception while accepting connections WebSocketServer was immediately stopping listening for new client connections. Retry feature added.

An example exception stack trace in a production environment is as follows;

Listener socket is closed Fleck Exception: Mesaj: One or more errors occurred.Stack Trace: Inner Exception Mesaj: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.Inner Exception Stack Trace: at System.Net.Sockets.NetworkStream.BeginRead(Byte[] buffer, Int32 offset, Int32 size, AsyncCallback callback, Object state) at System.Net.FixedSizeReader.StartReading() at System.Net.Security.SslState.StartReceiveBlob(Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ForceAuthentication(Boolean receiveFirst, Byte[] buffer, AsyncProtocolRequest asyncRequest) at System.Net.Security.SslState.ProcessAuthentication(LazyAsyncResult lazyResult) at System.Net.Security.SslStream.BeginAuthenticateAsServer(X509Certificate serverCertificate, Boolean clientCertificateRequired, SslProtocols enabledSslProtocols, Boolean checkCertificateRevocation, AsyncCallback asyncCallback, Object asyncState) at Fleck.SocketWrapper.<>c__DisplayClass4.b__0(AsyncCallback cb, Object s) at System.Threading.Tasks.TaskFactory1.FromAsyncImpl(Func3 beginMethod, Func2 endFunction, Action1 endAction, Object state, TaskCreationOptions creationOptions) at System.Threading.Tasks.TaskFactory.FromAsync(Func3 beginMethod, Action1 endMethod, Object state, TaskCreationOptions creationOptions) at System.Threading.Tasks.TaskFactory.FromAsync(Func3 beginMethod, Action1 endMethod, Object state) at Fleck.SocketWrapper.Authenticate(X509Certificate2 certificate, Action callback, Action1 error)<rn> at Fleck.WebSocketServer.OnClientConnect(ISocket clientSocket)<rn> at Fleck.SocketWrapper.<>c__DisplayClass14.<Accept>b__12(Task1 t) at System.Threading.Tasks.Task.Execute()

@statianzo
Copy link
Owner

If retries are set to some arbitrary limit like 1000, then it becomes a ticking time bomb. For example, if you're having ~10 socket closed exceptions per day, the server could just die after 3 months.
However, if we kill the limit then it's possible to get into a spin of retrying. Maybe replacing the retry limit with a short (100ms?) delay could prevent pegging the CPU.

Also, how does this play with invoking dispose() on the server? Would the async Accept just loop over an endless number of ObjectDisposedExceptions? A retry shouldn't happen if the server is disposed.

@aliulug
Copy link
Author

aliulug commented Apr 10, 2015

In order to solve ticking time bomb problem, we can reset the counter in
OnClientConnect method. So, whenever a client is connected successfully,
the counter will be reset.

2015-04-09 17:40 GMT+03:00 Jason Staten [email protected]:

If retries are set to some arbitrary limit like 1000, then it becomes a
ticking time bomb. For example, if you're having ~10 socket closed
exceptions per day, the server could just die after 3 months.
However, if we kill the limit then it's possible to get into a spin of
retrying. Maybe replacing the retry limit with a short (100ms?) delay could
prevent pegging the CPU.

Also, how does this play with invoking dispose() on the server? Would the
async Accept just loop over an endless number of ObjectDisposedExceptions?
A retry shouldn't happen if the server is disposed.


Reply to this email directly or view it on GitHub
#137 (comment).

@darkl
Copy link
Collaborator

darkl commented Aug 5, 2015

I've encountered this issue when writing a test client for #145.
I think this is critical and should be handled somehow. (Even if by notifying the user that server has terminated (via an event), but ideally auto-retry should be a better approach)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants