-
Notifications
You must be signed in to change notification settings - Fork 22
Multiple WiFi creation errors lead to MDNSResponder crash and memory leak #223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thank you for reporting this. Decoding the stack traces...
|
All the crashes are inside the MDNSResponder code. This is a fairly common crash which we have not got to the bottom of. Usually it is infrequent caused by a lot of mDNS activity on the network. #211 and #215 look to be similar. In your case it looks specific to problems in connecting to WiFi (sample from your last log below) so we need to look at how to make the code more tolerant of failed WiFi connection. And there looks to be a memory leak in the WiFi code when it fails to connect.
|
Not sure if this is much help, but here's my router log from the time of the last crash.
|
Just dropping this one here because it's slightly different than the rest with exception 29 and restart 2: |
@nyjklein thank you for the new log. The stack dump decodes to...
But I think the root cause is the same. Multiple WiFi creation errors leading to out-of-memory. Is there anything going on with your network that may cause these errors?
|
Nothing I'm aware of and this one device seems to be the only one that has issues. Everything else is extremally stable. That's why it's so frustrating with the crashes happening so randomly even in the middle of the night with clearly no other activity. |
Let me ask another question here. Here's what my heap and stack look like after running for about 25 hours: In all of the crashes I've seen, the heap is always well on its way in its downward spiral (well below 6000). So, it appears to me whatever event precepted the crash began before any messages captured in the crashlog. Any way to preserve a longer history of these messages? |
You can capture the kids over the network, but that might not be helpful in your case since your network connection died. We are limited by ram and flash wear, so it's not easy to save a longer history on the device. Your best option is to capture the logs using a usb cable which may be hard to do depending on your setup. |
@nyjklein I could do a build that increases the space that holds the log messages... This would be a special build for you that enables the IRAM heap... which gives us a much larger RAM area for log messages. I could do 10KB which is 5x normal. Some folks have h/w which did not work well with IRAM heap enabled (hence why not enabled by default). But assuming you don't run into that this would give you a version to test that hopefully has large enough message buffer |
@dkerr64 Thanks. Before we try that I disconnected the device, erased it and re-flashed it. I positioned it slightly differently (away from the LED bulb in the operner, just in case). Let's let it run a while and see if that makes any difference. |
Thanks for the report. Do let us know how it goes. |
Possible fix in PR #242 as we use IRAM heap to make more memory available. |
Believe to be fixed in release 1.8.0 |
crashlog.txt
crashlog3.txt
crashlog4.txt
I get seemingly random infrequent (every day or two) that look like the attached. From what I can tell, something happens to the WiFi connection and in the process of trying to recover RATGDO runs out of heap space and crashes. I'm at a loss as to what to try. My network is a non-mesh dedicated 2.4GHz network with its own SSID and other than these crashes seem stable in every other way.
The text was updated successfully, but these errors were encountered: