-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: This Fiber is a zombie #131
Comments
OK, I figured out what we're doing. We're calling The effect for us (on OSX at least) is that the main fiber actually gets re-run from the top! Here's a reproduction:
For me, with Node 0.8.24 and Fibers 1.0.0, this prints:
and so forth. |
So I guess the short answer is, I shouldn't be letting a fiber that I care about get GCed. And in fact, the strategy of using But it certainly was surprising that the fiber got re-run from the top! |
Also occurs with Node 0.10.13, Fibers 1.0.1. |
At the very least, I think doing this should cause your code to crash, not re-run the fiber from the top. We would have figured out what was going on much earlier if that had happened. |
Ah, OK. The second call to Now, for me, the |
Actually if you grab a reference to the fiber in the
Yeah the overloading of |
Right, I agree that grabbing a reference to the Fiber in I don't think I was actually using |
…t to yield to. For more info look here: laverdet/node-fibers#131
I'm curious if you discovered a trick to prevent the zombie fiber issue? This issue is periodically taking down our production app (not meteor). It does seem correlated to garbage collection, but I'm uncertain of the exact relationship. I'm starting to dig into the fibers source to better understand what causes the zombie condition. If garbage collection is the source of the problem, I'm unclear why the fiber, or associated function, would be garbage collected, as it looks like the function is stored in a Persistent in fibers.cc which if I understand correctly, takes it out of garbage collection. And I'm also unclear on why storing a local variable to the fiber would somehow prevent this, since the local variable would be not reachable after a function returns, unless we store the fiber in a global structure of some sort. I appreciate any insights, whatever they might be, as we begin our debug hunt! Chris Could this code cause a problem if the fiber isn't reachable on the heap? function handleSomeRequest() {
Fiber(function() { ... }).run();
} |
If the fiber isn't accessible on the heap then that means there is no way of the fiber ever waking up, because you can't ever call Edit: There is nothing wrong with |
Edit: Not so fast! Still debugging. Will post conclusions when we have them. We resolved this issue, thanks to the clues you guys provided above. For posterity, in case it helps anyone else: Reproduction: > node --expose-gc index.js var Fiber = require("fibers");
// Run this every so often so that GC and DestroyOrphans happens.
setInterval(function() {
console.log("Running GC!");
Fiber(function() {
global.gc();
}).run();
}, 1000);
Fiber(function() {
try {
Fiber.yield();
} catch (e) {
console.log(e);
}
}).run(); In the C++ class backing Fiber, the fiber is stored in a The fix, as @glasser mentions above, is to add a local reference to the fiber. But the trick is to store this reference inside the fiber callback function itself. Otherwise we'd have to somehow keep track when the fiber has finished running (a meta gc!). Fiber(function() {
// add a ref to the current fiber, tricking the v8 gc so that the weak handle callback isn't called.
var f = Fiber.current;
try {
Fiber.yield();
} catch (e) {
console.log(e);
}
}).run(); |
I must warn you that your fix here replaces an error with a memory leak. |
@laverdet - Thanks for your response and this extra hint. Your explanation makes sense. It looks like we must still be missing something in our app. If we didn't have a handle to the fiber somewhere then yeah we wouldn't be able to call Edit: I hadn't seen your response before I wrote mine. |
For anyone else reaching this issue from Google and worrying that something is wrong with node-fibers: It may very well be a bug in your code or another library where a callback is never invoked which that fibre is waiting on. I had this error as well, "This Fiber is a zombie." It turned out to be because of a bug in my application code. In my case, I had implemented a re-entrant lock (mutex) which allows first caller to acquire lock, then requires successive calls to wait until the first caller calls release(). The problem ended up being that my rather hastily written lock implementation only stored the first callback for waiting (blocked) fibres! In fact, it wasn't even storing a reference to successive waiting (yielded) fibres. So, naturally the garbage collector is GCing the fibre as there are no references to it, in my case. If you are dealing with a web application, look for requests which are timing out or finishing with an error. Hope this helps someone stop wasting time like I did worrying about a bug in node-fibers before checking their own code. |
If I'm able to catch an exception with the text
Error: this Fiber is a zombie
, then that's a fibers bug, right? Not user error?Working on finding a minimal reproduction, but while we're working on that, curious if we're correct that this is definitely a Fibers bug... (In 1.0.0; going to try 1.0.1 next.)
The text was updated successfully, but these errors were encountered: