Impact on observability tools relying on stack traces #27

misterdjules · 2019-09-25T17:24:02Z

Hi!

I'm trying to understand the impact of this proposal on the use cases of operators of applications that rely on accurate stack traces to observe the behavior of their applications.

Since this proposal would allow developers to opt into a mode where stack traces would have some stack frames removed, I'm wondering whether the proposal is intended to provide an opt-in behavior in rare specific cases (e.g. a few functions out of all the dependencies of an application whose implementation details would reveal sensitive information) or in the common case (most if not all library code).

The following excerpt from the proposal text:

This has benefits for authors of library code who would like to refactor without fear of breaking consumers relying on their implementation details and authors of encapsulated code who would like to provide certain encapsulation guarantees.

seems to indicate the latter, since "lik[ing] to refactor without fear of breaking consumers relying on their implementation details" seems like a common goal for developers.

If this proposal indeed intends to provide an opt-in behavior in the common case, what are your thoughts on its impact on tools like Sentry or any other stack trace collection tool? Would that potentially make them a lot less usable once this feature ships, or are there other mechanisms through which those tools would be able to collect complete stack traces even when developers opt into this behavior?

/cc @domenic @michaelficarra

bathos · 2019-09-25T19:51:15Z

It’s intended for specialized use cases. It would impact traces in tools like Bugsnag and Sentry. If used correctly, it could improve them; it would reduce noise from internals. The tricky thing of course is that this is only true if the library code in question really doesn’t have bugs!

Consider for example a polyfill for a constructable IDL interface Foo. Say this constructor’s signature accepts an arg, one of the IDL integer types. Coercing user input to that type might involve a deep stack of internal frames, and it could result in (deliberately) throwing a TypeError, or it could result in an error being thrown from invocation of arbitrary user code (e.g. obj.valueOf() call when casting to number). The frames concerned with this coercion aren’t where the bug is — the bug is in a line like new Foo(invalidArg). Normally the frame with that line could end up buried in the middle of a stack with library internals piled on top and externals down below. But with stack redaction, the correct line would end up at the top of the stack, just as it would have if Foo had been implemented natively.

(external 1:34 here points at new Event, not an internal arg type casting operation)

So that’s a pretty solid benefit, but it’s true that it could go south pretty badly if people used this in inappropriate situations.

I wonder if stack redaction would be better if it had two parts: one, the implementation hiding directive, but two, a requirement that the error in question gets expressly flagged as having been anticipated. This would reduce the odds of library authors accidentally making their own bugs untraceable, since they will know at what junctures they are invoking potentially untrusted code or code which expressly creates and throws new errors based on input.

try {
  invokeUntrustedUserCode();
} catch (err) {
  throw Error.wasAnticipated(err);
}

(Spitballing ... probably not practical, I dunno, since it would probably need to occur in every single implementation-hidden function in the stack — ‘yes, I should be redacted’ — to work.)

misterdjules · 2019-09-26T17:19:42Z

I actually just realized that my question regarding whether the new directive is intended for common vs exceptional use cases is covered by https://github.com/tc39/proposal-function-implementation-hiding#will-everyone-just-end-up-using-this-everywhere.

misterdjules · 2019-09-26T21:51:54Z

@bathos The use cases you are describing seem to be the ones where this proposal would be helpful, and I understand those, but it seems that saying:

The tricky thing of course is that this is only true if the library code in question really doesn’t have bugs!

is making a very significant assumption that I don't think is valid in practice.

BridgeAR · 2019-09-27T01:27:19Z

Node.js is already kind of redacting some error frames. It's done by using the non standard Error.captureStackTrace() function. The full stack of the error is probably still available when using the C++ layer and V8 internal functionality but it's not visible by default for the user. We only do that for validation functions that would end up as obsolete frames for the end user.

But after working with that for a while I feel the need for a way to opt out of that behavior, since it's otherwise hard to debug the functionality during development. And if Node.js would not have a bug in the code, it would also not surface properly...

I think it would be ideal if the consumer could decide if they want to see the full stack or the "recommended" one by the author of the code.

bathos · 2019-09-27T12:46:57Z

@misterdjules Yep, I agree — that’s why I suggested an (admittedly under-researched) alternative where redaction requires express ‘blessing’. Only the author can potentially distinguish here, we anticipate a possible error from user input/calling into user code from oh noooo. Even with an explicit model, the author would need a (perhaps surprising) amount of specialized knowledge to get it right, so I don’t think it’s off-base for you to have concerns about this.

With the current blanket approach to stack redaction, I personally would not be comfortable using implementation hiding in contexts where I would like to for the sake of .toString(). I can identify points where user code/coercion is expected, but I cannot quite say I never write bugs :)

bathos · 2019-09-27T12:49:45Z

@BridgeAR The visibility of the full stack in REPL or other tooling is not mandated by this proposal, fwiw. Only the error.stack property. That is, this is about what’s introspectable within JS, but it would not require stuff like consoles to not show the uncensored stack.

mmarchini · 2019-09-27T17:32:28Z

The visibility of the full stack in REPL or other tooling is not mandated by this proposal, fwiw. Only the error.stack property. That is, this is about what’s introspectable within JS, but it would not require stuff like consoles to not show the uncensored stack.

This is true when working on development environments, but it's not necessarily true for servers in production, since many tracing and logging libraries won't use console.log. Here's a few examples of popular libraries focused on production logging which will output the stack trace of Error objects without using console.log:

pino, a popular logging library (388,066 downloads/week) uses streams instead of console.log. By default it will stream to the process stdout, but it can easily be changed to stream to a file, to a socket, http requests, etc:

https://github.com/pinojs/pino/blob/master/pino.js#L124-L125

Diagnostic Report is a new API in Node.js core, which outputs a JSON-serialized report with information about the environment in case of a crash. By default it writes directly to a file (without using console.log). It also serializes the stack trace of error objects, and it relies on Error.prototype.stack to get the stack trace.

https://github.com/nodejs/node/blob/master/lib/internal/process/report.js#L22-L30

@elastic/apm-agent-nodejs (48,205 downloads/week) relies on captureStackTrace (which wasn't mentioned in the proposal so far) for tracing-level logging. The transporters might also receive an Error object with a .stack property, in which case it will be serialized and sent using the transporter (could be to stdout, to a file, http request, etc.)

https://github.com/elastic/apm-agent-nodejs/blob/master/lib/agent.js#L154

Other APM agents (for example, Dynatrace and Google Stackdrive) also rely on Error.prototype.stack or Error.captureStackTrace

As @bathos mentioned, hiding some frames from the stack could improve those tools in some cases. But this could also be achieved by post-processing the errors, and many of these tools (if not all) have hooks which can be used to redact/transform stack traces from error objects already.

But there will be times when 3rd party libraries will have bugs (assuming otherwise is unrealistic). These bugs might only happen in production, which means they will be reported by logging libraries mentioned above. If there's not way to see the real stack trace (without hidden frames), debugging these issues would be much harder, and reporting issues could be frustrating for the reporter and for the library maintainers. Having a way to opt-out from the hide implementation details behavior would solve that though.

jorendorff mentioned this issue Dec 5, 2019

Examples of stack position information restricting refactoring #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Impact on observability tools relying on stack traces #27

Impact on observability tools relying on stack traces #27

misterdjules commented Sep 25, 2019

bathos commented Sep 25, 2019 •

edited

Loading

misterdjules commented Sep 26, 2019

misterdjules commented Sep 26, 2019

BridgeAR commented Sep 27, 2019

bathos commented Sep 27, 2019

bathos commented Sep 27, 2019

mmarchini commented Sep 27, 2019

Impact on observability tools relying on stack traces #27

Impact on observability tools relying on stack traces #27

Comments

misterdjules commented Sep 25, 2019

bathos commented Sep 25, 2019 • edited Loading

misterdjules commented Sep 26, 2019

misterdjules commented Sep 26, 2019

BridgeAR commented Sep 27, 2019

bathos commented Sep 27, 2019

bathos commented Sep 27, 2019

mmarchini commented Sep 27, 2019

bathos commented Sep 25, 2019 •

edited

Loading