Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is JSON.rawJSON limited to primitives only? #46

Open
airhorns opened this issue Jun 15, 2024 · 2 comments
Open

Why is JSON.rawJSON limited to primitives only? #46

airhorns opened this issue Jun 15, 2024 · 2 comments

Comments

@airhorns
Copy link

airhorns commented Jun 15, 2024

Forgive me if this is the wrong spot to put this.

I think JSON.rawJSON is a really powerful API for performance-optimizing JSON serialization. But, because it is limited to only producing valid primitive JSON, it can't be used for "inline"-ing existing JSON.

I've got a couple use cases I want to use it for that requires feeding pre-serialized objects and arrays into the serialization of outer object trees. For example, in a typical REST API, you might retrieve 10 records from the database, and reply with one big JSON array of all of them. Each record might have a big JSON value on it, and if they are large, it performs poorly to de-serialize each record's JSON object to then just serialize it again to produce the REST API response holding all 10 records. Instead, it'd be great to leave the data as a string when fetching from the database, and then just insert it into the final JSON string produced by JSON.stringify using JSON.rawJSON to wrap each of these strings.

Without this capability, one has to resort to manually clobbering together JSON strings which is far less performant and correct than using the engine's built-in capabilities, or always deserializing just to serialize again. Userland implementations like json-stream-stringify are far, far slower, and at least in my case, the JSON objects are really big, so deserializing and reserializing is a major performance issue.

I presume there is a justification for limiting what can be go through a .rawJSON, but what is it? And, could there ever be a trusted mode, or some sort of escape hatch where for very performance sensitive use cases, any ole string could be sent along?

Also one other note: it seems that this low level API could really assist with performance optimization around avoiding re-serializing values you already have the source JSON string for, but as currently specified it can't because it does the safety check by parsing the string anyways. That seems correct but inefficient, again suggesting that it'd be great to have some sort of escape hatch for the brave. Notably, [[IsRawJSON]] being an internal slot means that userland can't create their own raw JSON objects and pay the complexity / reliability price.

@airhorns
Copy link
Author

airhorns commented Aug 23, 2024

@gibson042 apologies for the direct ping but it'd be super helpful to understand this and/or collaborate on widening the applicability!

I ended up open sourcing the thing I would want to use rawJSON for here: https://github.com/gadget-inc/deferredjson

@gibson042
Copy link
Collaborator

Thanks for the ping. The reason for limiting to primitive values is cutting off what would otherwise be a bigger opportunity for surreptitious communication by varying representation details within JSON text representing the same data. See #12 (comment) , #19 (comment) , and also the extensive discussion at the October 2021 plenary that ultimately resulting in global availability with primitive-only constraints as a balance of convenience vs. integrity (the latter being a concern about the ability for an untrusted data-only input object to encode itself as arbitrary JSON text, originally raised in July 2020).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants