Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/json: allow encoding dicts with int keys #468

Closed
wants to merge 1 commit into from

Conversation

andreimatei
Copy link
Contributor

Before this patch, only dicts with string keys were supported. Now, integer keys are also permitted. Dicts with int keys should be common. This matches Go's json.Marshal() and Python's json.dumps(), which also permit int keys (in addition to other things).

Before this patch, only dicts with string keys were supported. Now,
integer keys are also permitted. Dicts with int keys should be common.
This matches Go's json.Marshal() and Python's json.dumps(), which also
permit int keys (in addition to other things).
stringKeys = stringKeys || isString
intKeys = intKeys || isInt
}
// Sort the keys. This is useful if they come from a dict, which presents
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering what this key sorting business is about, and whether it should be pushed into the different implementations of the IterableMapping that want it?
I'm thinking that by default a random impl does not want it (i.e. it wants the keys rendered in the order it produces them in Items()), but notable starlark.Dict probably wants it (because I think it currently produces its items in a random order).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very little is random in Starlark, as it was originally designed for a build system where reproducibility of computation is paramount. For that reason, Items() yields dict elements in insertion order.

This does mean that sorting isn't necessary for determinism, and nor is it part of the contract specified by the doc comment, but it is the existing behavior in starlark-go, and is the documented behavior of starlark-java, so I think this is in spirit a breaking change, and thus warrants an issue in the bazelbuild/starlark repo before we implement any change. Supporting int keys is also a behavior change that needs discussion.

Personally I'm loathe to remove it, even if it is a minor optimization. Even Go, where performance is generally more important than in Starlark, sorts keys in its json.Marshal implementation.

I suspect that using Go's sort.Strings instead of sorting a slice of abstract starlark.Values is a significant optimization that makes this change unnecessary. I suggest you evaluate that first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense about the key sorting. I wasn't particularly interested in doing anything about it.
As you can see, the patch kept the sorting, although not for the case of mixed int/string keys. I'm happy to do whatever in that case - sort as strings, or even reject the whole marshalling.

Supporting int keys is also a behavior change that needs discussion.

So you're saying I should necessarily discuss the int keys support on https://github.com/bazelbuild/starlark/ for this patch to move forward (i.e. "extras" in the Go impl are not kosher)? In the happy scenario that the proposal gets accepted, would a Java implementation also be required? Because that might be beyond personal interest :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should necessarily discuss the int keys support on https://github.com/bazelbuild/starlark/ for this patch to move forward...?

Yes, the two implementations should be consistent in the behavior of all their core functions. (This is not a theoretical concern: at Google we have tools in Go and Java that process Bazel/Blaze BUILD files, and any divergence in their behaviors is problematic.)

would a Java implementation also be required?

No. Someone else will implement that if the proposal is accepted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to discuss the int keys as a wider Starlark proposal, assuming it has your support in principle.
But just to check - I've found this recent comment on the
starlark repo that seems to say that the JSON library is not under their purview -
bazelbuild/starlark#253 (comment)
Should I not read too much into it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to discuss the int keys as a wider Starlark proposal, assuming it has your support in principle.

Personally I don't see much value in json.encode supporting dicts with int keys, since you can't round-trip encode+decode them: since JSON doesn't support objects with int keys, they end up being coerced to strings. And while it's true the Go's json.Marshal supports this behavior (again by coercing to strings), I've never seen anyone use it, and I can't really imagine why you'd want to. Plus it complicates the sorting. But feel free to propose it if you like.

Should I not read too much into it?

Yep, don't worry about that comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't care about it, I probably will not propose anything; I'm not blocked on this.
But,

since you can't round-trip encode+decode

They do round-trip, though, at least in Go:
https://go.dev/play/p/v07cIJJNZLA
Let me know if this changes your interest; if it doesn't, I'll close this PR.

Copy link
Collaborator

@adonovan adonovan May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I had no idea. Thanks for the correction. Reading threads golang/go#12146 and golang/go#12529, I'm honestly surprised the feature was accepted with so little resistance.

If you want this feature, by all means propose it. I'm curious though, what do you need it for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm extending Delve with an RPC that takes a Starlark script as an argument, and returns the result of running that script. All my JSON interest comes from figuring out how exactly to return those results. My current thinking is that I'll mandate that the script return a string - in particular, a JSON string. So it's up to the script to do the json encoding. On the client side, I want to easily map the result onto a struct - a Go struct since the client is also a Go program.
In the one script that I'm playing with, I was naturally constructing a Starlark dict with int keys (goroutine IDs, in my case), and I was upset that it didn't "just work" when I attempted to json.encode() it. Otherwise, things work pretty well, partially courtesy of your struct package that gives me nice json matching go-side structs.

I'm not married to constructing the result JSON inside the script - particularly since I still have the issue with the fallible attribute accessors that Delve implements (I'm running with a fork of starlark-go to handle that for the moment). Delve uses the stdlib rpc system with the json codec, so I could even conceivably have my rpc talk in terms of starlark.Value and let the rpc codec deal with encoding/decoding. I very briefly tried building the json outside of the script, in Delve. But the problem I ran into immediately was that the implementation of the starlark-go hashtable is not marshallable.

I've also discovered that you've made a proto library for starlark-go, so that's another option. But for now I'm interested in prototyping things more quickly, so I didn't jump at the opportunity to add protos to the mix.

So, anyway, I don't really know what I'm doing yet, and I don't particularly need this patch. It just struck me as something that oughta "just work".

Copy link
Collaborator

@adonovan adonovan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution, Andrei.

stringKeys = stringKeys || isString
intKeys = intKeys || isInt
}
// Sort the keys. This is useful if they come from a dict, which presents
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very little is random in Starlark, as it was originally designed for a build system where reproducibility of computation is paramount. For that reason, Items() yields dict elements in insertion order.

This does mean that sorting isn't necessary for determinism, and nor is it part of the contract specified by the doc comment, but it is the existing behavior in starlark-go, and is the documented behavior of starlark-java, so I think this is in spirit a breaking change, and thus warrants an issue in the bazelbuild/starlark repo before we implement any change. Supporting int keys is also a behavior change that needs discussion.

Personally I'm loathe to remove it, even if it is a minor optimization. Even Go, where performance is generally more important than in Starlark, sorts keys in its json.Marshal implementation.

I suspect that using Go's sort.Strings instead of sorting a slice of abstract starlark.Values is a significant optimization that makes this change unnecessary. I suggest you evaluate that first.

@andreimatei
Copy link
Contributor Author

Closing this for now as it does not look like an immediate accept. Might reopen later in the context of a broader Starlark proposal.

@andreimatei andreimatei closed this Jun 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants