Replies: 9 comments 30 replies
-
cc @captbaritone & @benjie |
Beta Was this translation helpful? Give feedback.
-
Open questions here:
|
Beta Was this translation helpful? Give feedback.
-
This feels really, really good to me. The summary here does a really good job explaining why “nullable” is the right default. I love that this proposal not only preserves that, but clarifies it (by indicating that the default is “present or error”, which is much more intuitive as a default than the current “present or null, where null might be an error”). —— When thinking about null and errors I like thinking in terms of pseudo-Rust types, since I find that type system’s distinction between The right default for a GraphQL field is to leave open the possibility that it might error in the future, which is
In the new behavior,
The new behavior is much much better. —— The migration plan here feels really good to me. One open question is whether this directive will live forever. Concretely, would we ever want (in a far future version) to make this behavior the default, and switch it from being opt-in to having the old behavior require an opt-out That change would be painful, but assuming this change has the anticipated benefits, it would be nice to get this behavior as the default at the limit. In 10 years, is every schema going to need this directive? That feels unwieldy. I’m very torn here. ——
Is this always unergonomic? When I think about Rust, returning For languages that don’t have the equivalent construct or idiom, they can either choose between a richer-but-less-ergonomic representation of the full sum type, or they can “fall back” to representing the error as null, losing some information but adding ergonomics. The latter option is the only option today… so with this change, we’d be doing no harm, since clients can always choose to maintain their current behavior if they want to prioritize ergonomics over richness. |
Beta Was this translation helpful? Give feedback.
-
This is a great write-up, and I love the definition of the terminology. Alas, I think it explicitly does not solve the static type generation problem for no-knowledge clients. Even under The null-only-on-error RFC eliminates ambiguous nulls for error-handling clients, even if they have no knowledge of the GraphQL schema. Let me address this more fully. DefinitionsThese definitions are only valid for this comment, not valid in the wider ecosystem! No-knowledge clientA client that does not know anything about the GraphQL schema and determines all behaviors from the server's response only. A no-knowledge client may or may not understand the GraphQL document that has been issued to the server - in the case of persisted operations, perhaps all it knows is the hash. For the purpose of this comment, knowledge of the outgoing GraphQL document is irrelevant. I believe most non-IDE GraphQL clients are no-knowledge clients (or very close to no-knowledge) - they are not given a runtime representation of the GraphQL schema they are working with, nor do they automatically introspect the schema, nor do they have specific runtime code generated for them based on this schema and their query documents (only static types which are used for type checking but don't impact runtime behavior). Error-handling clientA client that handles errors locally, for example by throwing an error when an errored GraphQL response field is accessed. A React-based client might be an error-handling client if it throws an error whenever a field is accessed that has an associated error in the Relay wants to become an error-handling client, and I think Apollo Client would like this too. (I'm uncertain of the intent of other GraphQL clients, but it seems to me that many will want to head in this direction.) Generated typesStatic types (that have no runtime behavior; such as TypeScript types) that accurately model the GraphQL data that will be seen during rendering in the given client. Note that these types may be specific to the combination of the schema, document and the client (e.g. error-handling clients may have different types generated versus non-error-handling clients). Semantic nullA legitimate The problemsThere's two problems that we're trying to solve (each of which may have separate but potentially related solutions): 1. Clients with normalized stores cannot safely update the store if an error occursDue to null bubbling, it is not safe for a no-knowledge client to update a normalized store when any errors are present in a GraphQL response where the For example, in this response {
"data": {
"me": {
"username": "Benjie",
"favouritePet": null
}
},
"errors": [
{
"message": "Failed to retrieve vet",
"path": ["me", "favouritePet", "vet"]
}
]
} Potential solution: disable null bubbling (e.g. {
"data": {
"me": {
"username": "Benjie",
"favouritePet": {
"name": "Brontie",
"age": 13,
"vet": null
}
}
},
"errors": [
{
"message": "Failed to retrieve vet",
"path": ["me", "favouritePet", "vet"]
}
]
} See, for example, the 2. Generated types for error-handling clients cannot correctly represent semantic nullabilityIn the current GraphQL specification we have Since it's not safe to mark most types as strict non-nullable (due to error bubbling and future schema compatibility), the generated types for most clients yield The "semantically-non-null" proposed solutionI posit that what we lack is a "semantically non-nullable" type ( The addition of a semantically non-nullable type would allow error-handling clients to have significantly improved generated types; since we know that any errors met will be thrown, we can safely generate static types for both strict non-nullable and semantically non-nullable as non-null in our language of choice, and avoid the need for null checks in related positions in our code. A nullable type would retain the For non-error-handling clients, type generation for a semantically non-nullable No nulls from semantically-non-nullable types!Critically, the "semantically non-nullable" type I propose would raise an error during coercion if a To tell the difference between an "error null" and a "semantic null" under "Strict Semantic Nullability"'s
(The FAQ above indicates why a A note on syntaxEffectively the "semantically non-null" proposal introduces a middle state between the current "nullable" and "strictly non-nullable" types we all understand. We could represent this with a number of different syntaxes; here are two proposals:
Note: the Aesthetically and forgetting all current usage of GraphQL, it would make most sense to use syntax A; We could use syntax A with the "semantic non-null" proposal and retain the type generation benefits for no-knowledge error-handling clients, but it would face many of the issues that this strict semantic nullability proposal would face:
Syntax B is entirely non-breaking since A note on nullable-by-defaultI believe nullable by default is still the right choice, because schema designers would have to put extra effort in to "narrow" the type from Show us the RFC!Here's the first draft of the "null-only-on-error", or "semantically-non-null", RFC. Note that the names and symbols used are open to workshopping! |
Beta Was this translation helpful? Give feedback.
-
Cool stuff! Quick note that as a mobile dev, I love this proposal because both
"we don't really know if it is nullable or not but we'll let you use it as if it were non-nullable if you know what you're doing ™️". See their definition: But maybe it's ok if we're saying that Overall looking forward to making it easier to work with GraphQL nulls and errors! |
Beta Was this translation helpful? Give feedback.
-
In terms of any changes that have a migration path, might make sense to look at the input side in tandem…. graphql/graphql-spec#872 |
Beta Was this translation helpful? Give feedback.
-
"How to adopt this incrementally"I particularly like the "alternative" incremental adoption strategy you propose. In a single step append a A few other reflections on incremental adoption: For RelayYou mention that Relay would want to adopt strict semantic nullability incrementally on a fragment by fragment basis, but I’m not sure that’s true. Relay, and other clients will need some incremental/thoughtful way to adopt error handling since that will change the runtime behavior of the app, but I think that the adoption of error handing (while a dependency) can be thought of orthogonally to this proposal. For implementation first serversOnce a client implements explicit handling errors I believe it can immediately start to respect semantic nullability, meaning: generate non-nullable types for unadorned fields if Another point which is potentially worth calling out, is that implementation first servers, where the SDL is derived from the actual resolver implementation, should be able to adopt strict semantic nullability in a single step by simply modifying their code-gen to add For federated schemasFinally, one aspect of adoption which I don’t see discussed here is how this would be incrementally adopted in situations where schemas get composed. For example federation or schema stitching. It would probably be good to clarify how we imagine those scenarios working. |
Beta Was this translation helpful? Give feedback.
-
Incorporating points from the above discussion, I have written a new proposal for a Semantic-Non-Null type wrapper: graphql/graphql-spec#1065 I propose using a exclamation point prefix as the syntactic representation of this type:
Critically, I feel this proposal has strong backwards compatibility and discoverability: I did consider a few other syntaxes, but this one is the one that felt the most right to me. With this change, non-null types should be used even more sparingly, and we'll start to see nice clean schemas that avoid the problems of null bubbling, such as: type Query {
user(id: ID!): User
}
type User {
id: ID!
username: !String
avatarUrl: String
bio: String
friends: ![!User]
} |
Beta Was this translation helpful? Give feedback.
-
Regarding the "Preserve option value" guiding principle; if we were to introduce a distinction between "optional" and "nullable" in GraphQL inputs, ## Traditional types:
# optional, nullable
arg: String 👉 string | null | undefined
# required, non-nullable
arg: String! 👉 string
## New types
# optional, nullable
arg?: String 👉 string | null | undefined
# optional, non-nullable
arg?: String! 👉 string | undefined
# required, nullable
arg!: String 👉 string | null
# required, non-nullable
arg!: String! 👉 string |
Beta Was this translation helpful? Give feedback.
-
This is a follow up to #1394 based on a discussion in the Oct 2023 WG meeting.
Future of nullability in GraphQL is strict semantic nullability.
High level overview:
?
which describes a type as strictly allowing return of semantic null values.@strictNullability
to resolve how to interpretnull
values.GraphQL nullability historical rationale
GraphQL field types default to being nullable with a modifier
!
to indicate non-nullability. Why?First, we want to preserve future evolution of schema. It’s often the case that when first designing schemas that nullability and changes to it over time aren’t deeply considered. It turns out that it’s safe to convert a nullable field to a non-nullable one, but not the other way around. Thus the default is nullable. Defaults matter, and GraphQL’s default prioritizes allowing for future change.
Second, we want to assume that anything can fail anywhere, and minimize disruption. A GraphQL field may be resolved by connecting to a service, and if that fails, a null is returned in the result (and also the error is included alongside the data in the response as well). Using the non-null modifier demands that field never returns null such that if an error occurs during resolution that it “bubbles” to instead have the parent field return null. This is nice in that it provides a strict guarantee of non-nullability, but not nice in that it’s destructive and that sibling fields which may have resolved normally are disposed as a result. As a result we provide guidance to use non-null
!
types sparingly.A very specific example covering both of these two reasons is considering what happens as a system evolves. Perhaps at first you have a simple application monolith with a single DB. A table column is non nullable so you imagine the resulting GraphQL field isn’t nullable either. However in the future you build a dedicated service for a subset of that table, and now resolving that field could fail to reach the service and result in null. A future change to architecture created the possibility for error, and thus null.
Implicit in this understanding of nullability is that a field type does not make it possible to differentiate between interpreting a null value as “this field is actually the value
null
” or “this field encountered an error and we have no data to return”. Ideally we can differentiate this both in the Schema, to describe which of these two interpretations are possible, and in the response, to describe which of the two interpretations has occurred for that specific resolution.Or put more candidly: a GraphQL field is not actually "nullable", it is "ambiguously nullable". Ambiguity hurts!
The specific way this hurts is that clients must be able to differentiate between these two cases. First (schema) to generate useful type definitions, where the ambiguity requires us to generate nullable types everywhere, which is awful ergonomics. Then (result) to know whether to interpret a null value as a semantic null or handle it as an error null. Today clients must look in the
"errors"
part of the result to see if an error exists at that field, but how to interpret the absence of an error isn’t clear if it isn’t known if semantic null type was allowed in the first place.So where do we go from here? How do we resolve this ambiguity?
Annotate semantic nullability:
?
Today we can describe a field’s type normally
field: String
or use a non-nullable type modifier,field: String!
.I propose introducing a "semantically nullable" modifier:
field: String?
(referring to this now as "nullable" to be terse).If a field type is nullable (
String?
), that means that null values are in fact semantically allowed. For a client to know the difference between semantic null vs an error, they can now confidently look to the errors result. If an error exists in the array for this field then the null was the result of an error, and if not then it is in fact a semantic null.This leaves an unmodified type (
String
) remaining as “ambiguously null”.Now we have a way to describe some fields as specifically allowing semantic null and we have a mechanism (errors result) to differentiate that from an error null.
Now that a nullable modifier exists, to make this truly useful, we would next want to interpret unmodified
field: Type
as “null only on error” (related RFC) and resolve the ambiguity. How can we do this this safely, in a backwards compatible way?A strict nullability schema
The schema can next include a directive (exposed as a new boolean in introspection) called
@strictNullability
. This directive tells clients that they should interpret unmodified field types (field: String
) as semantic null not being a valid value and that any null value in a the data result should be interpreted as a field error, regardless of whether the error portion of the result includes an entry for that field.With both changes in effect, a schema has removed ambiguous null as a potential result from the service overall. Clients know the types possible in the schema and can interpret and differentiate the result accordingly.
Edit: added after @benjie's feedback below
Additionally, the introduction of
@strictNullability
now requires that an error is included in the error list if an unmodifiedfield: Type
returns null. It will do this by changing the execution behavior through the same mechanism as NonNull types in Value Completion. Importantly, these errors would not bubble.Execution behavior (value completion) does not change for nullable types (
Type?
) since null continues to be allowed.This means that execution behavior could change in a subtle manner. The result of the
"data"
field will remain unchanged (what was a null, remains a null), however the"errors"
list could appear in some responses it previously did not. This could potentially be breaking when sending responses to a client which discards responses that include any error (unfortunately common for older clients).Here is the specific case of this scenario explained via an example:
A field returns a value which is not meant to be semantically nullable, however the resolver is known to fail often. This service knows it has a client which throws out responses that include field errors, so it does not raise a field error from the resolver even though that would have been the semantically correct thing to do. Because the field is known fails often and the service decides that failure is not a big deal and they would like the client to use the rest of the data, they simply return null to indicate failure instead. While this is semantically incorrect, it produced the outcome they were looking for.
When migrating an existing service to
@strictNullability
that also needs to preserve backwards compatibility for clients which discard full responses if there are any errors, fields that returnnull
to indicate an error should be typed asType?
instead ofType
- they should be declared nullable, since that is an accurate typing of the schema design choice that was made.End Edit
How to adopt this incrementally?
For existing schemas adopting this feature, they will be in an incremental state where "semantically nullable" modifiers (
?
) are incrementally added to resolve some ambiguity, and in this state the schema does not yet apply@strictNullability
.Once this migration is complete and a service has added all true semantically nullable modifiers to field types, then the
@strictNullability
directive is added.Alternative incremental migration strategy
First, convert all field types to Nullable and apply
@strictNullability
at the same time, then incrementally remove the Nullable types from fields which are known to never be nullable.While uglier, this would be safer for avoiding breaking changes if a service is unsure what values are possibly returned and concerned about the impact of introducing new field errors.
In the duration between a client beginning to use nullable type modifiers but before applying
@strictNullability
, clients can decide how to use code generation and result interpretation. Either:Most will do A, and that's fine - it's the preferred path if the migration will be quick and they prefer to just look ahead. Some will do B, and that's fine for small or high-communication teams where you can trust the wrongness risk. Relay and other sophisticated clients will do C, where they allow large teams to adopt this over time.
Let’s look at the effects. Does this break things?
Say a historical schema with many clients has now adopted nullable types and the @strictNullable modifier, what happens to backwards and forwards compatibility?
First of all, new clients no longer see “ambiguous” nulls. The schema now describes if a null is or is not semantically a valid value from the schema’s field type, and we know how to differentiate semantic null from error null (either because
Type
where null definitionally indicates an error, orType?
where if an error result for the field exists it is an error null, otherwise it is semantic null).Edit after @benjie's comment
Even without knowledge of the schema, a client can accurately use the
"errors"
list in the response to know which null values represent errors and which are values, since an error null is always accompanied by an error in the list.The application of
@strictNullability
is potentially breaking in an edge case that can be mitigated by use of Nullable types. Execution results are always unchanged for the"data"
response, any client which exclusively looks at this part of the response will see no change at all. After applying@strictNullability
unmodified types must include an error in the list for a null value. Clients which consider"errors"
in the response could see new errors if a service was invalidly returning null from a field not marked nullable.Historical clients are unchanged because critically this has not changed the way the executor works in any way. No field which used to return a null value no longer does or vice versa. No new errors are being emitted in the errors result. Error handling behavior is unchanged. This has exclusively changed the schema to be more descriptive in how to interpret existing results.An important subtle point is that a@strictNullability
service may return a null value from an unmodified field type without a resulting error payload. Modern clients now know to interpret this as the field failing to resolve an error (error null) and not a semantic null. Historical clients will continue to interpret this as ambiguous null. Introducing a new error payload where there wasn't one previously would have been unsafe. Some clients throw out any result payload with any error. (Wat?! See the FAQ below)End edit
What about forward compatibility?
In a
@strictNullability
service/schema, you might still begin by introducing a field with an unmodified typefield: Type
, and while it's still true that later changing this tofield: Type!
remains safe, once a schema is strict, later changing this tofield: Type?
is in fact not safe.However, I am less concerned about this for two reasons:
The primary reason schema designers are tripped up by this forward compat issue is not missing semantic null, it's missing error null. They fail to anticipate future changes in their underlying architecture introducing new places for errors to occur, and this proposal includes error nulls as a possibility in the default unmodified type.
Given the proliferation of type-safe languages today (not the case in 2012) it's likely that strict nullability is a first class design consideration for anyone with this directive enabled. If it's not, well then this is an opt-in directive and this schema design "footgun" is at least one that schema owners are opting themselves into rather than being surprised by. The default without-directive state will remain
Type → Type | AmbiguousNull
, which remains fine for less sophisticated services and clients.FAQ: Should we then continue to suggest use of NonNull (
!
)?Yes, but far less often. It's still used sparingly but it implies something which the service guarantees will never produce a null, including an error null. That's still useful in some scenarios (obj identifiers).
But generally most will use this a lot less with a more familiar
?
available to them.FAQ: How is it okay for a@strictNullability
field to returnnull
without a matching error in the"errors"
array?EDIT: This section no longer applies, but leaving here for posterity
Currently, a field returning an ambiguous null could mean one of three things:
"errors"
array response, therefore it is certainly the result of an error.Wait, what? How is a missing matching error possibly spec compliant?
According to the section on Handling Field Errors if a field error occurs then an error must be added to the errors list. This could happen because the resolver simply failed (threw Exception, return Result, etc), it could also happen because it returned a value that failed to coerce (was the wrong type, null for a Non-Null modifier, etc). This all implies that if a field failed to return the wrong type of value or failed to return at all that it is a field error and thus must have an error entry.
So how could this a field returning an error null not have a matching error in the list? Well, the field resolver happened to simply return
null
, which is totally allowed by the executor and schema. It did this not because semantic null was the right value, but just because services are weird sometimes and this is how they decided to represent a failure condition. And this is allowed... and ambiguous 🤷So what do we do about this? We have two options:
Option A: A
@strictNullability
service always produces an error for nullsWe amend Value Completion so that in strict mode such that if a resolver returns
null
, and it isn't explicitly aNullable
type, then we throw a field error.Pros:
null
we assume is semantic null and not valid for a strict unmodifiedfield: Type
)Cons:
This introduces a new error which didn't exist before. Since lots of historical clients decided to simply reject any result which had an
"errors"
and try again, it's entirely possible that the service had made this strange choice not because they didn't know better, but because they considered the failure non-fatal and safe to omit the value. If they had thrown an error instead the client would have treated it too seriously and thrown out the whole thing. This was unfortunately a common pattern for a long time.This breaking change can be mitigated, but only with careful guidance! Since the directive isn't applied by default, adding this to the spec is definitely not breaking. BUT you can't simply add the directive and expect no breaking changes! You must first move every field resolver that returns null to be a Nullable type! If that is true, then adding the directive introduces no change and no thus no breakage.
Option B: Do nothing.
No changes to the executor at all. Existing behavior persists.
Pros:
Cons:
Had we been starting from scratch, I'd definitely do option A (and I'd also not make strict mode, I'd just have done this from the start - agreeing with @dschafer's comment below). The guarantee of having error info is strictly better, and we'd just have built better clients.
But alas, I think our Guiding Principles point us to option B.
Also, while the spec can choose to do nothing, GraphQL libraries and services can always choose to be stricter than the spec itself. We've left plenty of room in allowing resolvers to be a "internal function" for GraphQL libraries to decide what is best.
I would be totally comfortable with a non-normative note in the spec suggesting that GraphQL libraries may choose option A, but for historical reasons we don't enforce it and it's still spec compliant to not.
Also, I suspect the cost of not having an error in the list guarantee is quite low. In
@strictNullability
we don't need it to know that a field has in fact failed. If a client wanted to get this guarantee back they could always fill in the gaps and produce a generic error locally that says something akin to "this field unexpectedly returned null"Beta Was this translation helpful? Give feedback.
All reactions