Skip to content

Latest commit

 

History

History
1136 lines (893 loc) · 47.2 KB

InputUnion.md

File metadata and controls

1136 lines (893 loc) · 47.2 KB

RFC: GraphQL Input Union

The addition of an Input Union type has been discussed in the GraphQL community for many years now. The value of this feature has largely been agreed upon, but the implementation has not.

This document attempts to bring together all the various solutions and perspectives that have been discussed with the goal of reaching a shared understanding of the problem space.

From that shared understanding, the GraphQL Working Group aims to reach a consensus on how to address the proposal.

Notes from the 2020/5/28 meeting: https://gist.github.com/leebyron/f7f9d81c7ca5259357fab5d82a4c0621

Contributing

To help bring this idea to reality, you can contribute PRs to this RFC document.

📜 Problem Statement

GraphQL currently provides polymorphic types that enable schema authors to model complex Object types that have multiple shapes while remaining type-safe, but lacks an equivilant capability for Input types.

Over the years there have been numerous proposals from the community to add a polymorphic input type. Without such a type, schema authors have resorted to a handful of work-arounds to model their domains. These work-arounds have led to schemas that aren't as expressive as they could be, and schemas where mutations that ideally mirror queries are forced to be modeled differently.

🐕 Problem Sketch

To understand the problem space a little more, we'll sketch out an example that explores a domain from the perspective of a Query and a Mutation. However, it's important to note that the problem is not limited to mutations, since Input types are used in field arguments for any GraphQL operation type.

Let's imagine an animal shelter for our example. When querying for a list of the animals, it's easy to see how abstract types are useful - we can get data specific to the type of the animal easily.

{
  animalShelter(location: "Portland, OR") {
    animals {
      __typename
      name
      age
      ... on Cat { livesLeft }
      ... on Dog { breed }
      ... on Snake { venom }
    }
  }
}

However, when we want to submit data, we can't use an interface or union, so we must model around that.

One technique commonly used to is a tagged union pattern. This essentially boils down to a "wrapper" input that isolates each type into its own field. The field name takes on the convention of representing the type.

mutation {
  logAnimalDropOff(
    location: "Portland, OR"
    animals: [
      {cat: {name: "Buster", age: 3, livesLeft: 7}}
    ]
  )
}

Unfortunately, this opens up a set of problems, since the Tagged union input type actually contains many fields, any of which could be submitted.

input AnimalDropOffInput {
  cat: CatInput
  dog: DogInput
  snake: SnakeInput
}

This allows nonsensical mutations to pass GraphQL validation, for example representing an animal that is both a Cat and a Dog.

mutation {
  logAnimalDropOff(
    location: "Portland, OR"
    animals: [
      {
        cat: {name: "Buster", age: 3, livesLeft: 7},
        dog: {name: "Ripple", age: 2, breed: WHIPPET}
      }
    ]
  )
}

In addition, relying on this layer of abstraction means that this domain must be modelled differently across input & output. This can put a larger burden on the developer interacting with the schema, both in terms of lines of code and complexity.

// JSON structure returned from a query
{
  "animals": [
    {"__typename": "Cat", "name": "Ruby", "age": 2, "livesLeft": 9}
    {"__typename": "Snake", "name": "Monty", "age": 13, "venom": "POISON"}
  ]
}
// JSON structure submitted to a mutation
{
  "animals": [
    {"cat": {"name": "Ruby", "age": 2, "livesLeft": 9}},
    {"snake": {"name": "Monty", "age": 13, "venom": "POISON"}}
  ]
}

Another approach is to use an input type with a discriminator and input fields for all possible member types.

mutation {
  logAnimalDropOff(
    location: "Portland, OR"
    animals: [
      {type: CAT, name: "Buster", age: 3, livesLeft: 7},
      {type: DOG, name: "Ripple", age: 2, breed: WHIPPET}
    ]
  )
}

input AnimalDropOffInput {
  type: AnimalType!
  name: String!
  age: Int!
  breed: DogBreed # only applies when type = DOG
  livesLeft: Int # only applies when type = CAT
  venom: VenomType # only applies when type = SNAKE
}

This results in more consistent modeling between input & output but still allows nonsensical inputs to pass GraphQL validation.

Another common approach is to provide a unique mutation for every type. A schema employing this technique might have logCatDropOff, logDogDropOff and logSnakeDropOff mutations. This removes the potential for modeling non-sensical situations, but it explodes the number of mutations in a schema, making the schema less accessible. If the type is nested inside other inputs, this approach simply isn't feasable.

These workarounds only get worse at scale. Real world GraphQL schemas can have dozens if not hundreds of possible types for a single Interface or Union.

The goal of the Input Union is to bring a polymorphic type to Inputs. This would enable us to model situations where an input may be of different types in a type-safe and elegant manner, like we can with outputs.

mutation {
  logAnimalDropOff(
    location: "Portland, OR"

    # Problem: we need to determine the type of each Animal
    animals: [
      # This is meant to be a CatInput
      {name: "Buster", age: 3, livesLeft: 7},

      # This is meant to be a DogInput
      {name: "Ripple", age: 2}
    ]
  )
}

In this mutation, we encounter the main challenge of the Input Union - we need to determine the correct type of the data submitted.

A wide variety of solutions have been explored by the community, and they are outlined in detail in this document under Possible Solutions.

🎨 Prior Art

Many other technologies provide polymorphic types, and have done so using a variety of techniques.

Tech Type Read Write
GraphQL Union
Protocol Buffers Oneof
FlatBuffers Union
Cap'n Proto Union
Thrift Union
Arvo Union
OpenAPI 3 oneOf
JSON Schema oneOf
Typescript Union
Typescript Discriminated Union
Rust Enum
Swift Enumeration
Haskell Algebraic data types

The topic has also been extensively explored in Computer Science more generally.

There are also libraries that mimic this functionality in GraphQL:

🛠 Use Cases

There have been a variety of use cases described by users asking for an abstract input type.

📋 Solution Criteria

This section sketches out the potential goals that a solution might attempt to fulfill. These goals will be evaluated with the GraphQL Spec Guiding Principles in mind:

  • Backwards compatibility
  • Performance is a feature
  • Favor no change
  • Enable new capabilities motivated by real use cases
  • Simplicity and consistency over expressiveness and terseness
  • Preserve option value
  • Understandability is just as important as correctness

Each criteria is identified with a Letter so they can be referenced in the rest of the document. New criteria must be added to the end of the list.

Solutions are evaluated and scored using a simple 3 part scale. A solution may have multiple evaluations based on variations present in the solution.

  • Pass. The solution clearly meets the criteria
  • ⚠️ Warning. The solution doesn't clearly meet or fail the criteria, or there is an important caveat to passing the criteria
  • 🚫 Fail. The solution clearly fails the criteria
  • ❔ The criteria hasn't been evaluated yet

Passing or failing a specific criteria is NOT the final word. Both the Criteria and the Solutions are up for debate.

Criteria have been given a "score" according to their relative importance in solving the problem laid out in this RFC while adhering to the GraphQL Spec Guiding Principles. The scores are:

  • 🥇 Gold - A must-have
  • 🥈 Silver - A nice-to-have
  • 🥉 Bronze - Not necessary

🎯 A. GraphQL should contain a polymorphic Input type

The premise of this RFC - GraphQL should contain a polymorphic Input type.

1 2 3 4 5 6 7

Criteria score: 🥇

🎯 B. Input polymorphism matches output polymorphism

Any data structure that can be modeled with output type polymorphism should be able to be mirrored with Input polymorphism. Minimal transformation of outputs should be required to send a data structure back as inputs.

  • ✂️ Objection: composite input types and composite output types are distinct. Fields on composite output types support aliases and arguments whereas fields on composite input types do not. Marking an output field as non-nullable is a non-breaking change, but marking an input field as non-nullable is a breaking change.
1 2 3 4 5 6 7
⚠️ ⚠️ 🚫 ⚠️

Criteria score: 🥇

🎯 C. Doesn't inhibit schema evolution

The GraphQL specification mentions the ability to evolve your schema as one of its core values: https://graphql.github.io/graphql-spec/draft/#sec-Validation.Type-system-evolution

Adding a new member type to an Input Union or doing any non-breaking change to existing member types does not result in breaking change. For example, adding a new optional field to member type or changing a field from non-nullable to nullable does not break previously valid client operations.

1 2 3 4 5 6 7
⚠️ 🚫 ⚠️

Criteria score: 🥇

🎯 D. Any member type restrictions are validated in schema

If a solution places any restrictions on member types, compliance with these restrictions should be fully validated during schema building (analagous to how interfaces enforce restrictions on member types).

1 2 3 4 5 6 7

Criteria score: 🥇

🎯 E. A member type may be a Leaf type

In addition to containing Input types, member type may also contain Leaf types like Scalars or Enums.

  • ✂️ Objection: multiple Leaf types serialize the same way, making it impossible to distinguish the type without additional information. For example, a String, ID and Enum.
    • Potential solution: only allow a single built-in leaf type per input union.
  • ✂️ Objection: Output polymorphism is restricted to Object types only. Supporting Leaf types in Input polymorphism would create a new inconsistency.
1 2 3 4 5 6 7
🚫 🚫 ⚠️ 🚫

Criteria score: 🥉

🎯 F. Migrating a field to a polymorphic input type is non-breaking

Since the input object type is now a member of the input union, existing input objects being sent through should remain valid.

Example: Relay Mutation

# From
input I { x: String }
# To (pseudocode)
input union IU = I | { y: Int }
  • ✂️ Objection: achieving this by indicating the default in the union (either explicitly or implicitly via the order) is undesirable as it may require multiple equivalent unions being created where only the default differs.
  • ✂️ Objection: Numerous changes to a schema currently introduce breaking changes. The possibility of a breaking change isn't a breaking change and shouldn't prevent a polymorphic input type from existing.
1 2 3 4 5 6 7
⚠️ ⚠️ ⚠️ 🚫 🚫

Criteria score: 🥉

🎯 G. Input unions may include other input unions

To ease development.

  • ✂️ Objection: Adds complexity without enabling any new use cases.
1 2 3 4 5 6 7

Criteria score: X (not considered)

🎯 H. Input unions should accept plain data

Clients should be able to pass "natural" input data to unions without specially formatting it or adding extra metadata.

In other words: data should require minimal or no transformation and metadata over the wire

  • ✂️ Objection: This is a matter of taste - legitimate Prior Art exists that require formatting / extra metadata.
1 2 3 4 5 6 7
⚠️ ⚠️ ⚠️ ⚠️

Criteria score: 🥉

🎯 I. Input unions should be easy to upgrade from existing solutions

Many people in the wild are solving the need for input unions with validation at run-time (e.g. using the "tagged union" pattern). Formalising support for these existing patterns in a non-breaking way would enable existing schemas to become retroactively more type-safe.

Note: This criteria is similar to F. Migrating a field to a polymorphic input type is non-breaking

# From
input I { x: String, y: Int }
# To (pseudocode)
input union IU = { x: String } | { y: Int }
  • ✂️ Objection: The addition of a polymorphic input type shouldn't depend on the ability to change the type of an existing field or an existing usage pattern. One can always add new fields that leverage new features.
  • ✂️ Objection: May break variable names? Only avoided with care
  • ✂️ Objection: There are different ways people are working around the lack of input unions so it likely won't be feasible to come up with a non-breaking migration path for all of them.
1 2 3 4 5 6 7
⚠️

Criteria score: 🥉

🎯 J. A GraphQL schema that supports input unions can be queried by older GraphQL clients

Preferably without a loss of or change in previously supported functionality.

1 2 3 4 5 6 7

Criteria score: 🥇

🎯 K. Input unions should be expressed efficiently in the query and on the wire

The less typing and fewer bytes transmitted, the better.

(Not Related to B/H)

  • ✂️ Objection: The quantity of "typing" isn't a worthwhile metric, most interactions with an API are programmatic.
  • ✂️ Objection: Simply compressing an HTTP request will reduce the bytes transmitted more than anything having to do with the structure of a Schema.
1 2 3 4 5 6 7
⚠️

Criteria score: 🥉

🎯 L. Input unions should be performant for servers

Ideally a server does not have to do much computation to determine which concrete type is represented by an input.

1 2 3 4 5 6 7
⚠️ ⚠️

Criteria score: 🥉

🎯 M. Existing SDL parsers are backwards compatible with SDL additions

Common tools that parse GraphQL SDL should not fail when pointed at a schema which supports polymorphic input types.

  • ✂️ Objection: Evolution of the SDL is expected with new features.
  • ✂️ Objection: SDL syntax error can be a positive as a "fail fast" if a system doesn't know about input unions.
1 2 3 4 5 6 7
🚫 🚫 🚫 🚫 🚫

Criteria score: X (rejected)

🎯 N. Existing code generated tooling is backwards compatible with Introspection additions

For example, GraphiQL should successfully render when pointed at a schema which contains polymorphic input types. It should continue to function even if it can't support the polymorphic input type.

1 2 3 4 5 6 7
⚠️ ⚠️ ⚠️ ⚠️ ⚠️

Criteria score: 🥈

🎯 O. Unconstrained combination of input types to unions

It should be possible to combine existing or new input types to unions freely and with ease. Adding an input to one or more unions should not require extraneous changes, constrain or be constrained by schema design.

1 2 3 4 5 6 7
✅️ 🚫️ 🚫

Criteria score: 🥇

🎯 P. Error states and messages should be clear and helpful

Complex algorithms can make it difficult to write error messages that are helpful and clear. When an invalid schema or invalid query are used, it should be obvious what went wrong and how to fix it.

1 2 3 4 5 6 7
✅️ ✅️ ⚠️ 🚫

Criteria score: 🥉

🎯 Q. No new polymorphic type construct should be introduced

The lack of polymorphism on input is only a side-effect of having 2 different type systems for input and output, a somewhat confusing GraphQL specificity (all mainstream programming language and API protocol use the same types for input and output). Adding a new construct for polymorphism support on input 'smells' like increasing confusion, and would increase the gap between input and output type systems, rather than reduce it.

1 2 3 4 5 6 7
🚫️ 🚫️ 🚫️ 🚫️

Criteria score: 🥇

🎯 P. Validation rule should produce easy to understand error message

Implementation of validation rules should be able to produce easy to understand error for value that is invalid according to definition of input union. It's critical for developer experience since GrahphiQL, IDE and other similar tools will output this error during development.

1 2 3 4 5
✅️ ✅️ 🚫 🚫

🚧 Possible Solutions

The community has imagined a variety of possible solutions, synthesized here.

Each solution is identified with a Number so they can be referenced in the rest of the document. New solutions must be added to the end of the list.

💡 1. Explicit __typename Discriminator field

Champion: @eapache

This solution was discussed in graphql/graphql-spec#395

input CatInput {
  name: String!
  age: Int
  livesLeft: Int
}
input DogInput {
  name: String!
  age: Int
  breed: DogBreed
}

inputunion AnimalInput = CatInput | DogInput

type Mutation {
  logAnimalDropOff(location: String, animals: [AnimalInput!]!): Int
}

# Variables:
{
  location: "Portland, OR",
  animals: [
    {
      __typename: "CatInput",
      name: "Buster",
      livesLeft: 7
    }
  ]
}

🎲 Variations

  • A default type may be defined, for which specifying the __typename is not required. This enables a field to migration from an Input to an Input Union

  • The discriminator field may be __inputname to differentiate from an Output's __typename

⚖️ Evaluation

💡 2. Explicit configurable Discriminator field

Champion: @binaryseed

A configurable discriminator field enables schema authors to model type discrimination into their schema more naturally.

A schema author may choose to add their chosen type discriminator field to output types as well to completely mirror the structure in a way that enables sending data back and forth through input & output with no transformations.

The mechanism for configuring the discriminator field is open to debate, in this example it's represented with the use of a schema directive.

🎲 Variations

  • Value is a Enum literal

This variation is derived from discussions in graphql/graphql-spec#488

enum AnimalSpecies {
  CAT
  DOG
}

input CatInput {
  species: AnimalSpecies::CAT
  # ...
}
input DogInput {
  species: AnimalSpecies::DOG
  # ...
}

inputunion AnimalInput @discriminator(field: "species") =
  | CatInput
  | DogInput

# Variables:
{
  location: "Portland, OR",
  animals: [
    {
      species: "CAT",
      name: "Buster",
      livesLeft: 7
    }
  ]
}
  • Value is a String literal
input CatInput {
  species: "Cat"
  # ...
}
input DogInput {
  species: "Dog"
  # ...
}

inputunion AnimalInput @discriminator(field: "species") =
  | CatInput
  | DogInput

# Variables:
{
  location: "Portland, OR",
  animals: [
    {
      species: "Cat",
      name: "Buster",
      livesLeft: 7
    }
  ]
}

⚖️ Evaluation

💡 3. Order based discrimination

Champion: @leebyron

The concrete type is the first type in the input union definition that matches.

input CatInput {
  name: String!
  age: Int
  livesLeft: Int
}
input DogInput {
  name: String!
  age: Int
  breed: DogBreed
  owner: ID
}

inputunion AnimalInput = CatInput | DogInput

type Mutation {
  logAnimalDropOff(location: String, animals: [AnimalInput!]!): Int
}

# Variables:
{
  location: "Portland, OR",
  animals: [
    {
      name: "Buster",
      age: 3
      # => CatInput
    },
    {
      name: "Buster",
      age: 3,
      breed: "WHIPPET"
      # => DogInput
    }
  ]
}

⚖️ Evaluation

💡 4. Structural uniqueness

Schema Rule: Each type in the union must have a unique set of required field names

input CatInput {
  name: String!
  age: Int
  livesLeft: Int!
}
input DogInput {
  name: String!
  age: Int
  breed: DogBreed!
}

inputunion AnimalInput = CatInput | DogInput

type Mutation {
  logAnimalDropOff(location: String, animals: [AnimalInput!]!): Int
}

# Variables:
{
  location: "Portland, OR",
  animals: [
    {
      name: "Buster",
      age: 3,
      livesLeft: 7
      # => CatInput
    },
    {
      name: "Buster",
      breed: "WHIPPET"
      # => DogInput
    }
  ]
}

An invalid schema:

input CatInput {
  name: String!
  age: Int!
  livesLeft: Int
}
input DogInput {
  name: String!
  age: Int!
  breed: DogBreed
}

🎲 Variations

  • Consider the field type along with the field name when determining uniqueness.

⚖️ Evaluation

💡 5. One Of (Tagged Union)

Champion: @benjie

This solution was presented in:

The type is discriminated using features already available in GraphQL, with an intermediate input type that acts to "tag" the field.

A proposed directive would specify that only one of the fields in an input type may be provided. This provides schema-level validation instead of relying on a runtime error to express the restriction.

input CatInput {
  name: String!
  age: Int!
  livesLeft: Int
}
input DogInput {
  name: String!
  age: Int!
  breed: DogBreed
}

input AnimalInput @oneOf {
  cat: CatInput
  dog: DogInput
}

type Mutation {
  logAnimalDropOff(location: String, animals: [AnimalInput!]!): Int
}

# Variables:
{
  location: "Portland, OR",
  animals: [
    {
      cat: {
        name: "Buster",
        livesLeft: 7
      }
    }
  ]
}

⚖️ Evaluation

Summary of spec changes

  • SDL: enable use of @oneOf directive on input object type definitions
  • Introspection: add requiresExactlyOneField: Boolean field to __Type type
  • Schema validation: all fields on a @oneOf input type must be nullable, and must not have defaults
  • Operation validation: when validating a @oneOf input object, assert that exactly one field was specified

The full spec changes can be seen here.

💡 6. Pending

Calls within the Input Union working group proposed a new solution, solution 6, which is a combination of features from solutions 1-4. It has not been fully formalized yet as the working group felt that the Tagged Type was more promising at this stage. This section is left as a placeholder for solution 6 to be formally evaluated at a later time.

For some of the notes we took during the calls, see: #426 (comment)

For the calls themselves, see: https://www.youtube.com/watch?v=u2dnnpKEHZM&list=PLP1igyLx8foH4M0YAbVqpSo2fS1ElvNVD

💡 7. Tagged Type

Champion: @benjie

This solution was presented in:

It's the spiritual successor of Solution 5 - the @oneOf directive after extensive feedback from the Input Unions working group.

In this solution, a new type is introduced to the GraphQL type system: the tagged type. The tagged type has two forms: a tagged input (valid only in inputs) and a tagged output (valid only in outputs), but the definitions look identical otherwise.

These tagged types define a list of member fields, exactly one of which must be present.

input CatInput {
  name: String!
  age: Int!
  livesLeft: Int
}
input DogInput {
  name: String!
  age: Int!
  breed: DogBreed
}

tagged input AnimalInput {
  cat: CatInput!
  dog: DogInput!
}

type Mutation {
  logAnimalDropOff(location: String, animals: [AnimalInput!]!): Int
}

# Variables:
{
  location: "Portland, OR",
  animals: [
    {
      cat: {
        name: "Buster",
        livesLeft: 7
      }
    }
  ]
}

There's controversy over whether the tagged output should be introduced or not, more details on this can be read in graphql/graphql-spec#733

⚖️ Evaluation

Summary of spec changes

  • SDL: introduce new tagged input and tagged output definitions, including member fields
  • Introspection: add new type and __Type.memberFields field to relate to these fields, and __Type.isInputType/__Type.isOutputType fields to differentiate input versus output tagged types
  • Schema validation: tagged types must contain only types that are compatible (matching input or output) and must contain at least one field
  • Operation validation: when validating a tagged input type, assert that exactly one field was specified

The full spec changes can be seen here.

🏆 Evaluation Overview

A quick glance at the evaluation results. Remember that passing or failing a specific criteria is NOT the final word.

1 2 3 4 5 6 7
A 🥇 ?
B 🥇 ⚠️ ⚠️ 🚫 ? ⚠️
C 🥇 ⚠️ 🚫 ⚠️ ?
D 🥇 ?
E 🥉 🚫 🚫 ⚠️ 🚫 ?
F 🥉 ⚠️ ⚠️ ⚠️ 🚫 ? 🚫
G 🥉 ?
H 🥉 ⚠️ ⚠️ ⚠️ ? ⚠️
I 🥉 ⚠️ ?
J 🥇 ?
K 🥉 ⚠️ ?
L 🥉 ⚠️ ⚠️ ?
M 🥈 🚫 🚫 🚫 🚫 ? 🚫
N 🥈 ⚠️ ⚠️ ⚠️ ⚠️ ? ⚠️
O 🥈 ✅️ 🚫️ 🚫 ?
P 🥉 ✅️ ✅️ ⚠️ 🚫
Q 🥉 🚫 🚫 ✅️ 🚫 🚫

☑️ Decision Time!

Following meetings of the GraphQL Input Unions working group, Solution 7 was proposed as an evolution of Solution 5, and is currently the leading solution.

According to a simple weight ranking, here are the solutions in order: