Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defer loading types until they're needed #4974

Closed
wants to merge 16 commits into from
Closed

Conversation

rmosolgo
Copy link
Owner

@rmosolgo rmosolgo commented May 27, 2024

GraphQL-Ruby could reduce application boot time by not loading the whole schema upfront. Instead, it could load the schema as-needed.

To opt into lazy-loaded types:

  • Add load_types_as_needed to the top of your Schema class (before any types are configured), for example:

    class MySchema < GraphQL::Schema
      load_types_as_needed if Rails.env.development? 
      # ... 
    end 
  • Use strings to identify your root types (query, mutation, and/or subscription) and orphan_types, using fully-qualified Ruby constant names. For example:

    - query Types::Query 
    + query "Types::Query" 

    (If you use Ruby constant literals instead of strings, then Rails will load the constants right away. When you use strings, GraphQL-Ruby will Object.const_get(...) the type definition when it needs it.)


At a minimum, it could load the whole schema the first time it's used. Ideally, it would load only the bits of schema needed for a given query. (Some queries would require the whole schema to be loaded.)

On the other hand, it's better for a production server to have everything loaded once, and before any requests are received (and before any forking, to avoid duplicates in child process memory).

TODO:

  • Find tests that broke and fix them
  • Add tests for internal Schema state not being loaded until necessary (maybe add an unload-type method for testing)
  • Migration path: make lazy-loading opt-in in development to avoid nasty surprises
  • Identify calls to Schema. methods which require build_types_hash, reduce usage of them I think this can be a second step
  • Add some APIs to GraphQL::Schema so that root types and orphan types can be registered without causing those files to be loaded in Rails
  • Add documentation
  • Update possible_types to use type classes, not strings, as keys will address in Schema type system clean-up #4975
  • Check benchmarks to make sure that checking whether types have been loaded isn't too slow

Fixes #4919

Copy link
Contributor

@gmcgibbon gmcgibbon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, but I think there might be some use-cases we should consider adding support for:

  • I'd like to support procs as well as strings for those using type checking in their projects. This makes it harder for to derive the name of a type before it is loaded, but it is somewhat easy to implement on top of this patch.

  • This patch currently eager loads all types when a query is executed, and I think we can do better. The current approach doesn't offer much benefit over plain old autoloading of a schema in a Rails app since you reference the constant when you make a query. It would be interesting to see if we could selectively eager load a specific root type depending on what field is being asked for in a query. I tried implementing a lightweight patch for this here but it doesn't cover type loading cases that don't have query contexts and seems a bit brittle.

  • When a schema is defined, you can use the use method to add an extension that might call the query, mutation, or subscription method of the schema. With this patch, they are still strings (or procs if you apply my patch above), so I wonder if it would make sense to implement some kind of callback mechanism to wait until the appropriate root type is loaded and types added.

  • I tried taking selective root type loading a step further by decoupling root type loading from type adding and traversal in gmcgibbon@ea86211, but we have internal API cases where we call schema.query and schema.mutation in places like warden initialization. I'm not sure if these can be easily refactored, I'll have to look into it.

@rmosolgo
Copy link
Owner Author

Thanks for sharing all this thoughtful feedback ... Still quite a few things to figure out 😅

Procs are definitely 👍 from me. I'll make sure to include those in the final implementation.

doesn't offer much benefit over plain old autoloading of a schema

I guess I was thinking that it would improve cases when you boot the app in development and use non-GraphQL features. But in that case, if your schema is autoloaded, you wouldn't've loaded it anyways. I guess I was thinking back to my old GitHub days, where the schema was always loaded at boot, regardless, but I think that was an application setup issue anyways 😖 . Yes, I can see how this would have to be pushed a little further to be useful for the normal case.

use will be tough. I definitely want everything to keep Just Works ™️-ing . One possibility -- I almost hate to mention it -- is to capture the inputs to use and only actually install those plugins when the types are loaded. We'd probably need incrementally-loaded root types for that to be useful too: if the plugin calls schema.mutation, then it has to return the real type -- but maybe it could somehow not have loaded its whole dependency tree... (Would that mean replacing all type/resolver references with Procs? Yuck.)

Maybe Just Works ™️ isn't an option -- maybe future-compat via callbacks is the way to go, like you suggested.

Decoupling traversal would be huge -- I don't quite have a vision for how that'd really work yet, because given a query like { node(id: 5) { ... on Thing { name } } }, I have to somehow find Thing, which may be attached to the schema in many different ways. (The "easiest" solution is a new hook, Schema.load_type(name), and make applications define it, but adding that hiccup to adoption seems bad.)

@gmcgibbon
Copy link
Contributor

Maybe Just Works ™️ isn't an option -- maybe future-compat via callbacks is the way to go, like you suggested.

The way I've always approached large refactors like this is to make it work first, and then make the backwards compatibility happen later. Once we have a better idea of all the changes we need to make, we can worry about compat IMO.

Further on the warden type traversal problem I was mentioning, I found that the way we call this method makes it difficult to lazily load root types in general because we have to ask every type we query if it is a root type here.

I'm switching gears and trying to make shallow schema type additions work like I mentioned here. This way, we can keep the way type resolution works and eliminate the root problem of a root type needing to know about all of its nested types when it is loaded.

Thank you so much for your help with this, I really appreciate it!

@rmosolgo
Copy link
Owner Author

That root_type? method could definitely be refactored somehow ... it was done that way because it's easy with the current implementation, but that might change!

@rmosolgo
Copy link
Owner Author

This was a worthwhile exercise (for me anyways) but not the right step forward, so I'm closing it 🍻

@rmosolgo rmosolgo closed this Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lazily load types and fields in a schema
2 participants