More documentation improvements.

timothyarmes · Feb 20, 2020 · 14aff39 · 14aff39
1 parent 4bd1fc0
commit 14aff39
Showing 1 changed file with 68 additions and 81 deletions.
diff --git a/README.md b/README.md
@@ -9,29 +9,29 @@ This approach is often sufficient, however it suffers from two major problems:
 
 1. The recursive nature of GraphQL means that a single request can lead to many database requests.
    For example, if a request pulls in 100 results, and each of those results calls a resolver that
-   requires a lookup in other table, then that single query will result in 101 database fetches.
+   requires a further lookup in other collection, then that single query will result in 101 database fetches.
    It's easy to see how we can quickly reach hundreds of lookups for a single Apollo query.
 
 2. The second problem, and the most difficult one to solve, occurs when we need to fetch data from a
    primary collection, join with a secondary collection and then sort (or filter) the results based on a
-   field in that second collection.
+   field in that *second* collection.
 
    A typical scenario would occur when fetching data from multiple collections to display in a table where
    the user can click a column to change the sort order.  In this case it's not sufficient to perform a simple
    Mongo `find` on the top-level collection (since the sub-fields won't be available to sort on), and as a
    result just isn't possible to use the simplified GraphQL approach of fetching the joins using resolvers.
 
-Both of these issues can be solved by performing a *single* Mongo query that joins the tables, so that we can
-then sort or filter on any field.
+Both of these issues can be solved by performing a *single* Mongo aggregation that fetches all the data in one go,
+performing lookups on the related collections, so that we can then sort or filter on any field in the result.
 
 *Apongo* does all the heavy lifting for you:
 
 1. It analyses the `resolveInfo` data passed to the top-level resolver in order to extract the hierarchy of
    fields that have been requested. It does this to ensure that it only performs the joins required for the
    actual query.
 
-2. From this information it builds a __single__ Mongo aggregation pipeline that recursively performs joins
-   for the other tables used in the request (using `$lookup`).
+2. From this information it builds a __single__ Mongo aggregation pipeline that recursively performs lookups
+   for the other collections used in the request.
 
    You can then include the pipeline as part of a larger aggregation pipeline that sorts and filters the result.
 
@@ -59,9 +59,8 @@ const schema = makeExecutableSchema({
 
 ## Specifying the Joins
 
-Apongo needs to know which fields are joins, and how to join them.
-
-A custom GraphQL directive, `@apongo`, is used to specify this information directly in the types declaration. Here's an example:
+Apongo needs to know which fields are joins, and how to join them. In order to make this both easy to specify and declarative,
+a custom GraphQL directive, `@apongo`, is used to specify this information directly in the types declaration. Here's an example:
 
 ```
 type User {
@@ -77,29 +76,31 @@ type Query {
 
 ## Writing the Resolvers
 
-In your resolver you'll need to call `createPipeline` to create the aggregation pipeline:
+In your resolvers you'll call `createPipeline` to create the aggregation pipeline:
 
 ```
 import { createPipeline } from 'apongo';
 
 ...
 
-const users = (_, { limit = 20 }, context, resolveInfo) => new Promise((resolve, reject) => {
-  // Create a pipeline to first perform the match and joins, then limit the results
+const users = (_, { limit = 20 }, context, resolveInfo) => {
+  // Create a pipeline to first perform any initial matching, then do the lookups and finally the results
   const pipeline = [
-    // Perform any initial matching that you need
+    // Perform any initial matching that you need.
+    // This would typically depend on the parameters passed to the query.
     { $match: { type: 'client' } }
     
-    // Include all the pipeline stages generated by Apongo to do the joins
+    // Include all the pipeline stages generated by Apongo to do the lookups
     // We pass `null` since the `users` query is mapped directly to the result
-    // of an aggregation on the Users collection
+    // of an aggregation on the Users collection.
     ...createPipeline(null, resolveInfo, context),
     
-    // Filter, sort or limit the result
+    // Filter, sort or limit the result.
     { $limit: limit },
   ];
 
   // How you call Mongo will depend on your code base. You'll need to pass your pipeline to Mongo's aggregate.
+  // This is how you'd do it using `mongoose`
   return UsersCollection.aggregate(pipeline);
 });
 
@@ -111,12 +112,17 @@ const users = (_, { limit = 20 }, context, resolveInfo) => new Promise((resolve,
 
 `createPipeline` is called with three parameters:
 
-* _mainFieldName_ : The name of the field containing the result of the aggregation, or null. See below.
-* _resolveInfo_ : The `resolveInfo` passed to your resolved
-* _context_ : The context passed to your resolved
+| Parameter       | Description
+| --------------- | -----------
+| _mainFieldName_ | The name of the field containing the result of the aggregation, or `null` if the entire query is the result of an aggregation over a specific collection. See below.
+| _resolveInfo_   | The `resolveInfo` passed to your resolver
+| _context_       | The `context` passed to your resolver
+
+This function will analyse the query and construct an aggregation pipeline to construct the lookups.
+
+In the example above, the `users` query need to directly returns the result of an aggregation over the `Users` collection.
+If the GraphQL request includes the `company` field then Apongo will fetch data from the `Companies` collection using `$lookup`.
 
-This function will analyse the query and construct an aggregation pipeline to construct the joins. In the example above, as we aggregate the __Users__ collection it sees the request for the `companies` fields and it will
-add a join to the pipeline:
 
 ```
     [
@@ -143,33 +149,10 @@ By default `createPipeline` assumes that the fields in current GraphQL request m
   type Query {
     paginatedUsers: PaginatedUsers!
   }
-`;
-```
-
-Here, calling `createPipeline` within the `paginatedUsers` resolver with `null` as the `mainField` will result in slight problem:
-
-```
-    [
-      {
-        '$lookup': {
-          from: 'users',
-          localField: 'users.userId', // Error - this should be 'usersId'
-          foreignField: '_id',
-          as: 'tasks.user'
-        }
-      },
-      {
-        '$unwind': { path: '$tasks.user', preserveNullAndEmptyArrays: true }
-      },
-    ]
-
 ```
 
-Apongo, having recursed into the users field of the request, will now try to look up `userId`
-at `users.userId` within the current pipeline.
-
-When we wish to create a pipeline for a specific field withing the response, we need to pass in
-the name of that field:
+Here, the `paginatedUsers` resolver should return two fields, `count` and `users`. `users` needs be the result an aggregation
+on the `Users` collection, sp we need to tell `createPipeline` this by passing the field name to `createPipeline`:
 
 
 ```
@@ -188,26 +171,28 @@ See below for more information about handling pagination.
 
 The `lookup` request accepts a number of fields:
 
-* _collection_ : The name of the collection to join to
-* _localField_ : The name of the local field used by the $lookup
-* _foreignField_ : The name of foreign field used by the $lookup
-* _preserveIfNull_ (Optional): Boolean to determine if the parent should should be kept if no join is found (default - true)
-* _conds_ (Optional): A stringified JSON array of additional conditions use for the lookup
+| Parameter                   | Description
+| --------------------------- | -----------
+| _collection_                | The name of the collection to lookup
+| _localField_                | The name of the local field used by the $lookup
+| _foreignField_              | The name of foreign field used by the $lookup
+| _preserveIfNull_ (Optional) | Boolean to determine if the parent should should be kept if no join is found (default - `true`)
+| _conds_ (Optional)          | A *stringified* JSON array of additional conditions used by the lookup
 
-Sometimes your join will need extra conditions to perform the join between the two collections. Mongo's `$lookup`
-command has an advanced feature that us allows use a pipeline within the lookup. Apongo uses this feature to
+Sometimes your lookup will need extra conditions to perform the join between the two collections. Mongo's `$lookup`
+command has an advanced feature that allows us to use a sub-pipeline within the primary lookup. Apongo uses this feature to
 allow us to supply an array of extra conditions that are used when matching the collection.
 
-Internally, this is what get added to the pipeline within the `$lookup`:
+Internally, this is what get added to the sub-pipeline within the `$lookup`:
 
 ```
 and: [
-   { $eq: [`$${apongo.lookup.foreignField}`, '$$localField'] },
-   ...JSON.parse(apongo.lookup.conds),
+   { $eq: [`$${foreignField}`, '$$localField'] }, // Match on the keys
+   ...JSON.parse(apongo.lookup.conds),            // Extra conditions specified in the directive
 ],
 ```
 
-The `conds` needs to be a JSON array, but we have to stringify it to pass it to the directive in the types file.
+The `conds` needs to be a JSON array, but we have to stringify it in order to pass it to the directive in the types file.
 
 Here's an example:
 
@@ -235,7 +220,7 @@ type User {
 
 This is useful when you need to sort or filter on a composed field as part of your pipeline.
 
-Note that Apongo takes care of replacing fields accessed by $ with the full path to that field following any joins.
+Note that Apongo takes care of replacing fields accessed by $ with the full path to that field following any lookups.
 
 ### The *expr* request
 
@@ -266,29 +251,30 @@ Wherever you need to access a field using $ you should include the token `@path`
    aggregation pipeline. They are ignored by all other resolvers.
 
 2. It's very important to understand that resolvers are __always__ called, even for fields which have already
-   been fetched by createPipeline. In our example above, if we provide a `company` resolver for the User type
+   been fetched by `createPipeline`. In our example above, if we provide a `company` resolver for the User type
    then it will be called for each fetched user, even though it would have already been fetched by the aggregation.
 
    It would be very costly to allow the server to refetch all of these fields unnecessarily, so the resolvers
-   need to be written to only fetch the field if it doesn't already exist in the root.
+   need to be written to only fetch the field if it doesn't already exist in the root object that's passed to the
+   resolver.
 
    Our User resolver might look like this:
 
-```
-const User = {
-  // We only fetch fields that haven't been fetched by createPipeline.
-  // companyId comes from the database collection, company is the result fetched via the pipeline
-  company: ({ companyId, company }) => company || CompaniesCollection.findOne(companyId),
-  ...
-```
+   ```
+   const User = {
+     // We only fetch fields that haven't been fetched by createPipeline.
+     // companyId comes from the database collection, company is the result fetched via the pipeline
+     company: ({ companyId, company }) => company || CompaniesCollection.findOne(companyId),
+     ...
+   ```
 
-In the above example we simply test if `company` has already been fetched into the root object
-(via the $lookup stage created by apongo), and if it hasn't we perform the lookup in the traditional way.
+   In the above example we simply test if `company` has already been fetched into the root object
+   (via the `$lookup` stage created by Apongo), and if it hasn't we perform the lookup in the traditional way.
 
-There's a slight performance limitation that occurs when the $lookup returns a null value.
-In that case the resolver receives null for the joined field, and it can't know that an attempt
-was made to do the join. In this case we'll have to __unnecessarily__ call the database (which will again return `null`).
-Such is life.
+   There's a slight performance limitation that occurs if the $lookup returns a null value.
+   In that case the resolver receives `null` for that field, and it can't know that an attempt
+   was made to do the join. In this case we'll have to __unnecessarily__ call the database (which will again return `null`).
+   Such is life.
 
 ## Recipes
 
@@ -309,20 +295,18 @@ By enhancing the aggregation pipeline we can do this quite easily. The types mig
   type Query {
     paginatedUsers: PaginatedUsers!
   }
-`;
 ```
 
 And the resolver:
 
 ```
-const paginatedUsers = (_, { limit = 20, offset = 0 }, context, resolveInfo) => new Promise((resolve, reject) => {
-  // Create a main pipeline to first perform the match and joins
+const paginatedUsers = (_, { limit = 20, offset = 0 }, context, resolveInfo) => {
+  // Create a main pipeline to first perform the match and lookups
   const pipeline = [
     // Perform any initial matching that you need
     { $match: { type: 'client' } }
     
-    // Include all the pipeline stages generated by Apongo to do the joins
-    // Note that we *must* specify the field for which we're creating the pipeline
+    // The `users` field contains the result of aggregating over the `Users` collection.
     ...createPipeline('users', resolveInfo, context),
   ];
 
@@ -335,7 +319,7 @@ const paginatedUsers = (_, { limit = 20, offset = 0 }, context, resolveInfo) =>
   ];
 
   // Split the main pipeline into two facets, one to return the paginated result using the pipeline
-  // above, and the other to get the total count of matched documents
+  // above, and the other to get the total count of matched documents.
   pipeline.push(
     {
       $facet: {
@@ -347,13 +331,16 @@ const paginatedUsers = (_, { limit = 20, offset = 0 }, context, resolveInfo) =>
     },
   );
 
+  // Call the aggregation function. Here's how we could do that using mongoose.
   return UsersCollection.aggregate(pipeline).exec().then(([{users, count}]) => {
     return { tasks, count: count.length === 0 ? 0 : count[0].count };
   });
 });
 ```
 
-### Meteor Compatibility
+## FAQ
+
+### Will this work with Meteor?
 
-Meteor doesn't natively provide access to Mongo's aggregation command. Fortunately this oversight can be
+Yes! Meteor doesn't natively provide access to Mongo's aggregation command. Fortunately this oversight can be
 rectified with a this [tiny meteor package](https://github.com/meteorhacks/meteor-aggregate).