RFC - a place to discuss and refine the API #20

DesignByOnyx · 2020-04-20T16:20:16Z

The "RFC" (request for comments) is intended to provide a place to discuss features and such for this library prior to v1. The goal of this is to collect insight from different developers and discuss use cases we have all encountered in the wild. In particular, I'd like to tap into developer experiences with multiple schema and validation libraries such as yup, joi, AJV, JSON Schema, sequelize, mongoose, et al.

Remember to be kind and focus on providing constructive feedback.

daffl · 2020-04-20T17:41:49Z

I would say that Joi is to @feathersjs/schema what Express is to Feathers. The underlying foundation that can be replaced if necessary but solid enough to cover most needs while only requiring a small wrapper for the integration. Joi is especially interesting because of the chainable interface and large ecosystem of resources and available validations.

The difference to the lower level validation library is what schema definitions add:

An abstraction in the schema definition format that allows to swap out the validation library
TypeScript annotations natively built in
Most importantly: Asynchronous resolvers and value setters to declare and resolve associations. No validation library supports this and ORMs/ODMs add a lot of unnecessary database specific complexity while not supporting more complex validation (and data protection) use cases. In addition this allows to:
- Protect properties based on the context (e.g. only admins can see or edit certain properties)
- "Massage" queries to be safe based on the context (e.g. non-admins can only query their own entries)
- Target schema based protocols like GraphQL and GRPC
- Just like for Feathers and Express, allow fall-throughs to the low level target format as well. E.g. @feathersjs/schema-sequelize allows to easily add Sequelize specific property definitions and constraints like you would in a normal model

A lot of this can already be done in individual hooks (in fact, in a Feathers app itself, @featherjs/schema integration is of course done via hooks) but putting it together with your data model makes a lot more sense. Hooks are great for workflows like sending a notification but using them for schema constraints (often split up across different hooks and mixed in with other workflows) turned out to not always be ideal.

There are now npm packages for almost every schema format combination (JSON schema to SQL, GraphQL to SQL, SQL to GraphQL, JSON schema to GraphQL, Joi to JSON schema) but so far there does not seem to be anything that addresses the actual underlying problem of providing a JavaScript way to define a schema that is not tied to an ORM, database or specific validation library.

DesignByOnyx · 2020-04-21T04:21:33Z

Thanks @daffl, that helps a lot and I'm glad we are kind of piggybacking on an established library, especially Joi now that it works in the browser (since September '19). Here are my initial comments (I tried to keep it short as possible :/):

The resolver function is awesome - sooo useful and easy.
Let's not allow (or show in docs) using primitive constructors in type definitions:
```
// let's not do/show this:
name: String,
isValid: { type: Boolean },
```
It creates ambiguity between primitive values and instance values... and many don't know the difference. I'm even inclined to say that constructors shouldn't be allowed period. If we want to allow it, then let's create Type.constructor(SomeCtor).
For decorators can we support an api like this?:
```
@Type.string().email().required();
email;
email: string; // typescript (is this even necessary??)
```
It's more terse and slightly more "copy and paste-able" with non-decorator schemas.

Can we leverage Joi's link() function for describing relational data, or make our own keyword. Let me describe:

// instead of the current API:
todos: { 
    type: [ 'todo' ]
    resolve: (ctx)  => ctx.service('todo').getList(...)
}

// maybe something like this established Joi syntax
// feathers should be able to recognize this as a relationship mapping (see next bullet):
todos: Type.array().items(Type.link('#todo'))

// Or our own custom "relationship" function (see next bullet):
todos: Type.array().items(
     // note: I like the pound sign for referencing other schemas:
    Type.$ref('#todo', (TodoModel, context) => TodoModel.getList(...))
);

Provide a query operator for relational data. In conjunction with the previous item ^^, this would allow feathers to load additional data based on a query syntax. The following are just conversation starters:

// load all related todos
todos.getList({ query: {
    $populate: ['todos']
}});

// or this
todos.getList({ query: {
    $populate: {
        todos: { paginate: false, query: { foo: 'bar' })
    }
}});

// or this
todos.getList({ query: {
    $populate: async (context) => { ... }
}});

DaddyWarbucks · 2020-04-21T17:40:58Z

I like @DesignByOnyx notes on $populate. That is something I do regularly. But I do see some caveats with it being baked into the schema, namely that it will require the DB adapters to handle that $populate keyword and allow it to "pass through" the methods w/o affecting the underlying DB call. I believe this can already be handled in user land with the current resolve syntax.

user: {
    type: User,
    resolve: async (todo, context) => {

      // The dev needs to bypass the DB adapter by plucking off invalid query params and stashing
      // theme somewhere for use in after hooks, here it would have been done in a before hook
      // and placed on a prop called stashedQuery
      const { $populate  } = context.params.stashedQuery

      const { params, app } = context;
    
      if ($populate && $populate.includes('user')) {
        return app.service('users').get(todo.userId, params);
      }
    }
  }

Having that baked in would be cool, but it may be quite difficult for it to play nicely with DB adapters unless I am missing something?

I am also curious what happens when the user does not return anything from the resolve function, either because they used an if-block like above or if the resolve function just doesn't return anything. I assume that the result would just be { ...data, user: undefined }. This is probably fine for the context.result. I only bring it up because when working with withData and withQuery, returning undefined had adverse affects with different DBs. When making those hooks, I had been using fastJoin quite a bit, but was not a fan of how I had to mutate the record directly. For example,

resolver () => (todo, context) => {
  todo.user = await context.app.users.get(todouser_.id);
  return todo;
}

So, I made the API such that you returned the actual result and the hook took care of placing it onto todo.user similar to how this resolve function works. Like I said, this probably isn't an issue with context.result as much as it would be context.data or context.params.query, but I just wanted to point out some obstacles I have encountered with similar "resolver" pattern. In the case of the with* hooks, if the result is undefined, then the property is not added to the result at all rather than being added as undefined. Actually, now that I think about the value setters rather than just the resolvers, this may be important.

daffl · 2020-04-21T17:48:24Z

👍
Makes sense for primitive types and keeps it consistent. I guess that's why TypeScript uses indicators like lowercase boolean and string. It would still be nice to have the same constructor references as you would in a TypeScript class but having only one way of doing things is definitely always the better choice especially since Constructor references do not allow for circular dependencies. This would require every schema to have a name though (it creates a unique internal one at the moment if it does not)
That's how it originally was but TypeScript requires the type and you really don't want to duplicate the type indicator. Even I messed it up a few time and had a different type than the TS class property definition and things broke in really weird ways. The callback basically gets the correct type validator from the property definition:

  // Here `validator` is a string validator
  @property<Joi.StringValidator> (validator => validator.email().required())
  email: string;

I think type linking that way (as mentioned in 2) makes sense but I'm not sure I follow how you would tell it what to query for. Say I have a user and want their todos I need some kind of resolver to tell it what to fetch (all todos with query: { userId: user.id }). Doing this declaratively (via a JS object syntax instead of a function that can return anything) is something I'd be very hesitant to support natively. It might make for a good plugin that returns standard relationship resolver functions but it does get into ORM complexity territory (my theory is that ORMs are so difficult not because of what they do but because of the DSL each one has to develop to declare dependencies).
Yes, a populate (or $select) should definitely be in there. It basically defines which resolvers should run. It would not be passed to the database adapter.

@DaddyWarbucks These were my thoughts on resolvers (getters) and value (setters):

const User = schema({
  name: 'user'
}, {
  password: {
    type: Type.string().required(),
    async value (value, user, context) {
      const hashedPassword = await bcrypt(value);

      return hashedPassword;
    },
    async resolve (value, user, context) {
      // Hide password for external call
      if (context.params.provider) {
        return undefined;
      }
      
      return value;
    }
  },
  permission: {
    type: Type.string(),
    async value (value, user, context) {
      const { user } = context.params;

      // Only an admin can change the user permission
      // With authentication there always will be a user for external requests
      if (user && user.permission !== 'admin') {
        return undefined;
      }

      return value;
    }
  }
});

If you return undefined it will not be included in data (from the value setter) or result (from the resolve getter)

DaddyWarbucks · 2020-04-21T17:57:29Z

Another feature I think we may want to explore is some "partial" validation. Often we need to run some hooks that only require part of the data in order to get the rest of the data, before being validated. For example this following hook is used to validate some data before sending it off to Cloudinary. Cloudinary then sends back its results which are merged with the original data and then validated and persisted. The following code is from very similar setup to feathers-schema where I store the schema on the service, etc.

const sendToCloudinary = async context => {
  const schema = context.service.schema.clonePartial([
    'entity_id',
    'entity_type',
    'document_type'
  ]);

  // I want to throw the same ValidationError here as what would be thrown from the
  // feathers-schema validation hook, but I am not ready to validate the whole thing yet
  const cloudinaryStuff = await schema.validateData(context.data);

  const cloudinaryRes = await Cloudinary.doTheThing(cloudinaryStuff);

  context.data = { ...context.data, cloudinaryRes };

  // Now the context.data is ready to be validated against the whole schema

  return context;
};

Note the clonePartial method on the schema. This method returns a new Schema that is a subset of the total schema. Also note the validateData method. That method simply calls the validateAsync method and wraps the error in ValidationError. I also have a validateQuery method that also simply calls the validateAsync but wraps the error in a QueryError. I am not sure if that is helpful, just throwing out some experience from a similar setup.

This could be handled manually, but it requires the developer to create an additional Schema and also manually throw the ValidationError.

Another suggestion, we may also want to let the developer pass in the schema via props. For example change this line: https://github.com/feathersjs/schema/blob/c9f345b23263c157efc32224f82eb1fc58e714ff/packages/schema-hooks/src/index.ts#L19
To something like

const schema = getSchema(context.params.schema || service?.options?.Schema || service.Schema);

Perhaps that is a little too close to a validation hook rather than a "This schema describes this service" type of thing, but just a suggestion.

KidkArolis · 2020-04-27T14:56:57Z

Hi, I'm not very familiar with the scope and goals of @featherjs/schema, but wanted to share a couple of learnings from using ajv with Feathers.

Ajv async validators

Ajv does support async validators, I use that feature to check things based on context or database. E.g. I've build custom validation keywords such as:

foreignKey (checks DB for that key)
internal (checks if provider is set before allowing to modify that field)
permission (for some field based authorization based on permissions in DB)

This isn't to try and sway you away from yup, just wanted to point out that ajv supports this. The main difference is that it doesn't support inline validation functions (that I know of). For those cases I tend to do extra work in my hooks.

Read vs create vs update

For all my resources, I tend to need 3 things:

what's the schema for creating
what's the schema for updating
what fields can the requesting user read

For that I actually export each of these definitions separately. I was wondering how you were thinking about handling that here.

A typical schema file (e.g. foo.schemas.js) ends up looking like this:

const fields = [
  'id',
  'personId',
  'name',
  'email',
  'phoneNumber',
  'formattedPhoneNumber', // derived in a hook
  'relationship',
  'primary',
  'createdAt',
  'updatedAt',
]

const patch = {
  type: 'object',
  properties: {
    name: { type: 'string' },
    email: { type: 'string', nullable: true },
    phoneNumber: { type: 'string', nullable: true },
    relationship: { type: 'string', nullable: true },
  },
  additionalProperties: false,
}

const create = {
  ...patch,
  properties: {
    ...patch.properties,
    personId: { type: 'string' },
    primary: { type: 'boolean' },
  },
  required: ['personId', 'name'],
}

module.exports.create = create
module.exports.patch = patch

module.exports.read = {
  owner: fields,
  admin: fields,
  user fields,
  self: fields,
}

DesignByOnyx · 2020-04-28T18:07:56Z

@daffl - I totally agree with you on this point:

........ I need some kind of resolver to tell it what to fetch (all todos with query: { userId: user.id }). Doing this declaratively (via a JS object syntax) ... does get into ORM complexity.

I agree and would like to avoid that at all costs. I guess my concern was more in describing the relationship of the data rather than how to hydrate the data. These are two separate things, and if the schema focused on describing the relationship, then tools/extensions/plugins can be built to take care of "how" the data is hydrated. This ties into my comment about AJV below.

@DaddyWarbucks - partial schema validation is possible with Joi, though the way they recommend doing it is kind of wonky to me. I second your recommendations about providing a validateData method which wraps the error in a standard ValidationError.

@KidkArolis - I am with you on AJV, which (for those who don't know) is a tool that compiles JSON Schema with the ability to provide custom extensions. My biggest problem with Joi/Yup is that they are so rich that it's not shareable with different languages. This is particularly painful if you have a non-JavaScript backend service with which you need to stay in sync. @daffl mentioned earlier that there are tools to convert in every direction, but in practice I have found that these tools are basically useless. JSON Schema is the "least common denominator" and the most interoperable due to its static nature. Individual teams can share and extend these static schemas in their own native languages. If done thoughtfully, these language-specific extensions can be tested in a uniform way with uniform error messaging. The same cannot be said for sharing Joi/Yup schemas.

If I could have my druthers, I would make the core of feathers-schema based on AJV, encouraging users to describe their models with static JSON Schema, writing custom "resolver" functions and other extensions separate from the static schemas. However, I'm totally fine with sticking with Joi due to the ecosystem surrounding it.

gustojs · 2020-05-17T23:30:51Z

Just wondering, is the RFC taking into consideration the plans to make the core of Feathersjs compatible with deno? Will Feathersjs + deno users have to wait for Joi/Ajv to be released on deno too?

daffl · 2020-05-26T16:43:58Z

I am working on making the core libraries Deno compatible and seeing how much work that already is (I basically have to create our own Deno and Node compatible replacements for every module that's being used) this is probably not going to happen right away for schema definitions.

As mentioned in feathersjs/feathers#1964 initial Deno support will focus on core functionality only and not include any module that has significant third party dependencies.

daffl · 2020-07-06T00:57:35Z

I retract my previous statement. Given that Joi's future is somewhat uncertain I opted for making schema core validation library independent and have it stay flexible enough to use it with whichever validation library you choose.

The core concept here are schema functions (async (value, object context, metadata)) that can be used for types, values (validation) and resolvers:

import { schema, string, number, boolean } from '@feathersjs/schema';

const User = schema({
  name: 'users'
}, {
  email: {
    type: string(),
    resolve: async (email, user, context, metadata) {
      if (!context.params.user.isAdmin) {
        // Non admins can not see the email
        return undefined;
      }

      return email;
    }
  },
  age: {
    type: number(),
    value: async (age, user, context, metadata) {
      if (age < 0) {
        throw new Error('Age can not be negative');
      }

      return age;
    }
  },
  enabled: {
    type: boolean(),
    value: async (enabled, user, context, metadata) {
      if (context.params.provider) {
        throw new Error('Can not be set externally');
      }

      return enabled;
    }
  }
});

It is possible to make chainable schema functions (e.g. value: required(email())) and add a description to it so the schema can be introspected (e.g. required(email()) would return something like { required: true, email: true }.

DesignByOnyx · 2020-07-06T22:50:12Z

I trust that you don't make any decision lightly, but this starts to tread into dangerous maintenance territory. It also feels kind of can-define'y in a not-so-good way: What's the difference between resolve and value? What about synchronous variants on both of these functions? Can users handle the difference between sync/async? Is there "pending" and "error" states around the async functions? Is the "value" a promise or the resolved value? [insert all of the asynchrony troubles from can-define here].

In an effort to not be redundant, my previous comment highlights my thoughts around AJV and the separation of 1) static schema descriptions (JSON Schema) and 2) how instances get hydrated/validated. I would greatly encourage that feathers-schema keeps from conflating those concepts (which is what can-define suffered from - not trying to pick on that tool, I just know you're intimate with it).

daffl · 2020-07-07T05:23:21Z

To answer the (easier) first two questions:

There will be no synchronous anything. All functions async, always.
The difference is that value converts (sets) the (incoming) value and resolve resolves (outgoing) value (same thing that is done now - often quite awkwardly - in hooks). This could also be done with two different schemas (good point, thank you for bringing that up).

I really don't feel static schema descriptions are the best way to do this. JSON schema isn't bad but it's one of those things that developers (sometimes) use but don't actually like using:

It's much more limited in its extensibility and flexibility than plain JS
After quite a bit of research around the JSON schema to anything else (Sequelize, GraphQL, Joi etc.) converters, none of them appeared to work well (or were unmaintained, assuming the authors gave up on that approach)
There is no widely used ORM using JSON schema for its definitions. There must be a reason for this other than that everybody wanted to reinvent the wheel.
As I mentioned before, to me the whole point of this project is that we already have a DSL for doing all of this and it's called JavaScript (or TypeScript if you want). If I have a converter function that does something wouldn't it be the best place to add metadata to that describes what it does instead of having to put it somewhere else?

Do you have an example why it is bad to conflate those two things? To me they really belong together and it is basically what e.g. GraphQLJS is doing (one might say successfully). I think can-define's weakness was more that it wasn't really universally usable and not targeting anything useful (you basically had to re-define your server schema in a custom DSL again on the client).

daffl · 2020-07-07T08:39:16Z

Maybe value and resolve is not super aptly named (set and get doesn't really fit though either) but the user schema is a good example for unifying things that currently have to be done with (imo clunky) hashPassword and protect hooks:

const UserSchema = schema({
  name: 'user'
}, {
  email: {
    type: string()
  },
  password: {
    type: string(),
    value: async password => {
      // Hash the password before saving to the database
      return bcrypt(password);
    },
    resolve: async (password, user, context) {
      // Never return the actual value for external requests
      if (context.params.provider) {
        return undefined;
      }

      return password;
    }
  }
});

This also allows assembling queries with more complex conditions in one place:

const MessageQuerySchema = schema({
  name: 'message-query'
}, {
  userId: {
    type: string(),
    value: async (userId, query, context) => {
      const { user } = context.params;
      
      // Non admins can only query their own messages
      return user.isAdmin ? userId : user.id;
    }
  },

  private: {
    type: boolean(),
    value: async (private, query, context) => {
      const { user } = context.params;

      // Free plan members can never query private messages
      return user.plan === 'free' ? false : private;
    }
  }
});

KidkArolis · 2020-07-07T08:55:28Z

value -> write resolve -> read Everything async might be slower for validation purposes.. (especially per field). Fastify uses json schemas: https://www.fastify.io/docs/latest/Validation-and-Serialization/, and its swappable with other libs. Although I haven't used fastify / explored enough to comment further. I just had a lot of success with ajv in my own app. It did sometimes feel like “I didn’t enjoy” using it, things like conditional logic was tricky, sometimes i moved parts of validation to a hook. But overall a lot of it made sense, and created consistency.

…

On Tue, 7 Jul 2020 at 09:39, David Luecke ***@***.***> wrote: Maybe value and resolve is not super aptly named (set and get doesn't really fit though either) but the user schema is a good example for unifying things that currently have to be done with (imo clunky) hashPassword and protect hooks: const UserSchema = schema({ name: 'user'}, { email: { type: string() }, password: { type: string(), value: async password => { // Hash the password before saving to the database return bcrypt(password); }, resolve: async (password, user, context) { // Never return the actual value for external requests if (context.params.provider) { return undefined; } return password; } }}); This also allows assembling queries with more complex conditions in one place possible: const MessageQuerySchema = schema({ name: 'message-query'}, { userId: { type: string(), value: async (userId, query, context) => { const { user } = context.params; // Non admins can only query their own messages return user.isAdmin ? userId : user.id; } }, private: { type: boolean(), value: async (private, query, context) => { const { user } = context.params; // Free plan members can never query private messages return user.plan === 'free' ? false : private; } }}); — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACPGWB4VLADNKPOIBYN5MLR2LNMFANCNFSM4MMSPTCA> .

daffl · 2020-08-22T20:58:03Z

Sorry for being a little slow but I just wanted to let you know that you fine folks are right 😄 The thing I'd like to solve here is context based resolving of properties, not re-inventing all of validation and coercion. I think a pluggable solution (similar to Feathers transport adapters) with JSON schema as the internal format should be doable here.

mrfrase3 · 2020-10-26T03:52:28Z

So a bit of a distraction from value/resolve, I was just pondering this and thought it might be a useful fit into feathers-schema.

Let's say we have a database with three services:

students
classes
rooms
classes has a studentIds and roomId field.

How would one easily/cleanly query all the students that have classes in one room, or what rooms a student has classes in? (assuming we don't care about what classes for this query)

You could query the classes, and then pull the students/rooms, but this can lead to a lot of boilerplate code in the client. I've kind of gotten around this by implementing hooks on find that will detect a field and swap it out, e.g. detect a roomId field in a find on the students service, do a find for all classes with that roomId, and then modify the student query to be like { _id: { $in: classes.reduce((a, class) => [...a, ...class.studentIds], []) } }, throwing theses hooks everywhere is in no-way elegant, doesn't account for nested queries, and doesn't get documented in the schema so the other developers on my team don't even know they can do it. The other solution is to save roomIds/studentIds to the students/rooms services, but this creates magic values/two sources of truth that have to be updated when the other changes.

What would be cool is the ability to define a "virtual ID" of sorts in the schema, that has a definition of where the relationship is actually stored. Say on the students service there is a roomVIds field (or similar) which isn't actually stored in the database but queriable and populates when resolved.

Food for thought.

daffl · 2021-04-22T00:34:01Z

Thank you again for the feedback everybody! I moved the discussion with the latest proposal to feathersjs/feathers#2312 and development to https://github.com/feathersjs/feathers/tree/schema since it currently makes more sense to have it in core vs. in a separate repository. I will archive this repository for now but we may bring it back later if it turns out to be something that could be generally useful.

marshallswain pinned this issue Aug 28, 2020

marshallswain unpinned this issue Aug 28, 2020

daffl mentioned this issue Apr 22, 2021

Feathers Schema: A common data model definition format feathersjs/feathers#2312

Closed

daffl closed this as completed Apr 22, 2021

daffl mentioned this issue Sep 24, 2022

Dove: We want to cheer... but have some mixed feelings feathersjs/feathers#2760

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC - a place to discuss and refine the API #20

RFC - a place to discuss and refine the API #20

DesignByOnyx commented Apr 20, 2020

daffl commented Apr 20, 2020

DesignByOnyx commented Apr 21, 2020 •

edited

Loading

DaddyWarbucks commented Apr 21, 2020

daffl commented Apr 21, 2020

DaddyWarbucks commented Apr 21, 2020 •

edited

Loading

KidkArolis commented Apr 27, 2020

DesignByOnyx commented Apr 28, 2020 •

edited

Loading

gustojs commented May 17, 2020 •

edited

Loading

daffl commented May 26, 2020 •

edited

Loading

daffl commented Jul 6, 2020

DesignByOnyx commented Jul 6, 2020 •

edited

Loading

daffl commented Jul 7, 2020

daffl commented Jul 7, 2020 •

edited

Loading

KidkArolis commented Jul 7, 2020 via email

daffl commented Aug 22, 2020

mrfrase3 commented Oct 26, 2020

daffl commented Apr 22, 2021

RFC - a place to discuss and refine the API #20

RFC - a place to discuss and refine the API #20

Comments

DesignByOnyx commented Apr 20, 2020

daffl commented Apr 20, 2020

DesignByOnyx commented Apr 21, 2020 • edited Loading

DaddyWarbucks commented Apr 21, 2020

daffl commented Apr 21, 2020

DaddyWarbucks commented Apr 21, 2020 • edited Loading

KidkArolis commented Apr 27, 2020

Ajv async validators

Read vs create vs update

DesignByOnyx commented Apr 28, 2020 • edited Loading

gustojs commented May 17, 2020 • edited Loading

daffl commented May 26, 2020 • edited Loading

daffl commented Jul 6, 2020

DesignByOnyx commented Jul 6, 2020 • edited Loading

daffl commented Jul 7, 2020

daffl commented Jul 7, 2020 • edited Loading

KidkArolis commented Jul 7, 2020 via email

daffl commented Aug 22, 2020

mrfrase3 commented Oct 26, 2020

daffl commented Apr 22, 2021

DesignByOnyx commented Apr 21, 2020 •

edited

Loading

DaddyWarbucks commented Apr 21, 2020 •

edited

Loading

DesignByOnyx commented Apr 28, 2020 •

edited

Loading

gustojs commented May 17, 2020 •

edited

Loading

daffl commented May 26, 2020 •

edited

Loading

DesignByOnyx commented Jul 6, 2020 •

edited

Loading

daffl commented Jul 7, 2020 •

edited

Loading