Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request: add support for parsing and matching suffixes #50

Open
rafalkrupinski opened this issue Aug 23, 2024 · 9 comments
Open

request: add support for parsing and matching suffixes #50

rafalkrupinski opened this issue Aug 23, 2024 · 9 comments

Comments

@rafalkrupinski
Copy link
Contributor

Since the times of XML media often have a suffix, for example application/vnd.oai.openapi+json. This conveys separate information on syntax and semantics.

Ability to match by group and suffix (in the example: application, json) would let applications to communicate the exact media type, and libraries to process such content in a generic way.

@rafalkrupinski
Copy link
Contributor Author

matching by (group and (subtype or suffix)) to be exact

@CaselIT
Copy link
Member

CaselIT commented Aug 27, 2024

Hi,

is your objective to add a function that given the result of parse_mime_type('application/vnd.oai.openapi+json') that would be ('application', 'vnd.oai.openapi+json', {}), can take the second index and split it into ('vnd.oai.openapi', 'json')?

I guess we could add it.

Changing the result of parse_mime_type could also be an option, but it would be in a backward compatible way

@rafalkrupinski
Copy link
Contributor Author

I was thinking of:

>>> parse_mime_type('application/vnd.oai.openapi+json', suffix=True)
('application', 'json', {})
>>> parse_mime_type('application/json', suffix=True)
('application', 'json', {})

This way keeps compatibility and best_match wouldn't have to handle two cases (split and non-split), but I'm assuming that no caller would ever want both full type and suffix.

Also, is the part between '/' and '+' useful for anything? I thought you either work with full media type application/vnd.oai.openapi+json or just the serialization media type application/json. That part doesn't even have a name, AFAIK.

@CaselIT
Copy link
Member

CaselIT commented Aug 28, 2024

If we introduce a new flag we can also return a 4-tuple when it's true, since it's a new feature.

Or have the second item of the 3-tuple be a 2-tuple (or a named tuple) instead of a string. This would work better if the flag is called split_siffix

@vytas7 what your opinion on this?

@vytas7
Copy link
Member

vytas7 commented Aug 28, 2024

To be honest, I'm not a big fan of adding this to the current interface, because it is not really in the spirit of RFC 9110, Section 12 (Content Negotiation). The RFC does not really define a way to provide wildcards (*) to only a segment of media type or subtype.

Although I do understand the value proposition in some scenarios, could you tell more about your use case @rafalkrupinski?

Could we maybe somehow extend these matching functions by adding an optional regex parameter to match things? Dunno how exactly it could look like.

@rafalkrupinski
Copy link
Contributor Author

rafalkrupinski commented Aug 28, 2024

httpx.Response.json() can deserialize the body if it's JSON. The current implementation doesn't check content-type header, but imagine similar scenario

def python(self: 'Request')->dict|list|str:
  deserializers = {
    'application/json': deserialize_json,
    'application/yaml': deserialize_yaml,
   ...
  }
  match = mimeparse.best_match(deserializers.keys(), self.headers['content-type'])
  return deserializers[match](self.body)

In the above example best_match doesn't currently work reliably for suffixed types, because none of the types matches, even though we know json serializer can handle for example 'application/geo+json'.

edit: I've hopefully clarified my explanation with a piece of code.


So that's the client side, lets look at the server side.

Suppose the server can handle application/json and application/yaml resources.
Client sends request with accept header application/geo+json.
best_match should match it to application/json and not just because type ('application') matches.

@vytas7
Copy link
Member

vytas7 commented Nov 10, 2024

To be fair, Falcon itself has a hack in one place to support +json and +xml, so it is not an unreasonable request.

Right now, Falcon's media handling doesn't support this scenario either for request media, but maybe we should have an option too 🤔 (Falcon's media is very similar to your serializers and deserializers, just on the server side.)

@rafalkrupinski
Copy link
Contributor Author

@vytas7 so how do you want to continue?

@vytas7
Copy link
Member

vytas7 commented Nov 20, 2024

Just to be honest, myself, I don't have much incentive to introduce any new major features to this library, but we can accept your suggestions or contributions :)

The first step, I think, is to agree on a good API design.

Your proposal, as I understand, is to have an option to parse_mime_type(...) to completely disregard the subtype, and replace it with the suffix, if any is found by splitting on +?

How would that be propagated from the matching functions (are you using best_match())?
And how would you handle scenarios when there is a dedicated handler for the specific media type, e.g., application/geo+json, and also for the generic application/json? Just bluntly stripping the subtype might broadcast them to the same thing, making these handlers weigh equal 🤔
Or were you thinking of a two-pass approach where you would try to match with +suffix in the case nothing could be matched without?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants