Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider an alternative solution in marking @container in context generator #214

Open
candleindark opened this issue Dec 19, 2023 · 3 comments
Milestone

Comments

@candleindark
Copy link
Member

This issue originates from this post and subsequent discussion. In short, the current way of marking @container, based on stringification, works for now but imprecise. It would be better if we have a more reliable solution.

@candleindark
Copy link
Member Author

candleindark commented Dec 19, 2023

I think we can consider attaching metadata information to a type (indirectly to a field) for context generation or some other purpose. Do it that way, what information to be attached will be explicit. With Pydantic V2, metadata can be easily attached to a type.

from typing import Type

from pydantic_core import CoreSchema
from typing_extensions import Annotated

from pydantic import BaseModel, GetCoreSchemaHandler


class Metadata(BaseModel):
    foo: str = 'metadata!'
    bar: int = 100

    @classmethod
    def __get_pydantic_core_schema__(
        cls, source_type: Type[BaseModel], handler: GetCoreSchemaHandler
    ) -> CoreSchema:
        if cls is not source_type:
            return handler(source_type)
        return super().__get_pydantic_core_schema__(source_type, handler)


class Model(BaseModel):
    state: Annotated[int, Metadata()]


m = Model.model_validate({'state': 2})
print(repr(m))
#> Model(state=2)
print(m.model_fields)
"""
{
    'state': FieldInfo(
        annotation=int,
        required=True,
        metadata=[Metadata(foo='metadata!', bar=100)],
    )
}
"""

As you can see, you can even have a Pydantic model to represent the metadata you want to attach to a type, and this metadata will not interfere with validation and serialization if you don't want it to.

You can find out more about this example at https://docs.pydantic.dev/latest/concepts/json_schema/#modifying-the-schema.

@satra
Copy link
Member

satra commented Dec 19, 2023

ooh that's nice - we could move all the jsonld related stuff into metadata instead of having nskey for example. can metadata also be added at the model level?

@candleindark
Copy link
Member Author

candleindark commented Dec 20, 2023

ooh that's nice - we could move all the jsonld related stuff into metadata instead of having nskey for example. can metadata also be added at the model level?

Do you mean adding metadata to a field of a Pydantic model type? If that's your question, the answer is yes. Metadata can be attached to essentially any type using the Annotated typing form. In fact, multiple pieces of metadata bundled in different objects can be attached to a type. Please take a look at the example below for some of these possible usages.

from typing import Any
from enum import Enum
import json
from pprint import pprint

from pydantic_core import CoreSchema
from typing_extensions import Annotated

from pydantic import BaseModel, GetCoreSchemaHandler


class AccessType(Enum):
    """An enumeration of access status options"""

    #: The dandiset is openly accessible
    OpenAccess = "dandi:OpenAccess"

    #: The dandiset is embargoed
    EmbargoedAccess = "dandi:EmbargoedAccess"


class Metadata1(BaseModel):
    foo: str = 'metadata!'
    bar: int = 100

    @classmethod
    def __get_pydantic_core_schema__(
        cls, source_type: Any, handler: GetCoreSchemaHandler
    ) -> CoreSchema:
        if cls is not source_type:
            return handler(source_type)
        return super().__get_pydantic_core_schema__(source_type, handler)


class Metadata2(BaseModel):
    x: int = 0
    y: int = 42

    @classmethod
    def __get_pydantic_core_schema__(
        cls, source_type: Any, handler: GetCoreSchemaHandler
    ) -> CoreSchema:
        if cls is not source_type:
            return handler(source_type)
        return super().__get_pydantic_core_schema__(source_type, handler)
    

class SubModel(BaseModel):
    a: int = 100
    b: str = "Hello, world!"


class Model(BaseModel):
    state: Annotated[int, Metadata1()]  # metadata on int
    access_type: Annotated[AccessType, Metadata2()]  # metadata on enum
    f: Annotated[SubModel, Metadata1(), Metadata2()]  # multiple metadata objects


json_schema = Model.model_json_schema()

print(json.dumps(json_schema, indent=2))
"""
{
  "$defs": {
    "AccessType": {
      "description": "An enumeration of access status options",
      "enum": [
        "dandi:OpenAccess",
        "dandi:EmbargoedAccess"
      ],
      "title": "AccessType",
      "type": "string"
    },
    "SubModel": {
      "properties": {
        "a": {
          "default": 100,
          "title": "A",
          "type": "integer"
        },
        "b": {
          "default": "Hello, world!",
          "title": "B",
          "type": "string"
        }
      },
      "title": "SubModel",
      "type": "object"
    }
  },
  "properties": {
    "state": {
      "title": "State",
      "type": "integer"
    },
    "access_type": {
      "$ref": "#/$defs/AccessType"
    },
    "f": {
      "$ref": "#/$defs/SubModel"
    }
  },
  "required": [
    "state",
    "access_type",
    "f"
  ],
  "title": "Model",
  "type": "object"
}
"""

m = Model(state=42, access_type=AccessType.EmbargoedAccess, f=SubModel(a=1, b="hi"))
print(m.model_dump_json(indent=2))
"""
{
  "state": 42,
  "access_type": "dandi:EmbargoedAccess",
  "f": {
    "a": 1,
    "b": "hi"
  }
}
"""

pprint(m.model_fields, indent=2)
"""
{ 'access_type': FieldInfo(annotation=AccessType, required=True, metadata=[Metadata2(x=0, y=42)]),
  'f': FieldInfo(annotation=SubModel, required=True, metadata=[Metadata1(foo='metadata!', bar=100), Metadata2(x=0, y=42)]),
  'state': FieldInfo(annotation=int, required=True, metadata=[Metadata1(foo='metadata!', bar=100)])}
"""

As you can see, the metadata can be attached to different types (or indirectly fields of different types), multiple metadata objects can be attached, and all the metadata do not affect JSON schema generation nor Pydantic model validation.

If you choose to, you can attach metadata that affect the generation of JSON schema and validation of a type. You can see examples of those at #203 (comment) and

# This file is for defining types that extend existing types through the use of
# `typing.Annotated`.
from typing import Type
from pydantic import ByteSize, GetCoreSchemaHandler, GetJsonSchemaHandler
from pydantic.json_schema import JsonSchemaValue
from pydantic_core import CoreSchema, core_schema
from typing_extensions import Annotated
class _ByteSizeJsonSchemaAnnotation:
"""
An annotation for `pydantic.ByteSize` that provides a JSON schema
Note: Pydantic V2 doesn't provide a JSON schema for `pydantic.ByteSize`.
This annotation provides a JSON schema that is the same JSON schema
used for `pydantic.ByteSize` in Pydantic V1, which is simply the JSON
schema for `int`.
"""
@classmethod
def __get_pydantic_core_schema__(
cls, source: Type[ByteSize], handler: GetCoreSchemaHandler
) -> CoreSchema:
assert source is ByteSize
return handler(source)
@classmethod
def __get_pydantic_json_schema__(
cls,
_core_schema: CoreSchema,
handler: GetJsonSchemaHandler,
) -> JsonSchemaValue:
return handler(core_schema.int_schema())
# An extension of `pydantic.ByteSize` that uses the JSON schema provided by
# `_ByteSizeJsonSchemaAnnotation`
ByteSizeJsonSchema = Annotated[ByteSize, _ByteSizeJsonSchemaAnnotation()]

All in all, I think we can benefit a lot in this project from some of the new features in Pydantic V2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants