Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two tuples as dictionary keys report a false alarm of SchemaMissingKeyError #312

Open
tjyuyao opened this issue Apr 24, 2024 · 2 comments · May be fixed by #327
Open

Two tuples as dictionary keys report a false alarm of SchemaMissingKeyError #312

tjyuyao opened this issue Apr 24, 2024 · 2 comments · May be fixed by #327

Comments

@tjyuyao
Copy link

tjyuyao commented Apr 24, 2024

I am using schema 0.7.5 and python 3.11.

Following is a minimal code to reproduce

from schema import *

data_schema = Schema({
    ('map_point', 'to', 'map_polygon'): {},
    ('map_polygon', 'to', 'map_polygon'): {},
}, ignore_extra_keys=True).validate(
    {
        ('map_point', 'to', 'map_polygon'): {},
        ('map_polygon', 'to', 'map_polygon'): {},
    }
)

It will give the error:

Traceback (most recent call last):
  File "/home/hyuyao/2024/trackpred/qcnet_plug_in/qcnet_official_study/data_schema.py", line 7, in <module>
    }, ignore_extra_keys=True).validate(
                               ^^^^^^^^^
  File "/home/hyuyao/miniconda3/envs/RAMoPred/lib/python3.11/site-packages/schema.py", line 420, in validate
    raise SchemaMissingKeyError(message, e.format(data) if e else None)
schema.SchemaMissingKeyError: Missing key: ('map_polygon', 'to', 'map_polygon')

The expected behavior is to pass the validation without raise the error.

@mutricyl
Copy link

when checking a dict, keys from schema are checked against keys of data dict. This means the codes execute at some stage:

Schema(('map_point', 'to', 'map_polygon')).validate(('map_point', 'to', 'map_polygon'))
# and
Schema(('map_point', 'to', 'map_polygon')).validate(('map_polygon', 'to', 'map_polygon'))

which both validate considering the way Schema deals with list like.
It would be needed to check keys hash and not keys values:

class Schema(object):
(...)
    def validate(self, data: Any, **kwargs: Dict[str, Any]) -> Any:
(...)
                for key, value in data_items:
                    for skey in sorted_skeys:
                        svalue = s[skey]
                        try:
-                            nkey = Schema(skey, error=e).validate(key, **kwargs)
+                           Schema(hash(skey), error=e).validate(hash(key), **kwargs)
+                           nkey = skey
                        except SchemaError:
                            pass
                        else:
                            if isinstance(skey, Hook):
(...)

it works for you case @tjyuyao but sadly it breaks a lot of tests 😞

@mutricyl
Copy link

restricting this to set and frozensets only seams to be fine for all pytests

proposed solution:

            with exitstack:
                # Evaluate dictionaries last
                data_items = sorted(data.items(), key=lambda value: isinstance(value[1], dict))
                for key, value in data_items:
                    for skey in sorted_skeys:
                        svalue = s[skey]
                        try:
                            if isinstance(skey, (tuple, frozenset)):
                                Schema(hash(skey), error=e).validate(hash(key), **kwargs)
                                nkey = skey
                            else:
                                nkey = Schema(skey, error=e).validate(key, **kwargs)
                        except SchemaError:
                            pass
                        else:

tests to add:

def test_tuple_key_of_dict():
    # this is a simplified regression test of the bug in github issue #312
    assert Schema({('map_point', 'to', 'map_polygon'): {}}).validate(
        {('map_point', 'to', 'map_polygon'): {}}
    ) == {('map_point', 'to', 'map_polygon'): {}}
    with SE:
        assert Schema({('map_point', 'to', 'map_polygon'): {}}).validate(
            {('map_polygon', 'to', 'map_polygon'): {}}
        ) == {('map_polygon', 'to', 'map_polygon'): {}}


def test_frozenset_key_of_dict():
    # this is a simplified regression test of the bug in github issue #312
    assert Schema({frozenset(('map_point', 'to', 'map_polygon')): {}}).validate(
        {frozenset(('map_point', 'to', 'map_polygon')): {}}
    ) == {frozenset(('map_point', 'to', 'map_polygon')): {}}
    with SE:
        assert Schema({frozenset(('map_point', 'to', 'map_polygon')): {}}).validate(
            {frozenset(('map_polygon', 'to', 'map_polygon')): {}}
        ) == {frozenset(('map_polygon', 'to', 'map_polygon')): {}}

@keleshev @skorokithakis @sjakobi @dblanchette or other maintainer, are you fine with this solution ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants