Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec and eval incorrectly accepting subclass of dict as globals #131811

Open
luigig44 opened this issue Mar 27, 2025 · 8 comments
Open

exec and eval incorrectly accepting subclass of dict as globals #131811

luigig44 opened this issue Mar 27, 2025 · 8 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error

Comments

@luigig44
Copy link

luigig44 commented Mar 27, 2025

Bug report

Bug description:

According to Python docs, exec and eval take as their globals argument an exact dict, and not a subclass of it.

If only globals is provided, it must be a dictionary (and not a subclass of dictionary), which will be used for both the global and the local variables. If globals and locals are given, they are used for the global and local variables, respectively. If provided, locals can be any mapping object.

However, builtin_exec_impl allows subclasses of dict because it checks the argument with PyDict_Check instead of PyDict_CheckExact (see here).

builtin_eval_impl makes the same, wrong, validation, with a slightly different error message. (see here).

This is a problem because it allows developers to pass a dictionary with a custom __getitem__ method which will never actually get called. It will not get called because when opcode LOAD_NAME is executed, it assumes globals is a dict, and accesses it using PyDict_GetItemRef instead of PyMapping_GetOptionalItem (see here).

Furthermore, opcode LOAD_GLOBAL behaves slightly different, first checking if globals is an exact dict and then choosing how to access it.

Here is an example of the inconsistent and unexpected behavior:

class mydict(dict):
  def __getitem__(self, key): 
    print('accessing', key)
    if key == 'f': return "hi!"
    return super().__getitem__(key)

exec('global f; print(f)', mydict(), {}) # This works, it finds f
exec('print(f)', mydict(), {}) # This, surprisingly, does not work
exec('print(f)', mydict()) # But this does work, because locals is a copy of globals and it supports the custom __getitem__

There even is a test (test_builtin.py, see here) that says "custom globals or builtins can raise errors on item access". The test passes only because a locals is not provided to exec, so exec copies it from globals. A custom globals cannot raise errors on item access.

If one applies the change I think must be made (Changing PyDict_Check to PyDict_CheckExact in exec and eval's implementations) 4 more tests break.

  • test_getpath passes custom dictionaries to exec (which, remember, the documentation forbids), and should be rewritten but does not seem to be very complex to do.
  • test_descrtut should change exec("x = 3; print(x)", a) to exec("x = 3; print(x)", {}, a), and then not expect 'builtins' to appear in a.keys()
  • test_dynamic's test_load_global_specialization_failure_keeps_oparg should be rewritten to, instead of creating a class with __missing__, create a dictionary with data (for i in range(variables): my_globals[f"_number_{i}"] = i)
  • test_type_aliases. test_exec_with_unusual_globals fails, but I do not understand how to fix it.

Alternatively, _PyEval_LoadName could be modified (and the documentation updated) so subclasses of dictionaries are supported as globals, but this is a hot path and I have not measured any possible performance implications. Modifying exec to comply with the docs makes more sense, IMHO.

CPython versions tested on:

CPython main branch, 3.12, 3.10, 3.9

Operating systems tested on:

Linux

@luigig44 luigig44 added the type-bug An unexpected behavior, bug, or error label Mar 27, 2025
@skirpichev skirpichev added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Mar 28, 2025
@gaogaotiantian
Copy link
Member

gaogaotiantian commented Mar 28, 2025

I don't think we should change the behavior. I agree the documentation might suggest that passing a subclass of dict would raise an error - but I believe it means that it's not guaranteed to work. If we are designing it from scratch, I agree we should make it throw an error if global is not a strict dict - that's definitely a less ambiguous behavior. However, this is a change on a very commonly used function, and it might break code out there for nothing. Like you mentioned, even CPython itself has tests (not just one test) that would be broken. I think the sudden change would frustrate the community. The gain is not nothing, but not super significant either. Users who try to use subclass of dict for globals will get an error instead of non-working code, which is not bad, but not great enough to justify this change, in my opinion.

@tomasr8
Copy link
Member

tomasr8 commented Mar 28, 2025

but I believe it means that it's not guaranteed to work.

Maybe we should make the documentation clearer on that point?

@luigig44
Copy link
Author

luigig44 commented Mar 28, 2025

Updating documentation (and removing or improving test_builtin's test_exec_globals_error_on_get and test_exec_globals_dict_subclass) would be acceptable.

I personally favor updating exec and eval. As this change would certainly break stuff, a good idea (that I missed to mention in my initial message) would be to start by adding something like this:

/* Current code */
    if (!PyDict_Check(globals)) {
        PyErr_Format(PyExc_TypeError, "exec() globals must be a dict, not %.100s",
                     Py_TYPE(globals)->tp_name);
        goto error;
    }
/* Added code */
    if (!PyDict_CheckExact(globals)) {
        if (PyErr_WarnFormat(PyExc_DeprecationWarning, 1,
                "globals should be an exact dict, not %.200s. "
                "The ability to pass a subclass of dict is deprecated, "
                "and may be removed in a future version of Python.",
                Py_TYPE(globals)->tp_name)) {
            goto error;
        }
    }

Edit: See this commit

@gaogaotiantian
Copy link
Member

If we decided to change the behavior, a deprecation phase would probably be required. If we kept the behavior, I would keep the "not-fully-legit" tests to remind us of this in the future (I vaguely remember that I found this issue before, but can't find the PR). I'm leaning towards keeping things as it is but I'm not super against deprecating it. This is a builtin function that is widely used, we probably need inputs from a few other core devs.

@luigig44
Copy link
Author

I tried to search for similar issues before, but failed. I have now found this previous dicussion:
#76149

This also seems relevant:
#4315 (comment)

@tomasr8
Copy link
Member

tomasr8 commented Mar 28, 2025

cc @rhettinger

@skirpichev
Copy link
Member

would be to start by adding something like this:

If I'm not misread docs, strict type checking should be enforced iff only globals provided.

@rhettinger rhettinger self-assigned this Mar 31, 2025
@rhettinger
Copy link
Contributor

rhettinger commented Mar 31, 2025

Research notes: The global argument type check goes back to before 1997 (I stopped looking at 79f25d9). The dict subtype check goes back to at least 2001 (I stopped looking at cbfc855). The eval/exec type check code wasn't updated when builtin types became subclassable in Python 2.2. In 2011, I was the first to report that the C-API would bypass methods defined in subclasses of builtin types. This current report is the first to focus on exec and eval in particular.

There was a surprised user in 2017. But after the resulting doc edit, 059b9ea, the eval/exec issue doesn't appear to have been a problem in practice. Likely this is either because testing promptly reveals that dict subclass methods have been bypassed or because users have read the docs which tell them specifically, "it must be a dictionary (and not a subclass of dictionary)"

Related PRs but not a direct cause: In 2006 7cae87c , exec became a function instead of a statement. In 2004 214b1c3, eval and exec permitted the locals to be any mapping types not just a dict. The PyDict_CheckExact function was added to the C-API in Python 2.4.

Recommendation: I don't have a strong opinion on what if anything should be done. But I'm reluctant to make a breaking change by raising an exception — we're here to fix bugs that people have rather than to cause bugs in their long stable working code. The current docs are helpful and only wrong in using the word "must" instead of "should". The docs do not cover the specific case what happens if you don't follow the rules, leaving this as an undefined behavior. We could elaborate, "it must be a dictionary (and not a subclass of dictionary because subclass methods are bypassed)".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants