How many orthogonal patchsets could this reasonably be broken into (and what would their interdependencies be)? #17

ApproximateIdentity · 2021-10-20T11:04:02Z

ApproximateIdentity
Oct 20, 2021

I mean this less from a perspective of "pick and choose parts of the project" and more from the perspective of trying to understand how everything is logically holding together.

For example, unless I'm missing something mimalloc should be usable without any other changes. So a (fairly trivial) patchset could be to simply add that in as the default (or maybe required) allocator. Next you could start making use of mimalloc by changing the GC algorithms and/or the collection lookup procedure. Maybe this would allow you to move the GIL calls down the tree for the various collections making them more and more concurrent. Essentially what I'm trying to understand is which of these logically separate patchsets would be independent of one another and which would require other patchsets to already be in place. Hopefully my question makes sense.

Here are the separate pieces from your doc as well as my own ignorant musings:

mimalloc - presumably could be added in entirely independently
reference count changes (biased/deferred/immortalized) - Does this require mimalloc? It seems independent to me.
garbage collection changes - If it walks the mimalloc heap, then it obviously requires mimalloc. This requires some of the GIL changes as well I guess.
method resolution changes - Is this entirely independent of the other work or does it actively require the GIL/mimalloc changes?
collections thread-safety - The constraints on the allocator don't seem to explicitly require mimalloc as long as constraints are required, but I guess python's current allocators wouldn't satisfy them so this either require mimalloc or restrictions on allocators (which might as well be mimalloc).
thread-states and GIL api - It's a bit hard for me to conceptualize this at the moment honestly.
Interpreter/bytecode changes/optimized calls - This seems entirely independent of everything else right? Also I guess some of the ideas are already being implemented ( bpo-44590: Lazily allocate frame objects python/cpython#27077 and bpo-45256: Remove the usage of the C stack in Python to Python calls python/cpython#28488 )

Thanks for any clarity you can provide! This is all really great work!

colesbury · 2021-10-20T15:41:19Z

colesbury
Oct 20, 2021
Maintainer

This is a difficult question to answer in general. I'm not sure how many logical patchsets there are, let alone their interdependencies. I'm in the process or rebasing and refactoring the changes into logical commits on the nogil-3.9 branch. Once I'm done, I'll have a better sense of how many logical commits there are.

It's easier to answer the questions about specific features.

mimalloc - presumably could be added in entirely independently

Yes, you can add mimalloc independently. There are changes to mimalloc necessary to support other features, though.

reference count changes (biased/deferred/immortalized)...

They do not require mimalloc. Immortalization is completely independent of the other features. Biased reference counting is almost independent of the other features, but it's convenient to have implemented the inter-thread signaling changes first.

Deferred reference counting is more tricky. Doing it in a way that's actually useful requires substantial changes to the interpreter.

garbage collection changes...

It requires mimalloc, but does not require the GIL changes. Currently, I've split the GC changes into three commits. First, I remove some features that will no longer be helpful without the GIL (gc: make the garbage collector non-generational). Second, I change it to use mimalloc instead of maintaining the GC linked list (gc: Traverese mimalloc heaps to find all objects.) Finally, I change the GC to use a stop-the-world mechanism ("Implement stop-the-world GC").

method resolution changes

"typeobject: make method_cache thread-local". It's independent of the other changes, except that like many of the other features it depends on the new internal locking API ("Add mutexes and one-time notifications").

collections thread-safety

It effectively requires mimalloc (and some changes to mimalloc).

thread-states and GIL api

These are relatively simple and independent of the changes. "pystate: keep track of attached vs. detached state". The important part of this changes is the modification to _PyThreadState_Swap, which is called by the public thread state APIs.

Interpreter/bytecode changes/optimized calls...

The interpreter changes are dependent on many of the other changes, such as changes to "dict". The motivation is to efficiently implement deferred reference counting (for scalability) and be able to (re-)implement other interpreter optimizations in a thread-safe way (e.g. LOAD_GLOBAL caching).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How many orthogonal patchsets could this reasonably be broken into (and what would their interdependencies be)? #17

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

How many orthogonal patchsets could this reasonably be broken into (and what would their interdependencies be)? #17

ApproximateIdentity Oct 20, 2021

Replies: 1 comment

colesbury Oct 20, 2021 Maintainer

ApproximateIdentity
Oct 20, 2021

colesbury
Oct 20, 2021
Maintainer