Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep thread pool executors around. #306

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Keep thread pool executors around. #306

wants to merge 1 commit into from

Conversation

apollo13
Copy link
Member

I'd love to hear your thoughts on thise one @carltongibson and @andrewgodwin. It just seems so wasteful to recreate new threads all the time when we can just reuse them.

Given this obviously somewhat artificial change, the timings go down from 15 seconds to 8 seconds:

#!/usr/bin/env python3

import asyncio
from asgiref.sync import ThreadSensitiveContext, sync_to_async


def sync():
    pass


async def test():
    for i in range(100_000):
        async with ThreadSensitiveContext():
            await sync_to_async(sync)()


async def main():
    await asyncio.gather(test())


if __name__ == "__main__":
    import time

    s = time.perf_counter()
    asyncio.run(test())
    elapsed = time.perf_counter() - s
    print(f"{__file__} executed in {elapsed:0.2f} seconds.")

Completely unscientifically this means that the time to create and destroy 100000 threads is roughly 7 seconds on my machine. I am not really sure what a good thread count would be, it might make sense to copy the default max_workers from the thread pool executor.

@apollo13 apollo13 marked this pull request as draft December 12, 2021 10:45
@andrewgodwin
Copy link
Member

The irony of pooling a threadpool is not lost on me, but if it provides significant enough performance benefits, maybe it's worth it. Any idea what % saving this is on a more traditional use case?

@apollo13
Copy link
Member Author

Most likely negligible (when looking at the "raw" numbers -- 7 seconds for 100000 threads is way below a millisecond if I did math correctly ;)). That said I do not know how much thread state python/django can accumulate that would require cleaning up and could increase that number. There is also the question on how well threads perform on other systems; I do not know much about windows or mac in that regard.

What it would allow for (in theory -- currently breaks because django.db.connection is context aware and not just a threading local) is that you could run with a pool of lets say 10 threads and reuse connections (ie run with persistent connections).

@andrewgodwin
Copy link
Member

Hm, yeah, the extra complexity for such a small performance gain does give me a little pause. If there's another convincing reason to add it, like the connections, that might change things though - but my understanding of the problem with connection management is that the same request needs to be on the same thread/transaction, and that's harder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants