Lazy pydantic import #6275

danielhollas · 2024-02-04T02:44:41Z

pydantic adds a lot of time to aiida startup, which is especially detrimental to tab completion.
Here I test the approach of deferring import of aiida.manage.configuration.config until really needed, which in turn defers pydantic.

Upcoming usage of pydantic from #6255 seems to be centered in aiida.orm which is not imported by default in verdi so this PR still seems worth it.

This is in principle a breaking change since we no longer export Config in aiida.manage.configuration module. But I guess it is not meant to be manipulated directly by users anyway.

danielhollas · 2024-02-04T03:03:43Z

Some benchmarks using verdi --version as a proxy for tab-completion, with pydantic v2.6.0.
I am seeing ~35ms speedup on my new fancy NVMe drive. On my old laptop the difference would likely be much bigger.

main

$ hyperfine -w 5 'verdi --version' 
Benchmark 1: verdi --version
  Time (mean ± σ):     101.4 ms ±   2.3 ms    [User: 83.6 ms, System: 16.5 ms]
  Range (min … max):    97.6 ms … 105.6 ms    30 runs

this PR

Benchmark 1: verdi --version
  Time (mean ± σ):      65.2 ms ±   2.1 ms    [User: 50.9 ms, System: 13.2 ms]
  Range (min … max):    62.2 ms …  69.8 ms    45 runs

edan-bainglass · 2024-02-06T05:15:10Z

Thank you @danielhollas for addressing this. I have a question - how much does pydantic load-time affect AiiDAlab? Is this one of the sources of lag in app load time?

danielhollas · 2024-02-06T12:02:37Z

@edan-bainglass it is not, since pydantic was used by aiida-core only since version 2.5, and we haven't even released an aiidalab image with that version.

That being said, version 2.5 contains a lot of work that improved import time so it should result in noticeable improvement. I'll let you know once we release the image, but it might take some time since we need to resolve some package version incompatibilities.

danielhollas · 2024-02-06T13:59:42Z

I'll also note that this particular PR will not help with most of aiida operations (or AiiDAlab load times) since we will need to load pydantic as soon as we load the AiiDA profile.

CC @sphuber this is ready for review. Happy to hear your thoughts. This PR kind of assumes that the use of pydantic will stay somewhat localized to aiida.orm module. If you have other plans than perhaps this is not worth it.

sphuber · 2024-02-07T20:12:53Z

CC @sphuber this is ready for review. Happy to hear your thoughts. This PR kind of assumes that the use of pydantic will stay somewhat localized to aiida.orm module. If you have other plans than perhaps this is not worth it.

Well, it already is used outside of aiida.orm, e.g.:

aiida-core/src/aiida/storage/psql_dos/backend.py

Line 16 in b7e59a0

from pydantic import BaseModel, Field

And it is in aiida.cmdline

aiida-core/src/aiida/cmdline/groups/dynamic.py

Line 130 in b7e59a0

from pydantic_core import PydanticUndefined

but I made sure to keep imports inside methods whenever possible. But I am not a 100% sure if this is not still evaluated during tab-completion, because one of the main motivations for adopting pydantic is to have verdi add subcommands dynamically based on which plugins are installed. See #6190

In #6255 we go even further and essentially use pydantic in almost each aiida.orm submodule. Would those automatically be handled by the changes in this PR? Or would special care still have to be taken? Is there anyway we can test this for regression reliably. There currently is a very basic test in the CI that we added years back, which simply measures the run time of verdi in a loop. But it is quite fragile if we set the limit too close to the ideal time (which is also strongly system specific) and so we risk false positives.

This is in principle a breaking change since we no longer export Config in aiida.manage.configuration module. But I guess it is not meant to be manipulated directly by users anyway.

Maybe not users, but third-party applications may very well be using it directly. For example, would AiiDAlab maybe not be affected? Surely they will operate on the Config, for example to set options etc and get profile information. Is this the only way of deferring the import of pydantic?

danielhollas · 2024-02-07T23:51:53Z

Thanks for taking a look @sphuber!

Well, it already is used outside of aiida.orm, e.g.:

Right, but I indeed double checked that in those cases the imports are inlined so I assumed it will be fine.
I have now verified this assumption manually, that pydantic is indeed not being imported during tab-completion (by adding raise ValueError in pydantic/__init__.py.

Would those automatically be handled by the changes in this PR?

We already verify that aiida.orm is not being imported in verdi startup, by running verdi devel check-load-time so provided that you didn't add pydantic import outside of aiida.orm that PR should be fine.

Is there anyway we can test this for regression reliably.

Indeed we can, that's what verdi devel check-undesired-imports is for! Unfortunately, it does not work in this case, because actually running verdi devel loads the configuration. In the tab-completion case, we have a special monkey-patch that prevents evaluation of dynamic default values that I added in #6144.

So instead I've added a regression test that calls tab-completion internally.

Maybe not users, but third-party applications may very well be using it directly. For example, would AiiDAlab maybe not be affected?

I don't think so? In AiiDAlab we mostly only use load_profile. I would assume third-party code would similarly only use the helper functions defined in aiida/manage/configuration/__init__.py such as get_config() / load_config().

Is this the only way of deferring the import of pydantic?

Not sure how to answer this. 😅 The Config class is derived from the pydantic BaseClass, and so pydantic needs to be imported when the Config class itself (not its instances) is built. I don't see a way how to make the class Config available in aiida.manage while also not importing pydantic.

Quoting from https://docs.python.org/3/library/sys.html [sys.modules] is a dictionary that maps module names to modules which have already been loaded. This can be manipulated to force reloading of modules and other tricks. However, replacing the dictionary will not necessarily work, as expected and deleting essential items from the dictionary may cause Python to fail... Let's just ignore the last sentence :-)

danielhollas · 2024-02-08T00:19:04Z

By the way, I've just found out about verdi devel play, what a beautiful easter egg. 😂 🎶

sphuber · 2024-02-08T13:09:06Z

By the way, I've just found out about verdi devel play, what a beautiful easter egg. 😂 🎶

It's not the only easter egg in verdi (hint, hint) ;)

sphuber

Thanks @danielhollas

Not sure how to answer this. 😅 The Config class is derived from the pydantic BaseClass, and so pydantic needs to be imported when the Config class itself (not its instances) is built. I don't see a way how to make the class Config available in aiida.manage while also not importing pydantic.

I now remember looking into this and trying but coming up short. I think indeed that it is not possible. So I guess your solution is the best we can do for now.

Also think the breaking of the import is acceptable, so let's continue with these changes.

Indeed we can, that's what verdi devel check-undesired-imports is for! Unfortunately, it does not work in this case, because actually running verdi devel loads the configuration. In the tab-completion case, we have a special monkey-patch that prevents evaluation of dynamic default values that I added in #6144. So instead I've added a regression test that calls tab-completion internally.

Very nice. I am wondering if the verdi devel check-undesired-imports is now superfluous as it is a subset of the unit test you added? I think technically there is still a difference in code path between merely tab-completing, and actually invoking a command. As you say, some of the command parameters were changed to lazily evaluate their default value, such that they are only executed when the command is actually called. What we are really interested in is that the tab-complete is responsive. So I think we can just add the modules from verdi devel check-undesired-imports to your unit test and get rid of that command (and invocation in the CI scripts). Right?

tests/conftest.py

tests/cmdline/params/options/test_callable.py

Co-authored-by: Sebastiaan Huber <[email protected]>

for more information, see https://pre-commit.ci

danielhollas · 2024-02-08T15:40:55Z

Very nice. I am wondering if the verdi devel check-undesired-imports is now superfluous as it is a subset of the unit test you added? I think technically there is still a difference in code path between merely tab-completing, and actually invoking a command. As you say, some of the command parameters were changed to lazily evaluate their default value, such that they are only executed when the command is actually called. What we are really interested in is that the tab-complete is responsive. So I think we can just add the modules from verdi devel check-undesired-imports to your unit test and get rid of that command (and invocation in the CI scripts). Right?

I've been thinking about this as well. As you mention, verdi devel check-undesired-imports provide stronger guarantees. I've looked at the list of the blacklisted modules and for most of them I think this stronger guarantee is warranted, since many of them incur significant import cost. In other words, I think it's reasonable to expect that verdi commands that can be fast are actually fast (e.g. asyncio should be loaded only when really needed).

danielhollas added 3 commits February 4, 2024 02:13

Lazy import pydantic by not importing aiida.manage.configuration.config

18ed546

Fixup imports

3175eaa

Fixup conftest

cd5400d

danielhollas marked this pull request as ready for review February 4, 2024 03:03

Add a regression test

c0e1d60

danielhollas added 2 commits February 7, 2024 23:55

Ugh, this is going to be tricky

ca81430

sphuber requested changes Feb 8, 2024

View reviewed changes

danielhollas and others added 2 commits February 8, 2024 15:22

Apply suggestions from code review

ca7d0ac

Co-authored-by: Sebastiaan Huber <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

bb63ab2

for more information, see https://pre-commit.ci

Merge branch 'main' into lazy-pydantic

9c44d48

danielhollas requested a review from sphuber February 8, 2024 17:15

Merge branch 'main' into lazy-pydantic

aa0cad1

sphuber approved these changes Feb 8, 2024

View reviewed changes

sphuber merged commit 9524cda into aiidateam:main Feb 8, 2024
19 checks passed

danielhollas deleted the lazy-pydantic branch February 8, 2024 23:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lazy pydantic import #6275

Lazy pydantic import #6275

danielhollas commented Feb 4, 2024 •

edited

Loading

danielhollas commented Feb 4, 2024

edan-bainglass commented Feb 6, 2024

danielhollas commented Feb 6, 2024

danielhollas commented Feb 6, 2024

sphuber commented Feb 7, 2024

danielhollas commented Feb 7, 2024

danielhollas commented Feb 8, 2024

sphuber commented Feb 8, 2024

sphuber left a comment

danielhollas commented Feb 8, 2024

Lazy pydantic import #6275

Lazy pydantic import #6275

Conversation

danielhollas commented Feb 4, 2024 • edited Loading

danielhollas commented Feb 4, 2024

edan-bainglass commented Feb 6, 2024

danielhollas commented Feb 6, 2024

danielhollas commented Feb 6, 2024

sphuber commented Feb 7, 2024

danielhollas commented Feb 7, 2024

danielhollas commented Feb 8, 2024

sphuber commented Feb 8, 2024

sphuber left a comment

Choose a reason for hiding this comment

danielhollas commented Feb 8, 2024

danielhollas commented Feb 4, 2024 •

edited

Loading