-
Notifications
You must be signed in to change notification settings - Fork 192
AiiDA 2.0 plugin migration guide
This page will contain a summary of the backwards incompatible changes going from v1.0 to v2.0 of aiida-core
with, where applicable, more detailed guides on how to migrate existing plugin code or scripts.
The library click
, which is what verdi
is built with, was upgraded. It now comes with tab-completion built-in, which means we could drop the additional dependency click-completion
. The completion works the same, except that the string that should be put in the activation script to enable it is now shell-dependent. See the documentation to find out what string you should use for your shell. See this PR for more details.
There is a small change in verdi code setup
where the order of prompts has changed. If you have scripts that use the interactive mode for this command, they might start to fail, since the wrong values are passed for the wrong arguments. However, it is in general not advisable to use the interactive (prompting) mode for automated scripts. Please use the --non-interactive
flag to ensure the command doesn't prompt and simply use the various parameter flags to specify the values, .e.g.:
verdi code setup --non-interactive -L label -D "description" .....
The entry point system allows external packages to extend the functionality of aiida-core
.
This concept was formally introduced in v1.0 and since there have been unwritten guidelines and naming conventions for entry points.
Particularly, entry points defined by a plugin package are encouraged to be prefixed with the name of the plugin package.
For example, the entry points of aiida-quantumespresso
all start with the prefix quantumespresso.
.
This ensures that entry points are properly namespaced and there is minimal risk that the entry points of different plugin packages overlap and therefore cannot be uniquely resolved, rendering them unusable.
To this day, however, aiida-core
itself has not been respecting this guideline and provides many entry points that are not namespaced with core.
.
This not only causes many namespaces to essentially be blocked for use for any potential plugin packages, it also makes it unclear where certain entry points come from.
Therefore, the decision was made to change the entry point in aiida-core
in v2.0 and properly prefix them with core.
.
The change was implemented in PR #5073.
This change has been made largely backward compatible, by updating the various plugin factories (imported from the aiida.plugins
module) with a special condition that detects the old entry point names.
If detected, it emits a deprecation warning and then proceeds to actually load the new entry point.
For example, the following code:
from aiida.plugins import DataFactory
Int = DataFactory('int')
will emit the following warning in v2.0:
In [1]: Int = DataFactory('int')
aiida/plugins/factories.py:40: AiidaDeprecationWarning: The entry point `int` is deprecated. Please replace it with `core.int`.
To get rid of the deprecation warning, simply update the entry point by prefixing it with core.
:
from aiida.plugins import DataFactory
Int = DataFactory('core.int')
Note that entry point names are also used on the command line.
For example, when creating a new computer, let's say the localhost configured with the DirectScheduler
, this used to be done with
verdi computer setup -L localhost -T local -S direct
which should now become
verdi computer setup -L localhost -T core.local -S core.direct
The old entry points will continue to work for v2.0, but will also cause the deprecation warning to be printed since the CLI goes through the plugin factories to load the entry points behind the scenes.
Given that entry point names are also stored in the database in certain places (for example the node_type
attribute of Data
nodes, and the scheduler_type
of Computer
instances), the data of existing databases will be automatically migrated.
Note that the new entry points do not only apply when they are used as command arguments, but also if the entry point is itself a command, the full entry point name needs to be used. A good example are the subcommands of the verdi data
command, which are themselves entry points. For example, what used to be:
verdi data bands list
has now become:
verdi data core.bands list
Remember that you can always use tab-completion to automatically discover the subcommands that are available.
The file repository arguably underwent the greatest change of all components of AiiDA in v2.0 and as such various backwards incompatible changes had to be introduced.
-
FileType
: moved fromaiida.orm.utils.repository
toaiida.repository.common
-
File
: moved fromaiida.orm.utils.repository
toaiida.repository.common
-
File
: changed from namedtuple to class -
File
: can no longer be iterated over -
File
:type
attribute was renamed tofile_type
-
Node.put_object_from_tree
:path
argument was renamed tofilepath
-
Node.put_object_from_file
:path
argument was renamed tofilepath
-
Node.put_object_from_tree
:key
argument was renamed topath
-
Node.put_object_from_file
:key
argument was renamed topath
-
Node.put_object_from_filelike
:key
argument was renamed topath
-
Node.get_object
:key
argument was renamed topath
-
Node.get_object_content
:key
argument was renamed topath
-
Node.open
:key
argument was renamed topath
-
Node.list_objects
:key
argument was renamed topath
-
Node.list_object_names
:key
argument was renamed topath
-
SinglefileData.open
:key
argument was renamed topath
-
Node.open
: can no longer be called without context manager -
Node.open
: only moder
andrb
are supported, useput_object_from_
methods instead -
Node.get_object_content
: only moder
andrb
are supported -
Node.put_object_from_tree
: argumentcontents_only
was removed -
Node.put_object_from_tree
: argumentforce
was removed -
Node.put_object_from_file
: argumentforce
was removed -
Node.put_object_from_filelike
: argumentforce
was removed -
Node.delete_object
: argumentforce
was removed
In AiiDA v1.0 it was possible to call Node.open
without a context manager, for example:
handle = node.open('filename.txt')
content = handle.read()
handle.close()
In AiiDA v2.0, this will raise and instead it should be used in a context manager
with node.open('filename.txt') as handle:
content = handle.read()
This is good practice in any case, because in this case the file handle will be properly closed even if the read
call excepts for some reason. In normal Python, although ill-advised, it is possible to call open
on a file on the file system without a context manager, but in AiiDA v2.0 this raises. The reason is that by requiring a context manager, the file repository can be implemented in a more efficient manner, making the reading of files faster.
Despite the changes listed above, it should be possible to write code that is compatible with both AiiDA 1.x and 2.x. The most important things to consider are:
-
Always use
.open()
with a context manager (as detailed above). -
Use
key
orpath
as positional arguments, not keyword arguments. For example, writewith node.open('filename.txt') as in_f: <...>
instead of
with node.open(key='filename.txt') as in_f: <...>
-
Use try / except clauses to handle imports that have moved. For example:
try: from aiida.orm.utils.repository import FileType except ImportError: from aiida.repository.common import FileType
-
To access the
type
/file_type
attribute of aFile
, you can again use try / except clauses:some_file = File(<...>) try: file_type = some_file.file_type except AttributeError: file_type = some_file.type
Or alternatively,
getattr
chaining:some_file = File(<...>) file_type = getattr(some_file, 'file_type', getattr(some_file, 'type'))
Points 3 & 4 are needed only for cross-compatibility between AiiDA versions <=1.3, and >=2.0. The 1.4 release is compatible with both the old and new syntax, but will show DeprecationWarning
if the old syntax is used.
When using these workarounds (3 & 4), we recommend placing a comment into your code. For example:
# Workaround for compatibility with AiiDA version < 1.4
This will let you know to remove the workaround once your code no longer needs to be compatible with older AiiDA versions. Make sure the comment is always exactly the same, to simplify searching for it.
For the Computer
class, the attribute name
was already deprecated in AiiDA v1.0 and was replaced by label
. However, the attribute name
remained in the database table. This meant that in the QueryBuilder
one had to continue using name
. In AiiDA v2.0, the database table is now updated to match the ORM. If before you did the following:
QueryBuilder().append(Computer, filters={'name': 'localhost'}, project=['name']).all()
now you have to use
QueryBuilder().append(Computer, filters={'label': 'localhost'}, project=['label']).all()
The attribute name
for the entity Computer
was renamed to label
.
In PR #3787 a change to the API of transport plugins has been introduced, to support also transferring bytes (rather than only Unicode strings) in the stdout/stderr of "remote" commands (via the transport).
The required changes in your plugin (if you wrote a transport plugin) are:
- rename the
exec_command_wait
function in your plugin implementation withexec_command_wait_bytes
- ensure that you have a
stdin
in the parameters (the signature should beexec_command_wait_bytes(self, command, stdin=None, **kwargs)
) and that you (also) accepts bytes in input in thestdin
parameter. Ideally, if you get bytes, you shouldn't do any encoding/decoding, to ensure your plugin works also if the stdin contains binary data. - return bytes for stdout and stderr (most probably internally you are already getting bytes - just do not decode them to strings)
See e.g. the changes to the local
transport plugin to see an example what needs to be changes.
Note that one can still call exec_command_wait
that is now defined in the parent Transport
class (that now has an encoding
optional parameter with default=utf8, as it used to be), and takes care of the decoding.
More details can be found in the PR and in the corresponding commit message, including how to support both v1.6 and v2.0 of AiiDA (by still defining also the exec_command_wait
in your plugin, during the transition period).
Since AiiDA v1.6.0, nodes of all types compare equal when they have the same UUID (See PR #4753).
However, most of the Pythonic base data types (Bool
, Int
, Float
, Str
and List
) already went one step further and also compared equal to other nodes based on the node content.
The only base type that was the exception here was Dict
.
After some discussion (see #5187 for a summary), it was decided to make the way compare equal to be consistent among the base types and hence make Dict
nodes compare equal when they have the same content (see PR #5251).
In case your code relies on Dict
nodes only comparing equal when it is strictly the same node, you can use the uuid
property of the nodes.
For example, when you define two different Dict
nodes based on the same dictionary:
In [1]: d1 = Dict({'a': 1})
In [2]: d2 = Dict({'a': 1})
They will now be equal according to the ==
operator:
In [3]: d1 == d2
Out[3]: True
However, you can still see if they are the same node using the uuid
property:
In [4]: d1.uuid == d2.uuid
Out[4]: False
Scheduler plugins implementing the Scheduler
class, had to implement the _get_submit_script_header
method, which was also responsible for writing the environment variable declarations if the job_environment
variable was set on the job template.
This functionality has now been factored out to the method _get_submit_script_environment_variables
(see PR 5283).
Instead of formatting the environment variables themselves, it is advised that plugin simply call this function from _get_submit_script_header
and include the generated string in the returned string.
- The
Transport.get_valid_transports()
method has been removed, useget_entry_point_names('aiida.transports')
instead, withaiida.plugins.entry_point.get_entry_point_names
. - The
Scheduler.get_valid_transports()
method has been removed, useget_entry_point_names('aiida.schedulers')
instead, withaiida.plugins.entry_point.get_entry_point_names
.
This affects only plugins still using the PluginTestCase
class.
Since 2017 (v0.11.0), AiiDA offered a PluginTestCase
class that made it easy for plugin developers set up a fully functioning test environment.
The test class was originally designed to work with the unittest
package, but testing in aiida-core (as well as most plugins) moved to pytest
.
The PluginTestCase
class could still be run through pytest
(and the aiida-plugin-cutter included an example of this), but as testing through unittest
is being deprecated, the PluginTestCase
only adds extra code to maintain and will be removed.
The canonical way of writing tests in pytest is through simple test functions and pytest fixtures. See the pytest documentation for details.
However, pytest also offers support for test classes with unittest-style setup
methods.
For a minimalist approach to removing the dependency on the PluginTestCase
, see this migration diff from the aiida-plugin-cutter
.
This is a temporary section with instructions for developers to have them test the database migrations that will be released with v2.0. The instructions below hopefully make it as easy as possible to test this.
- Checkout the latest
develop
branch:git checkout develop && git pull
- Install latest dependencies:
pip install -U -e .[tests,pre-commit]
- Run
verdi status
: this will update your configuration to the latest schema version
- Create a clone of the PostgreSQL database you want to migrate
- Login as the
postgres
user:sudo su - postgres
- Load the postgres program:
psql
- If it is already loaded in postgres, you can clone it in
psql
directly:CREATE DATABASE aiida_clone WITH TEMPLATE aiida_original_db OWNER aiida;
Make sure to change the names of the databases and the owner of course. - If the database is on another machine and you want to test the migration on your workstation.
- Go to the remote machine and dump the database:
pg_dump -h localhost -d aiida_original_db -U aiida -W > aiida_original_db.psql
- Copy over the
aiida_original_db.psql
file to your workstation - Create a new database in
psql
:CREATE DATABASE aiida_clone OWNER aiida;
- Load the database dump:
psql -h localhost -d aiida_original_db -U aiida -W > aiida_original_db.psql
- Go to the remote machine and dump the database:
- Login as the
- Check statistics of the database (this information should be kept for reporting):
- Note whether it is Django or SqlAlchemy. If you don't know, run
SELECT * FROM alembic_version;
inpsql
. If it returns a value, it is SqlAlchemy, if it errors withERROR: relation "alembic_version" does not exist
it is Django - Get database node count:
SELECT count(*) FROM db_dbnode;
- Get database size:
SELECT pg_size_pretty(pg_database_size('aiida_clone'));
- Get database revision:
- For SqlAlchemy:
SELECT * FROM alembic_version;
- For Django:
SELECT name FROM django_migrations WHERE app = 'db' ORDER BY id DESC LIMIT 1;
- For SqlAlchemy:
- Note whether it is Django or SqlAlchemy. If you don't know, run
- Create a clone of the repository (Note: this is only necessary if your database revision is below a certain revision; the migrations above it will not affect the repository, including the repository migration itself, as it will leave the original repo intact and simply write the new disk object store in parallel.)
- Django: if you have revision 0027 or above, there is no need to clone the repo
- SqlAlchemy: if your revision is in the following list, there is no need to clone the repo:
['535039300e4a', '1feaea71bd5a', '7536a82b2cc4', '0edcdd5a30f0', 'bf591f31dd12', '118349c10896', '91b573400be5', '7b38a9e783e7', 'e734dd5e50d7', 'e797afa09270', '26d561acd560', '07fac78e6209', 'de2eaf6978b4', '1830c8430131', '1b8ed3425af9', '3d6190594e19', '5a49629f0d45', '5ddd24e52864', 'd254fdfed416', '61fc0913fae9', 'ce56d84bcc35']
- Create a profile with the correct database and repository configured
- Easiest is to open
config.json
and clone an entry and simply update the name of the database and the location of the repository -
IMPORTANT: if the database has an old schema version (see the point above) you should have made a clone of the repository and you should make sure that the
storage.config.repository_uri
key points to the correct path
- Easiest is to open
- Make sure the daemon is not running
- Run
time verdi -p aiida-profile storage migrate -f
. IMPORTANT do not forget thetime
in front. We would like to gather this information to get an idea of how long the migrations typically take. - Copy the log messages from the migrations printed to stdout.
- Rerun the statistics database size and node count in
psql
:SELECT count(*) FROM db_dbnode;
SELECT pg_size_pretty(pg_database_size('aiida_clone'));
- Run
verdi status
and check that storage connection is green - Open
verdi shell
and do some tests: queries, opening repository files of some nodes etc.
For each database for which you test the migration, please report the following:
- Database backend (Django or SqlAlchemy)
- Starting revision
- Node count before migration
- Node count after migration
- Database size before migration
- Database size after migration
- Time taken for the actual migration
- Messages printed to stdout by the migrations
- Any errors you encountered or problems you noticed afterwards when manually inspecting the data