Skip to content

AiiDA 2.0 plugin migration guide

Leopold Talirz edited this page Sep 22, 2021 · 63 revisions

This page will contain a summary of the backwards incompatible changes going from v1.0 to v2.0 of aiida-core with, where applicable, more detailed guides on how to migrate existing plugin code or scripts.

Tab-completion

The library click, which is what verdi is built with, was upgraded. It now comes with tab-completion built-in, which means we could drop the additional dependency click-completion. The completion works the same, except that the string that should be put in the activation script to enable it is now shell-dependent. See the documentation to find out what string you should use for your shell. See this PR for more details.

Entry points

The entry point system allows external packages to extend the functionality of aiida-core. This concept was formally introduced in v1.0 and since there have been unwritten guidelines and naming conventions for entry points. Particularly, entry points defined by a plugin package are encouraged to be prefixed with the name of the plugin package. For example, the entry points of aiida-quantumespresso all start with the prefix quantumespresso.. This ensures that entry points are properly namespaced and there is minimal risk that the entry points of different plugin packages overlap and therefore cannot be uniquely resolved, rendering them unusable.

To this day, however, aiida-core itself has not been respecting this guideline and provides many entry points that are not namespaced with core.. This not only causes many namespaces to essentially be blocked for use for any potential plugin packages, it also makes it unclear where certain entry points come from. Therefore, the decision was made to change the entry point in aiida-core in v2.0 and properly prefix them with core.. The change was implemented in PR #5073.

This change has been made largely backward compatible, by updating the various plugin factories (imported from the aiida.plugins module) with a special condition that detects the old entry point names. If detected, it emits a deprecation warning and then proceeds to actually load the new entry point. For example, the following code:

from aiida.plugins import DataFactory
Int = DataFactory('int')

will emit the following warning in v2.0:

In [1]: Int = DataFactory('int')
aiida/plugins/factories.py:40: AiidaDeprecationWarning: The entry point `int` is deprecated. Please replace it with `core.int`.

To get rid of the deprecation warning, simply update the entry point by prefixing it with core.:

from aiida.plugins import DataFactory
Int = DataFactory('core.int')

Note that entry point names are also used on the command line. For example, when creating a new computer, let's say the localhost configured with the DirectScheduler, this used to be done with

verdi computer setup -L localhost -T local -S direct

which should now become

verdi computer setup -L localhost -T core.local -S core.direct

The old entry points will continue to work for v2.0, but will also cause the deprecation warning to be printed since the CLI goes through the plugin factories to load the entry points behind the scenes.

Given that entry point names are also stored in the database in certain places (for example the node_type attribute of Data nodes, and the scheduler_type of Computer instances), the data of existing databases will be automatically migrated.

Repository

The file repository arguably underwent the greatest change of all components of AiiDA in v2.0 and as such various backwards incompatible changes had to be introduced.

  • FileType: moved from aiida.orm.utils.repository to aiida.repository.common
  • File: moved from aiida.orm.utils.repository to aiida.repository.common
  • File: changed from namedtuple to class
  • File: can no longer be iterated over
  • File: type attribute was renamed to file_type
  • Node.put_object_from_tree: path argument was renamed to filepath
  • Node.put_object_from_file: path argument was renamed to filepath
  • Node.put_object_from_tree: key argument was renamed to path
  • Node.put_object_from_file: key argument was renamed to path
  • Node.put_object_from_filelike: key argument was renamed to path
  • Node.get_object: key argument was renamed to path
  • Node.get_object_content: key argument was renamed to path
  • Node.open: key argument was renamed to path
  • Node.list_objects: key argument was renamed to path
  • Node.list_object_names: key argument was renamed to path
  • SinglefileData.open: key argument was renamed to path
  • Node.open: can no longer be called without context manager
  • Node.open: only mode r and rb are supported, use put_object_from_ methods instead
  • Node.get_object_content: only mode r and rb are supported
  • Node.put_object_from_tree: argument contents_only was removed
  • Node.put_object_from_tree: argument force was removed
  • Node.put_object_from_file: argument force was removed
  • Node.put_object_from_filelike: argument force was removed
  • Node.delete_object: argument force was removed

Using open in a context manager

In AiiDA v1.0 it was possible to call Node.open without a context manager, for example:

handle = node.open('filename.txt')
content = handle.read()
handle.close()

In AiiDA v2.0, this will raise and instead it should be used in a context manager

with node.open('filename.txt') as handle:
    content = handle.read()

This is good practice in any case, because in this case the file handle will be properly closed even if the read call excepts for some reason. In normal Python, although ill-advised, it is possible to call open on a file on the file system without a context manager, but in AiiDA v2.0 this raises. The reason is that by requiring a context manager, the file repository can be implemented in a more efficient manner, making the reading of files faster.

Writing cross-compatible code

Despite the changes listed above, it should be possible to write code that is compatible with both AiiDA 1.x and 2.x. The most important things to consider are:

  1. Always use .open() with a context manager (as detailed above).

  2. Use key or path as positional arguments, not keyword arguments. For example, write

    with node.open('filename.txt') as in_f:
        <...>

    instead of

    with node.open(key='filename.txt') as in_f:
        <...>
  3. Use try / except clauses to handle imports that have moved. For example:

    try:
        from aiida.orm.utils.repository import FileType
    except ImportError:
        from aiida.repository.common import FileType
  4. To access the type / file_type attribute of a File, you can again use try / except clauses:

    some_file = File(<...>)
    try:
        file_type = some_file.file_type
    except AttributeError:
        file_type = some_file.type

    Or alternatively, getattr chaining:

    some_file = File(<...>)
    file_type = getattr(some_file, 'file_type', getattr(some_file, 'type'))

Points 3 & 4 are needed only for cross-compatibility between AiiDA versions <=1.3, and >=2.0. The 1.4 release is compatible with both the old and new syntax, but will show DeprecationWarning if the old syntax is used.

When using these workarounds (3 & 4), we recommend placing a comment into your code. For example:

# Workaround for compatibility with AiiDA version < 1.4

This will let you know to remove the workaround once your code no longer needs to be compatible with older AiiDA versions. Make sure the comment is always exactly the same, to simplify searching for it.

QueryBuilder

For the Computer class, the attribute name was already deprecated in AiiDA v1.0 and was replaced by label. However, the attribute name remained in the database table. This meant that in the QueryBuilder one had to continue using name. In AiiDA v2.0, the database table is now updated to match the ORM. If before you did the following:

QueryBuilder().append(Computer, filters={'name': 'localhost'}, project=['name']).all()

now you have to use

QueryBuilder().append(Computer, filters={'label': 'localhost'}, project=['label']).all()

REST API

The attribute name for the entity Computer was renamed to label.

Transport plugins

In PR #3787 a change to the API of transport plugins has been introduced, to support also transferring bytes (rather than only Unicode strings) in the stdout/stderr of "remote" commands (via the transport).

The required changes in your plugin (if you wrote a transport plugin) are:

  • rename the exec_command_wait function in your plugin implementation with exec_command_wait_bytes
  • ensure that you have a stdin in the parameters (the signature should be exec_command_wait_bytes(self, command, stdin=None, **kwargs)) and that you (also) accepts bytes in input in the stdin parameter. Ideally, if you get bytes, you shouldn't do any encoding/decoding, to ensure your plugin works also if the stdin contains binary data.
  • return bytes for stdout and stderr (most probably internally you are already getting bytes - just do not decode them to strings)

See e.g. the changes to the local transport plugin to see an example what needs to be changes.

Note that one can still call exec_command_wait that is now defined in the parent Transport class (that now has an encoding optional parameter with default=utf8, as it used to be), and takes care of the decoding. More details can be found in the PR and in the corresponding commit message, including how to support both v1.6 and v2.0 of AiiDA (by still defining also the exec_command_wait in your plugin, during the transition period).

Miscellaneous

  • The Transport.get_valid_transports() method has been removed, use get_entry_point_names('aiida.transports') instead, with aiida.plugins.entry_point.get_entry_point_names.
  • The Scheduler.get_valid_transports() method has been removed, use get_entry_point_names('aiida.schedulers') instead, with aiida.plugins.entry_point.get_entry_point_names.

Unit tests

This affects only plugins still using the PluginTestCase class.

Background

Since 2017 (v0.11.0), AiiDA offered a PluginTestCase class that made it easy for plugin developers set up a fully functioning test environment. The test class was originally designed to work with the unittest package, but testing in aiida-core (as well as most plugins) moved to pytest.

The PluginTestCase class could still be run through pytest (and the aiida-plugin-cutter included an example of this), but as testing through unittest is being deprecated, the PluginTestCase only adds extra code to maintain and will be removed.

Migrating to pytest

The canonical way of writing tests in pytest is through simple test functions and pytest fixtures. See the pytest documentation for details.

However, pytest also offers support for test classes with unittest-style setup methods. For a minimalist approach to removing the dependency on the PluginTestCase, see this migration diff from the aiida-plugin-cutter.

Testing the migration

This is a temporary section with instructions for developers to have them test the database migrations that are necessary for the new repository implementation. The instructions below hopefully make it as easy as possible to test this:

  • turn off the daemon
  • Clone the database that you want to migrate in Postgres
    • Login as the postgres user: sudo su - postgres
    • Load the postgres program: psql
    • Create the clone: CREATE DATABASE aiida_clone WITH TEMPLATE aiida_original_db OWNER aiida;
    • Make sure to replace aiida_original_db and aiida with the actual database name and database user
    • Get statistics: \c aiida_clone followed by select count(*) from db_dbnode;
  • Duplicate the profile whose database you want to test migrate. Easiest is just to copy paste the profile manually in the config.json
  • Change the AIIDADB_NAME key in the profile to that of the newly created database, in this example aiida_clone. There is no need to change the repository_url in the profile (see the next point).
  • IMPORTANT Even though the migration should not touch the repository, it is best to make sure it is backed up. The migration will only read from the existing repository and create the new repository along side it. The file content will therefore be duplicated so make sure you have enough space available on the file system.
  • Count number of files in original repository: time find /path/to/.aiida/repository/node -type f -printf x | wc -c
  • Get the size of the original repository: time du -sh /path/to/.aiida/repository/node
  • Checkout the aiidateam/fix/3445/disk-object-store-repository branch that I added from my fork to the aiidateam repo for convenience latest develop branch (the original branch has been merged into develop). Also, pip install -U -e . to update dependencies
  • Run the migration (after the statistics above (with time), to use caches): time verdi -p profile-name database migrate -f NOTE do not forget the time in front. We would like to gather this information to get an idea of how long the migrations typically take.

Ideally, also start a verdi shell in the new profile-name repository and try to access some nodes with files to see if everything worked correctly (e.g. the inputcat or outputcat of a CalcJob, an array inside an ArrayData (or subclass), the content of a UpfData, ...) - check if the content seems correct and not scrambled.

Finally, it would be great if you could add the following information to the PR (you should have collected these numbers in the instructions above):

  • Database backend (Django or SqlAlchemy)
  • Number of nodes in the database
  • Number of files in the repository (and time taken)
  • Size of the repository (and time taken)
  • Time taken for the actual migration (and any detailed report of errors, if any)

In addition, it would be great if you could report any problems that you encountered during the migration or suggest improvements for any error messaging. For example, currently if you perform the migration and it fails and then migrate again, you will get a RuntimeError saying that the container already exists and to delete it manually.

This has been fixed as of #4889. We should probably catch these exceptions and print a nice critical error instead.

The above is not really true, that PR addressed another problem. I already changed the RuntimeError in the original new repo PR #4345 and replaced it with a DatabaseMigrationError. This exception is also already caught in verdi database migrate and calls echo_critical with the error message. So I think the above request is already satisfied.