Prior check if migration needed on SqliteZipBackend initialisation #6963

GeigerJ2 · 2025-07-23T08:08:10Z

Problem

The SqliteZipBackend.initialise() method was always running the migration code during initialization, even when archives were already at the target version.

Solution

This PR introduces the check_migration_needed() function that validates whether migration is actually required before attempting the migration process, with the implementation basically just being checks factored out which were previously part of the migrate method.

Now, initialise() performs this check early and returns immediately with appropriate logging if no migration is needed. The migrate() function has also been refactored to use this same validation logic.

Tests were added for the check_migration_needed function, as well as for the two paths during initilialise (migration required or not). Previously, no tests were in place for the src/aiida/storage/sqlite_zip/migrator.py file, so the test_migrator.py is newly added.

Closes #6961.

for more information, see https://pre-commit.ci

codecov · 2025-07-23T08:09:56Z

Codecov Report

❌ Patch coverage is 97.56098% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 79.06%. Comparing base (313f342) to head (e6e611d).

Files with missing lines	Patch %	Lines
src/aiida/storage/sqlite_zip/backend.py	96.78%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6963      +/-   ##
==========================================
+ Coverage   79.05%   79.06%   +0.02%     
==========================================
  Files         566      566              
  Lines       43675    43696      +21     
==========================================
+ Hits        34522    34546      +24     
+ Misses       9153     9150       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

agoscinski · 2025-07-24T09:30:02Z

src/aiida/manage/configuration/config.py

@@ -528,6 +528,7 @@ def create_profile(

        LOGGER.report('Initialising the storage backend.')
        try:
+            # PRCOMMENT: Not sure what this context manager is for?


This context manager changes sys.stdout temporary (within the context), so this it is doing sys.stdout = mystdout. Since it does not store the redirected information anywhere, I think person just wanted to prevent whatever profile.storage_cls.initialise(profile) is outputting is printed to the user terminal. Outside of the context you put pack sys.stdout back to what it was before.

ah you already have remove it in a PR. Okay i see your question was more about why do we redirect this and not print it to the terminal

I am a bit afraid that the removal now triggers stdout at places where you actually don't want it to happen, because we use this function at places where the user would be confused about migration. Check usage of create_profile. For example now verdi presto prints information about initialisation. I think in the case of verdi presto the additional prints are fine and the other usages I seem also fine but I did not thoroughly checked. Hope you checked it when merging the PR.

Yeah, you're right, that's a valid concern. Though, if migration is happening, the user should always be told, I think. In the case I was looking at, archive migration as part of sqlite_zip profile migration, it did make sense to print the output. However, I didn't verify if it also captures other, unwanted output elsewhere, though. Will check this. Though, anyway, it might be better to solve this be setting the logger messages at different log levels, instead of capturing all stdout, as it was done. TBD

OK, with verdi presto and verdi profile setup core.sqlite_dos, the only additional logging that comes from the change in PR #6964 is:

Report: Migrating to the head of the main branch

Which is probably even wrong in the first place... Similar to the spirit of this PR, why is there even any migration-related code being called during creation of a fresh profile, with new databases 🤔

…ion-needed

GeigerJ2 · 2025-08-19T12:44:47Z

@superstar54, maybe we can review this together during the coding days? Would you have time?

superstar54 · 2025-08-20T04:43:41Z

@superstar54, maybe we can review this together during the coding days? Would you have time?

Sure!

superstar54

Hi @GeigerJ2 , thanks for the work. Overall looks good to me. I added a few comments on the tests and logger.

superstar54 · 2025-08-22T09:18:04Z

tests/storage/sqlite_zip/test_migrator.py

+    # Test: force overwrite existing output
+    output_path.write_text('existing content')
+    migrate(input_path, output_path, latest_version, force=True)
+    assert zipfile.is_zipfile(output_path)


Here, you should check if the output_path is new and is different from the previous output_path.

This is actually implicitly tested via is_zipfile. The original file is a text file, hence it would fail the is_zipfile test without the migration. I added an explicit check for this failure above, to make it more clear.

superstar54 · 2025-08-22T09:19:56Z

tests/storage/sqlite_zip/test_migrator.py

+        zf.writestr('metadata.json', json.dumps(metadata))
+    assert input_path.exists()
+
+    # Test: different paths, should copy file


Consider adding a test when the paths are identical.

Good catch! This case was actually not handled and instead excepting the migrate function, so I added an early return there, as well as a test.

src/aiida/storage/sqlite_zip/backend.py

superstar54 · 2025-08-22T09:43:20Z

src/aiida/storage/sqlite_zip/backend.py


            # The archive exists but ``reset == False``, so we try to migrate to the latest schema version. If the
            # migration works, we replace the original archive with the migrated one.
            with tempfile.TemporaryDirectory() as dirpath:
                filepath_migrated = Path(dirpath) / 'migrated.zip'
-                LOGGER.report(f'Migrating existing {cls.__name__}')
-                migrate(filepath_archive, filepath_migrated, cls.version_head())
+                LOGGER.report(f'Migrating existing {cls.__name__} to {target_version}')


if check_migration_needed return current_version

Suggested change

LOGGER.report(f'Migrating existing {cls.__name__} to {target_version}')

LOGGER.report(f'Migrating existing {cls.__name__} from version {current_version} to {target_version}')

if not

Suggested change

LOGGER.report(f'Migrating existing {cls.__name__} to {target_version}')

LOGGER.report(f'Migrating existing {cls.__name__} to version {target_version}')

superstar54 · 2025-08-22T09:43:56Z

src/aiida/storage/sqlite_zip/migrator.py

+        raise StorageMigrationError(msg)
+
+    # check if migration is needed
+    return current_version != target_version


You can return the current_version and use it in the log.

Suggested change

return current_version != target_version

return current_version != target_version, current_version

I tried this, but didn't like this approach because it muddied the return of the function, and in most cases, I ended up just throwing the current_version return away (apart from logging). This led to a general refactor now, where I split the check_migration_needed code into two functions, get_current_archive_version and validate_archive_versions, both as staticmethods of the SqliteZipBackend. I feel like this is more readable. Happy to read your thoughts :)

I implemented that, but was not too happy with it, as this approach muddies the return of the function, and in most cases, I was just throwing away the current version return. I actually ended up refactoring the code, and split the logic of the check_migration_needed function into get_current_archive_version and validate_archive_versions, both implemented as staticmethods of the SqliteZipBackend class. Happy to hear your thoughts :)

GeigerJ2 · 2025-08-22T09:53:51Z

Thanks a lot for the review, @superstar54! Will address your comments today :)

…ion-needed

GeigerJ2 · 2025-08-27T18:31:53Z

@superstar54, OK, took a bit longer, sorry :D I addressed all your points. Could I ask you to have another look. Thanks!

GeigerJ2 and others added 2 commits July 23, 2025 09:34

Add check if migration needed

a757626

[pre-commit.ci] auto fixes from pre-commit.com hooks

6b62cef

for more information, see https://pre-commit.ci

agoscinski reviewed Jul 24, 2025

View reviewed changes

GeigerJ2 added 4 commits August 12, 2025 09:56

Merge remote-tracking branch 'upstream/main' into sqlite-check-migrat…

19ecf32

…ion-needed

Fix failing test due to report message

5e5349a

Add test_migrator.py

a2e7f01

Clean up tests

b66fbe6

GeigerJ2 marked this pull request as ready for review August 12, 2025 12:51

GeigerJ2 force-pushed the sqlite-check-migration-needed branch from 90f3f38 to b66fbe6 Compare August 13, 2025 07:20

GeigerJ2 and others added 5 commits August 13, 2025 09:31

Try codecov passing now

60d87cc

Merge branch 'main' into sqlite-check-migration-needed

f3ed6cb

final test check

f965513

add test for migration triggered on initialise

d76760b

.

66ac06a

GeigerJ2 requested review from unkcpz and khsrali and removed request for unkcpz and khsrali August 13, 2025 10:49

GeigerJ2 self-assigned this Aug 19, 2025

GeigerJ2 assigned superstar54 and unassigned GeigerJ2 Aug 21, 2025

superstar54 requested changes Aug 22, 2025

View reviewed changes

GeigerJ2 and others added 4 commits August 22, 2025 11:54

Merge remote-tracking branch 'upstream/main' into sqlite-check-migrat…

0ef8534

…ion-needed

Merge branch 'main' into sqlite-check-migration-needed

4354b2e

Merge branch 'main' into sqlite-check-migration-needed

289548f

Adress review comments and refactor check_migration_needed

2d223c5

GeigerJ2 and others added 2 commits August 27, 2025 13:00

Fix failing tests

2fe8e6d

Merge branch 'main' into sqlite-check-migration-needed

e6e611d

	LOGGER.report(f'Migrating existing {cls.__name__} to {target_version}')
	LOGGER.report(f'Migrating existing {cls.__name__} from version {current_version} to {target_version}')

	return current_version != target_version
	return current_version != target_version, current_version

Prior check if migration needed on SqliteZipBackend initialisation #6963

Are you sure you want to change the base?

Prior check if migration needed on SqliteZipBackend initialisation #6963

Conversation

GeigerJ2 commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Uh oh!

codecov bot commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

agoscinski Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agoscinski Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GeigerJ2 Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GeigerJ2 Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GeigerJ2 commented Aug 19, 2025

Uh oh!

superstar54 commented Aug 20, 2025

Uh oh!

superstar54 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GeigerJ2 Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GeigerJ2 commented Aug 22, 2025

Uh oh!

GeigerJ2 commented Aug 27, 2025

Uh oh!

Uh oh!

GeigerJ2 commented Jul 23, 2025 •

edited

Loading

codecov bot commented Jul 23, 2025 •

edited

Loading

agoscinski Jul 24, 2025 •

edited

Loading

agoscinski Jul 24, 2025 •

edited

Loading

GeigerJ2 Jul 24, 2025 •

edited

Loading

GeigerJ2 Jul 24, 2025 •

edited

Loading

GeigerJ2 Aug 27, 2025 •

edited

Loading