Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to test/verify backups #290

Open
2 tasks done
strugee opened this issue Jul 26, 2021 · 11 comments · May be fixed by #785
Open
2 tasks done

Option to test/verify backups #290

strugee opened this issue Jul 26, 2021 · 11 comments · May be fixed by #785
Assignees
Milestone

Comments

@strugee
Copy link

strugee commented Jul 26, 2021


I'd love some way to verify the backups I've already made. Ideally this would be done automatically for me on a regular basis.

Maybe select a backup to verify every night semi-randomly, giving a stronger weight to both recent backups and backups that hadn't been checked in a while?

@chirayudesai

This comment was marked as outdated.

@strugee
Copy link
Author

strugee commented Jul 27, 2021

I'm not sure because I'm unsure of exactly what kind of metadata is retained by Seedvault. Ideally Seedvault would be able to cryptographically verify that the data hadn't been tampered with or corrupted, and it would do a dry run restore (i.e. everything that a restore would do except actually passing the data to the system APIs to be restored) to check for format issues, like a bug in the original serialization.

Mostly the Nextcloud app is slow and fragile and errors on that side have caused numerous backups I've manually run to abort partway through. So from a user perspective I want to have some assurance that what was uploaded before the failure is usable.

@capshort

This comment was marked as off-topic.

@grote

This comment was marked as off-topic.

@grote grote added this to the 3.x milestone Jan 4, 2023
@gdt

This comment was marked as duplicate.

@gdt
Copy link

gdt commented Feb 2, 2023

Arguably one should be able to configure, and perhaps by default, automatic validation (using the key that is already inside), perhaps as the first step in a backup, basically validate previous, do backup, report both statuses.

@grote
Copy link
Collaborator

grote commented Sep 25, 2023

as the first step in a backup, basically validate previous

As we don't have a server-side component that can perform verification tasks, we would need to download all data for verification. I am not sure we can do that by default before all backups (especially those tasks scheduled by the OS), because it would introduce significant delays and bandwidth usage.

@nettnikl
Copy link
Contributor

as the first step in a backup, basically validate previous

As we don't have a server-side component that can perform verification tasks, we would need to download all data for verification. I am not sure we can do that by default before all backups (especially those tasks scheduled by the OS), because it would introduce significant delays and bandwidth usage.

Maybe we do not need that? I think generally in most use cases we can rely on files to be not maliciously modified on target storage. So, the meta data has to be validated. Then we can check if the file has the expected size, and if it does, it probably has been written as expected to the storage. To validate bit-correctness, one could then run a script to check for filename = hashsum(content) on target storage (on some future storage backends (which store hashsums of stored content) this might even be supported ootb).

@strugee
Copy link
Author

strugee commented Sep 26, 2023

To validate bit-correctness, one could then run a script to check for filename = hashsum(content) on target storage (on some future storage backends (which store hashsums of stored content) this might even be supported ootb).

Modern Nextcloud versions support checksums automatically.

@nettnikl
Copy link
Contributor

Modern Nextcloud versions support checksums automatically.

That sounds great! Could you check if the filenames align with the sums your NC calculates?

@grote grote linked a pull request Oct 30, 2024 that will close this issue
@grote
Copy link
Collaborator

grote commented Oct 30, 2024

Metadata now gets validated before each backup run. #785 will allow you do do manual integrity checks of random samples of data. Let's make a new ticket for scheduling such checks automatically for those who want them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants