-
-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User migration via export file #3054
Conversation
…etween instances (#38) Co-authored-by: Daniel Burgess <[email protected]> Co-authored-by: Hugh Rundle <[email protected]> Co-authored-by: dannymate <[email protected]> Co-authored-by: hughrun <[email protected]> Reviewed-on: https://codeberg.org/GuildAlpha/bookwyrm/pulls/38 Co-authored-by: CSDUMMI <[email protected]> Co-committed-by: CSDUMMI <[email protected]>
* cleans up some test logging * cleans up some commented-out code * adds export_job model tests * reconsiders some tests in export user view tests
Complete Migrations of Bookwyrm Accounts across instances Merging this into `user-migration` branch to enable final work on this within the main Bookwyrm repository. We will pull in the final PR from there into `main` when ready. Thanks to @CSDUMMI and the crew for this huge job.
* fix Safari not downloading with the correct filename * add FAILED status * don't provide download link for stopped jobs
add USER_EXPORT_COOLDOWN_HOURS setting for controlling user exports and imports
- makes user_import_time_limit a site setting rather than a value in settings.py (note this applies to exports as well as imports) - admins can change user_import_time_limit from UI - admins can cancel stuck user imports - disabling new imports also disables user imports
complete most outstanding user migrate tasks
formatting and linting fixes
fix tests and linting
once more into the linting breach!
oops import Any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi!
I took a look at the mypy/pylint errors you mentioned, please see below.
Thanks @dato those fixes do indeed seem to resolve most of the problems. And thanks @mouse-reeve for picking up my over-eager linting "fix" with the tuple. I've sorted these out on my local,
(edit - I think it was just a borked environment, I tested again after a couple of tweaks and all good). |
I think the test was failing because it was extremely brittle, not because of anything wrong with the code itself.
I can't get ANY of the tests to run locally. It looks like migration 0187 is referencing That doesn't fix the failing migration test, but I think that's because that test is really brittle -- it looks like it's failing because it's using the database schema pegged to a specific migration but the current code, which is in a different state. I propose deleting it. |
On reflection, I wonder if you have a local copy of migration 0186 that was allowing your tests to pass? At any rate, if the commit I just made passes, I think that's good? If not (or if you don't think we should delete that test) I'm happy to nix it. |
Looks like that worked! I think I messed up a merge somewhere, but also this PR has taken so long with so many different merges it's a bit trickier to keep all the migrations aligned. I would recommend testing a |
merge migrations and lint
notification type migration after merge
Adds a link in the text of the notification, and fixes references to notification type in the model
I think this wording is a little clearer
Hello! (I apologize that I didn't run these tests much earlier.) I tested this on my development instance, so I exported data for my main user (id=2), then created an empty user (id=90) and imported into that account. While books in shelves where imported correctly:
something weird (bad?) happened with statuses: it would seem they simply got reassigned to user id 90 (in the web UI, indeed the origin user has no statuses in their Activity tab after the import):
Importing into the export instance is arguably a corner case, albeit one I think people might need to make use of for username changes. I haven't had a chance to look into the code yet, though I suspect the fix might be easier than this write-up!, heh. Edited to add: I was wondering if this was minor enough that the PR could be merged anyway, but I just realized this means a malicious user could (perhaps?) craft an archive.json that deletes another user’s data. |
@dato thanks for being so thorough.
I hadn't expected this, but it makes sense. In the original way I wrote this it wouldn't have happened, but making export files as close to ActivityPub objects has a lot of advantages so we use The issue here is in instance = parsed.to_model(model=cls, save=True, overwrite=True) The only way I could get this to work is to use
I admit this is something I hadn't thought of, and you're right, it would be possible for a malicious user - though you would |
I took a closer look because of this comment, and there is a second issue because the new statuses preserve the
(¹) It fails because those URLs don't exist any more. But if this happens in cross-instance imports (I'm not sure that it does), it could be considered worse since those permalinks should stay in-instance.’ Edited to add: confirmed it happens when importing into a separate instance:
|
dang. I missed that, I think I assumed it would create a new remote_id but logically, why would it? I think I see what needs to happen here, will take a look tonight my time (and also remove a few rogue |
I’m so glad you caught that, it does make sense that the old remote is remains. Every model should have a function to generate the remote id which we can re-run and save bookwyrm/bookwyrm/models/base_model.py Line 39 in 198c003
|
I think I've got a working solution for this. This solution will mean the import still works for everything else, but statuses won't be created if the source account hasn't been set as I'm not sure of the best way to present any errors to the user though. If it fails to import statuses that will probably be because they're a legit user who forgot to do things in the correct order, or maybe doesn't have access to the old account any more, so we need to provide clear info about why statuses haven't imported rather than just silently not importing them. Any suggestions @mouse-reeve ? |
Great! I have two questions if that's okay:
Thanks!! |
However, that creates a new problem in that we don't always want to do that when importing (e.g. if for some reason a user imports the same file twice). So I'll have to add some custom logic there.
|
- remote_id is now updated on import of statuses - statuses cannot be imported unless source has target listed in alsoKnownAs or movedTo - add alert boxes to import and export screens advising of the above - update tests accordingly
fix upsert_statuses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- does the latest commit resolve the bug @dato identified?
- does it introduce any new problems?
- are the messages about setting an alias sufficient?
I believe the commit is good, yes!
One tiny comment below, but absolutely not a blocker, is a corner case (again).
Thanks!
👏🏼👏🏼👏🏼 |
This PR aims to implement #1012. It introduces new export and import functionality, that allows users to export their entire account and all associated media in a self-contained tar file.
Resolves #1012
Resolves #2666
Archive structure
The tar.gz archive contains three parts:
archive.json
: Containing the user settings, names and books used by the user.avatar.png/avatar.jpg
: A user's avatar image if presentcovers/
: the book covers used by the userIncluded
Not included
Move
activity for user migration #2970)Also note that for reasons of expediency the JSON export is a custom structure and not in the form of ActivityPub JSON-LD.
Implementation
This project was mostly implemented in collaboration during the first Guild Alpha Sprint.
We decided to use a custom JSON archive as the basis of this proposal, as @hughrun had already done work on that, which we used as a base.
Contributors:
Special thanks to @Ryuno-Ki and @circlebuilder
Guild Alpha added:
Due to time constraints we couldn't implement everything needed. Subsequently @hughrun has added:
Related work
This was originally offered as #2980 - it was pulled into a new branch so the work could be completed
This is a sibling PR with #2970
Remaining work - help wanted
mypy
doesn't likebookwyrm/utils/tar.py
and I'm not sure what to do or whether I can just tell mypy to ignore types for now. There's also a related test that is failingtest_bookwyrm_import_job
which indicate I probably need to refactor a test there but I can't figure out what I should be doingOther work - possibly in scope
Move
activity for user migration #2970 as we can probably consolidate some menus and definitely will need some clear instructions. Possibly we could pull them both intomain
but check for potential streamlining before releasing as production-ready.Future work - not in scope