-
Notifications
You must be signed in to change notification settings - Fork 5
Synchronisation Background Task
Note: This is part one of two of creating an approach for a background synchronisation task. Part two can be found here
To resolve possible sync inconsistency amongst clients, a background task synchronises the folder with all clients in certain intervals (currently every 10 minutes).
- Elect a master client
- The elected master client fetches all object stores from all own clients
- Merge object stores
- Fetch missing files
- Send merged object store to all other clients
- All other clients check their object stores for differences and fetch missing files from the client which sent the merged object store (since only he will have all files present)
- Restarting event aggregation
- Why do we need a master client? A master client is currently needed to coordinate the temporary halt of event aggregation. If each client would just start to synchronise his object store with other clients independently of the others, then he would fail to reach a consistent state amongst all clients in certain circumstances: e.g. another client modifies or deletes a file while syncing. Since the client paused his event aggregation, he would not receive any notifications about a change of the file.
The first step of reconciling the synchronised folder consists of selecting a master client.
Each client which has reached the end of the background task interval, starts a BackgroundSyncer
.
Within this thread, all PeerIds are fetched. Then, the highest id is tentatively selected as master. If the peer having started the BackgroundSyncer
, detects that he owns the highest peerId, no further requests are sent to other clients.
Otherwise, all other clients are requested for their opinion about being a master. A client can deny such a request if and only if he is still working on another background task. Otherwise he must accept such a request.
After all clients have responded, a master is selected only if all clients have positively answered the corresponding MasterElectionRequest
. If a client has denied a request, then the master election will conclude with the state, that another master is still working on a background task and therefore the interval step should be skipped.
Once a client is elected a master, the BackgroundSyncer
continues by starting a InitSyncExchangeHandler
.
This will cause all clients (incl. the master) to completely stop their event aggregation. No client is now notified about changes on the filesystem (See: FAQ: Why do we need a master client?).
The master client will then start to fetch the object stores from all other clients.
Once all other object stores are retrieved, the master client merges them one-by-one.
After merging the object stores, the master will compare his own object store with the merged one, and delete all files he should have deleted resp. fetches all files he should have been downloaded before. If a fetch of a missing file is not possible from one client (e.g. the other client deleted a file in the synchronised folder while synchronising), the file is marked as deleted and removed from the masters disk.
Side note: If the event aggregation would still be running here, such events would be propagated to other clients
Side note: Since the object store is not updated in the time during the reconciliation step, any data loss is prevented due to the fact that only files are compared which are in the object store. Data loss is still possible, if a file is modified which has a different version on another client too, since the file is then fetched from the other client and its contents get overwritten (See https://github.com/p2p-sync/sync/issues/4)
After cleaning up his own synchronised folder, the master sends the merged object store to all clients.
All other clients will start to clean up their state of the synchronised folder after having received the result of the master. Since only the master is guaranteed to have all files, each client will request missing files from the master.
Side note: The same issue about data loss applies here too: https://github.com/p2p-sync/sync/issues/4
To not lose modifications done during a sync, the cleaned up object store is then compared with a newly indexed one. Any changes detected between the cleaned up object store and the disk (i.e. the new indexed object store) are then propagated to other clients
- Commons
- Persistence Layer
- Versioning Layer
- Event Aggregation Layer
- Network Layer
- Core (this repository)
- End-User Client