This commit is contained in:
Joey Hess 2012-08-22 15:45:20 -04:00
parent 6873ca0c1b
commit e43feeb5b4

View file

@ -3,15 +3,42 @@ all the other git clones, at both the git level and the key/value level.
## immediate action items
* Sync with all available remotes on startup.
* TransferScanner should avoid unnecessary scanning of remotes.
This is paricilarly important for scans queued by the NetWatcher,
which can be polling, or could be after a momentary blip in network
connectivity. The TransferScanner could check the remote's git-annex
branch; if it is not ahead of the local git-annex branch, then
there's nothing to transfer. **Except** if the tree was not already
up-to-date before the loss of connectivity. So doing this needs
tracking of when the tree is not yet fully up-to-date.
* Optimisations in 5c3e14649ee7c404f86a1b82b648d896762cbbc2 temporarily
broke content syncing in some situations, which need to be added back.
Now syncing a disconnected remote only starts a transfer scan if the
remote's git-annex branch has diverged, which indicates it probably has
new files. But that leaves open the cases where the local repo has
new files; and where the two repos git branches are in sync, but the
content transfers are lagging behind; and where the transfer scan has
never been run.
Need to track locally whether we're believed to be in sync with a remote.
This includes:
* All local content has been transferred to it successfully.
* The remote has been scanned once for data to transfer from it, and all
transfers initiated by that scan succeeded.
Note the complication that, if it's initiated a transfer, our queued
transfer will be thrown out as unnecessary. But if its transfer then
fails, that needs to be noticed.
If we're going to track failed transfers, we could just set a flag,
and use that flag later to initiate a new transfer scan. We need a flag
in any case, to ensure that a transfer scan is run for each new remote.
The flag could be `.git/annex/transfer/scanned/uuid`.
But, if failed transfers are tracked, we could also record them, in
order to retry them later, without the scan. I'm thinking about a
directory like `.git/annex/transfer/failed/{upload,download}/uuid/`,
which failed transfer log files could be moved to.
Note that a remote may lose content it had before, so when requeuing
a failed download, should check the location log to see if it still has
the content, and if not, queue a download from elsewhere. (And, a remote
may get content we were uploading from elsewhere, so check the location
log when queuing a failed Upload too.)
* Ensure that when a remote receives content, and updates its location log,
it syncs that update back out. Prerequisite for:
* After git sync, identify new content that we don't have that is now available
@ -49,6 +76,10 @@ all the other git clones, at both the git level and the key/value level.
that need to be done to sync with a remote. Currently it walks the git
working copy and checks each file.
## misc todo
* --debug will show often unnecessary work being done. Optimise.
## data syncing
There are two parts to data syncing. First, map the network and second,
@ -163,8 +194,5 @@ redone to check it.
finishes. **done**
* Test MountWatcher on KDE, and add whatever dbus events KDE emits when
drives are mounted. **done**
* Possibly periodically, or when the network connection
changes, or some heuristic suggests that a remote was disconnected from
us for a while, queue remotes for processing by the TransferScanner.
**done**; both network-manager and wicd connection events are supported,
and it falls back to polling every 30 minutes when neither is available.
* It would be nice if, when a USB drive is connected,
syncing starts automatically. Use dbus on Linux? **done**