update

2012-08-22 15:45:20 -04:00 · 2012-08-22 15:45:20 -04:00 · e43feeb5b4
commit e43feeb5b4
parent 6873ca0c1b
1 changed files with 42 additions and 14 deletions
--- a/doc/design/assistant/syncing.mdwn
+++ b/doc/design/assistant/syncing.mdwn
@ -3,15 +3,42 @@ all the other git clones, at both the git level and the key/value level.
 ## immediate action items
-* Sync with all available remotes on startup.
+* Optimisations in 5c3e14649ee7c404f86a1b82b648d896762cbbc2 temporarily
-* TransferScanner should avoid unnecessary scanning of remotes.
+  broke content syncing in some situations, which need to be added back.
-  This is paricilarly important for scans queued by the NetWatcher,
+
-  which can be polling, or could be after a momentary blip in network
+  Now syncing a disconnected remote only starts a transfer scan if the
-  connectivity. The TransferScanner could check the remote's git-annex
+  remote's git-annex branch has diverged, which indicates it probably has
-  branch; if it is not ahead of the local git-annex branch, then
+  new files. But that leaves open the cases where the local repo has
-  there's nothing to transfer. **Except** if the tree was not already
+  new files; and where the two repos git branches are in sync, but the
-  up-to-date before the loss of connectivity. So doing this needs
+  content transfers are lagging behind; and where the transfer scan has
-  tracking of when the tree is not yet fully up-to-date.
+  never been run.
  Need to track locally whether we're believed to be in sync with a remote.
  This includes:
  * All local content has been transferred to it successfully.
  * The remote has been scanned once for data to transfer from it, and all
    transfers initiated by that scan succeeded.
  Note the complication that, if it's initiated a transfer, our queued
  transfer will be thrown out as unnecessary. But if its transfer then
  fails, that needs to be noticed.
  If we're going to track failed transfers, we could just set a flag,
  and use that flag later to initiate a new transfer scan. We need a flag
  in any case, to ensure that a transfer scan is run for each new remote.
  The flag could be `.git/annex/transfer/scanned/uuid`.
  But, if failed transfers are tracked, we could also record them, in 
  order to retry them later, without the scan. I'm thinking about a
  directory like `.git/annex/transfer/failed/{upload,download}/uuid/`,
  which failed transfer log files could be moved to.
  Note that a remote may lose content it had before, so when requeuing
  a failed download, should check the location log to see if it still has
  the content, and if not, queue a download from elsewhere. (And, a remote
  may get content we were uploading from elsewhere, so check the location
  log when queuing a failed Upload too.)
 * Ensure that when a remote receives content, and updates its location log,
  it syncs that update back out. Prerequisite for:
 * After git sync, identify new content that we don't have that is now available
@ -49,6 +76,10 @@ all the other git clones, at both the git level and the key/value level.
  that need to be done to sync with a remote. Currently it walks the git
  working copy and checks each file.
 ## misc todo
 * --debug will show often unnecessary work being done. Optimise.
 ## data syncing
 There are two parts to data syncing. First, map the network and second,
@ -163,8 +194,5 @@ redone to check it.
  finishes. **done**
 * Test MountWatcher on KDE, and add whatever dbus events KDE emits when
  drives are mounted. **done**
-* Possibly periodically, or when the network connection
+* It would be nice if, when a USB drive is connected, 
-  changes, or some heuristic suggests that a remote was disconnected from
+  syncing starts automatically. Use dbus on Linux? **done**
  us for a while, queue remotes for processing by the TransferScanner. 
  **done**; both network-manager and wicd connection events are supported,
  and it falls back to polling every 30 minutes when neither is available.