diff --git a/Backend/SHA.hs b/Backend/SHA.hs index 7abbf8035a..95ce4a7701 100644 --- a/Backend/SHA.hs +++ b/Backend/SHA.hs @@ -97,16 +97,17 @@ keyValueE :: SHASize -> KeySource -> Annex (Maybe Key) keyValueE size source = keyValue size source >>= maybe (return Nothing) addE where addE k = return $ Just $ k - { keyName = keyName k ++ extension + { keyName = keyName k ++ selectExtension (keyFilename source) , keyBackendName = shaNameE size } - naiveextension = takeExtension $ keyFilename source - extension - -- long or newline containing extensions are - -- probably not really an extension - | length naiveextension > 6 || - '\n' `elem` naiveextension = "" - | otherwise = naiveextension + +selectExtension :: FilePath -> String +selectExtension = join "." . reverse . take 2 . takeWhile shortenough . + reverse . split "." . takeExtensions + where + shortenough e + | '\n' `elem` e = False -- newline in extension?! + | otherwise = length e <= 4 -- long enough for "jpeg" {- A key's checksum is checked during fsck. -} checkKeyChecksum :: SHASize -> Key -> FilePath -> Annex Bool diff --git a/debian/changelog b/debian/changelog index 1c44f59526..5eaf9d52eb 100644 --- a/debian/changelog +++ b/debian/changelog @@ -9,6 +9,8 @@ git-annex (3.20120630) UNRELEASED; urgency=low but avoids portability problems. * Use SHA library for files less than 50 kb in size, at which point it's faster than forking the more optimised external program. + * SHAnE backends are now smarter about composite extensions, such as + .tar.gz Closes: #680450 -- Joey Hess Sun, 01 Jul 2012 15:04:37 -0400 diff --git a/doc/design/assistant/blog/day_25__transfer_queueing.mdwn b/doc/design/assistant/blog/day_25__transfer_queueing.mdwn new file mode 100644 index 0000000000..35922c0d11 --- /dev/null +++ b/doc/design/assistant/blog/day_25__transfer_queueing.mdwn @@ -0,0 +1,41 @@ +So as not to bury the lead, I've been hard at work on my first day in +Nicaragua, and ** the git-annex assistant fully syncs files (including +their contents) between remotes now !! ** + +Details follow.. + +Made the committer thread queue Upload Transfers when new files +are added to the annex. Currently it tries to transfer the new content +to *every* remote; this innefficiency needs to be addressed later. + +Made the watcher thread queue Download Transfers when new symlinks +appear that point to content we don't have. Typically, that will happen +after an automatic merge from a remote. This needs to be improved as it +currently adds Transfers from every remote, not just those that have the +content. + +This was the second place that needed an ordered list of remotes +to talk to. So I cached such a list in the DaemonStatus state info. +This will also be handy later on, when the webapp is used to add new +remotes, so the assistant can know about them immediately. + +Added YAT (Yet Another Thread), number 15 or so, the transferrer thread +that waits for transfers to be queued and runs them. Currently a naive +implementation, it runs one transfer at a time, and does not do anything +to recover when a transfer fails. + +Actually transferring content requires YAT, so that the transfer +action can run in a copy of the Annex monad, without blocking +all the assistant's other threads from entering that monad while a transfer +is running. This is also necessary to allow multiple concurrent transfers +to run in the future. + +This is a very tricky peice of code, because that thread will modify the +git-annex branch, and its parent thread has to invalidate its cache in +order to see any changes the child thread made. Hopefully that's the extent +of the complication of doing this. The only reason this was possible at all +is that git-annex already support multiple concurrent processes running +and all making independant changes to the git-annex branch, etc. + +After all my groundwork this week, file content transferring is now +fully working! diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn index 76e2e18322..343a0e4aaa 100644 --- a/doc/design/assistant/syncing.mdwn +++ b/doc/design/assistant/syncing.mdwn @@ -21,8 +21,11 @@ all the other git clones, at both the git level and the key/value level. Watcher. **done** * Write basic Transfer handling thread. Multiple such threads need to be able to be run at once. Each will need its own independant copy of the - Annex state monad. + Annex state monad. **done** * Write transfer control thread, which decides when to launch transfers. + **done** +* Check that download transfer triggering code works (when a symlink appears + and the remote does *not* upload to us. * At startup, and possibly periodically, look for files we have that location tracking indicates remotes do not, and enqueue Uploads for them. Also, enqueue Downloads for any files we're missing. @@ -86,35 +89,6 @@ reachable remote. This is worth doing first, since it's the simplest way to get the basic functionality of the assistant to work. And we'll need this anyway. -### transfer tracking - -Transfer threads started/stopped as necessary to move data. -(May sometimes want multiple threads downloading, or uploading, or even both.) - - startTransfer :: TransferQueue -> Transfer -> Annex () - startTransfer q transfer = error "TODO" - - stopTransfer :: TransferQueue -> TransferID -> Annex () - stopTransfer q transfer = error "TODO" - -The assistant needs to find out when `git-annex-shell` is receiving or -sending (triggered by another remote), so it can add data for those too. -This is important to avoid uploading content to a remote that is already -downloading it from us, or vice versa, as well as to in future let the web -app manage transfers as user desires. - -For files being received, it can see the temp file, but other than lsof -there's no good way to find the pid (and I'd rather not kill blindly). - -For files being sent, there's no filesystem indication. So git-annex-shell -(and other git-annex transfer processes) should write a status file to disk. - -Can use file locking on these status files to claim upload/download rights, -which will avoid races. - -This status file can also be updated periodically to show amount of transfer -complete (necessary for tracking uploads). - ## other considerations This assumes the network is connected. It's often not, so the diff --git a/doc/forum/Problems_using_submodules_with_git-annex__63__/comment_1_c7a927736d419d3c31c912001ff16ee4._comment b/doc/forum/Problems_using_submodules_with_git-annex__63__/comment_1_c7a927736d419d3c31c912001ff16ee4._comment new file mode 100644 index 0000000000..3c2f5addba --- /dev/null +++ b/doc/forum/Problems_using_submodules_with_git-annex__63__/comment_1_c7a927736d419d3c31c912001ff16ee4._comment @@ -0,0 +1,7 @@ +[[!comment format=mdwn + username="http://joeyh.name/" + subject="comment 1" + date="2012-07-05T17:04:34Z" + content=""" +I haven't tried it either, but I think it should work ok, as long as you bear in mind that to git-annex, each submodule will be treated as a separate git repository. +"""]]