2011-09-28 19:15:42 +00:00
|
|
|
{- git cat-file interface, with handle automatically stored in the Annex monad
|
|
|
|
-
|
2013-05-15 22:46:38 +00:00
|
|
|
- Copyright 2011-2013 Joey Hess <joey@kitenet.net>
|
2011-09-28 19:15:42 +00:00
|
|
|
-
|
|
|
|
- Licensed under the GNU GPL version 3 or higher.
|
|
|
|
-}
|
|
|
|
|
2011-10-04 04:40:47 +00:00
|
|
|
module Annex.CatFile (
|
2011-11-12 21:45:12 +00:00
|
|
|
catFile,
|
detect and recover from branch push/commit race
Dealing with a race without using locking is exceedingly difficult and tricky.
Fully tested, I hope.
There are three places left where the branch can be updated, that are not
covered by the race recovery code. Let's prove they're all immune to the
race:
1. tryFastForwardTo checks to see if a fast-forward can be done,
and then does git-update-ref on the branch to fast-forward it.
If a push comes in before the check, then either no fast-forward
will be done (ok), or the push set the branch to a ref that can
still be fast-forwarded (also ok)
If a push comes in after the check, the git-update-ref will
undo the ref change made by the push. It's as if the push did not come
in, and the next git-push will see this, and try to re-do it.
(acceptable)
2. When creating the branch for the very first time, an empty index
is created, and a commit of it made to the branch. The commit's ref
is recorded as the current state of the index. If a push came in
during that, it will be noticed the next time a commit is made to the
branch, since the branch will have changed. (ok)
3. Creating the branch from an existing remote branch involves making
the branch, and then getting its ref, and recording that the index
reflects that ref.
If a push creates the branch first, git-branch will fail (ok).
If the branch is created and a racing push is then able to change it
(highly unlikely!) we're still ok, because it first records the ref into
the index.lck, and then updating the index. The race can cause the
index.lck to have the old branch ref, while the index has the newly pushed
branch merged into it, but that only results in an unnecessary update of
the index file later on.
2011-12-11 22:39:53 +00:00
|
|
|
catObject,
|
2013-09-19 20:30:37 +00:00
|
|
|
catTree,
|
2012-06-10 23:58:34 +00:00
|
|
|
catObjectDetails,
|
2013-01-05 19:26:22 +00:00
|
|
|
catFileHandle,
|
2012-12-12 23:20:38 +00:00
|
|
|
catKey,
|
2013-01-05 19:26:22 +00:00
|
|
|
catKeyFile,
|
2013-08-22 17:57:07 +00:00
|
|
|
catKeyFileHEAD,
|
2011-09-28 19:15:42 +00:00
|
|
|
) where
|
|
|
|
|
2012-06-20 17:13:40 +00:00
|
|
|
import qualified Data.ByteString.Lazy as L
|
2013-05-15 22:46:38 +00:00
|
|
|
import qualified Data.Map as M
|
2013-09-19 20:30:37 +00:00
|
|
|
import System.PosixCompat.Types
|
2011-11-12 21:45:12 +00:00
|
|
|
|
2011-10-05 20:02:51 +00:00
|
|
|
import Common.Annex
|
improve type signatures with a Ref newtype
In git, a Ref can be a Sha, or a Branch, or a Tag. I added type aliases for
those. Note that this does not prevent mixing up of eg, refs and branches
at the type level. Since git really doesn't care, except rare cases like
git update-ref, or git tag -d, that seems ok for now.
There's also a tree-ish, but let's just use Ref for it. A given Sha or Ref
may or may not be a tree-ish, depending on the object type, so there seems
no point in trying to represent it at the type level.
2011-11-16 06:23:34 +00:00
|
|
|
import qualified Git
|
2011-09-28 19:15:42 +00:00
|
|
|
import qualified Git.CatFile
|
|
|
|
import qualified Annex
|
2012-06-10 23:58:34 +00:00
|
|
|
import Git.Types
|
2013-05-12 22:18:48 +00:00
|
|
|
import Git.FilePath
|
2013-09-19 20:30:37 +00:00
|
|
|
import Git.FileMode
|
2013-11-07 17:55:36 +00:00
|
|
|
import qualified Git.Ref
|
2011-09-28 19:15:42 +00:00
|
|
|
|
improve type signatures with a Ref newtype
In git, a Ref can be a Sha, or a Branch, or a Tag. I added type aliases for
those. Note that this does not prevent mixing up of eg, refs and branches
at the type level. Since git really doesn't care, except rare cases like
git update-ref, or git tag -d, that seems ok for now.
There's also a tree-ish, but let's just use Ref for it. A given Sha or Ref
may or may not be a tree-ish, depending on the object type, so there seems
no point in trying to represent it at the type level.
2011-11-16 06:23:34 +00:00
|
|
|
catFile :: Git.Branch -> FilePath -> Annex L.ByteString
|
2011-11-12 21:45:12 +00:00
|
|
|
catFile branch file = do
|
|
|
|
h <- catFileHandle
|
|
|
|
liftIO $ Git.CatFile.catFile h branch file
|
|
|
|
|
detect and recover from branch push/commit race
Dealing with a race without using locking is exceedingly difficult and tricky.
Fully tested, I hope.
There are three places left where the branch can be updated, that are not
covered by the race recovery code. Let's prove they're all immune to the
race:
1. tryFastForwardTo checks to see if a fast-forward can be done,
and then does git-update-ref on the branch to fast-forward it.
If a push comes in before the check, then either no fast-forward
will be done (ok), or the push set the branch to a ref that can
still be fast-forwarded (also ok)
If a push comes in after the check, the git-update-ref will
undo the ref change made by the push. It's as if the push did not come
in, and the next git-push will see this, and try to re-do it.
(acceptable)
2. When creating the branch for the very first time, an empty index
is created, and a commit of it made to the branch. The commit's ref
is recorded as the current state of the index. If a push came in
during that, it will be noticed the next time a commit is made to the
branch, since the branch will have changed. (ok)
3. Creating the branch from an existing remote branch involves making
the branch, and then getting its ref, and recording that the index
reflects that ref.
If a push creates the branch first, git-branch will fail (ok).
If the branch is created and a racing push is then able to change it
(highly unlikely!) we're still ok, because it first records the ref into
the index.lck, and then updating the index. The race can cause the
index.lck to have the old branch ref, while the index has the newly pushed
branch merged into it, but that only results in an unnecessary update of
the index file later on.
2011-12-11 22:39:53 +00:00
|
|
|
catObject :: Git.Ref -> Annex L.ByteString
|
|
|
|
catObject ref = do
|
|
|
|
h <- catFileHandle
|
|
|
|
liftIO $ Git.CatFile.catObject h ref
|
|
|
|
|
2013-09-19 20:30:37 +00:00
|
|
|
catTree :: Git.Ref -> Annex [(FilePath, FileMode)]
|
|
|
|
catTree ref = do
|
|
|
|
h <- catFileHandle
|
|
|
|
liftIO $ Git.CatFile.catTree h ref
|
|
|
|
|
2013-10-20 21:50:51 +00:00
|
|
|
catObjectDetails :: Git.Ref -> Annex (Maybe (L.ByteString, Sha, ObjectType))
|
2012-06-10 23:58:34 +00:00
|
|
|
catObjectDetails ref = do
|
|
|
|
h <- catFileHandle
|
|
|
|
liftIO $ Git.CatFile.catObjectDetails h ref
|
|
|
|
|
2013-05-15 22:46:38 +00:00
|
|
|
{- There can be multiple index files, and a different cat-file is needed
|
|
|
|
- for each. This is selected by setting GIT_INDEX_FILE in the gitEnv. -}
|
2011-11-12 21:45:12 +00:00
|
|
|
catFileHandle :: Annex Git.CatFile.CatFileHandle
|
2013-05-15 22:46:38 +00:00
|
|
|
catFileHandle = do
|
|
|
|
m <- Annex.getState Annex.catfilehandles
|
|
|
|
indexfile <- fromMaybe "" . maybe Nothing (lookup "GIT_INDEX_FILE")
|
|
|
|
<$> fromRepo gitEnv
|
|
|
|
case M.lookup indexfile m of
|
|
|
|
Just h -> return h
|
|
|
|
Nothing -> do
|
|
|
|
h <- inRepo Git.CatFile.catFileStart
|
|
|
|
let m' = M.insert indexfile h m
|
|
|
|
Annex.changeState $ \s -> s { Annex.catfilehandles = m' }
|
|
|
|
return h
|
2012-12-12 23:20:38 +00:00
|
|
|
|
2013-09-19 20:30:37 +00:00
|
|
|
{- From the Sha or Ref of a symlink back to the key.
|
|
|
|
-
|
|
|
|
- Requires a mode witness, to guarantee that the file is a symlink.
|
|
|
|
-}
|
|
|
|
catKey :: Ref -> FileMode -> Annex (Maybe Key)
|
2013-09-20 00:09:03 +00:00
|
|
|
catKey = catKey' True
|
|
|
|
|
|
|
|
catKey' :: Bool -> Ref -> FileMode -> Annex (Maybe Key)
|
|
|
|
catKey' modeguaranteed ref mode
|
2013-09-19 20:30:37 +00:00
|
|
|
| isSymLink mode = do
|
2013-09-20 00:09:03 +00:00
|
|
|
l <- fromInternalGitPath . encodeW8 . L.unpack <$> get
|
2013-09-19 20:30:37 +00:00
|
|
|
return $ if isLinkToAnnex l
|
|
|
|
then fileKey $ takeFileName l
|
|
|
|
else Nothing
|
|
|
|
| otherwise = return Nothing
|
2013-09-20 00:09:03 +00:00
|
|
|
where
|
|
|
|
-- If the mode is not guaranteed to be correct, avoid
|
|
|
|
-- buffering the whole file content, which might be large.
|
|
|
|
-- 8192 is enough if it really is a symlink.
|
|
|
|
get
|
|
|
|
| modeguaranteed = catObject ref
|
|
|
|
| otherwise = L.take 8192 <$> catObject ref
|
2013-09-19 20:30:37 +00:00
|
|
|
|
|
|
|
{- Looks up the file mode corresponding to the Ref using the running
|
|
|
|
- cat-file.
|
|
|
|
-
|
|
|
|
- Currently this always has to look in HEAD, because cat-file --batch
|
|
|
|
- does not offer a way to specify that we want to look up a tree object
|
|
|
|
- in the index. So if the index has a file staged not as a symlink,
|
2013-09-20 00:09:03 +00:00
|
|
|
- and it is a symlink in head, the wrong mode is gotten.
|
2013-09-19 20:30:37 +00:00
|
|
|
- Also, we have to assume the file is a symlink if it's not yet committed
|
2013-09-20 00:09:03 +00:00
|
|
|
- to HEAD. For these reasons, modeguaranteed is not set.
|
2013-09-19 20:30:37 +00:00
|
|
|
-}
|
|
|
|
catKeyChecked :: Bool -> Ref -> Annex (Maybe Key)
|
2013-09-20 00:09:03 +00:00
|
|
|
catKeyChecked needhead ref@(Ref r) =
|
|
|
|
catKey' False ref =<< findmode <$> catTree treeref
|
2013-09-19 20:30:37 +00:00
|
|
|
where
|
|
|
|
pathparts = split "/" r
|
|
|
|
dir = intercalate "/" $ take (length pathparts - 1) pathparts
|
|
|
|
file = fromMaybe "" $ lastMaybe pathparts
|
|
|
|
treeref = Ref $ if needhead then "HEAD" ++ dir ++ "/" else dir ++ "/"
|
|
|
|
findmode = fromMaybe symLinkMode . headMaybe .
|
|
|
|
map snd . filter (\p -> fst p == file)
|
2013-01-05 19:26:22 +00:00
|
|
|
|
2013-05-25 04:37:41 +00:00
|
|
|
{- From a file in the repository back to the key.
|
|
|
|
-
|
|
|
|
- Ideally, this should reflect the key that's staged in the index,
|
|
|
|
- not the key that's committed to HEAD. Unfortunately, git cat-file
|
|
|
|
- does not refresh the index file after it's started up, so things
|
|
|
|
- newly staged in the index won't show up. It does, however, notice
|
|
|
|
- when branches change.
|
|
|
|
-
|
|
|
|
- For command-line git-annex use, that doesn't matter. It's perfectly
|
|
|
|
- reasonable for things staged in the index after the currently running
|
2013-09-19 20:30:37 +00:00
|
|
|
- git-annex process to not be noticed by it. However, we do want to see
|
|
|
|
- what's in the index, since it may have uncommitted changes not in HEAD>
|
2013-05-25 04:37:41 +00:00
|
|
|
-
|
|
|
|
- For the assistant, this is much more of a problem, since it commits
|
|
|
|
- files and then needs to be able to immediately look up their keys.
|
|
|
|
- OTOH, the assistant doesn't keep changes staged in the index for very
|
|
|
|
- long at all before committing them -- and it won't look at the keys
|
|
|
|
- of files until after committing them.
|
|
|
|
-
|
|
|
|
- So, this gets info from the index, unless running as a daemon.
|
2013-01-05 19:26:22 +00:00
|
|
|
-}
|
|
|
|
catKeyFile :: FilePath -> Annex (Maybe Key)
|
2013-05-25 04:37:41 +00:00
|
|
|
catKeyFile f = ifM (Annex.getState Annex.daemon)
|
2013-08-22 17:57:07 +00:00
|
|
|
( catKeyFileHEAD f
|
2013-11-07 17:55:36 +00:00
|
|
|
, catKeyChecked True $ Git.Ref.fileRef f
|
2013-05-25 04:37:41 +00:00
|
|
|
)
|
2013-08-22 17:57:07 +00:00
|
|
|
|
|
|
|
catKeyFileHEAD :: FilePath -> Annex (Maybe Key)
|
2013-11-07 17:55:36 +00:00
|
|
|
catKeyFileHEAD f = catKeyChecked False $ Git.Ref.fileFromRef Git.Ref.headRef f
|