add --force-annex/--force-git

options make it easier to override annex.largefiles configuration
(and potentially safer as it avoids bugs like the smudge bug fixed
in the last release)

Deleted some old comments that were posted to the man page discussing such
options.

Updated docs that used -c annex.largefiles to use the options.

Note that addSmallOverridden was needed to avoid the clean filter running
on the file. It would be possible to make addFile also update the index
directly, rather than going via git add. However, it was not necessary,
and I want to avoid breaking on some edge case, particularly if the code in
addSmallOverridden has some oversight.

Also, when annex.addunlocked is set and annex.largefiles does not match a file,
git annex add --force-large works, but git status will then show the file
as added, with a unstaged modification. The unstaged modification adds the
file to git. This is identical behavior to using -c annex.largefiles=nothing
when annex.addunlocked is set. This does not prevent committing what was
intended to be added. I have not gotten to the bottom of why git thinks
the file is modified and runs it through the clean filter in this case.
This commit is contained in:
Joey Hess 2020-01-01 14:03:06 -04:00
parent 022dead40a
commit 503788238c
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
11 changed files with 74 additions and 84 deletions

View file

@ -1,3 +1,11 @@
git-annex (7.20191231) UNRELEASED; urgency=medium
* add: --force-annex/--force-git options make it easier to override
annex.largefiles configuration (and potentially safer as it avoids
bugs like the smudge bug fixed in the last release).
-- Joey Hess <id@joeyh.name> Wed, 01 Jan 2020 12:51:40 -0400
git-annex (7.20191230) upstream; urgency=medium git-annex (7.20191230) upstream; urgency=medium
* Optimised processing of many files, especially by commands like find * Optimised processing of many files, especially by commands like find

View file

@ -1,6 +1,6 @@
{- git-annex command {- git-annex command
- -
- Copyright 2010-2017 Joey Hess <id@joeyh.name> - Copyright 2010-2020 Joey Hess <id@joeyh.name>
- -
- Licensed under the GNU AGPL version 3 or higher. - Licensed under the GNU AGPL version 3 or higher.
-} -}
@ -17,8 +17,12 @@ import qualified Database.Keys
import Annex.FileMatcher import Annex.FileMatcher
import Annex.Link import Annex.Link
import Annex.Tmp import Annex.Tmp
import Annex.HashObject
import Messages.Progress import Messages.Progress
import Git.Types
import Git.FilePath import Git.FilePath
import qualified Git.UpdateIndex
import Utility.FileMode
import qualified Utility.RawFilePath as R import qualified Utility.RawFilePath as R
cmd :: Command cmd :: Command
@ -32,6 +36,7 @@ data AddOptions = AddOptions
, includeDotFiles :: Bool , includeDotFiles :: Bool
, batchOption :: BatchMode , batchOption :: BatchMode
, updateOnly :: Bool , updateOnly :: Bool
, largeFilesOverride :: Maybe Bool
} }
optParser :: CmdParamsDesc -> Parser AddOptions optParser :: CmdParamsDesc -> Parser AddOptions
@ -47,18 +52,31 @@ optParser desc = AddOptions
<> short 'u' <> short 'u'
<> help "only update tracked files" <> help "only update tracked files"
) )
<*> (parseforcelarge <|> parseforcesmall)
where
parseforcelarge = flag Nothing (Just True)
( long "force-large"
<> help "add all files to annex, ignoring other configuration"
)
parseforcesmall = flag Nothing (Just False)
( long "force-small"
<> help "add all files to git, ignoring other configuration"
)
seek :: AddOptions -> CommandSeek seek :: AddOptions -> CommandSeek
seek o = startConcurrency commandStages $ do seek o = startConcurrency commandStages $ do
largematcher <- largeFilesMatcher largematcher <- largeFilesMatcher
addunlockedmatcher <- addUnlockedMatcher addunlockedmatcher <- addUnlockedMatcher
let gofile file = ifM (checkFileMatcher largematcher (fromRawFilePath file) <||> Annex.getState Annex.force) let gofile file = case largeFilesOverride o of
( start file addunlockedmatcher Nothing -> ifM (checkFileMatcher largematcher (fromRawFilePath file) <||> Annex.getState Annex.force)
, ifM (annexAddSmallFiles <$> Annex.getGitConfig) ( start file addunlockedmatcher
( startSmall file , ifM (annexAddSmallFiles <$> Annex.getGitConfig)
, stop ( startSmall file
, stop
)
) )
) Just True -> start file addunlockedmatcher
Just False -> startSmallOverridden file
case batchOption o of case batchOption o of
Batch fmt Batch fmt
| updateOnly o -> | updateOnly o ->
@ -82,6 +100,29 @@ addSmall file = do
showNote "non-large file; adding content to git repository" showNote "non-large file; adding content to git repository"
addFile file addFile file
startSmallOverridden :: RawFilePath -> CommandStart
startSmallOverridden file = starting "add" (ActionItemWorkTreeFile file) $
next $ addSmallOverridden file
addSmallOverridden :: RawFilePath -> Annex Bool
addSmallOverridden file = do
showNote "adding content to git repository"
let file' = fromRawFilePath file
s <- liftIO $ getFileStatus file'
if isSymbolicLink s
then addFile file
else do
-- Can't use addFile because the clean filter will
-- honor annex.largefiles and it has been overridden.
-- Instead, hash the file and add to the index.
sha <- hashFile file'
let ty = if isExecutable (fileMode s)
then TreeExecutable
else TreeFile
Annex.Queue.addUpdateIndex =<<
inRepo (Git.UpdateIndex.stageFile sha ty file')
return True
addFile :: RawFilePath -> Annex Bool addFile :: RawFilePath -> Annex Bool
addFile file = do addFile file = do
ps <- forceParams ps <- forceParams

View file

@ -39,6 +39,16 @@ annexed content, and other symlinks.
Add gitignored files. Add gitignored files.
* `--force-large`
Treat all files as large files, ignoring annex.largefiles configuration,
and add to the annex.
* `--force-small`
Treat all files as small files, ignoring annex.largefiles configuration,
and add to git, also ignoring annex.addsmallfiles configuration.
* `--backend` * `--backend`
Specifies which key-value backend to use. Specifies which key-value backend to use.

View file

@ -1,12 +0,0 @@
[[!comment format=mdwn
username="rrnewton@63c9faa1997c908b1dc04dfdca33c809660cd158"
nickname="rrnewton"
avatar="http://cdn.libravatar.org/avatar/638acc3e55c2bb09aa0dcca5b5c8acb6"
subject="Flag to force same behavior as annex.largefiles attribute?"
date="2018-05-21T05:29:06Z"
content="""
When in [direct mode](https://git-annex.branchable.com/direct_mode), the \"add the non-large file directly to the git repository\" behavior described above is very useful, because the option of typing simply `git add foo`, does not exist as it does in [indirect mode](https://git-annex.branchable.com/git-annex-indirect/).
However, I can't see any combination of flags that trigger this behavior. I suppose it can be accomplished by temporarily setting [annex.largefiles](https://git-annex.branchable.com/tips/largefiles/) to a huge value before executing `git annex add` (i.e. creating a `.gitattributes` and then deleting it). I think I'll try that as a work-around, but it would be great to have a flag that accomplishes this.
"""]]

View file

@ -1,12 +0,0 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2018-05-21T16:36:51Z"
content="""
@rrnewton I know people do commonly accomplish this
by something like `git -c annex.largefiles='exclude(*)' annex add`
A shorter way to write that would only be useful for direct mode,
so I'm inclined not to add it, but open a todo item if you want to discuss
that.
"""]]

View file

@ -1,14 +0,0 @@
[[!comment format=mdwn
username="rrnewton@63c9faa1997c908b1dc04dfdca33c809660cd158"
nickname="rrnewton"
avatar="http://cdn.libravatar.org/avatar/638acc3e55c2bb09aa0dcca5b5c8acb6"
subject="Sounds great!"
date="2018-05-21T18:09:35Z"
content="""
That's fabulous. A Bash alias around that command is really all I need when working in direct mode. (And the archive's too damn big to switch back and forth between direct/indirect.)
I was just too much a newb with git attributes to know it could be done that way. For discoverability, maybe that command could be placed in an \"examples\" section in the primary documentation above?
"""]]

View file

@ -1,8 +0,0 @@
[[!comment format=mdwn
username="timeless-ventricle"
avatar="http://cdn.libravatar.org/avatar/0b220fa4c0b59e883f360979ee745d63"
subject="comment 4"
date="2019-01-06T12:24:49Z"
content="""
@joey I'm obviously missing something here, why would a shorter way to write that only be useful for direct mode? I don't understand what the connection is between direct mode and wanting to specify whether this is a \"regular git\" file or an annexed file (except that direct mode is not supported in v7)? I thought it was considered supported to have a mix of both large binary files and text files? Even if some text files are large, I think I want to add them as files whose content is tracked by git, so I think I want to choose 'by hand' -- is that not really supported / considered a bad idea for some reason?
"""]]

View file

@ -1,10 +0,0 @@
[[!comment format=mdwn
username="joey"
subject="""comment 5"""
date="2019-01-22T21:10:37Z"
content="""
Because "git add foo" does not work in direct mode.
This is really not the place to be having a conversation about this. If you
want something changed in git-annex, open a bug report or todo item.
"""]]

View file

@ -1,16 +0,0 @@
[[!comment format=mdwn
username="johnmario.itec19@69a7b742534851b36216e0f951f1a00dbb9067cd"
nickname="johnmario.itec19"
avatar="http://cdn.libravatar.org/avatar/2f07ffce1656bdcd6aa19aaab7517975"
subject="commenting on git-annex-add"
date="2019-09-02T06:21:27Z"
content="""
Yes you can do that. Simplest way is to git add the files you want to directly be in the git repo (e.g. the source code) and git annex add the large files.
You can then check in any changes to the source code files (or anything else you added with git add) to github as normal.
You can manage the storage and versioning of the large files using git annex commands. Git annex supports using AWS S3 and/or glacier for backing up the files. It can also back them up to a server you control over ssh or to an external drive (or any combination of the above). http://git-annex.branchable.com/special_remotes/
With the latest version of git annex, you can also set up automatically filters that decide which types/sizes of files to check in directly to git vs which ones to store as links in the annex. https://git-annex.branchable.com/tips/largefiles/
For more tech related assistance or support <a href=\"https://uaedatarecovery.com/data-recovery-dubai/\">Data Recovery Dubai</a>
"""]]

View file

@ -89,7 +89,7 @@ If you've set up an annex.largefiles configuration but want to force a file to
be stored in the annex, you can temporarily override the configuration like be stored in the annex, you can temporarily override the configuration like
this: this:
git annex add -c annex.largefiles=anything smallfile git annex add --force-large smallfile
## converting git to annexed ## converting git to annexed
@ -97,7 +97,7 @@ When you have a file that is currently stored in git, and you want to
convert that to be stored in the annex, here's how to accomplish that: convert that to be stored in the annex, here's how to accomplish that:
git rm --cached file git rm --cached file
git annex add -c annex.largefiles=anything file git annex add --force-large file
git commit file git commit file
This first removes the file from git's index cache, and then adds it back This first removes the file from git's index cache, and then adds it back
@ -111,7 +111,7 @@ convert that to be stored in git, here's how to accomplish that:
git annex unlock file git annex unlock file
git rm --cached file git rm --cached file
git -c annex.largefiles=nothing add file git annex add --force-small file
git commit file git commit file
You can modify the file after unlocking it and before adding it to You can modify the file after unlocking it and before adding it to

View file

@ -1,5 +1,6 @@
Make `git-annex add --annex` and `git-annex add --git` add a specific file to Make `git-annex add --force-large` and `git-annex add --force-small`
annex or git, bypassing annex.largefiles and all other configuration and state. add a specific file to annex or git, bypassing annex.largefiles
and all other configuration and state.
One reason to want this is that it avoids users doing stuff like this: One reason to want this is that it avoids users doing stuff like this:
@ -11,3 +12,5 @@ Such a temporary setting of annex.largefiles can be problimatic, as explored in
Also, this could also be used to easily switch a file from one storage to Also, this could also be used to easily switch a file from one storage to
the other. I suppose the file would have to be touched first to make git-annex the other. I suppose the file would have to be touched first to make git-annex
add process it? add process it?
> [[done]] --[[Joey]]