2010-11-02 23:04:24 +00:00
|
|
|
{- git-annex command
|
|
|
|
-
|
2021-01-04 17:12:28 +00:00
|
|
|
- Copyright 2010-2021 Joey Hess <id@joeyh.name>
|
2010-11-02 23:04:24 +00:00
|
|
|
-
|
2019-03-13 19:48:14 +00:00
|
|
|
- Licensed under the GNU AGPL version 3 or higher.
|
2010-11-02 23:04:24 +00:00
|
|
|
-}
|
|
|
|
|
|
|
|
module Command.Add where
|
|
|
|
|
|
|
|
import Command
|
2015-12-22 17:23:33 +00:00
|
|
|
import Annex.Ingest
|
2011-10-15 20:21:08 +00:00
|
|
|
import Logs.Location
|
2011-10-04 04:40:47 +00:00
|
|
|
import Annex.Content
|
fully support core.symlinks=false in all relevant symlink handling code
Refactored annex link code into nice clean new library.
Audited and dealt with calls to createSymbolicLink.
Remaining calls are all safe, because:
Annex/Link.hs: ( liftIO $ createSymbolicLink linktarget file
only when core.symlinks=true
Assistant/WebApp/Configurators/Local.hs: createSymbolicLink link link
test if symlinks can be made
Command/Fix.hs: liftIO $ createSymbolicLink link file
command only works in indirect mode
Command/FromKey.hs: liftIO $ createSymbolicLink link file
command only works in indirect mode
Command/Indirect.hs: liftIO $ createSymbolicLink l f
refuses to run if core.symlinks=false
Init.hs: createSymbolicLink f f2
test if symlinks can be made
Remote/Directory.hs: go [file] = catchBoolIO $ createSymbolicLink file f >> return True
fast key linking; catches failure to make symlink and falls back to copy
Remote/Git.hs: liftIO $ catchBoolIO $ createSymbolicLink loc file >> return True
ditto
Upgrade/V1.hs: liftIO $ createSymbolicLink link f
v1 repos could not be on a filesystem w/o symlinks
Audited and dealt with calls to readSymbolicLink.
Remaining calls are all safe, because:
Annex/Link.hs: ( liftIO $ catchMaybeIO $ readSymbolicLink file
only when core.symlinks=true
Assistant/Threads/Watcher.hs: ifM ((==) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file))
code that fixes real symlinks when inotify sees them
It's ok to not fix psdueo-symlinks.
Assistant/Threads/Watcher.hs: mlink <- liftIO (catchMaybeIO $ readSymbolicLink file)
ditto
Command/Fix.hs: stopUnless ((/=) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) $ do
command only works in indirect mode
Upgrade/V1.hs: getsymlink = takeFileName <$> readSymbolicLink file
v1 repos could not be on a filesystem w/o symlinks
Audited and dealt with calls to isSymbolicLink.
(Typically used with getSymbolicLinkStatus, but that is just used because
getFileStatus is not as robust; it also works on pseudolinks.)
Remaining calls are all safe, because:
Assistant/Threads/SanityChecker.hs: | isSymbolicLink s -> addsymlink file ms
only handles staging of symlinks that were somehow not staged
(might need to be updated to support pseudolinks, but this is
only a belt-and-suspenders check anyway, and I've never seen the code run)
Command/Add.hs: if isSymbolicLink s || not (isRegularFile s)
avoids adding symlinks to the annex, so not relevant
Command/Indirect.hs: | isSymbolicLink s -> void $ flip whenAnnexed f $
only allowed on systems that support symlinks
Command/Indirect.hs: whenM (liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f) $ do
ditto
Seek.hs:notSymlink f = liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f
used to find unlocked files, only relevant in indirect mode
Utility/FSEvents.hs: | Files.isSymbolicLink s = runhook addSymlinkHook $ Just s
Utility/FSEvents.hs: | Files.isSymbolicLink s ->
Utility/INotify.hs: | Files.isSymbolicLink s ->
Utility/INotify.hs: checkfiletype Files.isSymbolicLink addSymlinkHook f
Utility/Kqueue.hs: | Files.isSymbolicLink s = callhook addSymlinkHook (Just s) change
all above are lower-level, not relevant
Audited and dealt with calls to isSymLink.
Remaining calls are all safe, because:
Annex/Direct.hs: | isSymLink (getmode item) =
This is looking at git diff-tree objects, not files on disk
Command/Unused.hs: | isSymLink (LsTree.mode l) = do
This is looking at git ls-tree, not file on disk
Utility/FileMode.hs:isSymLink :: FileMode -> Bool
Utility/FileMode.hs:isSymLink = checkMode symbolicLinkMode
low-level
Done!!
2013-02-17 19:05:55 +00:00
|
|
|
import qualified Annex
|
|
|
|
import qualified Annex.Queue
|
2016-01-20 20:36:33 +00:00
|
|
|
import qualified Database.Keys
|
2013-03-29 20:17:13 +00:00
|
|
|
import Annex.FileMatcher
|
2016-05-16 19:30:40 +00:00
|
|
|
import Annex.Link
|
2019-05-07 17:04:39 +00:00
|
|
|
import Annex.Tmp
|
2019-06-25 17:12:47 +00:00
|
|
|
import Messages.Progress
|
2016-05-16 19:30:40 +00:00
|
|
|
import Git.FilePath
|
2019-12-26 20:24:40 +00:00
|
|
|
import Config.GitConfig
|
2021-01-04 17:12:28 +00:00
|
|
|
import Config.Smudge
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
import Utility.OptParse
|
smudge: check for known annexed inodes before checking annex.largefiles
smudge: Fix a case where an unlocked annexed file that annex.largefiles
does not match could get its unchanged content checked into git, due to git
running the smudge filter unecessarily.
When the file has the same inodecache as an already annexed file,
we can assume that the user is not intending to change how it's stored in
git.
Note that checkunchangedgitfile already handled the inverse case, where the
file was added to git previously. That goes further and actually sha1
hashes the new file and checks if it's the same hash in the index.
It would be possible to generate a key for the file and see if it's the
same as the old key, however that could be considerably more expensive than
sha1 of a small file is, and it is not necessary for the case I have, at
least, where the file is not modified or touched, and so its inode will
match the cache.
git-annex add was changed, when adding a small file, to remove the inode
cache for it. This is necessary to keep the recipe in
doc/tips/largefiles.mdwn for converting from annex to git working.
It also avoids bugs/case_where_using_pathspec_with_git-commit_leaves_s.mdwn
which the earlier try at this change introduced.
2021-05-10 17:05:08 +00:00
|
|
|
import Utility.InodeCache
|
|
|
|
import Annex.InodeSentinal
|
2019-12-06 19:37:12 +00:00
|
|
|
import qualified Utility.RawFilePath as R
|
2010-11-02 23:04:24 +00:00
|
|
|
|
2015-07-08 16:33:27 +00:00
|
|
|
cmd :: Command
|
2018-02-19 18:28:17 +00:00
|
|
|
cmd = notBareRepo $
|
2020-10-19 19:36:18 +00:00
|
|
|
withGlobalOptions opts $
|
2018-02-19 18:28:17 +00:00
|
|
|
command "add" SectionCommon "add files to annex"
|
|
|
|
paramPaths (seek <$$> optParser)
|
2020-10-19 19:36:18 +00:00
|
|
|
where
|
|
|
|
opts =
|
|
|
|
[ jobsOption
|
|
|
|
, jsonOptions
|
|
|
|
, jsonProgressOption
|
|
|
|
, fileMatchingOptions LimitDiskFiles
|
|
|
|
]
|
2015-02-06 21:08:14 +00:00
|
|
|
|
2015-07-10 17:18:46 +00:00
|
|
|
data AddOptions = AddOptions
|
|
|
|
{ addThese :: CmdParams
|
2016-01-19 21:46:46 +00:00
|
|
|
, batchOption :: BatchMode
|
2017-04-07 19:55:34 +00:00
|
|
|
, updateOnly :: Bool
|
2020-01-01 18:03:06 +00:00
|
|
|
, largeFilesOverride :: Maybe Bool
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
, checkGitIgnoreOption :: CheckGitIgnore
|
2015-07-10 17:18:46 +00:00
|
|
|
}
|
2014-03-26 18:52:07 +00:00
|
|
|
|
2015-07-10 17:18:46 +00:00
|
|
|
optParser :: CmdParamsDesc -> Parser AddOptions
|
|
|
|
optParser desc = AddOptions
|
|
|
|
<$> cmdParams desc
|
2021-08-25 18:20:33 +00:00
|
|
|
<*> parseBatchOption False
|
2017-04-07 19:55:34 +00:00
|
|
|
<*> switch
|
|
|
|
( long "update"
|
|
|
|
<> short 'u'
|
|
|
|
<> help "only update tracked files"
|
|
|
|
)
|
2020-01-01 18:03:06 +00:00
|
|
|
<*> (parseforcelarge <|> parseforcesmall)
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
<*> checkGitIgnoreSwitch
|
2020-01-01 18:03:06 +00:00
|
|
|
where
|
|
|
|
parseforcelarge = flag Nothing (Just True)
|
|
|
|
( long "force-large"
|
|
|
|
<> help "add all files to annex, ignoring other configuration"
|
|
|
|
)
|
|
|
|
parseforcesmall = flag Nothing (Just False)
|
|
|
|
( long "force-small"
|
|
|
|
<> help "add all files to git, ignoring other configuration"
|
|
|
|
)
|
2010-12-30 18:19:16 +00:00
|
|
|
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
checkGitIgnoreSwitch :: Parser CheckGitIgnore
|
|
|
|
checkGitIgnoreSwitch = CheckGitIgnore <$>
|
|
|
|
invertableSwitch "check-gitignore" True
|
|
|
|
(help "Do not check .gitignore when adding files")
|
|
|
|
|
2015-07-10 17:18:46 +00:00
|
|
|
seek :: AddOptions -> CommandSeek
|
2019-06-19 16:35:08 +00:00
|
|
|
seek o = startConcurrency commandStages $ do
|
2019-12-20 19:01:34 +00:00
|
|
|
largematcher <- largeFilesMatcher
|
|
|
|
addunlockedmatcher <- addUnlockedMatcher
|
2019-12-26 20:24:40 +00:00
|
|
|
annexdotfiles <- getGitConfigVal annexDotFiles
|
2020-09-14 20:49:33 +00:00
|
|
|
let gofile (si, file) = case largeFilesOverride o of
|
2020-01-01 18:26:43 +00:00
|
|
|
Nothing ->
|
2020-11-03 14:11:04 +00:00
|
|
|
ifM (pure (annexdotfiles || not (dotfile file))
|
|
|
|
<&&> (checkFileMatcher largematcher file
|
|
|
|
<||> Annex.getState Annex.force))
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
( start o si file addunlockedmatcher
|
2020-01-01 18:26:43 +00:00
|
|
|
, ifM (annexAddSmallFiles <$> Annex.getGitConfig)
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
( startSmall o si file
|
2020-01-01 18:26:43 +00:00
|
|
|
, stop
|
|
|
|
)
|
2019-12-26 20:24:40 +00:00
|
|
|
)
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
Just True -> start o si file addunlockedmatcher
|
|
|
|
Just False -> startSmallOverridden o si file
|
2016-01-19 21:46:46 +00:00
|
|
|
case batchOption o of
|
added -z
Added -z option to git-annex commands that use --batch, useful for
supporting filenames containing newlines.
It only controls input to --batch, the output will still be line delimited
unless --json or etc is used to get some other output. While git often
makes -z affect both input and output, I don't like trying them together,
and making it affect output would have been a significant complication,
and also git-annex output is generally not intended to be machine parsed,
unless using --json or a format option.
Commands that take pairs like "file key" still separate them with a space
in --batch mode. All such commands take care to support filenames with
spaces when parsing that, so there was no need to change it, and it would
have needed significant changes to the batch machinery to separate tose
with a null.
To make fromkey and registerurl support -z, I had to give them a --batch
option. The implicit batch mode they enter when not provided with input
parameters does not support -z as that would have complicated option
parsing. Seemed better to move these toward using the same --batch as
everything else, though the implicit batch mode can still be used.
This commit was sponsored by Ole-Morten Duesund on Patreon.
2018-09-20 20:09:21 +00:00
|
|
|
Batch fmt
|
2017-04-07 19:55:34 +00:00
|
|
|
| updateOnly o ->
|
|
|
|
giveup "--update --batch is not supported"
|
2021-08-25 18:20:33 +00:00
|
|
|
| otherwise -> batchFiles fmt gofile
|
2016-01-19 21:46:46 +00:00
|
|
|
NoBatch -> do
|
2020-05-28 19:55:17 +00:00
|
|
|
-- Avoid git ls-files complaining about files that
|
|
|
|
-- are not known to git yet, since this will add
|
|
|
|
-- them. Instead, have workTreeItems warn about other
|
|
|
|
-- problems, like files that don't exist.
|
|
|
|
let ww = WarnUnmatchWorkTreeItems
|
|
|
|
l <- workTreeItems ww (addThese o)
|
|
|
|
let go a = a ww (commandAction . gofile) l
|
2017-04-07 19:55:34 +00:00
|
|
|
unless (updateOnly o) $
|
2020-09-18 17:33:35 +00:00
|
|
|
go (withFilesNotInGit (checkGitIgnoreOption o))
|
2016-12-05 18:02:11 +00:00
|
|
|
go withFilesMaybeModified
|
2019-08-30 17:54:57 +00:00
|
|
|
go withUnmodifiedUnlockedPointers
|
2010-11-11 22:54:52 +00:00
|
|
|
|
2015-04-08 20:14:23 +00:00
|
|
|
{- Pass file off to git-add. -}
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
startSmall :: AddOptions -> SeekInput -> RawFilePath -> CommandStart
|
|
|
|
startSmall o si file =
|
2021-03-12 18:09:19 +00:00
|
|
|
starting "add" (ActionItemTreeFile file) si $
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
next $ addSmall (checkGitIgnoreOption o) file
|
2015-12-02 18:48:42 +00:00
|
|
|
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
addSmall :: CheckGitIgnore -> RawFilePath -> Annex Bool
|
|
|
|
addSmall ci file = do
|
2015-04-08 20:16:42 +00:00
|
|
|
showNote "non-large file; adding content to git repository"
|
2021-01-04 17:12:28 +00:00
|
|
|
addFile Small ci file
|
2015-07-07 20:15:30 +00:00
|
|
|
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
startSmallOverridden :: AddOptions -> SeekInput -> RawFilePath -> CommandStart
|
|
|
|
startSmallOverridden o si file =
|
2021-03-12 18:09:19 +00:00
|
|
|
starting "add" (ActionItemTreeFile file) si $ next $ do
|
2021-01-04 17:12:28 +00:00
|
|
|
showNote "adding content to git repository"
|
|
|
|
addFile Small (checkGitIgnoreOption o) file
|
2020-01-01 18:03:06 +00:00
|
|
|
|
2021-01-04 17:12:28 +00:00
|
|
|
data SmallOrLarge = Small | Large
|
2020-01-01 18:03:06 +00:00
|
|
|
|
2021-01-04 17:12:28 +00:00
|
|
|
addFile :: SmallOrLarge -> CheckGitIgnore -> RawFilePath -> Annex Bool
|
|
|
|
addFile smallorlarge ci file = do
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
ps <- gitAddParams ci
|
smudge: check for known annexed inodes before checking annex.largefiles
smudge: Fix a case where an unlocked annexed file that annex.largefiles
does not match could get its unchanged content checked into git, due to git
running the smudge filter unecessarily.
When the file has the same inodecache as an already annexed file,
we can assume that the user is not intending to change how it's stored in
git.
Note that checkunchangedgitfile already handled the inverse case, where the
file was added to git previously. That goes further and actually sha1
hashes the new file and checks if it's the same hash in the index.
It would be possible to generate a key for the file and see if it's the
same as the old key, however that could be considerably more expensive than
sha1 of a small file is, and it is not necessary for the case I have, at
least, where the file is not modified or touched, and so its inode will
match the cache.
git-annex add was changed, when adding a small file, to remove the inode
cache for it. This is necessary to keep the recipe in
doc/tips/largefiles.mdwn for converting from annex to git working.
It also avoids bugs/case_where_using_pathspec_with_git-commit_leaves_s.mdwn
which the earlier try at this change introduced.
2021-05-10 17:05:08 +00:00
|
|
|
cps <- case smallorlarge of
|
|
|
|
-- In case the file is being converted from an annexed file
|
|
|
|
-- to be stored in git, remove the cached inode, so that
|
|
|
|
-- if the smudge clean filter later runs on the file,
|
|
|
|
-- it will not remember it was annexed.
|
|
|
|
--
|
|
|
|
-- The use of bypassSmudgeConfig prevents the smudge
|
|
|
|
-- filter from being run. So the changes to the database
|
|
|
|
-- can be queued up and not flushed to disk immediately.
|
|
|
|
Small -> do
|
2021-05-12 15:09:38 +00:00
|
|
|
maybe noop Database.Keys.removeInodeCache
|
|
|
|
=<< withTSDelta (liftIO . genInodeCache file)
|
smudge: check for known annexed inodes before checking annex.largefiles
smudge: Fix a case where an unlocked annexed file that annex.largefiles
does not match could get its unchanged content checked into git, due to git
running the smudge filter unecessarily.
When the file has the same inodecache as an already annexed file,
we can assume that the user is not intending to change how it's stored in
git.
Note that checkunchangedgitfile already handled the inverse case, where the
file was added to git previously. That goes further and actually sha1
hashes the new file and checks if it's the same hash in the index.
It would be possible to generate a key for the file and see if it's the
same as the old key, however that could be considerably more expensive than
sha1 of a small file is, and it is not necessary for the case I have, at
least, where the file is not modified or touched, and so its inode will
match the cache.
git-annex add was changed, when adding a small file, to remove the inode
cache for it. This is necessary to keep the recipe in
doc/tips/largefiles.mdwn for converting from annex to git working.
It also avoids bugs/case_where_using_pathspec_with_git-commit_leaves_s.mdwn
which the earlier try at this change introduced.
2021-05-10 17:05:08 +00:00
|
|
|
return bypassSmudgeConfig
|
|
|
|
Large -> return []
|
2021-01-04 17:12:28 +00:00
|
|
|
Annex.Queue.addCommand cps "add" (ps++[Param "--"])
|
2021-01-04 16:51:55 +00:00
|
|
|
[fromRawFilePath file]
|
2015-12-02 19:12:33 +00:00
|
|
|
return True
|
2015-04-08 20:14:23 +00:00
|
|
|
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
start :: AddOptions -> SeekInput -> RawFilePath -> AddUnlockedMatcher -> CommandStart
|
|
|
|
start o si file addunlockedmatcher = do
|
2019-08-30 17:54:57 +00:00
|
|
|
mk <- liftIO $ isPointerFile file
|
|
|
|
maybe go fixuppointer mk
|
2012-11-12 05:05:04 +00:00
|
|
|
where
|
2016-05-16 19:30:40 +00:00
|
|
|
go = ifAnnexed file addpresent add
|
2019-12-06 19:37:12 +00:00
|
|
|
add = liftIO (catchMaybeIO $ R.getSymbolicLinkStatus file) >>= \case
|
2017-12-05 19:00:50 +00:00
|
|
|
Nothing -> stop
|
|
|
|
Just s
|
|
|
|
| not (isRegularFile s) && not (isSymbolicLink s) -> stop
|
make CommandStart return a StartMessage
The goal is to be able to run CommandStart in the main thread when -J is
used, rather than unncessarily passing it off to a worker thread, which
incurs overhead that is signficant when the CommandStart is going to
quickly decide to stop.
To do that, the message it displays needs to be displayed in the worker
thread, after the CommandStart has run.
Also, the change will mean that CommandStart will no longer necessarily
run with the same Annex state as CommandPerform. While its docs already
said it should avoid modifying Annex state, I audited all the
CommandStart code as part of the conversion. (Note that CommandSeek
already sometimes runs with a different Annex state, and that has not been
a source of any problems, so I am not too worried that this change will
lead to breakage going forward.)
The only modification of Annex state I found was it calling
allowMessages in some Commands that default to noMessages. Dealt with
that by adding a startCustomOutput and a startingUsualMessages.
This lets a command start with noMessages and then select the output it
wants for each CommandStart.
One bit of breakage: onlyActionOn has been removed from commands that used it.
The plan is that, since a StartMessage contains an ActionItem,
when a Key can be extracted from that, the parallel job runner can
run onlyActionOn' automatically. Then commands won't need to worry about
this detail. Future work.
Otherwise, this was a fairly straightforward process of making each
CommandStart compile again. Hopefully other behavior changes were mostly
avoided.
In a few cases, a command had a CommandStart that called a CommandPerform
that then called showStart multiple times. I have collapsed those
down to a single start action. The main command to perhaps suffer from it
is Command.Direct, which used to show a start for each file, and no
longer does.
Another minor behavior change is that some commands used showStart
before, but had an associated file and a Key available, so were changed
to ShowStart with an ActionItemAssociatedFile. That will not change the
normal output or behavior, but --json output will now include the key.
This should not break it for anyone using a real json parser.
2019-06-06 19:42:30 +00:00
|
|
|
| otherwise ->
|
2021-03-12 18:09:19 +00:00
|
|
|
starting "add" (ActionItemTreeFile file) si $
|
make CommandStart return a StartMessage
The goal is to be able to run CommandStart in the main thread when -J is
used, rather than unncessarily passing it off to a worker thread, which
incurs overhead that is signficant when the CommandStart is going to
quickly decide to stop.
To do that, the message it displays needs to be displayed in the worker
thread, after the CommandStart has run.
Also, the change will mean that CommandStart will no longer necessarily
run with the same Annex state as CommandPerform. While its docs already
said it should avoid modifying Annex state, I audited all the
CommandStart code as part of the conversion. (Note that CommandSeek
already sometimes runs with a different Annex state, and that has not been
a source of any problems, so I am not too worried that this change will
lead to breakage going forward.)
The only modification of Annex state I found was it calling
allowMessages in some Commands that default to noMessages. Dealt with
that by adding a startCustomOutput and a startingUsualMessages.
This lets a command start with noMessages and then select the output it
wants for each CommandStart.
One bit of breakage: onlyActionOn has been removed from commands that used it.
The plan is that, since a StartMessage contains an ActionItem,
when a Key can be extracted from that, the parallel job runner can
run onlyActionOn' automatically. Then commands won't need to worry about
this detail. Future work.
Otherwise, this was a fairly straightforward process of making each
CommandStart compile again. Hopefully other behavior changes were mostly
avoided.
In a few cases, a command had a CommandStart that called a CommandPerform
that then called showStart multiple times. I have collapsed those
down to a single start action. The main command to perhaps suffer from it
is Command.Direct, which used to show a start for each file, and no
longer does.
Another minor behavior change is that some commands used showStart
before, but had an associated file and a Key available, so were changed
to ShowStart with an ActionItemAssociatedFile. That will not change the
normal output or behavior, but --json output will now include the key.
This should not break it for anyone using a real json parser.
2019-06-06 19:42:30 +00:00
|
|
|
if isSymbolicLink s
|
2021-01-04 17:12:28 +00:00
|
|
|
then next $ addFile Small (checkGitIgnoreOption o) file
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
else perform o file addunlockedmatcher
|
2019-08-30 17:54:57 +00:00
|
|
|
addpresent key =
|
2019-12-06 19:37:12 +00:00
|
|
|
liftIO (catchMaybeIO $ R.getSymbolicLinkStatus file) >>= \case
|
2017-12-05 19:00:50 +00:00
|
|
|
Just s | isSymbolicLink s -> fixuplink key
|
2018-09-12 17:53:03 +00:00
|
|
|
_ -> add
|
2020-11-10 16:10:51 +00:00
|
|
|
fixuplink key =
|
2021-03-12 18:09:19 +00:00
|
|
|
starting "add" (ActionItemTreeFile file) si $
|
2020-11-10 16:10:51 +00:00
|
|
|
addingExistingLink file key $ do
|
|
|
|
liftIO $ removeFile (fromRawFilePath file)
|
|
|
|
addLink (checkGitIgnoreOption o) file key Nothing
|
|
|
|
next $ cleanup key =<< inAnnex key
|
|
|
|
fixuppointer key =
|
2021-03-12 18:09:19 +00:00
|
|
|
starting "add" (ActionItemTreeFile file) si $
|
2020-11-10 16:10:51 +00:00
|
|
|
addingExistingLink file key $ do
|
|
|
|
Database.Keys.addAssociatedFile key =<< inRepo (toTopFilePath file)
|
2021-01-04 17:12:28 +00:00
|
|
|
next $ addFile Large (checkGitIgnoreOption o) file
|
2010-11-02 23:04:24 +00:00
|
|
|
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
perform :: AddOptions -> RawFilePath -> AddUnlockedMatcher -> CommandPerform
|
|
|
|
perform o file addunlockedmatcher = withOtherTmp $ \tmpdir -> do
|
2019-12-20 19:01:34 +00:00
|
|
|
lockingfile <- not <$> addUnlocked addunlockedmatcher
|
2021-03-01 20:34:40 +00:00
|
|
|
(MatchingFile (FileInfo file file Nothing))
|
2021-01-25 17:55:01 +00:00
|
|
|
True
|
2016-01-07 21:39:59 +00:00
|
|
|
let cfg = LockDownConfig
|
|
|
|
{ lockingFile = lockingfile
|
2019-05-07 17:04:39 +00:00
|
|
|
, hardlinkFileTmpDir = Just tmpdir
|
2021-09-02 17:45:21 +00:00
|
|
|
, checkWritePerms = True
|
2016-01-07 21:39:59 +00:00
|
|
|
}
|
2019-12-04 17:15:34 +00:00
|
|
|
ld <- lockDown cfg (fromRawFilePath file)
|
2019-06-25 17:12:47 +00:00
|
|
|
let sizer = keySource <$> ld
|
bwlimit
Added annex.bwlimit and remote.name.annex-bwlimit config that works for git
remotes and many but not all special remotes.
This nearly works, at least for a git remote on the same disk. With it set
to 100kb/1s, the meter displays an actual bandwidth of 128 kb/s, with
occasional spikes to 160 kb/s. So it needs to delay just a bit longer...
I'm unsure why.
However, at the beginning a lot of data flows before it determines the
right bandwidth limit. A granularity of less than 1s would probably improve
that.
And, I don't know yet if it makes sense to have it be 100ks/1s rather than
100kb/s. Is there a situation where the user would want a larger
granularity? Does granulatity need to be configurable at all? I only used that
format for the config really in order to reuse an existing parser.
This can't support for external special remotes, or for ones that
themselves shell out to an external command. (Well, it could, but it
would involve pausing and resuming the child process tree, which seems
very hard to implement and very strange besides.) There could also be some
built-in special remotes that it still doesn't work for, due to them not
having a progress meter whose displays blocks the bandwidth using thread.
But I don't think there are actually any that run a separate thread for
downloads than the thread that displays the progress meter.
Sponsored-by: Graham Spencer on Patreon
2021-09-21 20:58:02 +00:00
|
|
|
v <- metered Nothing sizer Nothing $ \_meter meterupdate ->
|
Added --no-check-gitignore option for finer grained control than using --force.
add, addurl, importfeed, import: Added --no-check-gitignore option
for finer grained control than using --force.
(--force is used for too many different things, and at least one
of these also uses it for something else. I would like to reduce
--force's footprint until it only forces drops or a few other data
losses. For now, --force still disables checking ignores too.)
addunused: Don't check .gitignores when adding files. This is a behavior
change, but I justify it by analogy with git add of a gitignored file
adding it, asking to add all unused files back should add them all back,
not skip some. The old behavior was surprising.
In Command.Lock and Command.ReKey, CheckGitIgnore False does not change
behavior, it only makes explicit what is done. Since these commands are run
on annexed files, the file is already checked into git, so git add won't
check ignores.
2020-09-18 17:12:04 +00:00
|
|
|
ingestAdd (checkGitIgnoreOption o) meterupdate ld
|
2019-06-25 17:12:47 +00:00
|
|
|
finish v
|
2013-09-25 20:07:11 +00:00
|
|
|
where
|
2016-02-16 18:43:43 +00:00
|
|
|
finish (Just key) = next $ cleanup key True
|
|
|
|
finish Nothing = stop
|
2012-06-06 00:28:34 +00:00
|
|
|
|
2016-02-16 18:43:43 +00:00
|
|
|
cleanup :: Key -> Bool -> CommandCleanup
|
|
|
|
cleanup key hascontent = do
|
2019-01-14 17:03:35 +00:00
|
|
|
maybeShowJSON $ JSONChunk [("key", serializeKey key)]
|
2014-01-05 18:09:57 +00:00
|
|
|
when hascontent $
|
|
|
|
logStatus key InfoPresent
|
2013-02-05 17:41:48 +00:00
|
|
|
return True
|