git-annex/Git
Joey Hess 6a3bd283b8
add restage log
When pointer files need to be restaged, they're first written to the
log, and then when the restage operation runs, it reads the log. This
way, if the git-annex process is interrupted before it can do the
restaging, a later git-annex process can do it.

Currently, this lets a git-annex get/drop command be interrupted and
then re-ran, and as long as it gets/drops additional files, it will
clean up after the interrupted command. But more changes are
needed to make it easier to restage after an interrupted process.

Kept using the git queue to run the restage action, even though the
list of files that it builds up for that action is not actually used by
the action. This could perhaps be simplified to make restaging a cleanup
action that gets registered, rather than using the git queue for it. But
I wasn't sure if that would cause visible behavior changes, when eg
dropping a large number of files, currently the git queue flushes
periodically, and so it restages incrementally, rather than all at the
end.

In restagePointerFiles, it reads the restage log twice, once to get
the number of files and size, and a second time to process it.
This seemed better than reading the whole file into memory, since
potentially a huge number of files could be in there. Probably the OS
will cache the file in memory and there will not be much performance
impact. It might be better to keep running tallies in another file
though. But updating that atomically with the log seems hard.

Also note that it's possible for calcRestageLog to see a different file
than streamRestageLog does. More files may be added to the log in
between. That is ok, it will only cause the filterprocessfaster heuristic to
operate with slightly out of date information, so it may make the wrong
choice for the files that got added and be a little slower than ideal.

Sponsored-by: Dartmouth College's DANDI project
2022-09-23 15:47:24 -04:00
..
Command update licenses from GPL to AGPL 2019-03-13 15:48:14 -04:00
Remote Removed support for git versions older than 2.1 2019-09-11 16:14:43 -04:00
AutoCorrect.hs all commands building except for assistant 2019-12-05 14:41:18 -04:00
Branch.hs sync --quiet 2021-07-19 11:28:47 -04:00
BuildVersion.hs update licenses from GPL to AGPL 2019-03-13 15:48:14 -04:00
CatFile.hs separate handles for cat-file and cat-file --batch-check 2021-09-24 13:16:13 -04:00
CheckAttr.hs mincopies 2021-01-06 14:15:19 -04:00
CheckIgnore.hs more RawFilePath conversion 2020-11-03 10:11:04 -04:00
Command.hs convert some error to giveup 2021-12-09 14:36:54 -04:00
Config.hs skip checkRepoConfigInaccessible when git directory specified explicitly 2022-09-20 14:52:43 -04:00
ConfigTypes.hs simplify and speed up Utility.FileSystemEncoding 2021-08-11 12:13:31 -04:00
Construct.hs skip checkRepoConfigInaccessible when git directory specified explicitly 2022-09-20 14:52:43 -04:00
Credential.hs cache credentials in memory when doing http basic auth to a git remote 2022-09-09 14:20:32 -04:00
CurrentRepo.hs skip checkRepoConfigInaccessible when git directory specified explicitly 2022-09-20 14:52:43 -04:00
DiffTree.hs simplify and speed up Utility.FileSystemEncoding 2021-08-11 12:13:31 -04:00
DiffTreeItem.hs ByteString Ref continued 2020-04-07 11:54:27 -04:00
Env.hs convert TopFilePath to use RawFilePath 2019-12-09 15:07:21 -04:00
FileMode.hs update licenses from GPL to AGPL 2019-03-13 15:48:14 -04:00
Filename.hs add newtypes for QuickCheck to avoid LANG=C issues 2020-11-09 20:21:18 -04:00
FilePath.hs more RawFilePath conversion 2020-10-29 12:03:50 -04:00
FilterProcess.hs filter-process: Fix protocol for empty files 2022-07-13 17:13:54 -04:00
Fsck.hs simplify and speed up Utility.FileSystemEncoding 2021-08-11 12:13:31 -04:00
GCrypt.hs simplify and speed up Utility.FileSystemEncoding 2021-08-11 12:13:31 -04:00
HashObject.hs more RawFilePath conversion 2020-10-29 12:03:50 -04:00
History.hs convert to withCreateProcess for async exception safety 2020-06-03 15:48:09 -04:00
Hook.hs fix build on windows 2020-11-20 12:53:25 -04:00
Index.hs more RawFilePath conversion 2020-11-05 18:45:37 -04:00
LockFile.hs update licenses from GPL to AGPL 2019-03-13 15:48:14 -04:00
LsFiles.hs convert some error to giveup 2021-12-09 14:36:54 -04:00
LsTree.hs simplify and speed up Utility.FileSystemEncoding 2021-08-11 12:13:31 -04:00
Merge.hs sync --quiet 2021-07-19 11:28:47 -04:00
Objects.hs more RawFilePath conversion 2020-11-05 18:45:37 -04:00
PktLine.hs update 2021-11-05 10:53:11 -04:00
Queue.hs add restage log 2022-09-23 15:47:24 -04:00
Ref.hs remove errant print debug 2021-10-03 18:18:04 -04:00
RefLog.hs ByteString Ref continued 2020-04-07 11:54:27 -04:00
Remote.hs simplify and speed up Utility.FileSystemEncoding 2021-08-11 12:13:31 -04:00
Repair.hs improve createDirectoryUnder to allow alternate top directories 2022-08-12 12:52:37 -04:00
Sha.hs started converting Ref from String to ByteString 2020-04-06 17:14:49 -04:00
Ssh.hs update licenses from GPL to AGPL 2019-03-13 15:48:14 -04:00
Status.hs convert TopFilePath to use RawFilePath 2019-12-09 15:07:21 -04:00
Tree.hs ImportableContentsChunkable 2021-10-08 13:15:22 -04:00
Types.hs skip checkRepoConfigInaccessible when git directory specified explicitly 2022-09-20 14:52:43 -04:00
UnionMerge.hs fix all remaining -Wincomplete-uni-patterns warnings 2020-04-15 13:55:08 -04:00
UpdateIndex.hs add restage log 2022-09-23 15:47:24 -04:00
Url.hs avoid partial functions in Git.Url 2021-01-18 15:07:23 -04:00
Version.hs more RawFilePath conversion 2020-10-29 12:03:50 -04:00