git-annex/Annex
Joey Hess 579d9b60c1
improve concurrency of move/copy --from --to
Use separate stages for download and upload. In the common case where
it downloads the file from one remote and then uploads to the other,
those are by far the most expensive operations, and there's a decent
chance the two remotes bottleneck on different resources.

Suppose it's being run with -J2 and a bunch of 10 mb files. Two threads
will be started both downloading from the src remote. They will probably
finish at the same time. Then two threads will be started uploading to
the dst remote. They will probably take the same time as well. Before
this change, it would alternate back and forth, bottlenecking on src and dst.
With this change, as soon as the two threads start uploading to dst, two
more threads are able to start, downloading from src. So bandwidth to
both remotes is saturated more often.

Other commands that use transferStages only send in one direction at a
time. So the worker threads for the other direction will sit idle, and
there will be no change in their behavior.

Sponsored-by: Dartmouth College's DANDI project
2023-01-24 13:59:39 -04:00
..
AdjustedBranch Typo fix unncessary -> unnecessary. 2022-08-20 09:40:19 -04:00
Branch handle transitions with read-only unmerged git-annex branches 2021-12-28 13:23:32 -04:00
Concurrent differentiate between concurrency enabled at command line and by git config 2020-09-16 11:47:12 -04:00
Content Typo fix unncessary -> unnecessary. 2022-08-20 09:40:19 -04:00
Debug implement fastDebug 2021-04-06 15:24:28 -04:00
LockPool fine-grained locking when annex.pidlock is enabled 2021-12-03 17:20:21 -04:00
MetaData update licenses from GPL to AGPL 2019-03-13 15:48:14 -04:00
SpecialRemote added an optional cost= configuration to all special remotes 2023-01-12 13:42:28 -04:00
VectorClock deal better with clock skew situations, using vector clocks 2021-08-04 12:33:46 -04:00
View Fix test suite failure on Windows 2021-08-24 14:03:29 -04:00
Action.hs avoid flushing keys db queue after each Annex action 2022-10-12 14:12:23 -04:00
AdjustedBranch.hs move several readonly values to AnnexRead 2022-06-28 15:40:19 -04:00
AutoMerge.hs support git 2.34.0's handling of merge conflict between annexed and non-annexed file 2021-11-22 16:10:24 -04:00
BloomFilter.hs Revert "data type that starts off using a set but converts to a bloom filter when large" 2020-07-01 20:12:19 -04:00
Branch.hs work around git segfault 2022-08-04 14:20:57 -04:00
BranchState.hs disable journalIgnorable in enableInteractiveBranchAccess 2022-07-15 13:48:41 -04:00
CatFile.hs read a consistent amount from pointer file 2022-02-23 12:52:34 -04:00
ChangedRefs.hs improve createDirectoryUnder to allow alternate top directories 2022-08-12 12:52:37 -04:00
CheckAttr.hs mincopies 2021-01-06 14:15:19 -04:00
CheckIgnore.hs move several readonly values to AnnexRead 2022-06-28 15:40:19 -04:00
Common.hs add annex.dbdir (WIP) 2022-08-11 16:58:53 -04:00
Concurrent.hs use ResourcePool for hash-object handles 2022-07-25 17:32:39 -04:00
Content.hs finishing up move --from --to 2023-01-23 17:43:48 -04:00
CopyFile.hs incremental verification for retrieval from import remotes 2022-05-09 15:39:43 -04:00
CurrentBranch.hs refactor getCurrentBranch 2018-10-19 17:29:18 -04:00
Debug.hs fix fastDebug to check if debugging is actually enabled 2021-04-06 16:28:37 -04:00
Difference.hs include git-annex-shell back in 2019-12-02 11:51:52 -04:00
DirHashes.hs Added http special remote, which is useful for accessing other remotes that publish content stored in them via http/https. 2020-09-01 15:16:35 -04:00
Drop.hs prevent numcopies or mincopies being configured to 0 2022-03-28 15:20:34 -04:00
Environment.hs include git-annex-shell back in 2019-12-02 11:51:52 -04:00
Export.hs convert Key to ShortByteString 2021-10-05 20:20:08 -04:00
ExternalAddonProcess.hs use fastDebug everywhere it can be used 2021-04-06 15:41:24 -04:00
FileMatcher.hs Support "inbackend" in preferred content expressions 2022-09-26 16:06:49 -04:00
Fixup.hs fix a bug that prevented git-annex init from working in a submodule 2021-01-21 15:33:15 -04:00
GitOverlay.hs add: Significantly speed up adding lots of non-large files to git 2021-01-04 13:12:28 -04:00
HashObject.hs use ResourcePool for hash-object handles 2022-07-25 17:32:39 -04:00
Hook.hs don't try to remove pre-commit-annex and post-update-annex-hooks 2020-10-19 13:13:49 -04:00
Import.hs all keys are still present on versioned remote after import of a tree 2022-10-11 13:05:40 -04:00
Ingest.hs move several readonly values to AnnexRead 2022-06-28 15:40:19 -04:00
Init.hs don't frontload reconcileStaged in git-annex init 2022-11-18 13:58:47 -04:00
InodeSentinal.hs add debugging in sameInodeCache 2021-07-26 10:58:07 -04:00
Journal.hs add annex.dbdir (WIP) 2022-08-11 16:58:53 -04:00
Link.hs refector for legibility 2022-09-23 18:53:06 -04:00
Locations.hs fix deadlock in restagePointerFiles 2022-12-08 14:36:11 -04:00
LockFile.hs add annex.dbdir (WIP) 2022-08-11 16:58:53 -04:00
LockPool.hs update licenses from GPL to AGPL 2019-03-13 15:48:14 -04:00
Magic.hs Serialize use of C magic library, which is not thread safe. 2020-09-17 17:27:42 -04:00
MetaData.hs switch to readMaybe to handle values with leading number followed by non-number 2022-12-22 14:33:47 -04:00
Multicast.hs use programPath consistently, not readProgramFile 2020-03-30 16:06:27 -04:00
Notification.hs fix build when dbus is enabled 2022-07-05 13:06:45 -04:00
NumCopies.hs move several readonly values to AnnexRead 2022-06-28 15:40:19 -04:00
Path.hs Make git-annex enable-tor work when using the linux standalone build 2022-10-26 15:45:08 -04:00
Perms.hs remove unncessary do block 2022-09-26 13:10:25 -04:00
PidLock.hs fix windows build 2022-09-26 12:08:04 -04:00
Queue.hs add restage log 2022-09-23 15:47:24 -04:00
RemoteTrackingBranch.hs refactor 2019-11-11 19:10:52 -04:00
ReplaceFile.hs improve createDirectoryUnder to allow alternate top directories 2022-08-12 12:52:37 -04:00
SpecialRemote.hs info: Added --autoenable option 2022-06-01 14:20:38 -04:00
Ssh.hs Added annex.adviceNoSshCaching config. 2021-05-27 12:37:49 -04:00
StallDetection.hs bwlimit 2021-09-21 16:58:10 -04:00
TaggedPush.hs Ref ByteString conversion done 2020-04-07 17:41:09 -04:00
Tmp.hs add annex.dbdir (WIP) 2022-08-11 16:58:53 -04:00
Transfer.hs improve concurrency of move/copy --from --to 2023-01-24 13:59:39 -04:00
TransferrerPool.hs fix restaging of transferred files after stalldetection kicks in 2022-09-23 15:55:40 -04:00
UntrustedFilePath.hs importfeed: Fix reversion that caused some '.' in filenames to be replaced with '_' 2020-08-05 11:35:00 -04:00
UpdateInstead.hs v7 for all repositories 2019-08-30 14:09:14 -04:00
Url.hs don't force use of conduit in withUrlOptionsPromptingCreds 2022-09-09 16:07:32 -04:00
UUID.hs simplify and speed up Utility.FileSystemEncoding 2021-08-11 12:13:31 -04:00
VariantFile.hs more RawFilePath 2019-12-18 17:10:28 -04:00
VectorClock.hs deal better with clock skew situations, using vector clocks 2021-08-04 12:33:46 -04:00
Verify.hs incremental verification for retrieval from all export remotes 2022-05-09 13:49:33 -04:00
Version.hs v8 repositories automatically upgrade to v9 2022-07-25 16:20:04 -04:00
View.hs turn of PackageImports in cabal file 2022-02-25 13:16:36 -04:00
Wanted.hs new matching options --want-get-by and --want-drop-by 2022-07-28 13:26:03 -04:00
WorkerPool.hs start splitting out readonly values from AnnexState 2021-04-02 15:51:44 -04:00
WorkTree.hs use lookupKeyStaged in --batch code paths 2022-10-26 14:43:06 -04:00
YoutubeDl.hs convert renameFile to moveFile to support cross-device moves 2022-12-20 15:17:50 -04:00