git-annex/Command
Joey Hess f617988a29
Make import --deduplicate and --skip-duplicates only hash once, not twice
import: --deduplicate and --skip-duplicates were implemented inneficiently;
they unncessarily hashed each file twice. They have been improved to only
hash once.

The new approach is to lock down (minimally) and hash files, and then
reuse that information when importing them.

This was rather tricky, especially in detecting changes to files while
they are being imported.

The output of import changed slightly. While before it silently skipped
over files with eg --skip-duplicates, now it shows each file as it starts
to act on it. Since every file is hashed first thing, it would otherwise
not be clear what file import is chewing on. (Actually, it wasn't clear
before when any of the duplicates switches were used.)

This commit was sponsored by Alexander Thompson on Patreon.
2017-02-09 15:32:22 -04:00
..
Add.hs add: Stage modified non-large files when running in indirect mode. 2016-12-05 14:10:21 -04:00
AddUnused.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
AddUrl.hs Some optimisations to string splitting code. 2017-01-31 19:06:22 -04:00
Adjust.hs adjust: Add --fix adjustment, which is useful when the git directory is in a nonstandard place. 2016-05-16 17:18:33 -04:00
Assistant.hs assistant: Make --autostart --foreground wait for the children it starts. 2017-02-07 13:31:45 -04:00
Benchmark.hs change keys database to use IKey type with more efficient serialization 2016-01-12 14:01:50 -04:00
CalcKey.hs calckey: New plumbing command, calculates the key that would be used to refer to a file 2016-04-20 13:50:26 -04:00
CheckPresentKey.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Commit.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Config.hs config: New command for storing configuration in the git-annex branch. 2017-01-30 16:46:38 -04:00
ConfigList.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
ContentLocation.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Copy.hs copy, move, mirror: Support --json and --json-progress. 2016-09-09 16:24:26 -04:00
Dead.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Describe.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
DiffDriver.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Direct.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Drop.hs get, move, copy, mirror: Added --failed switch which retries failed copies/moves 2016-08-03 12:37:12 -04:00
DropKey.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
DropUnused.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
EnableRemote.hs add SetupStage parameter to RemoteType.setup 2017-02-07 14:55:58 -04:00
EnableTor.hs refactor 2016-12-30 12:31:17 -04:00
ExamineKey.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Expire.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Find.hs Removed dependency on json library; all JSON is now handled by aeson. 2016-07-26 19:15:34 -04:00
FindRef.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Fix.hs fix build warning on windows and android 2016-05-05 15:49:56 -04:00
Forget.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
FromKey.hs Make all --batch input, as well as fromkey and registerurl stdin be processed without requiring it to be in the current encoding. 2016-12-13 15:35:04 -04:00
Fsck.hs fsck --all --from was checking the content of files in the local repository, rather than on the special remote. 2016-11-16 15:33:57 -04:00
FuzzTest.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
GCryptSetup.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Get.hs enable forwardRetry for command-line transfers 2016-10-26 15:38:27 -04:00
Group.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
GroupWanted.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Help.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Import.hs Make import --deduplicate and --skip-duplicates only hash once, not twice 2017-02-09 15:32:22 -04:00
ImportFeed.hs Always use filesystem encoding for all file and handle reads and writes. 2016-12-24 14:46:31 -04:00
InAnnex.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Indirect.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Info.hs info: Support being passed a treeish, and show info about the annexed files in it similar to how a directory is handled. 2016-09-15 12:51:00 -04:00
Init.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
InitRemote.hs initremote: When a uuid= parameter is passed, use the specified UUID for the new special remote, instead of generating a UUID. 2017-02-07 15:10:41 -04:00
List.hs list: Do not include dead repositories. 2016-06-04 14:33:31 -04:00
Lock.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
LockContent.hs git-annex-shell, remotedaemon, git remote: Fix some memory DOS attacks. 2016-12-09 13:34:32 -04:00
Log.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
LookupKey.hs Fix reversion in lookupkey, contentlocation, and examinekey which caused them to sometimes output side messages. 2016-01-29 13:20:24 -04:00
Map.hs Some optimisations to string splitting code. 2017-01-31 19:06:22 -04:00
MatchExpression.hs matchexpression: Added --largefiles option to parse an annex.largefiles expression. 2016-02-03 16:58:36 -04:00
Merge.hs refactor 2016-04-22 14:35:48 -04:00
MetaData.hs metadata --batch: Fix bug when conflicting metadata changes were made in the same batch run. 2016-12-13 11:07:49 -04:00
Migrate.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Mirror.hs copy, move, mirror: Support --json and --json-progress. 2016-09-09 16:24:26 -04:00
Move.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
NotifyChanges.hs make tor hidden service work when directory watching is not available 2016-12-09 16:40:47 -04:00
NumCopies.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
P2P.hs wormhole pairing appid flag day 2021-12-31 2017-02-03 15:06:40 -04:00
PreCommit.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Proxy.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
ReadPresentKey.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
RecvKey.hs get, move, copy, mirror: Added --failed switch which retries failed copies/moves 2016-08-03 12:37:12 -04:00
RegisterUrl.hs Make all --batch input, as well as fromkey and registerurl stdin be processed without requiring it to be in the current encoding. 2016-12-13 15:35:04 -04:00
Reinit.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Reinject.hs avoid too-long command synopsis 2016-11-30 14:16:57 -04:00
ReKey.hs rekey --force: Incorrectly marked the new key's content as being present in the local repo even when it was not. 2016-12-19 18:18:57 -04:00
RemoteDaemon.hs remotedaemon: serve tor hidden service 2016-11-20 15:48:12 -04:00
Repair.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Required.hs started converting to use optparse-applicative 2015-07-08 13:36:25 -04:00
ResolveMerge.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
RmUrl.hs rekey: Added --batch mode. 2016-12-05 12:55:50 -04:00
Schedule.hs Fixed typo in Schedule.hs. 2016-11-24 07:37:33 -04:00
Semitrust.hs convert all commands to work with optparse-applicative 2015-07-08 15:08:02 -04:00
SendKey.hs enable forwardRetry for command-line transfers 2016-10-26 15:38:27 -04:00
SetKey.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
SetPresentKey.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Smudge.hs Make import --deduplicate and --skip-duplicates only hash once, not twice 2017-02-09 15:32:22 -04:00
Status.hs Removed dependency on json library; all JSON is now handled by aeson. 2016-07-26 19:15:34 -04:00
Sync.hs make sync --no-commit override annex.annex.autocommit 2017-02-03 14:36:14 -04:00
Test.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
TestRemote.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
TransferInfo.hs git-annex-shell, remotedaemon, git remote: Fix some memory DOS attacks. 2016-12-09 13:34:32 -04:00
TransferKey.hs remove TransferObserver 2016-08-03 13:46:20 -04:00
TransferKeys.hs Always use filesystem encoding for all file and handle reads and writes. 2016-12-24 14:46:31 -04:00
Trust.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Unannex.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Undo.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Ungroup.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Uninit.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Unlock.hs upgrade: Handle upgrade to v6 when the repository already contains v6 unlocked files whose content is already present. 2016-10-17 15:19:47 -04:00
Untrust.hs convert all commands to work with optparse-applicative 2015-07-08 15:08:02 -04:00
Unused.hs unused: When large files are checked right into git, avoid buffering their contents in memory. 2017-01-31 19:09:37 -04:00
Upgrade.hs When auto-upgrading a v3 remote, avoid upgrading to version 6, instead keep it at version 5. 2016-10-05 16:23:09 -04:00
VAdd.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
VCycle.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Version.hs version: Display OS version and architecture too. 2016-05-05 16:06:01 -04:00
VFilter.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Vicfg.hs make git annex config settings editable in vicfg 2017-01-30 17:08:05 -04:00
View.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
VPop.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Wanted.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Watch.hs remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
WebApp.hs Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. 2016-11-15 21:29:54 -04:00
Whereis.hs better locking for json with -J 2016-09-09 15:51:34 -04:00