Commit graph

2652 commits

Author SHA1 Message Date
Ilya_Shlyakhter
13794595d6 Added a comment 2019-06-26 20:15:19 +00:00
Joey Hess
732f03b202
comment 2019-06-26 11:58:53 -04:00
Joey Hess
3bb8b62699
comments 2019-06-26 11:23:41 -04:00
Joey Hess
42c386fc47
add: Display progress meter when hashing files.
* add: Display progress meter when hashing files.
* add: Support --json-progress option.
2019-06-25 13:12:47 -04:00
Joey Hess
191bdaafc5
comment 2019-06-25 11:08:45 -04:00
Joey Hess
9d36c826c0
use fine-grained WorkerStages when transferring and verifying
This means that Command.Move and Command.Get don't need to
manually set the stage, and is a lot cleaner conceptually.

Also, this makes Command.Sync.syncFile use the worker pool better.
In the scenario where it first downloads content and then uploads it to
some other remotes, it will start in TransferStage, then enter VerifyStage
and then go back to TransferStage for each transfer to the remotes.
Before, it entered CleanupStage after the download, and stayed in it for
the upload, so too many transfer jobs could run at the same time.

Note that, in Remote.Git, it uses runTransfer and also verifyKeyContent
inside onLocal. That has a Annex state for the remote, with no worker pool.
So the resulting calls to enteringStage won't block in there.

While Remote.Git.copyToRemote does do checksum verification, I
realized that should not use a verification slot in the WorkerPool
to do it. Because, it's reading back from eg, a removable disk to checksum.
That will contend with other writes to that disk. It's best to treat
that checksum verification as just part of the transer. So, removed the todo
item about that, as there's nothing needing to be done.
2019-06-19 13:24:20 -04:00
Joey Hess
e19408ed9d
Merge branch 'master' of ssh://git-annex.branchable.com 2019-06-17 15:26:57 -04:00
Joey Hess
04cc470201
run download checksum verification in separate job pool
get, move, copy, sync: When -J or annex.jobs has enabled concurrency,
checksum verification uses a separate job pool than is used for
downloads, to keep bandwidth saturated.

Not yet done for upload checksum verification, but that only affects
remotes on local disks.
2019-06-17 14:58:02 -04:00
Joey Hess
1a8d06d251
thought 2019-06-17 11:50:18 -04:00
jsag@f84637fe752e0235291a118b1cd007bafad0997e
ae9f2d5e6a 2019-06-17 12:43:17 +00:00
Joey Hess
e589a9b3fc
moving this to a bug 2019-06-12 15:00:14 -04:00
Joey Hess
8e5ea28c26
finish CommandStart transition
The hoped for optimisation of CommandStart with -J did not materialize.
In fact, not runnign CommandStart in parallel is slower than -J3.
So, CommandStart are still run in parallel.

(The actual bad performance I've been seeing with -J in my big repo
has to do with building the remoteList.)

But, this is still progress toward making -J faster, because it gets rid
of the onlyActionOn roadblock in the way of making CommandCleanup jobs
run separate from CommandPerform jobs.

Added OnlyActionOn constructor for ActionItem which fixes the
onlyActionOn breakage in the last commit.

Made CustomOutput include an ActionItem, so even things using it can
specify OnlyActionOn.

In Command.Move and Command.Sync, there were CommandStarts that used
includeCommandAction, so output messages, which is no longer allowed.
Fixed by using startingCustomOutput, but that's still not quite right,
since it prevents message display for the includeCommandAction run
inside it too.
2019-06-12 13:24:01 -04:00
Joey Hess
3893d84764
todo 2019-06-06 12:02:27 -04:00
Joey Hess
3eac4e01a4
idea 2019-06-05 19:43:01 -04:00
Joey Hess
659640e224
separate queue for cleanup actions
When running multiple concurrent actions, the cleanup phase is run in a
separate queue than the main action queue. This can make some commands
faster, because less time is spent on bookkeeping in between each file
transfer.

But as far as I can see, nothing will be sped up much by this yet, because
all the existing cleanup actions are very light-weight. This is just groundwork
for deferring checksum verification to cleanup time.

This change does mean that if the user expects -J2 will mean that they see no
more than 2 jobs running at a time, they may be surprised to see 4 in some
cases (if the cleanup actions are slow enough to notice).

It might also make sense to enable background cleanup without the -J,
for at least one cleanup action. Indeed, that's the behavior that -J1
has now. At some point in the future, it make make sense to make the
behavior with no -J the same as -J1. The only reason it's not currently
is that git-annex can build w/o concurrent-output, and also any bugs
in concurrent-output (such as perhaps misbehaving on non-VT100 compatible
terminals) are avoided by default by only using it when -J is used.
2019-06-05 17:54:35 -04:00
Joey Hess
7dcc815c29
more thoughts 2019-06-04 14:38:55 -04:00
Joey Hess
cd20dc4158
thoughts 2019-06-04 14:13:15 -04:00
Joey Hess
cd5e8be2dc
comment 2019-05-23 13:36:59 -04:00
Joey Hess
e06feb7316
honor preferred content when importing
Importing from a special remote honors its preferred content too; unwanted
files are not imported. But, some preferred content expressions can't be
checked before files are imported, and trying to import with such an
expression will fail.

Tested this with scenarios including changing the preferred content
expression and making sure merging the import didn't delete files that were
no longer wanted.

There was one minor inefficiency mentioned in the todo that I punted on.
2019-05-21 14:38:06 -04:00
Joey Hess
ec11575d17
hairyness 2019-05-21 12:54:57 -04:00
Joey Hess
3b9a19171a
Merge branch 'master' into preferred 2019-05-21 11:34:45 -04:00
Joey Hess
5e1221ad53
Improve shape of commit tree when importing from unversioned special remotes
Make the import have the previous import as a parent, so eg `git log --stat`
displays a useful diff.

Also a minor optimisation, only calculate the depth of the imported history
once.
2019-05-21 11:32:54 -04:00
Joey Hess
5af9e7f3d0
break out a todo 2019-05-21 11:10:13 -04:00
Joey Hess
97fd9da6e7
add back non-preferred files to imported tree
Prevents merging the import from deleting the non-preferred files from
the branch it's merged into.

adjustTree previously appended the new list of items to the old, which
could result in it generating a tree with multiple files with the same
name. That is not good and confuses some parts of git. Gave it a
function to resolve such conflicts.

That allowed dealing with the problem of what happens when the import
contains some files (or subtrees) with the same name as files that were
filtered out of the export. The files from the import win.
2019-05-20 16:43:52 -04:00
Joey Hess
7d177b78e4
docs for export preferred content
This includes a note about how include= and exclude= match when exporting
a subtree. I don't know if the note is prominent enough, but the
behavior seems unsurprising enough.
2019-05-20 12:06:02 -04:00
Joey Hess
12451ea010
Merge branch 'master' into preferred 2019-05-20 10:00:03 -04:00
Joey Hess
8958556fe3
thought 2019-05-16 20:41:17 -04:00
Joey Hess
24c8b1b15a
update 2019-05-14 15:25:09 -04:00
Joey Hess
9411a7c93c
matching preferred content before key is known
This will let import try to match preferred content expressions before
downloading the content and generating its key.

If an expression needs a key, it preferredContentParser with
preferredContentKeylessTokens will fail to parse it.

standard and groupwanted are not in preferredContentKeylessTokens
because they may refer to an expression that refers to a key.
That needs further work to support them.
2019-05-14 14:28:23 -04:00
Joey Hess
a3e24ed533
more design work 2019-05-14 11:49:23 -04:00
Joey Hess
c5a61ee808
closing in on final design for this 2019-05-14 10:52:00 -04:00
Ilya_Shlyakhter
0610789285 Added a comment: checksums of remote data 2019-05-13 22:03:38 +00:00
Joey Hess
0c7569bb6f
close 2019-05-10 13:59:39 -04:00
Joey Hess
c77d79d343
close old todo 2019-05-10 13:54:32 -04:00
Joey Hess
ae562ad4d7
update old todo item with what still needs doing
removed old comments that are no longer relevant
2019-05-10 13:52:40 -04:00
Joey Hess
daa0c6c1c6
close old todo 2019-05-10 13:35:55 -04:00
Joey Hess
d32143e7ad
close 2019-05-10 13:34:44 -04:00
Joey Hess
ccfb800fa6
Merge branch 'master' of ssh://git-annex.branchable.com 2019-05-10 13:31:49 -04:00
Joey Hess
82186ca58f
annex.jobs=cpus etc
Added the ability to run one job per CPU (core), by setting annex.jobs=cpus,
or using option --jobs=cpus or -Jcpus.

Built with future expansion in mind, including not defaulting matching on
Concurrency so more constructors can later be added, and using "cpu"
instead of "0".
2019-05-10 13:27:08 -04:00
Ilya_Shlyakhter
e0c73c7f29 Added a comment 2019-05-09 21:07:39 +00:00
Ilya_Shlyakhter
5638ae9688 Added a comment 2019-05-07 00:59:31 +00:00
Joey Hess
b03e65d260
Improved locking when multiple git-annex processes are writing to the .git/index file 2019-05-06 15:15:12 -04:00
Joey Hess
4bc99e4c21
add todo 2019-05-06 14:58:59 -04:00
Joey Hess
6845c1e020
comment 2019-05-06 12:16:19 -04:00
Ilya_Shlyakhter
437fa438e3 Added a comment 2019-05-03 16:31:18 +00:00
Ilya_Shlyakhter
6535d0c1b2 Added a comment 2019-05-03 16:26:40 +00:00
Ilya_Shlyakhter
03a20b225a Added a comment 2019-05-03 16:11:06 +00:00
Joey Hess
40c749387f
comment 2019-05-03 11:53:03 -04:00
Ilya_Shlyakhter
64bcaff016 added todo for speculate-can-get : extension of speculate-present 2019-05-03 15:34:41 +00:00
Joey Hess
700a3f2787
Merge branch 'master' into import-from-s3 2019-05-01 14:30:52 -04:00
Joey Hess
a405ae015d
remove simple fast-forward todo
I think the history looks nice enough without that special case.
2019-05-01 14:29:52 -04:00
Joey Hess
a32f31235a
reuse old imported commits
This avoids proliferation of different import commits for the same
trees, and makes the resulting git history nice.
2019-05-01 14:20:26 -04:00
Joey Hess
83a420dd66
update todo 2019-04-30 16:31:46 -04:00
Joey Hess
1503b86a14
make import tree from remote generate a merge commit
This way no history is lost, neither what was exported to the remote,
or the history of changes that is imported from it. No complicated
correlation of two possibly very different histories is needed, just
record what we know and then git merge will do a good job.

Also, it notices when the remote tracking branch doesn't need to be updated,
and avoids doing anything, so noop remotes are super cheap.

The only catch here is that, since the commits generated for imports
from the remote don't have a stable date or author/committer, each
(non-noop) import generates different commits for the same imported
trees. So, when the imported remote tracking branch is merged into master
and then a change is imported again, there will be an extra series of
commits, which will get more and more expensive each time.

This seems to call for making stable commits for imports. Also that
seems a good idea to make importing in several repositories have the
same result.
2019-04-30 16:13:21 -04:00
Joey Hess
cd5e685fd1
comment 2019-04-26 10:18:55 -04:00
Joey Hess
3e85707ccf
Merge branch 'master' of ssh://git-annex.branchable.com 2019-04-26 10:17:15 -04:00
yarikoptic
b71a1b5483 wishlist for add --json-progress 2019-04-25 16:33:19 +00:00
Joey Hess
2a6824bf9d
close 2019-04-25 10:49:55 -04:00
Joey Hess
ca385a09c1
rename problem 2019-04-24 15:52:05 -04:00
Joey Hess
5b09b016fe
update 2019-04-24 15:22:02 -04:00
Ilya_Shlyakhter
ae04ab3b91 re: backend variants that compute checksum of chunk checksums 2019-04-24 17:40:13 +00:00
Joey Hess
2d0dd34916
initial work toward correctly merging deeper import histories
Pure code is tested working, including with even histories that merge
several lines of development. Needs to be hooked up to git histories
next.
2019-04-23 16:34:19 -04:00
Joey Hess
48d30d8753
Merge branch 'master' into import-from-s3 2019-04-23 15:34:26 -04:00
Joey Hess
c3f5e7863c
some more todos 2019-04-23 15:34:11 -04:00
Joey Hess
8d01b00507
update status 2019-04-23 14:50:33 -04:00
Joey Hess
a42e7a012a
refuse unsafe store to unversioned exporttree with old aws version
I've developed a patch to aws, once it gets merged, the real version
number of aws can be filled in.
2019-04-23 14:39:30 -04:00
Joey Hess
ae21c88640
tested S3 import/export with versioned bucket
rename and delete working
2019-04-23 13:43:41 -04:00
Joey Hess
0c878899ea
update status 2019-04-23 13:21:38 -04:00
Ilya_Shlyakhter
4c79f2b4ac added suggestion to use git-replace for better implement git-annex-migrate 2019-04-22 01:26:55 +00:00
Joey Hess
2f79cb4b45
versioned import from S3 is working
Still some bugs and two stubbed methods to implement though.
2019-04-19 15:13:49 -04:00
Joey Hess
55a5d9679a
implemented mkImportableContentsVersioned 2019-04-19 13:39:33 -04:00
Joey Hess
1968f6d9c6
designing S3 GetBucketObjectVersions to ImportableContents algo
I think I have a good algo now, at least poorly explained in English..
2019-04-18 16:25:04 -04:00
Joey Hess
2f740d14da
hmm 2019-04-16 13:18:59 -04:00
Joey Hess
a474304f1d
Merge branch 'master' of ssh://git-annex.branchable.com 2019-04-15 13:49:16 -04:00
Joey Hess
c0c38e986d
added renameremote command 2019-04-15 13:49:03 -04:00
Joey Hess
de7a510da1
update 2019-04-15 13:00:46 -04:00
Joey Hess
00b1943927
close 2019-04-15 12:59:39 -04:00
Joey Hess
72b01b0faf
todo 2019-04-15 12:55:56 -04:00
Joey Hess
40fe5e8927
todo 2019-04-12 11:49:38 -04:00
Ilya_Shlyakhter
9a7cef06e3 added suggestion for git-annex-get --batch --key 2019-04-11 23:41:17 +00:00
Joey Hess
d3d6a45918
thoughts 2019-04-10 12:01:52 -04:00
Joey Hess
7b6d0da9b8
adb import
As well as adding the necessary methods, a few other changes to the adb
remote:

* Use ".annextmp" extension for temp files, to avoid conflict with other
  temp files.
* Stop using "echo $?" to get exit status of command inside adb.
  There were two problems; first the "echo" just before it meant it was
  always 0! And secondly, it seems kind of random on my phone whether it's
  1 or 0, not dependant on whether the command seems to have succeeded.
2019-04-09 17:52:41 -04:00
Joey Hess
7bf18f23e5
todo 2019-04-09 14:07:47 -04:00
Joey Hess
0a14dfd383
comment 2019-04-09 11:08:18 -04:00
Joey Hess
4af55c1f30
Merge branch 'master' of ssh://git-annex.branchable.com 2019-04-05 11:41:46 -04:00
yarikoptic
7db8eaf512 initial question about possible "globus" special remote 2019-04-05 02:25:12 +00:00
Joey Hess
1f3245ddf5
close as basis of this is wrong 2019-04-04 12:50:55 -04:00
Joey Hess
727ac0451a
comment 2019-04-03 13:14:54 -04:00
Joey Hess
bc302b56ae
test patch 2019-03-28 16:16:28 -04:00
Joey Hess
c68ae14268
further thought 2019-03-28 15:46:14 -04:00
Joey Hess
b09c6e3016
todo item based on behavior yoh showed me 2019-03-28 14:04:20 -04:00
Joey Hess
e035bc5324
minor typos 2019-03-27 11:15:20 -04:00
Ilya_Shlyakhter
1c334f74d6 Added a comment 2019-03-26 18:27:04 +00:00
Ilya_Shlyakhter
9b4e06d8c2 fixed a typo 2019-03-26 17:51:19 +00:00
Ilya_Shlyakhter
38669f0817 Added a comment: simplifying the interface 2019-03-26 17:40:33 +00:00
Ilya_Shlyakhter
72b788dfaf re: documenting git-annex dependencies 2019-03-24 18:47:28 +00:00
Joey Hess
9ada4b38c1
comment 2019-03-22 10:30:22 -04:00
Joey Hess
59c8119b2a
comment 2019-03-22 10:18:07 -04:00
Joey Hess
5fea7efee7
comment and toddo 2019-03-22 09:23:31 -04:00
Ilya_Shlyakhter
438ff50013 Added a comment 2019-03-19 20:40:50 +00:00