Commit graph

3217 commits

Author SHA1 Message Date
Joey Hess
9d9f1f85d6
comment 2020-07-27 11:37:12 -04:00
Joey Hess
36d1621c35
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-27 11:34:03 -04:00
Joey Hess
3953c7a0ce
add DEBUG 2020-07-27 11:31:00 -04:00
ghen1
ad391da6b8 2020-07-27 15:24:21 +00:00
Joey Hess
32e1d7bc31
add 2020-07-24 14:11:08 -04:00
Joey Hess
c5ea2e9d12
better benchmark for move/copy speedup 2020-07-24 13:34:12 -04:00
Joey Hess
18f1fb5841
drop performance improvements
Sped up seeking files to drop by 2x, and also some performance
improvements to checking numcopies.

Interestingly, the seek speedup is not due to precaching, but I think is
due to calling getParsed earlier.

Annex.Drop had to be changed to check inAnnex there, since it was removed
from Command.Drop. All other users of Command.Drop already checked inAnnex
themselves.

This commit was sponsored by Ryan Newton on Patreon.
2020-07-24 13:27:46 -04:00
Joey Hess
d732ef1a89
move, copy: Sped up seeking for annexed files to operate on by a factor of nearly 2x. 2020-07-24 12:56:02 -04:00
Joey Hess
4685612f43
small git-annex get speedup
Remove an redundant inAnnex check. The checkContentPresent handles that,
and after the last commit also does in batch mode.
2020-07-22 14:29:30 -04:00
Joey Hess
1be92381ec
unify batch mode with non-batch by using AnnexedFileSeeker 2020-07-22 14:23:28 -04:00
Ilya_Shlyakhter
59917f8a6d Added a comment: external backend protocol 2020-07-21 17:43:27 +00:00
Joey Hess
abd56fb019
Fix a bug in find --batch in the previous version. 2020-07-20 19:50:53 -04:00
Joey Hess
f71310fed0
comment 2020-07-20 14:19:13 -04:00
Joey Hess
d1300eca2e
draft external backend protocol 2020-07-20 14:05:49 -04:00
Joey Hess
1489fbbdde
bug 2020-07-19 18:26:57 -04:00
yarikoptic
6a05388877 Added a comment 2020-07-18 05:09:54 +00:00
yarikoptic
7ee0bcbee7 Added a comment 2020-07-18 05:09:32 +00:00
yarikoptic
4ab711e153 Added a comment 2020-07-18 04:50:12 +00:00
yarikoptic
a6b0147b7f Added a comment 2020-07-18 04:49:49 +00:00
yarikoptic
d7b4df85e4 Added a comment 2020-07-18 04:34:26 +00:00
yarikoptic
5215fe92b9 Added a comment 2020-07-18 04:34:05 +00:00
yarikoptic
360de9446e Added a comment 2020-07-18 03:57:20 +00:00
yarikoptic
c46b9ac4ae initial 2nd wave of whining about startup time and to consider prelink or alike 2020-07-18 03:54:34 +00:00
yarikoptic
4f152089eb Added a comment: Windows build of file (which includes libmagic) 2020-07-16 22:01:17 +00:00
Joey Hess
a3a8779501
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-16 15:08:51 -04:00
Ilya_Shlyakhter
77299ae6e5 Added a comment: external backends 2020-07-16 17:30:55 +00:00
Joey Hess
5ab3849da3
thought 2020-07-15 20:42:53 -04:00
Joey Hess
034f958b09
comment 2020-07-15 14:02:31 -04:00
Joey Hess
360dc386e7
comment 2020-07-15 10:08:37 -04:00
Joey Hess
1bc015bff4
tag datalad at yoh's req 2020-07-15 09:51:57 -04:00
Joey Hess
e66ba410fc
todo 2020-07-14 21:44:31 -04:00
Joey Hess
f9b4a9f650
update 2020-07-14 14:47:22 -04:00
Joey Hess
7b2d236556
importfeed: stream metadata for 5% speedup
On top of the 10% speedup from streaming url logs.
2020-07-14 14:35:26 -04:00
Joey Hess
535cdc8d48
importfeed: Made checking known urls step around 10% faster.
This was a bit disappointing, I was hoping for a 2x speedup. But, I think
the metadata lookup is wasting a lot of time and also needs to be made to
stream.

The changes to catObjectStreamLsTree were benchmarked to not also speed
up --all around 3% more. Seems I managed to make it polymorphic after all.
2020-07-14 12:47:51 -04:00
Joey Hess
75aab72d23
mostly done with location log precaching
Some nice wins.
2020-07-13 17:04:02 -04:00
Joey Hess
df58609804
convert sync to use seekFilteredKeys
This only speeds up sync --content from 34.75 to 33.17 seconds;
location log precaching will probably be a bigger win.
2020-07-13 15:02:52 -04:00
Joey Hess
c70ae68d7e
update 2020-07-13 11:49:24 -04:00
Joey Hess
415d394222
thought 2020-07-13 11:04:57 -04:00
Joey Hess
a32b6f9812
update 2020-07-10 15:49:03 -04:00
Joey Hess
412b09e17e
update 2020-07-10 15:23:12 -04:00
Joey Hess
2468eefc6d
2x speedup for annex file seeking on the horizon 2020-07-10 14:02:48 -04:00
Joey Hess
1df9e72a78
update 2020-07-10 13:31:47 -04:00
Joey Hess
6b9d1c1317
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-10 13:16:11 -04:00
Joey Hess
6e9fcf468d
streamkeys branch 2020-07-09 14:48:03 -04:00
branchable@bafd175a4b99afd6ed72501042e364ebd3e0c45e
bbc3800369 Added a comment: Update on my auto-commit / auto-sync scripts 2020-07-09 14:23:15 +00:00
Ilya_Shlyakhter
96aad5458b Added a comment: re: git-annex-cat 2020-07-09 01:06:37 +00:00
Ilya_Shlyakhter
75b96059af Added a comment: git-annex-cat 2020-07-09 00:21:02 +00:00
Joey Hess
9f6bd6cc05
add inRepoDetails
planned to use for an optimisation

most things using stagedDetails were not expecting to get dup files in a
conflicted merge and deal with them, so converted them to use
inRepoDetails.
2020-07-08 15:36:35 -04:00
Joey Hess
c1eaf5b930
note 2020-07-08 14:21:37 -04:00
Joey Hess
d08c178f97
avoid catObjectStream skipping over unavailable shas
Not needed as it's used for --all, but will be needed later.
2020-07-08 13:57:17 -04:00
Joey Hess
de3d7d044d
make catObjectStream support newline and carriage return in filenames
Turns out the %(rest) trick was not needed. Instead, just maintain a
list of files we've asked for, and each cat-file response is for the
next file in the list.

This actually benchmarks 25% faster than before! Very surprising, but it
must be due to needing to shove less data through the pipe, and parse
less.
2020-07-08 13:49:03 -04:00
Joey Hess
2cf6717aec
thoughts 2020-07-08 10:51:24 -04:00
Joey Hess
5849bd6340
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-07 16:50:26 -04:00
Joey Hess
afd9b2f667
idea 2020-07-07 16:49:44 -04:00
yarikoptic
c9d0bf0e6a reassign to datalad - generic enhancement 2020-07-07 19:05:59 +00:00
Joey Hess
ba0adefe4c
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-07 14:19:46 -04:00
Joey Hess
d010ab04be
sped up the --all option by 2x to 16x by using git cat-file --buffer
This assumes that no location log files will have a newline or carriage
return in their name. catObjectStream skips any such files due to
cat-file not supporting them.

Keys have been prevented from containing newlines since 2011,
commit 480495beb4. If some old repo
had a key with a newline in it, --all will just skip processing that key.
Other things, like .git/annex/unused files certianly assume no newlines in
keys too, and AFAICR, such keys never actually worked.

Carriage return is escaped by preSanitizeKeyName since 2013. WORM keys
generated before that point could perhaps contain a CR. (URL probably not,
http probably doesn't support an URL with a raw CR in it.) So, added
a warning in fsck about such keys. Although, fsck --all will naturally
skip them, so won't be able to warn about them. Not entirely
satisfactory, but I'll bet there are not really any such keys in
existence.

Thanks to Lukey for finding this optimisation.
2020-07-07 13:54:04 -04:00
timothy.sanders@a7ce3a8bae11a60e0c4cda9cb4aef24ec459bbab
3b6754e2a5 2020-07-07 10:26:00 +00:00
timothy.sanders@a7ce3a8bae11a60e0c4cda9cb4aef24ec459bbab
8a9323f5b5 2020-07-07 10:24:29 +00:00
Lukey
56f5d99ceb Added a comment 2020-07-06 21:20:58 +00:00
Joey Hess
9468675ba9
note 2020-07-06 15:12:26 -04:00
Joey Hess
d66fc1a464
Revert "async exception safety for coprocesses"
This reverts commit 7013798df5.
2020-07-06 15:11:28 -04:00
Joey Hess
dfa1c21b8a
comment
and update changelog with benchmark results
2020-07-06 13:39:42 -04:00
Joey Hess
9a2fbc2ea8
comment 2020-07-06 11:58:14 -04:00
Ilya_Shlyakhter
f6af30a7af Added a comment 2020-07-03 19:55:36 +00:00
Joey Hess
d89b52086e
close 2020-07-03 14:31:12 -04:00
Joey Hess
85506a7015
import: Added --no-content option, which avoids downloading files from a special remote
Only supported by some special remotes: directory
I need to check the rest and they're currently missing methods until I do.

git-annex sync --no-content does not yet use this to do imports
2020-07-03 13:41:57 -04:00
Joey Hess
a8099b9896
thought 2020-07-03 12:02:07 -04:00
Joey Hess
89108d6f5a
thought 2020-07-02 21:56:00 -04:00
Joey Hess
e463ef1b91
comment 2020-07-02 20:13:19 -04:00
Joey Hess
8fc9788363
fix commment 2020-07-02 20:05:36 -04:00
yarikoptic
edef3c25b3 Added a comment: map2url? 2020-07-02 20:41:15 +00:00
Ilya_Shlyakhter
df65c4796d Added a comment 2020-07-02 20:22:26 +00:00
yarikoptic
b7a78cbb26 Added a comment 2020-07-02 20:14:20 +00:00
Joey Hess
3353ff236a
comment 2020-07-02 15:30:16 -04:00
Joey Hess
f8ed8a916c
design 2020-07-02 14:35:59 -04:00
Joey Hess
a88b671bd9
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-02 14:17:31 -04:00
Joey Hess
caaeba0be9
thoughts 2020-07-02 14:15:47 -04:00
yarikoptic
1d51db3b02 Added a comment: more ideas for async implementation 2020-07-02 17:44:44 +00:00
Joey Hess
fe1f4632a4
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-02 10:00:11 -04:00
Joey Hess
00c9eb4c78
comment 2020-07-01 20:12:10 -04:00
Ilya_Shlyakhter
d03902f7ff Added a comment: annex.thin for importing from directory special remote 2020-07-01 22:23:58 +00:00
Lukey
5a64acf790 Added a comment 2020-07-01 20:37:13 +00:00
Joey Hess
640dbaaaf8
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-01 15:15:47 -04:00
Joey Hess
11c2886578
overlapping todos 2020-07-01 15:06:36 -04:00
Ilya_Shlyakhter
d1232e385b Added a comment 2020-07-01 17:33:49 +00:00
Ilya_Shlyakhter
6eb318cd53 Added a comment: git pack files 2020-07-01 17:32:45 +00:00
Joey Hess
424b1912d6
followup and add link 2020-07-01 12:28:44 -04:00
Joey Hess
a496ab602d
todo 2020-07-01 12:07:11 -04:00
Joey Hess
07dff32bd4
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-01 11:23:39 -04:00
Lukey
ffb03cc959 Added a comment 2020-07-01 14:32:12 +00:00
Joey Hess
98a8a6da81
todo 2020-06-30 18:41:47 -04:00
Joey Hess
8f508d4406
comments 2020-06-30 16:41:31 -04:00
Joey Hess
1d335520df
Merge branch 'master' of ssh://git-annex.branchable.com 2020-06-30 12:27:19 -04:00
Joey Hess
137450c9fe
thoughts 2020-06-30 12:24:08 -04:00
Lukey
e6ca4cd0df 2020-06-30 15:46:57 +00:00
yarikoptic
692cea01e4 an idea on a (more) efficient transfer via async external remote protocol 2020-06-30 04:37:22 +00:00
Joey Hess
7fd20146e1
all easy cases done
bup can't do it after all, because removeKey deletes the git branch. And
the rest seem too hard to tackle today.
2020-06-26 14:24:48 -04:00
Joey Hess
76721b62dd
does not make sense to lockContent on web
Looked into this, and dropKey from web actually removes the url,
so git-annex won't try to get content from it.

So, if lockContent were implemented for web, and the web was left as the
only thing containing an object, another repo could at the same time
drop from web and remove its url, leaving no way to get the object.

Add to that, of course, the web is typically set untrusted, and so
implementing lockContent would not then be useful.

Similar reasoning applies to the bittorrent special remote, as well
as the fact that it does not even implement checkKey.
2020-06-26 13:58:28 -04:00
Joey Hess
b316a85ede
update 2020-06-26 13:54:23 -04:00