Commit graph

3656 commits

Author SHA1 Message Date
Ilya_Shlyakhter
9bde743eae Added a comment: install clean/smudge filter only when needed 2021-03-18 15:47:03 +00:00
Joey Hess
02e74c010b
todo 2021-03-16 17:18:35 -04:00
Joey Hess
7dabbe4520
comment 2021-03-16 14:56:08 -04:00
yarikoptic
d0b56f113c an idea to avoid lengthy Scanning for unlocked files (this may take some time) 2021-03-15 17:57:53 +00:00
Ilya_Shlyakhter
b4022decb8 Added a comment: re: individually hash chunks 2021-03-15 15:23:09 +00:00
0xloem@0bd8a79a57e4f0dcade8fc81d162c37eae4d6730
32f773ae8f 2021-03-15 07:08:08 +00:00
Joey Hess
37263e97c7
comment 2021-03-12 11:45:28 -04:00
meribold
96b549779a Added a comment 2021-03-12 03:58:08 +00:00
Joey Hess
22d8ec6d74
comment 2021-03-11 13:06:24 -04:00
Joey Hess
06f86379b0
todo from comment 2021-03-11 12:26:10 -04:00
Joey Hess
b05f8458fb
remove tag 2021-03-08 14:47:40 -04:00
Joey Hess
e8065ee99d
close todo 2021-03-05 14:46:09 -04:00
Joey Hess
8d983b6432
comment 2021-03-05 12:39:57 -04:00
Joey Hess
eb594c710e
unregisterurl: New command
Implemented by generalizing registerurl. Without the implicit batch mode
of registerurl since that is only a backwards compatability thing
(see commit 1d1054faa6).
2021-03-01 14:28:24 -04:00
Joey Hess
06665d733a
comment 2021-03-01 13:52:19 -04:00
Joey Hess
db2e0485ae
Merge branch 'master' of ssh://git-annex.branchable.com 2021-03-01 13:10:50 -04:00
Ilya_Shlyakhter
cd682cf227 Added a comment: issue of pushing refs to annexed files but not info on how to fetch them 2021-03-01 16:46:57 +00:00
Joey Hess
9835fa5d01
todo 2021-02-26 12:12:41 -04:00
kyle
4c48791d41 Added a comment 2021-02-25 21:25:52 +00:00
kyle
9d96f41185 Added a comment: setpresentkey 0 2021-02-25 21:18:50 +00:00
yarikoptic
e5a24a502c or just a rmurl --key? 2021-02-25 20:37:25 +00:00
yarikoptic
4c1ffaac56 TODO for unregisterurl 2021-02-25 20:27:46 +00:00
Joey Hess
62d5a73bdd
unannex, uninit: Avoid running git rm once per annexed file, for a large speedup. 2021-02-22 12:56:11 -04:00
git-annex.branchable.com@d12f3f46c9222459d17f96bc7be04f7cd03a6732
1e4fac1046 Added a comment 2021-02-21 15:50:49 +00:00
git-annex.branchable.com@d12f3f46c9222459d17f96bc7be04f7cd03a6732
26c19de0d9 Add workaround 2021-02-20 19:05:42 +00:00
git-annex.branchable.com@d12f3f46c9222459d17f96bc7be04f7cd03a6732
c37bfccb63 Initial report 2021-02-20 19:04:18 +00:00
yarikoptic
a876884987 initial observation about slow uninit 2021-02-19 17:08:39 +00:00
Joey Hess
cb7bb3e4b9
comment 2021-02-10 21:49:25 -04:00
Joey Hess
e3832af5d5
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-10 16:40:16 -04:00
Joey Hess
f44d4704c6
incremental checksum for local remotes
This benchmarks only slightly faster than the old git-annex. Eg, for a 1
gb file, 14.56s vs 15.57s. (On a ram disk; there would certianly be
more of an effect if the file was written to disk and didn't stay in
cache.)

Commenting out the updateIncremental calls make the same run in 6.31s.
May be that overhead in the implementation, other than the actual
checksumming, is slowing it down. Eg, MVar access.

(I also tried using 10x larger chunks, which did not change the speed.)
2021-02-10 16:05:24 -04:00
Joey Hess
6487a75d33
comment 2021-02-10 13:15:00 -04:00
yarikoptic
f5c3eb86f9 Added a comment 2021-02-09 22:48:07 +00:00
Joey Hess
d8598dc3a0
comment 2021-02-09 17:05:56 -04:00
Joey Hess
fd51b0cd83
comment 2021-02-09 13:42:49 -04:00
Joey Hess
cbe84b62b9
close 2021-02-08 18:17:59 -04:00
Joey Hess
8f3554a7a8
close as dup 2021-02-08 14:19:43 -04:00
Joey Hess
dd39e9e255
suggest when user may want annex.stalldetection
When annex.stalldetection is not enabled, and a likely stall is detected,
display a suggestion to enable it.

Note that the progress meter display is not taken down when displaying
the message, so it will display like this:

	0%    8 B                 0 B/s
	  Transfer seems to have stalled. To handle stalling transfers, configure annex.stalldetection
	0%    10 B                0 B/s

Although of course if it's really stalled, it will never update
again after the message. Taking down the progress meter and starting
a new one doesn't seem too necessary given how unusual this is,
also this does help show the state it was at when it stalled.

Use of uninterruptibleCancel here is ok, the thread it's canceling
only does STM transactions and sleeps. The annex thread that gets
forked off is separate to avoid it being canceled, so that it
can be joined back at the end.

A module cycle required moving from dupState the precaching of the
remote list. Doing it at startConcurrency should cover all the cases
where the remote list is used in concurrent actions.

This commit was sponsored by Kevin Mueller on Patreon.
2021-02-03 15:57:19 -04:00
Joey Hess
135757d64a
automatic stall detection
annex.stalldetection can now be set to "true" to make git-annex do
automatic stall detection when it detects a remote is updating its transfer
progress consistently enough.

This commit was sponsored by Luke Shumaker on Patreon.
2021-02-03 13:33:57 -04:00
Joey Hess
904689f11b
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-02 19:39:24 -04:00
guardcat
ebb805b611 2021-02-02 21:04:21 +00:00
Joey Hess
aec2cf0abe
addon commands
Seems only fair, that, like git runs git-annex, git-annex runs
git-annex-foo.

Implementation relies on O.forwardOptions, so that any options are passed
through to the addon program. Note that this includes options before the
subcommand, eg: git-annex -cx=y foo

Unfortunately, git-annex eats the --help/-h options.
This is because it uses O.hsubparser, which injects that option into each
subcommand. Seems like this should be possible to avoid somehow, to let
commands display their own --help, instead of the dummy one git-annex
displays.

The two step searching mirrors how git works, it makes finding
git-annex-foo fast when "git annex foo" is run, but will also support fuzzy
matching, once findAllAddonCommands gets implemented.

This commit was sponsored by Dr. Land Raider on Patreon.
2021-02-02 16:32:49 -04:00
Joey Hess
d0fe0c5e10
close old todo 2021-02-02 13:25:58 -04:00
Joey Hess
696ee5d464
close 2021-02-02 13:21:34 -04:00
Joey Hess
233b2ab133
reject 2021-02-02 13:19:28 -04:00
Joey Hess
80a16a9dc8
remove no longer relevant part 2021-02-02 13:15:32 -04:00
Joey Hess
811399c8a1
meant to close this earlier 2021-02-02 13:12:47 -04:00
Joey Hess
cdbf80d338
comment 2021-02-02 13:10:07 -04:00
Joey Hess
5cad76198a
close 2021-02-02 12:52:00 -04:00
Joey Hess
d631304237
remove priority tag (unused) 2021-02-02 12:42:20 -04:00
Joey Hess
02b5bd224e
move to todo 2021-01-29 13:07:49 -04:00
Joey Hess
b372d962ae
Added GETGITREMOTENAME to extenal special remote protocol 2021-01-26 12:42:47 -04:00
Joey Hess
a11dad646c
Merge branch 'master' of ssh://git-annex.branchable.com 2021-01-26 11:32:52 -04:00
Joey Hess
8a8491b5b9
update 2021-01-26 11:32:40 -04:00
Joey Hess
4d8ce4464f
add 2021-01-26 11:27:14 -04:00
hello@6d9a437cebceb2fc657f93c4f30a7ba859e9309c
8d3bc1d3cb Added a comment: Other benefits of proposed improvements to the smudge/clean API 2021-01-26 14:25:59 +00:00
Joey Hess
9b2711167c
Merge branch 'master' of ssh://git-annex.branchable.com 2021-01-19 13:20:22 -04:00
Joey Hess
73df633a62
omit inode from ContentIdentifier for directory special remote
Directory special remotes with importtree=yes now avoid unncessary overhead
when inodes of files have changed, as happens whenever a FAT filesystem
gets remounted.

A few unusual edge cases of modifications won't be detected and
imported. I think they're unusual enough not to be a concern. It would
be possible to add a config setting that controls whether to compare
inodes too, but does not seem worth bothering the user about currently.

I chose to continue to use the InodeCache serialization, just with the
inode zeroed. This way, if I later change my mind or make it
configurable, can parse it back to an InodeCache and operate on it. The
overhead of storing a 0 in the content identifier log seems worth it.

There is a one-time cost to this change; all directory special remotes
with importtree=yes will re-hash all files once, and will update the
content identifier logs with zeroed inodes.

This commit was sponsored by Brett Eisenberg on Patreon.
2021-01-19 13:15:07 -04:00
Lukey
b11aa063ae Added a comment 2021-01-19 16:27:07 +00:00
Joey Hess
2b458c2d68
comment and todo 2021-01-19 11:56:27 -04:00
Joey Hess
96a7a1fb71
close 2021-01-11 12:26:52 -04:00
Joey Hess
515f54bd70
idea 2021-01-10 16:32:44 -04:00
Joey Hess
4694c2bb87
Merge branch 'master' into requirednumcopies 2021-01-06 14:24:09 -04:00
Joey Hess
cc89699457
mincopies
This is conceptually very simple, just making a 1 that was hard coded be
exposed as a config option. The hard part was plumbing all that, and
dealing with complexities like reading it from git attributes at the
same time that numcopies is read.

Behavior change: When numcopies is set to 0, git-annex used to drop
content without requiring any copies. Now to get that (highly unsafe)
behavior, mincopies also needs to be set to 0. It seemed better to
remove that edge case, than complicate mincopies by ignoring it when
numcopies is 0.

This commit was sponsored by Denis Dzyubenko on Patreon.
2021-01-06 14:15:19 -04:00
Joey Hess
8d8cdbec56
branch 2021-01-05 14:28:54 -04:00
Joey Hess
5ce61c6b2a
add: Significantly speed up adding lots of non-large files to git
* add: Significantly speed up adding lots of non-large files to git,
  by disabling the annex smudge filter when running git add.
* add --force-small: Run git add rather than updating the index itself,
  so any other smudge filters than the annex one that may be enabled will
  be used.
2021-01-04 13:12:28 -04:00
Joey Hess
e7b0754171
comment and todo 2021-01-04 12:26:48 -04:00
Joey Hess
9a3998392e
close import_tree todo
Split out two todos for things that were mentioned as still open items
in there. Most of the others were already dealt with. I didn't open a
new todo for the import from readonly S3 bucket because I guess if
someone needs that, they can ask for it.
2020-12-30 13:40:49 -04:00
Joey Hess
b16e6fb4e6
borg appendonly config 2020-12-28 16:23:38 -04:00
Joey Hess
6280af2901
generate more compact git-annex branch for imports
Especially from borg, where the content identifier logs
all end up being the same identical file!

But also, for other imports, the location tracking logs can,
in some cases, be identical files.

Bonus optimisation: Avoid looking up (and parsing when set)
GIT_ANNEX_VECTOR_CLOCK env var every time a log is written to.
Although the lookup does happen at startup even when no
log will be written now.
2020-12-23 15:25:16 -04:00
Joey Hess
7916fc98a3
graft in imported tree to avoid gc
Fix a bug that could prevent getting files from an importtree=yes remote,
because the imported tree was allowed to be garbage collected.
2020-12-23 14:27:38 -04:00
Joey Hess
e3d356fe84
borg: add subdir= config
Note that, after changing it with enableremote, syncing won't rescan
known archives in the borg repo using the changed config. Probably not a
problem?

Also used File in some places where filenames that could theoretically
start with - are passed to borg, to avoid it confusing them with
options.
2020-12-23 13:12:11 -04:00
Joey Hess
1574972ba9
make sync --content get from third-party populated remotes like borg 2020-12-23 12:10:39 -04:00
Joey Hess
79729bcd76
todo 2020-12-22 16:37:19 -04:00
Joey Hess
a2fe994ebb
move unimplemented option to todo 2020-12-22 16:28:13 -04:00
Joey Hess
cd4c68924b
merged borg
Still a couple related todos, but it's basically usable now.
2020-12-22 16:22:44 -04:00
Joey Hess
2335476e1e
todo 2020-12-22 16:19:02 -04:00
Joey Hess
4254e2297d
implement retrieveExportWithContentIdentifier
Moved out an XXX to a todo

This seems about ready to merge..
2020-12-22 16:16:48 -04:00
Joey Hess
f31bdd0b19
todo 2020-12-22 15:01:07 -04:00
Joey Hess
82e43da936
todo 2020-12-22 15:00:11 -04:00
Joey Hess
f62aee0525
fix handling of importtree-only remotes
Don't want to try to use these remotes as key/value remotes, which will
surely fail. It only recently became possible for importtree to be set
w/o exporttree, so before this code was ok.

(cherry picked from commit 97599cb0f7f4115aa5a3e81a91ee3d1d6c52dc84)
2020-12-18 15:13:30 -04:00
Joey Hess
4c63cab467
todo 2020-12-17 16:30:51 -04:00
Joey Hess
2abda21123
update 2020-12-15 16:35:06 -04:00
Joey Hess
117d270bb4
comment 2020-12-15 16:34:16 -04:00
Joey Hess
74c1e0660b
propagate git-annex -c on to transferrer child process
git -c was already propagated via environment, but need this for
consistency.

Also, notice it does not use gitAnnexChildProcess to run the
transferrer. So nothing is done about avoid it taking the
pid lock. It's possible that the caller is already doing something that
took the pid lock, and if so, the transferrer will certianly fail,
since it needs to take the pid lock too. This may prevent combining
annex.stalldetection with annex.pidlock, but I have not verified it's
really a problem. If it was, it seems git-annex would have to take
the pid lock when starting a transferrer, and hold it until shutdown,
or would need to take pid lock when starting to use a transferrer,
and hold it until done with a transfer and then drop it. The latter
would require starting the transferrer with pid locking disabled for the
child process, so assumes that the transferrer does not do anyting that
needs locking when not running a transfer.
2020-12-15 11:36:25 -04:00
Joey Hess
75acf5f440
improve some edge cases around partial initialization
* Guard against running in a repo where annex.uuid is set but
  annex.version is set, or vice-versa.
* Avoid autoinit when a repo does not have annex.version or annex.uuid
  set, but has a git-annex objects directory, suggesting it was used
  by git-annex before.
2020-12-14 13:17:43 -04:00
Joey Hess
9aaab02e44
add 2020-12-13 18:50:35 -04:00
Joey Hess
3c76a31b15
response and related todo 2020-12-11 16:21:16 -04:00
Joey Hess
e6692b66f1
remove
I need to think about this some more, not clear if it's a todo
item specific to stalldetection at all. Remotes with this behavior
also show no progress when run with -J. And some other remotes don't
update any progress meters at all, eg adb is that way and so are hook
remotes and of course external remotes don't have to send progress info.
2020-12-11 15:49:39 -04:00
Joey Hess
fadf47557f
note 2020-12-11 15:48:03 -04:00
Joey Hess
90eadbce49
close 2020-12-11 15:45:55 -04:00
Joey Hess
d3f78da0ed
propagate signals to the transferrer process group
Done on unix, could not implement it on windows quite.

The signal library gets part of the way needed for windows.
But I had to open https://github.com/pmlodawski/signal/issues/1 because
it lacks raiseSignal.

Also, I don't know what the equivilant of getProcessGroupIDOf is on
windows. And System.Process does not provide a way to send any signal to
a process group except for SIGINT.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2020-12-11 15:32:00 -04:00
Joey Hess
79c765b727
meant to close this earlier 2020-12-11 14:25:54 -04:00
Joey Hess
095cdc7e83
extend transferrer protocol to send progress bar total size updates
New protocol is not back-compat with old one, but it's never been
released so that's ok.
2020-12-11 12:42:28 -04:00
Joey Hess
263fd1d459
ugh 2020-12-11 11:51:46 -04:00
Joey Hess
a422a056f2
make getViaTmpFrom no longer update location log
All callers adjusted to update it themselves.

In Command.ReKey, and Command.SetKey, the cleanup action already did,
so it was updating the log twice before.

This fixes a bug when annex.stalldetection is set, as now
Command.Transferrer can skip updating the location log, and let it be
updated by the calling process.
2020-12-11 11:50:13 -04:00
Joey Hess
a6ed23b82f
todo 2020-12-11 11:09:19 -04:00
Joey Hess
d3c11eac5d
2 bugs involving new stalldetection feature 2020-12-10 17:46:58 -04:00
Joey Hess
3fa2bc2eed
rename back, there were links to this 2020-12-09 13:14:54 -04:00
Joey Hess
05c0543e8e
move new interface to git-annex transfer
This is to avoid breakage when upgrading or downgrading git-annex with a
process running that uses the interface. It's better to keep the
compatability code for a few years than worry about such breakage.

This commit was sponsored by Brett Eisenberg on Patreon.
2020-12-09 12:33:56 -04:00
Joey Hess
41f2c308ff
stall detection is working
New config annex.stalldetection, remote.name.annex-stalldetection, which
can be used to deal with remotes that stall during transfers, or are
sometimes too slow to want to use.

This commit was sponsored by Luke Shumaker on Patreon.
2020-12-08 15:22:18 -04:00
Joey Hess
09ed9f7d1f
reminder for later 2020-12-08 15:20:05 -04:00
Joey Hess
c4d489f7d4
add todo
Not going to do this yet, so remember for later.
2020-12-08 15:17:35 -04:00
Joey Hess
438d5be1f7
support prompt in message serialization
That seems to be the last thing needed for message serialization.
Although it's only used in the assistant currently, so hard to tell if I
forgot something.

At this point, it should be possible to start using transferkeys
when performing transfers, which will allow killing a transferkeys
process if a transfer times out or stalls. But that's for another day.

This commit was sponsored by Ethan Aubin.
2020-12-04 14:54:09 -04:00
Joey Hess
4d9f416949
idea 2020-12-04 00:00:40 -04:00
Joey Hess
bf76ae2c90
mention new branch 2020-12-03 16:29:22 -04:00
Joey Hess
1858b65d88
design work 2020-12-02 14:31:24 -04:00
Joey Hess
ee86972f66
thought 2020-11-30 13:27:45 -04:00
Joey Hess
e843334a40
comment 2020-11-30 13:13:56 -04:00
Lukey
36b4a253e7 2020-11-29 19:06:37 +00:00
Lukey
7e86da7701 2020-11-29 19:04:34 +00:00
Joey Hess
a6f7017eba
Merge branch 'master' of ssh://git-annex.branchable.com 2020-11-23 13:00:36 -04:00
Joey Hess
8b8ee68a9c
update 2020-11-23 13:00:20 -04:00
Cebtenzzre
8816f0d8ab Added a comment 2020-11-22 16:07:16 +00:00
Joey Hess
1be38362aa
retitle 2020-11-16 15:17:48 -04:00
Joey Hess
805af01562
bug fix
really innefficient but it does solve dropping
2020-11-16 14:57:51 -04:00
Joey Hess
0896038ba7
annex.adjustedbranchrefresh
Added annex.adjustedbranchrefresh git config to update adjusted branches
set up by git-annex adjust --unlock-present/--hide-missing.

Note, in a few cases, I was not able to make the adjusted branch
be updated in calls to moveAnnex, because information about what
file corresponds to a key is not available. They are:

* If two files point to one file, then eg, `git annex get foo` will
  update the branch to unlock foo, but will not unlock bar, because it
  does not know about it. Might be fixable by making `git annex get
  bar` do something besides skipping bar?
* git-annex-shell recvkey likewise (so sends over ssh from old versions
  of git-annex)
* git-annex setkey
* git-annex transferkey if the user does not use --file
* git-annex multicast sends keys with no associated file info

Doing a single full refresh at the end, after any incremental refresh,
will deal with those edge cases.
2020-11-16 14:27:28 -04:00
Joey Hess
26cf26caca
Merge branch 'master' into symlink-missing 2020-11-16 10:03:12 -04:00
Joey Hess
5a8d01f63e
examinekey: Added a "file" format variable
For consistency with find, and for easier scripting.
2020-11-16 09:59:11 -04:00
yarikoptic
13bab4f2cf Added a comment 2020-11-14 02:00:13 +00:00
Joey Hess
f07670a282
measurement 2020-11-13 15:57:35 -04:00
Joey Hess
56aabccda4
close 2020-11-13 15:54:33 -04:00
Joey Hess
b9351922d2
add todo 2020-11-13 15:50:35 -04:00
Joey Hess
7566aa6bc5
examinekey: Added --migrate-to-backend
Note that, the way the SeekInput parser is written to support batch mode,
it's actually possible to do git-annex examinekey
"SHA1--foo foo.tar.gz" --migrate-to-backend=SHA1E

While that might be kind of useful to support multiple migrations not using
batch mode, I have not documented it. It would be better to take pairs of
key and file in that case.
2020-11-12 14:09:14 -04:00
Joey Hess
12e32d1dee
examinekey: Added two new format variables: objectpath and objectpointer 2020-11-12 13:02:31 -04:00
Joey Hess
c5141b469a
comment 2020-11-12 12:59:27 -04:00
Joey Hess
d7da4ee00a
comment 2020-11-12 12:29:15 -04:00
yarikoptic
60a71f90cc adding a note pointing to tentative recipe 2020-11-11 20:10:33 +00:00
yarikoptic
07e9f43c63 todo/question on how to get full path to the key knowing metadata but having no file 2020-11-11 19:29:14 +00:00
Lukey
9d598265e4 2020-11-05 18:41:25 +00:00
Joey Hess
664bec4297
comment 2020-11-02 22:34:04 -04:00
yarikoptic
5b4d5f6d64 Added a comment 2020-11-02 20:37:47 +00:00
Joey Hess
40679616ed
comment 2020-11-02 15:01:17 -04:00
yarikoptic
79874325b8 a plea for more --debug output 2020-11-02 18:04:20 +00:00
michael.hanke@c60e12358aa3fc6060531bdead1f530ac4d582ec
0170b5468f Added a comment: Documentation of demand 2020-10-27 14:59:47 +00:00
yarikoptic
c24480c061 Added a comment: windows build with magic 2020-10-22 19:32:55 +00:00
Joey Hess
0133b7e5a8
move: Improve resuming a move that was interrupted after the object was transferred
In cases where numcopies checks prevented the resumed move from dropping
the object from the source repository, it now relies on a log of recent
moves to replicate the behavior of the interrupted command.

Performance: Probably noticable impact, since it has to add to the log,
check the log, and remove from the log. Seems worth it to avoid this
annoying edge case. The log functions are pretty well optimised to avoid
unncessary work.

An performance improvement to make later would be to avoid cleanup doing
anything if it's not written to the log file, and has confirmed that the
log file does not contain the log line.

This commit was sponsored by Jake Vosloo on Patreon.
2020-10-21 10:31:56 -04:00
Joey Hess
15ea0e6c0a
design done 2020-10-19 14:57:02 -04:00
Joey Hess
5009c1ce68
update 2020-10-19 14:49:16 -04:00
Joey Hess
c337b58caf
more thoughts 2020-10-19 14:19:41 -04:00
Joey Hess
3c6cfacb19
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-10-19 12:19:33 -04:00
Joey Hess
e408658eaa
close 2020-10-19 12:16:15 -04:00
Ilya_Shlyakhter
1ac3735ac9 Added a comment: copying directly from directory special remote to the cloud 2020-10-18 21:00:59 +00:00
Joey Hess
e4a8b7b26f
comment 2020-10-13 18:57:30 -04:00
Joey Hess
9a5cd96f0d
Fix a memory leak introduced in the last release
The problem was this line:

	cleanup = and <$> sequence (map snd v)

That caused all of v to be held onto until the end, when the cleanup action
was run.

I could not seem to find a bang pattern that avoided the leak, so I
resorted to a IORef, rather clunky, but not a performance problem because
it will only be written once per git ls-files, so typically just 1 time.

This commit was sponsored by Mark Reidenbach on Patreon.
2020-10-13 16:31:01 -04:00
Joey Hess
d41795c8ff
blew a day investigating this and still don't understand it fully 2020-10-13 15:33:59 -04:00
Joey Hess
cf0ff9f53d
comment 2020-10-13 12:24:13 -04:00
Joey Hess
b72a5c0680
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-10-12 16:14:51 -04:00
Joey Hess
db1def72ee
comment
This commit was sponsored by Ethan Aubin.
2020-10-12 16:07:30 -04:00
Joey Hess
4124862ae0
comment
This commit was sponsored by Ethan Aubin.
2020-10-12 15:47:46 -04:00
Joey Hess
fc16057f99
profiling
Currently a bit stuck, but at least starting to get some clues on this
memory leak.

This commit was sponsored by Ethan Aubin.
2020-10-12 14:48:26 -04:00
Joey Hess
81216931c6
comment 2020-10-12 12:32:53 -04:00
Lukey
308e5b3a81 Added a comment 2020-10-11 14:17:47 +00:00
Joey Hess
46371797ec
design for finally slaying this beast
This commit was sponsored by Ethan Aubin.
2020-10-08 10:09:12 -04:00
yarikoptic
8035e2073d Added a comment 2020-10-05 23:17:08 +00:00
Joey Hess
471bcfaf37
comment 2020-10-05 14:51:19 -04:00
kyle
a39f9e8c7f Added a comment 2020-10-05 18:09:07 +00:00
yarikoptic
0bb7569e1c Added a comment 2020-10-05 18:02:51 +00:00
kyle
64ca40e11c Added a comment 2020-10-05 17:42:26 +00:00
yarikoptic
4bb980bfb9 Added a comment 2020-10-05 15:26:10 +00:00
kyle
742ce7ec87 Added a comment 2020-10-01 19:07:48 +00:00
yarikoptic
1bf55a8fc1 Added a comment 2020-10-01 17:26:10 +00:00
Joey Hess
37b1f2f2ed
response 2020-10-01 13:08:09 -04:00
yarikoptic
23103e225a Added a comment 2020-10-01 14:06:34 +00:00
Joey Hess
1610d94776
addurl: Avoid a redundant git ignores check for speed
Ensure that checkCanAdd is used everywhere a file is added to git,
so git add is run with -f, presumably avoiding the work it would usually
do to check ignores.
2020-09-29 13:00:41 -04:00
Joey Hess
d10cbaa084
comment 2020-09-29 12:25:40 -04:00
lykos@d125a37d89b1cfac20829f12911656c40cb70018
a3083436dc removed 2020-09-29 08:36:37 +00:00
lykos@d125a37d89b1cfac20829f12911656c40cb70018
7b37f22880 Added a comment 2020-09-29 08:36:19 +00:00
lykos@d125a37d89b1cfac20829f12911656c40cb70018
ff0784dad9 Added a comment 2020-09-29 08:32:22 +00:00
Joey Hess
3eaaec3113
consistently use importKey when available
This avoids import with --no-content and with --content potentially
generating two different trees, leading to a merge conflict when run in
two different clones of a repo. And it's necessary groundwork to make
git-annex sync --no-content import from special remotes that support
importKey.

Only the directory special remote currently supports importKey, and it
generates the same key as git-annex usually does, so there is no
behavior change for it.

Future special remotes will need to take care when adding importKey,
if it generates different keys. Added some warnings about that to
comments.

This commit was sponsored by Noam Kremen on Patreon.
2020-09-28 15:27:46 -04:00
Joey Hess
15c1ee16d9
import --no-content: Check annex.largefiles
Import small files into git, the same as is done when importing with content.
Which means, for small files, --no-content does download them.

If the largefiles expression needs the file content available
(due to mimetype or mimeencoding being used), the import will fail.

This commit was sponsored by Jake Vosloo on Patreon.
2020-09-28 13:28:57 -04:00
Joey Hess
9e676f062f
split out todo 2020-09-28 10:40:13 -04:00
Joey Hess
6a41a615b9
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-25 13:51:20 -04:00
Joey Hess
13f9c88123
add todo 2020-09-25 13:51:04 -04:00
Lukey
12eb7a3ceb Added a comment 2020-09-25 16:33:42 +00:00
Joey Hess
ace02f41b0
seek: defer matcher check until more info is known
Sped up seeking for files to operate on, when using options like --copies
or --in, by around 20%.

Benchmark showed an increase for --copies from 155 seconds to 121
seconds, and --in remote will be similar to that.

For --in here, the speedup was less, 5-10% or so.

(both warm cache)

This commit was sponsored by Jack Hill on Patreon.
2020-09-24 17:59:12 -04:00
Joey Hess
d89984b121
sync --all avoid unncessary first pass
Sped up seeking to around twice as fast, by avoiding a pass over the
worktree files when preferred content expressions of the local repo and
remotes don't use include=/exclude=.

Thanks to Lukey for identifying the optimisation.

This commit was sponsored by Brock Spratlen on Patreon.
2020-09-24 15:12:09 -04:00
Joey Hess
c1b4d76e6b
make MatchFiles introspectable
matchNeedsFileContent is not used yet, but shows how to add information
about terminals. That one would be needed for
https://git-annex.branchable.com/todo/sync_fast_import/

Note the tricky bit in Annex.FileMatcher.call where it folds over the
included matcher to propagate the information.

This commit was sponsored by Svenne Krap on Patreon.
2020-09-24 14:01:53 -04:00
Joey Hess
6d95361f35
add meta todo 2020-09-24 12:54:54 -04:00
Joey Hess
4d4f963c46
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-24 12:42:32 -04:00
Joey Hess
68f9766544
Improve --debug output to show pid of processes that are started and stopped
getPid returns Nothing if the process has already been stopped, and in that
case, the pid will not be displayed. I think that would only happen if
waitForProcess or similar gets called more than once on the same process
handle though.

getPid on unix has an overhead of only a MVar read. On Windows it needs to
make a syscall, so will be probably more expensive. While the added expense
happens even when debug logging is disabled, it should be small enough
compared with the overhead of starting a process that it's not a problem.

(It does occur to me that a debugM that took an IO String could only run it
when debugging is really enabled, which would improve performance. It does
not seem possible to use the current hslogger interface to do that though;
it does not expose the information that would be needed.)
2020-09-24 12:39:57 -04:00
Lukey
221b47162d 2020-09-24 16:36:12 +00:00
yarikoptic
9e033f3001 initial TODO for making failure messages for processes be more informative 2020-09-23 13:03:38 +00:00
Joey Hess
6a5e0cbfc7
Improve the "Try making some of these repositories available" message
With some hints for the user for what to do.

Took care to avoid changing the json output. It would have been ok to add
the new separated lists to it, in addition to the old list, but I didn't
do that because I didn't see much point.
2020-09-22 14:10:30 -04:00
Joey Hess
5cfcf1f05f
cache remote.log
Unlikely to speed up any of the existing uses much, but I want to use it
in a message that might be displayed many times.
2020-09-22 13:52:26 -04:00
Joey Hess
ebdce707da
fix typo 2020-09-22 13:26:49 -04:00
Joey Hess
361ef19999
wording 2020-09-22 12:39:33 -04:00
Joey Hess
41044de833
comment 2020-09-22 12:24:22 -04:00
yarikoptic
c44cd27520 Added a comment 2020-09-18 20:11:27 +00:00
Joey Hess
46a7fcef0d
close 2020-09-18 13:21:32 -04:00
Joey Hess
186c3827d0
comment 2020-09-18 13:21:00 -04:00
Joey Hess
500454935f
comment 2020-09-18 12:08:11 -04:00
Joey Hess
956ff1350a
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-18 12:00:12 -04:00
Joey Hess
81a38df5a7
add missing CHECKURL-FAILURE ErrorMsg to docs 2020-09-18 11:58:18 -04:00
yarikoptic
b48acde47d some whining about check-ignore 2020-09-18 15:30:23 +00:00
yarikoptic
f913822c03 initial todo to add ErrorMsg to all -FAILURE responses 2020-09-17 23:39:42 +00:00
Joey Hess
83df401d93
Merge branch 'batchasync' into master 2020-09-16 13:02:58 -04:00
Joey Hess
10f9107c1b
close 2020-09-16 13:02:35 -04:00
Joey Hess
929de3bb37
groundwork complete 2020-09-15 16:29:38 -04:00
Joey Hess
c6e159550d
update 2020-09-14 16:57:47 -04:00
Joey Hess
63d6cb27a9
thoughts 2020-09-10 13:13:39 -04:00
Joey Hess
6813373490
todo 2020-09-10 09:08:40 -04:00
Joey Hess
2bb933eb60
import: Retry downloads that fail
Also, using the transfer machinery for this makes eg, git-annex info show
in-progress imports, and makes --notify-start/finish work.
2020-09-04 13:54:05 -04:00
Joey Hess
46eb48d7c0
Retry transfers to exporttree=yes remotes same as for other remotes
The comment about noRetry is not well-justified, because transfers to many
remotes cannot be resumed, but retries are still allowed for those.
2020-09-04 13:24:08 -04:00
Joey Hess
1d244bafbd
Limit retrying of failed transfers when forward progress is being made to 5
To avoid some unusual edge cases where too much retrying could result in
far more data transfer than makes sense.
2020-09-04 12:46:37 -04:00
Joey Hess
5a9f518a42
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-02 12:25:52 -04:00
Joey Hess
854cd2ad47
httpalso: support exporttree=yes
Also tested what happens if the other special remote has importtree=yes
and exporttree=yes, and in that case, download via httpalso works too,
without needing to implement any importtree methods here.

It might be possible to make it automatically set exporttree=yes if the
--sameas does. Didn't try, will probably be layering issues.

Or perhaps it should be inherited by sameas like some
other configs? But then, wouldn't it also make sense to inherit
importree=yes? But as shown here, it's not needed by this kind of
remote.
2020-09-02 11:26:00 -04:00
Joey Hess
8656afd3e1
rename http special remote to httpalso
"http" was too generic and easy to confuse with web. The new name makes
clear it's used in addition to some other remote. And other protocols
can use the same naming scheme.
2020-09-02 10:41:53 -04:00
yarikoptic
871257ee23 Added a comment 2020-09-01 20:50:02 +00:00
Joey Hess
20b06266d1
thought 2020-09-01 16:04:40 -04:00
Joey Hess
48dde6d5b0
link 2020-09-01 16:02:03 -04:00
Joey Hess
4bcfd56902
todo 2020-09-01 16:00:49 -04:00
Joey Hess
955f309cd5
comment 2020-09-01 15:51:41 -04:00
Joey Hess
d80876920f
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-01 15:36:56 -04:00
Joey Hess
5b177317b4
comment 2020-09-01 15:36:28 -04:00
yarikoptic
a70260160f noise about 500s 2020-09-01 19:29:29 +00:00
Joey Hess
571ec900ac
Added http special remote, which is useful for accessing other remotes that publish content stored in them via http/https.
With automatic layout learning!
2020-09-01 15:16:35 -04:00
Joey Hess
fccc9ab442
note common need of these two todos 2020-09-01 12:13:41 -04:00
Joey Hess
d91b2b9fe2
close as dup 2020-09-01 11:58:08 -04:00
Joey Hess
95d9a3cf8a
Merge branch 'asyncexternal' 2020-08-14 16:00:49 -04:00
Joey Hess
05b2b46a82
async extension done 2020-08-14 15:24:34 -04:00
Joey Hess
1ecbac4025
branch 2020-08-12 16:27:02 -04:00
Joey Hess
ddf69bf5b8
draft async extension 2020-08-11 16:42:09 -04:00
Joey Hess
db1c6da84b
close 2020-08-11 14:01:22 -04:00
Lukey
7077a7c2e7 Added a comment 2020-08-04 14:10:33 +00:00
Joey Hess
88e5ebcda7
runshell LD_HWCAP_MASK=0 optimisation 2020-08-03 14:34:15 -04:00
yarikoptic
7bd1e392dc Added a comment 2020-07-31 21:48:21 +00:00
yarikoptic
592db0629e Added a comment 2020-07-31 21:37:48 +00:00
kyle
960180dece Added a comment 2020-07-31 21:30:34 +00:00
yarikoptic
2b9d8b4e08 Added a comment 2020-07-31 21:23:57 +00:00
Joey Hess
f5724d78d4
comment 2020-07-31 16:47:23 -04:00
Joey Hess
8b4e5e6f68
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-31 14:58:02 -04:00
Joey Hess
c4ec52b9ae
Slightly sped up the linux standalone bundle
Reduce the number of directories listed in libdirs, which makes the linker
check a lot less dead ends looking for directories.

Eliminated some directories that didn't really contain shared libraries,
or only contained the linker.

That left only 2, one in lib and one in usr/lib, so consolidate those two.

Doing it this way, rather than just consolidating all libs that might exist
into a single directory means that, if there are optimised versions of some
libs, eg in lib/subarch/foo.so, and lib/subarch2/foo.so, they don't get
moved around in a way that would make the linker pick the wrong one.
2020-07-31 14:42:03 -04:00
Joey Hess
676257dfa8
comment 2020-07-31 13:29:48 -04:00
Ilya_Shlyakhter
85e57d8260 Added a comment: streaming data and external backends 2020-07-30 15:58:24 +00:00
Joey Hess
049807dbba
external backends implemented 2020-07-29 17:24:34 -04:00
yarikoptic
6b26802047 Added a comment 2020-07-28 15:26:43 +00:00
yarikoptic
61e96d7be4 Added a comment 2020-07-28 15:18:44 +00:00
yarikoptic
a846655d69 Added a comment 2020-07-27 21:40:55 +00:00
Joey Hess
3fafcc47bb
comment 2020-07-27 16:53:21 -04:00
Joey Hess
9d9f1f85d6
comment 2020-07-27 11:37:12 -04:00
Joey Hess
36d1621c35
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-27 11:34:03 -04:00
Joey Hess
3953c7a0ce
add DEBUG 2020-07-27 11:31:00 -04:00
ghen1
ad391da6b8 2020-07-27 15:24:21 +00:00
Joey Hess
32e1d7bc31
add 2020-07-24 14:11:08 -04:00
Joey Hess
c5ea2e9d12
better benchmark for move/copy speedup 2020-07-24 13:34:12 -04:00
Joey Hess
18f1fb5841
drop performance improvements
Sped up seeking files to drop by 2x, and also some performance
improvements to checking numcopies.

Interestingly, the seek speedup is not due to precaching, but I think is
due to calling getParsed earlier.

Annex.Drop had to be changed to check inAnnex there, since it was removed
from Command.Drop. All other users of Command.Drop already checked inAnnex
themselves.

This commit was sponsored by Ryan Newton on Patreon.
2020-07-24 13:27:46 -04:00
Joey Hess
d732ef1a89
move, copy: Sped up seeking for annexed files to operate on by a factor of nearly 2x. 2020-07-24 12:56:02 -04:00
Joey Hess
4685612f43
small git-annex get speedup
Remove an redundant inAnnex check. The checkContentPresent handles that,
and after the last commit also does in batch mode.
2020-07-22 14:29:30 -04:00
Joey Hess
1be92381ec
unify batch mode with non-batch by using AnnexedFileSeeker 2020-07-22 14:23:28 -04:00
Ilya_Shlyakhter
59917f8a6d Added a comment: external backend protocol 2020-07-21 17:43:27 +00:00
Joey Hess
abd56fb019
Fix a bug in find --batch in the previous version. 2020-07-20 19:50:53 -04:00
Joey Hess
f71310fed0
comment 2020-07-20 14:19:13 -04:00
Joey Hess
d1300eca2e
draft external backend protocol 2020-07-20 14:05:49 -04:00
Joey Hess
1489fbbdde
bug 2020-07-19 18:26:57 -04:00
yarikoptic
6a05388877 Added a comment 2020-07-18 05:09:54 +00:00
yarikoptic
7ee0bcbee7 Added a comment 2020-07-18 05:09:32 +00:00
yarikoptic
4ab711e153 Added a comment 2020-07-18 04:50:12 +00:00
yarikoptic
a6b0147b7f Added a comment 2020-07-18 04:49:49 +00:00
yarikoptic
d7b4df85e4 Added a comment 2020-07-18 04:34:26 +00:00
yarikoptic
5215fe92b9 Added a comment 2020-07-18 04:34:05 +00:00
yarikoptic
360de9446e Added a comment 2020-07-18 03:57:20 +00:00
yarikoptic
c46b9ac4ae initial 2nd wave of whining about startup time and to consider prelink or alike 2020-07-18 03:54:34 +00:00
yarikoptic
4f152089eb Added a comment: Windows build of file (which includes libmagic) 2020-07-16 22:01:17 +00:00
Joey Hess
a3a8779501
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-16 15:08:51 -04:00
Ilya_Shlyakhter
77299ae6e5 Added a comment: external backends 2020-07-16 17:30:55 +00:00
Joey Hess
5ab3849da3
thought 2020-07-15 20:42:53 -04:00
Joey Hess
034f958b09
comment 2020-07-15 14:02:31 -04:00
Joey Hess
360dc386e7
comment 2020-07-15 10:08:37 -04:00
Joey Hess
1bc015bff4
tag datalad at yoh's req 2020-07-15 09:51:57 -04:00
Joey Hess
e66ba410fc
todo 2020-07-14 21:44:31 -04:00
Joey Hess
f9b4a9f650
update 2020-07-14 14:47:22 -04:00
Joey Hess
7b2d236556
importfeed: stream metadata for 5% speedup
On top of the 10% speedup from streaming url logs.
2020-07-14 14:35:26 -04:00
Joey Hess
535cdc8d48
importfeed: Made checking known urls step around 10% faster.
This was a bit disappointing, I was hoping for a 2x speedup. But, I think
the metadata lookup is wasting a lot of time and also needs to be made to
stream.

The changes to catObjectStreamLsTree were benchmarked to not also speed
up --all around 3% more. Seems I managed to make it polymorphic after all.
2020-07-14 12:47:51 -04:00
Joey Hess
75aab72d23
mostly done with location log precaching
Some nice wins.
2020-07-13 17:04:02 -04:00
Joey Hess
df58609804
convert sync to use seekFilteredKeys
This only speeds up sync --content from 34.75 to 33.17 seconds;
location log precaching will probably be a bigger win.
2020-07-13 15:02:52 -04:00
Joey Hess
c70ae68d7e
update 2020-07-13 11:49:24 -04:00
Joey Hess
415d394222
thought 2020-07-13 11:04:57 -04:00
Joey Hess
a32b6f9812
update 2020-07-10 15:49:03 -04:00
Joey Hess
412b09e17e
update 2020-07-10 15:23:12 -04:00
Joey Hess
2468eefc6d
2x speedup for annex file seeking on the horizon 2020-07-10 14:02:48 -04:00
Joey Hess
1df9e72a78
update 2020-07-10 13:31:47 -04:00
Joey Hess
6b9d1c1317
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-10 13:16:11 -04:00
Joey Hess
6e9fcf468d
streamkeys branch 2020-07-09 14:48:03 -04:00
branchable@bafd175a4b99afd6ed72501042e364ebd3e0c45e
bbc3800369 Added a comment: Update on my auto-commit / auto-sync scripts 2020-07-09 14:23:15 +00:00
Ilya_Shlyakhter
96aad5458b Added a comment: re: git-annex-cat 2020-07-09 01:06:37 +00:00
Ilya_Shlyakhter
75b96059af Added a comment: git-annex-cat 2020-07-09 00:21:02 +00:00
Joey Hess
9f6bd6cc05
add inRepoDetails
planned to use for an optimisation

most things using stagedDetails were not expecting to get dup files in a
conflicted merge and deal with them, so converted them to use
inRepoDetails.
2020-07-08 15:36:35 -04:00
Joey Hess
c1eaf5b930
note 2020-07-08 14:21:37 -04:00
Joey Hess
d08c178f97
avoid catObjectStream skipping over unavailable shas
Not needed as it's used for --all, but will be needed later.
2020-07-08 13:57:17 -04:00
Joey Hess
de3d7d044d
make catObjectStream support newline and carriage return in filenames
Turns out the %(rest) trick was not needed. Instead, just maintain a
list of files we've asked for, and each cat-file response is for the
next file in the list.

This actually benchmarks 25% faster than before! Very surprising, but it
must be due to needing to shove less data through the pipe, and parse
less.
2020-07-08 13:49:03 -04:00
Joey Hess
2cf6717aec
thoughts 2020-07-08 10:51:24 -04:00
Joey Hess
5849bd6340
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-07 16:50:26 -04:00
Joey Hess
afd9b2f667
idea 2020-07-07 16:49:44 -04:00
yarikoptic
c9d0bf0e6a reassign to datalad - generic enhancement 2020-07-07 19:05:59 +00:00
Joey Hess
ba0adefe4c
Merge branch 'master' of ssh://git-annex.branchable.com 2020-07-07 14:19:46 -04:00
Joey Hess
d010ab04be
sped up the --all option by 2x to 16x by using git cat-file --buffer
This assumes that no location log files will have a newline or carriage
return in their name. catObjectStream skips any such files due to
cat-file not supporting them.

Keys have been prevented from containing newlines since 2011,
commit 480495beb4. If some old repo
had a key with a newline in it, --all will just skip processing that key.
Other things, like .git/annex/unused files certianly assume no newlines in
keys too, and AFAICR, such keys never actually worked.

Carriage return is escaped by preSanitizeKeyName since 2013. WORM keys
generated before that point could perhaps contain a CR. (URL probably not,
http probably doesn't support an URL with a raw CR in it.) So, added
a warning in fsck about such keys. Although, fsck --all will naturally
skip them, so won't be able to warn about them. Not entirely
satisfactory, but I'll bet there are not really any such keys in
existence.

Thanks to Lukey for finding this optimisation.
2020-07-07 13:54:04 -04:00
timothy.sanders@a7ce3a8bae11a60e0c4cda9cb4aef24ec459bbab
3b6754e2a5 2020-07-07 10:26:00 +00:00
timothy.sanders@a7ce3a8bae11a60e0c4cda9cb4aef24ec459bbab
8a9323f5b5 2020-07-07 10:24:29 +00:00
Lukey
56f5d99ceb Added a comment 2020-07-06 21:20:58 +00:00
Joey Hess
9468675ba9
note 2020-07-06 15:12:26 -04:00