Commit graph

37039 commits

Author SHA1 Message Date
Joey Hess
3252c4ccca
Merge branch 'master' of ssh://git-annex.branchable.com 2020-04-23 15:21:40 -04:00
Joey Hess
2aeb79249b
external: stop storing readonly=true in remote.log
readonly=true is used to make an external special remote that does not
need the external program to be installed. It was stored in the
remote.log by default, and so every time it was specified in an
enableremote or initremote, whatever value was used became the new
default for subsequent enableremotes of that remote.

That was surprising, and I consider it to be a bug.

It does not make much sense to pass it to initremote because then how
would you populate that remote with anything? You would have to
enableremote elsewhere, and store content there. I'm assuming nobody
used it that way.

Someone might rely on passing it to enableremote once, and then that
being inherited in other clones. But that is not how it's documented to
be used. It is barely documented in git-annex at all, only in the
external special remote protocol, and the documentation there says to
"Document that this external special remote can be used in readonly
mode." (by the user of it passing readonly=true to enableremote). The
one external special remote that I know of that does document that is
<https://github.com/bgilbert/gcsannex> (the one that motivated adding
it). That one's docs do say to pass it to enableremote.

So, it seemed safe to make this behavior change. If someone was in fact
relying on one of those behaviors, all their current repos will still
work as they configured them (although they will need to deal
with the related change in 9f3c2dfeda).
In new clones, they will find enableremote fails, complaining the
external program is not in path. An easy enough problem to recover from.
2020-04-23 15:21:26 -04:00
Joey Hess
9f3c2dfeda
stop using remote.name.annex-readonly for two distinct things 2020-04-23 14:56:03 -04:00
thk
697f7b93a2 2020-04-23 15:57:17 +00:00
kyle
d1dbd45743 remove spam 2020-04-22 13:53:20 +00:00
harimau
d23fee6a9e 2020-04-21 20:54:48 +00:00
Ilya_Shlyakhter
461e3c0b62 Added a comment: "dry run" option 2020-04-21 19:04:07 +00:00
Joey Hess
cd1676d604
fix bug involving local git remote and out of date location log
get --from, move --from: When used with a local git remote, these used to
silently skip files that the location log thought were present on the
remote, when the remote actually no longer contained them. Since that
behavior could be surprising, now instead display a warning.

I got very confused when I encountered this behavior, since it was silently
skipping a file I needed that whereis said was on the remote.

get without --from already displayed a "unable to access these remotes"
message, which while a bit misleading in that the remote is likely
accessible, but just doesn't contain the file, at least indicated something
went wrong.

Having get --from display a warning makes it in line with get
w/o --from, so seems certianly ok. It might be there are situations where
move --from is used, on eg a whole directory, and the user only wants to
move whatever is present in the remote, and is perfectly ok with files
that are not present being skipped. So I'm less sure about the new warning
being ok there. OTOH, only local git remotes avoiding displaying a warning
in that case too, so this just brings them into line with other remotes.

(Also note that this makes it a little bit faster when dealing with a lot of
files, since it avoids a redundant stat of the file.)
2020-04-21 12:36:58 -04:00
Joey Hess
2f87c6db79
done 2020-04-21 11:30:49 -04:00
Joey Hess
87bab2d7c2
close 2020-04-21 11:29:51 -04:00
Joey Hess
04352ed9c5
check-ignore resource pool
Much like check-attr before.
2020-04-21 11:25:28 -04:00
Joey Hess
45fb7af21c
check-attr resource pool
Limited to min of -JN or number of CPU cores, because it will often be
CPU bound, once it's read the gitignore file for a directory.

In some situations it's more disk bound, but in any case it's unlikely
to be the main bottleneck that -J is used to avoid. Eg, when dropping,
this is used for numcopies checks, but the main bottleneck will be
accessing the remotes to verify presence. So the user might decide to
-J32 that, but having 32 check-attr processes would just waste however
many filehandles they open, and probably worsen their performance due to
CPU contention.

Note that, I first tried just letting up to the -JN be started. However,
even when it's no bottleneck at all, that still results in all of them
being started. Why? Well, all the worker threads start up nearly
simulantaneously, so there's a thundering herd..
2020-04-21 11:05:57 -04:00
Joey Hess
cee6b344b4
cat-file resource pool
Avoid running a large number of git cat-file child processes when run with
a large -J value.

This implementation takes care to avoid adding any overhead to git-annex
when run without -J. When run with -J, there is a small bit of added
overhead, to manipulate the resource pool. That optimisation added a
fair bit of complexity.
2020-04-20 15:19:31 -04:00
Joey Hess
87b7b0f202
comment 2020-04-20 12:06:14 -04:00
Joey Hess
5446379cd9
Merge branch 'master' of ssh://git-annex.branchable.com 2020-04-20 10:03:38 -04:00
thk
ffeef75917 2020-04-19 08:15:47 +00:00
Joey Hess
1b2dd74d8d
bug 2020-04-18 23:57:48 -04:00
yarikoptic
6c9c974e55 Added a comment 2020-04-18 02:14:32 +00:00
yarikoptic
67f0407477 Added a comment 2020-04-18 02:05:52 +00:00
Joey Hess
b480ce01f7
Merge branch 'master' of ssh://git-annex.branchable.com 2020-04-17 17:47:59 -04:00
Joey Hess
2da760fcae
comment 2020-04-17 17:32:49 -04:00
Joey Hess
529f488ec4
fix a thundering herd problem
Avoid repeatedly opening keys db when accessing a local git remote and -J
is used.

What was happening was that Remote.Git.onLocal created a new annex state
as each thread started up. The way the MVar was used did not prevent that.
And that, in turn, led to repeated opening of the keys db, as well as
probably other extra work or resource use.

Also managed to get rid of Annex.remoteannexstate, and it turned out there
was an unncessary Maybe in the keysdbhandle, since the handle starts out
closed.
2020-04-17 17:09:29 -04:00
yarikoptic
a2b2708ab6 Added a comment: quick follow up 2020-04-17 20:34:55 +00:00
Joey Hess
fada5c120c
remove unused import 2020-04-17 15:19:49 -04:00
Joey Hess
fe9cf1256e
move remoteList into dupState
This does mean that RemoteDaemon.Transport.Tor's call runs it, otherwise
no change, but this is groundwork for doing more such expensive actions
in dupState.
2020-04-17 14:36:45 -04:00
Joey Hess
988317634b
comment 2020-04-17 14:11:17 -04:00
Joey Hess
6c39ec9b27
comment 2020-04-17 12:37:28 -04:00
Dan
b325dfea4d Added a comment: find wanted on remote? 2020-04-16 21:19:44 +00:00
yarikoptic
5dc513ccdb Added a comment 2020-04-16 03:43:13 +00:00
yarikoptic
1705d3657e Added a comment: it is many more "open files" in reality 2020-04-16 03:41:07 +00:00
Joey Hess
a7840c0e04
improve programPath
Fixes a failure mode where git-annex sync would try to run git-annex and
complain that it failed to find it in ~/.config/git-annex/program or PATH,
when there was a git-annex in /usr/bin/, but the original one was run
from elsewhere (eg, ~/bin) and happened not to be present any longer.

Now, it will fall back to using git-annex from PATH in such a case.
Which might fail due to some version incompatability, but still better
than a misleading error message.

Also made readProgramFile only read the file, not look for git-annex in
PATH as a fallback. That fallback may have confused Assistant.Upgrade,
which really wants the value from the file.
2020-04-15 16:46:34 -04:00
Joey Hess
957a87b437
fix absolute filenames fed into --batch and git-annex info 2020-04-15 16:04:05 -04:00
Joey Hess
a14168a321
reproduced 2020-04-15 15:06:53 -04:00
Joey Hess
503abb6d54
remove warning about git gc for annex.alwayscommit=false
I doubt that warning has ever been right, but I'm sure it is not right
now.

For there to be a risk of git gc deleting objects that are in the annex
index, journal files would have to be staged into it, and deleted, but
the index not committed to the git-annex branch. And AFAICS, there is no
code path where that actually happens. I considered adding one recently,
but didn't.

The way it actually works is, as long as the user has annex.alwayscommit=false
the data lives in the journal, where it's safe from git gc. Then when
git-annex is run w/o that config, the journal is staged into the index,
which is immediately committed to the branch. There's no window where
git gc could delete the objects, because git gc only deletes objects
after some time (2 weeks by default).

Now, if git-annex gets suspended at just the wrong time, or interrupted,
then yes, it's possible. But doesn't matter whether that config was ever
set or not. And many uses of git-annex also recover from that situation
by committing to the git-annex branch.
2020-04-15 14:25:33 -04:00
Joey Hess
ddadc0c1aa
Merge branch 'master' of ssh://git-annex.branchable.com 2020-04-15 14:20:06 -04:00
Joey Hess
891e9a81eb
close bug that was apparently fixed satisfactorally 2020-04-15 14:17:21 -04:00
Joey Hess
a2fed82267
close 2020-04-15 14:15:41 -04:00
Joey Hess
7ef030b576
close old bug since git-annex no longer uses rsync like it used to 2020-04-15 14:08:48 -04:00
kyle
1a040e0c0a Added a comment 2020-04-15 18:02:29 +00:00
Joey Hess
f85ca7dc80
fix all remaining -Wincomplete-uni-patterns warnings
A couple of these were probably actual bugs in edge cases. Most of the
changes I'm fine with. The fact that aeson's object returns sometihng
that we know will be an Object, but the type checker does not know is
kind of annoying.
2020-04-15 13:55:08 -04:00
Joey Hess
43a9808292
disable journal read optimisation when alwayscommit=false
The journal read optimisation in aeca7c220 later got fixed in eedd73b84
to stage and commit any files that were left in the journal by a
previous git-annex run. That's necessary for the optimisation to work
correctly. But it also meant that alwayscommit=false started committing
the previous git-annex processes journalled changes, which defeated the
purpose of the config setting entirely.

So, disable the optimisation when alwayscommit=false, leaving the
files in the journal and not committing them. See my comments on the bug
report for why this seemed the best approach.

Also fixes a problem when annex.merge-annex-branches=false and there
are changes in the journal. That config indirectly prevents committing
the journal. (Which seems a bit odd given its name, but it always has..)
So, when there were changes in the journal, perhaps left there due to
alwayscommit=false being set before, the optimisation would prevent
git-annex from reading the journal files, and it would operate with out
of date information.
2020-04-15 13:24:33 -04:00
Joey Hess
0e4c92503e
fix warning
I don't think the NoConfigValue case ever actually occurs here.
2020-04-15 13:04:00 -04:00
Joey Hess
9f17242f29
comment 2020-04-15 12:48:55 -04:00
Joey Hess
8ac44498d6
comment 2020-04-15 12:43:22 -04:00
Joey Hess
1aa7082c9a
better response 2020-04-15 12:17:57 -04:00
Joey Hess
b241f579c0
hm 2020-04-15 12:14:39 -04:00
Joey Hess
f4d5ec1457
correction 2020-04-15 12:12:11 -04:00
Joey Hess
520ddf1c75
comment 2020-04-15 12:10:51 -04:00
Christoph.Schmidpeter@d3e5d124c7d5459315c2a9f983ab9a70b88e1d03
95b505c22f 2020-04-15 13:06:14 +00:00
Christoph.Schmidpeter@d3e5d124c7d5459315c2a9f983ab9a70b88e1d03
98c1047a29 removed 2020-04-15 13:00:12 +00:00