Commit graph

1045 commits

Author SHA1 Message Date
Joey Hess
9f3c2dfeda
stop using remote.name.annex-readonly for two distinct things 2020-04-23 14:56:03 -04:00
Joey Hess
cd1676d604
fix bug involving local git remote and out of date location log
get --from, move --from: When used with a local git remote, these used to
silently skip files that the location log thought were present on the
remote, when the remote actually no longer contained them. Since that
behavior could be surprising, now instead display a warning.

I got very confused when I encountered this behavior, since it was silently
skipping a file I needed that whereis said was on the remote.

get without --from already displayed a "unable to access these remotes"
message, which while a bit misleading in that the remote is likely
accessible, but just doesn't contain the file, at least indicated something
went wrong.

Having get --from display a warning makes it in line with get
w/o --from, so seems certianly ok. It might be there are situations where
move --from is used, on eg a whole directory, and the user only wants to
move whatever is present in the remote, and is perfectly ok with files
that are not present being skipped. So I'm less sure about the new warning
being ok there. OTOH, only local git remotes avoiding displaying a warning
in that case too, so this just brings them into line with other remotes.

(Also note that this makes it a little bit faster when dealing with a lot of
files, since it avoids a redundant stat of the file.)
2020-04-21 12:36:58 -04:00
Joey Hess
04352ed9c5
check-ignore resource pool
Much like check-attr before.
2020-04-21 11:25:28 -04:00
Joey Hess
45fb7af21c
check-attr resource pool
Limited to min of -JN or number of CPU cores, because it will often be
CPU bound, once it's read the gitignore file for a directory.

In some situations it's more disk bound, but in any case it's unlikely
to be the main bottleneck that -J is used to avoid. Eg, when dropping,
this is used for numcopies checks, but the main bottleneck will be
accessing the remotes to verify presence. So the user might decide to
-J32 that, but having 32 check-attr processes would just waste however
many filehandles they open, and probably worsen their performance due to
CPU contention.

Note that, I first tried just letting up to the -JN be started. However,
even when it's no bottleneck at all, that still results in all of them
being started. Why? Well, all the worker threads start up nearly
simulantaneously, so there's a thundering herd..
2020-04-21 11:05:57 -04:00
Joey Hess
cee6b344b4
cat-file resource pool
Avoid running a large number of git cat-file child processes when run with
a large -J value.

This implementation takes care to avoid adding any overhead to git-annex
when run without -J. When run with -J, there is a small bit of added
overhead, to manipulate the resource pool. That optimisation added a
fair bit of complexity.
2020-04-20 15:19:31 -04:00
Joey Hess
529f488ec4
fix a thundering herd problem
Avoid repeatedly opening keys db when accessing a local git remote and -J
is used.

What was happening was that Remote.Git.onLocal created a new annex state
as each thread started up. The way the MVar was used did not prevent that.
And that, in turn, led to repeated opening of the keys db, as well as
probably other extra work or resource use.

Also managed to get rid of Annex.remoteannexstate, and it turned out there
was an unncessary Maybe in the keysdbhandle, since the handle starts out
closed.
2020-04-17 17:09:29 -04:00
Joey Hess
957a87b437
fix absolute filenames fed into --batch and git-annex info 2020-04-15 16:04:05 -04:00
Joey Hess
5a62e8132d
When parsing git configs, support all the documented ways to write true and false, including "yes", "on", "1", etc.
This change does impact git-annex config
eg "git annex config --set annex.addunlocked on"
will store "on" and new git-annex will understand that value, while
old git-annex will error:
git-annex: bad annex.addunlocked configuration in git annex config:
Parse failure: near "on"
That seems acceptable.

Not special remote configs that are only documented as =true or =false
however. Having git-annex support other values for those would break
backwards compatability when used with old versions of git-annex. And
older versions ignore invalid special remote configs.. That would not
be a good combination.
2020-04-13 14:05:30 -04:00
Joey Hess
9cb69dbb76
support boolean git configs that are represented by the name of the setting with no value
Eg"core.bare" is the same as "core.bare = true".

Note that git treats "core.bare =" the same as "core.bare = false", so the
code had to become more complicated in order to treat the absense of a
value differently than an empty value. Ugh.
2020-04-13 13:35:22 -04:00
Joey Hess
ca9c6c5f60
Fix a potential failure to parse git config
Git has an obnoxious special case in git config, a line "foo" is the same
as "foo = true". That means there is no way to examine the output of
git config and tell if it was run with --null or not, since a "foo"
in the first line could be such a boolean, or could be followed by its
value on the next line if --null were used.

So, rather than trying to do such a detection, track the style of config
at all the points where it's generated.
2020-04-13 13:05:41 -04:00
Joey Hess
86426036a0
optimise catfile interface with ByteString and Attoparsec
Around 3% total speedup.

Profiling git annex find --not --in web, it's now bytestring end-to-end,
and there is only a little added overhead in eg accessing the Annex
state MVar (3%). The rest of the runtime is spent reading symlinks, and
in attoparsec.

This feels like the end of the optimisation road, without a major change
like caching information for faster queries.
2020-04-10 14:18:52 -04:00
Joey Hess
2caf579718
cache annex index filename for 1.5% speedup to queries 2020-04-10 13:37:04 -04:00
Joey Hess
aeca7c2207
Sped up query commands that read the git-annex branch by around 5%
The only price paid is one additional MVar read per write to the journal.
Presumably writing a journal file dominiates over a MVar read time by
several orders of magnitude.

--batch does not get the speedup because then it needs to notice when
another process has made a change. Also made the assistant and other damon
modes bypass the optimisation, which would not help them anyway.
2020-04-09 13:54:43 -04:00
Joey Hess
541803bb52
changelog for ByteString Ref 2020-04-08 14:10:29 -04:00
Joey Hess
7ebc118776
adb: Better messages when the adb command is not installed
After a user completely ignored the display of the exception probably
because it didn't make sense..

This does make it a little bit slower since it checks adb is in path each
time before running it. Also, it might display a lot of warnings about it
not being installed.

This commit was sponsored by Ilya Shlyakhter on Patreon.
2020-04-02 10:48:28 -04:00
Joey Hess
87d5583a91
use programPath consistently, not readProgramFile
Improve git-annex's ability to find the path to its program, especially
when it needs to run itself in another repo to upgrade it.

Some parts of the code used readProgramFile, probably because I forgot that
programPath exists.

I noticed this when a git-annex auto-upgrade failed because it was running
git-annex upgrade --autoonly, but the code to run git-annex used
readProgramFile, which happened to point to an older build of git-annex.
2020-03-30 16:06:27 -04:00
Joey Hess
dbad6c5c39
releasing package git-annex version 8.20200330 2020-03-30 13:46:24 -04:00
Joey Hess
c9ac7aa338
patch applied 2020-03-26 15:18:47 -04:00
Joey Hess
173465592f
changelog for Kyle's other fix, and close bug 2020-03-26 13:18:41 -04:00
Joey Hess
c0ceb969e6
changelog for Kyle's fix 2020-03-26 13:12:49 -04:00
Joey Hess
bbe977a2b6
fsck: Fix reversion in 8.20200226 that made it incorrectly warn that hashed keys with an extension should be upgraded. 2020-03-20 13:09:16 -04:00
Joey Hess
4b92bbe8d7
webdav: Made exporttree remotes faster by caching connection to the server
Followed example of Remote.S3.
2020-03-20 12:48:43 -04:00
Joey Hess
c8fec6ab03
Fix a minor bug that caused options provided with -c to be passed multiple times to git. 2020-03-16 13:06:44 -04:00
Joey Hess
14a4a9f4cd
releasing package git-annex version 8.20200309 2020-03-09 17:08:16 -04:00
Joey Hess
4ce518998a
Fix upgrade failure when a file has been deleted from the working tree 2020-03-09 16:59:18 -04:00
Joey Hess
afe72d04ff
fix problems with upgrade of local remotes
Upgrade other repos than the current one by running git-annex upgrade
inside them, which avoids problems with upgrade code making assumptions
that the cwd will be inside the repo being upgraded.

In particular, this fixes a problem where upgrading a v7 repo to v8 caused
an ugly git error message.

I actually could not find a way to make Upgrade.V7 work properly
without changing directory to the remote. Once I got git ls-files to work,
the git cat-file failed because :path can only be used in the current git
repo.
2020-03-09 16:49:28 -04:00
Joey Hess
d930a2035c
Avoid converting .git file in a worktree or submodule to a symlink when the repository is not a git-annex repository.
This means it will still be a .git file when git-annex init runs. That's
ok, the repo probably contains no annexed objects yet, and even if it does,
git-annex init does not care if symlinks in the worktree don't point to the
objects.

I made init, at the end, run the conversion code. Not really necessary
because the next git-annex command could do it just as well. But, this
avoids commands that don't normally write to the repo needing to write to
it, which might avoid some problem or other, and seems worth avoiding
generally.
2020-03-09 14:54:14 -04:00
Joey Hess
c0a981cb0e
update comment 2020-03-09 14:31:28 -04:00
Joey Hess
8a58f67170
remove changelog for problem I've not fixed yet 2020-03-09 14:24:40 -04:00
Joey Hess
1978a24207
Fix bug that caused unlocked annexed dotfiles to be added to git by the smudge filter when annex.dotfiles was not set. 2020-03-09 14:20:02 -04:00
Joey Hess
70d24c0302
add a comment about CWD
While git ls-files can actually be used on a repo that is not in the
cwd, it works inconsistently. For example, this fails:

git --git-dir=../foo/.git --work-tree=../foo ls-files ../foo

But change some of the paths to absolute and it will succeed. That seems
like a bug in git.

OTOH, this succeeds:

git --git-dir=../foo/.git --work-tree=../foo ls-files

But, that lists paths relative to the top of the --work-tree,
rather than the usual listing them relative to the cwd. Because the cwd
is not in the repo. And so anything parsing the ls-files output of that
is likely to operate on files in the wrong location. Indeed, there is
code in Upgrade/ that has this problem!
2020-03-09 13:37:01 -04:00
Joey Hess
6a91471923
GETCONFIG name fix
Fix regression that prevented external special remotes from using GETCONFIG
to query values like "name". (Introduced in version 7.20200202.7.)
2020-03-09 12:38:04 -04:00
Joey Hess
093fde5abd
completed the createDirectoryIfMissing conversion
Remaining calls in the assistant and Annex.Ssh have been audited and are ok.
2020-03-06 12:55:03 -04:00
Joey Hess
88f721549d
Linux standalone: Use md5sum to shorten paths in .cache/git-annex/locales
md5sum is part of busybox, so is probably available unless it were compiled
out. If md5sum (or cut for that matter) is not available, it will
still use the whole path to $base, otherwise hash it.

Of course it's possible for md5sum to be available sometimes and not others
on the same system; in such an event the locales would be built twice for
the same bundle. The cleanup code will delete both sets once that
version of the bundle is upgraded.
2020-03-04 13:04:56 -04:00
Joey Hess
2099e40664
improve wording 2020-03-02 16:27:29 -04:00
Joey Hess
ccd8c43dc8
git-annex config: guard against non-repo-global configs
git-annex config: Only allow configs be set that are ones git-annex
actually supports reading from repo-global config, to avoid confused users
trying to set other configs with this.
2020-03-02 15:54:18 -04:00
Joey Hess
c578e8ebfd
close already fixed bug
and document it in the changelog for the release where it was fixed
2020-03-02 13:47:47 -04:00
Joey Hess
f9a116f056
stack.yaml: Updated to lts-14.27. 2020-03-02 13:25:51 -04:00
Joey Hess
f6d629e483
changelog and minor style 2020-02-28 12:57:55 -04:00
Joey Hess
2366e7fb84
catch whereisKey exception and provide error messages when external programs neglect to
* whereis: If a remote fails to report on urls where a key
  is located, display a warning, rather than giving up and not displaying
  any information.
* When external special remotes fail but neglect to provide an error
  message, say what request failed, which is better than displaying an
  empty error message to the user.
2020-02-27 14:09:18 -04:00
Joey Hess
c089f395b0
Bugfix: Don't ignore --debug when it is followed by -c 2020-02-27 00:52:37 -04:00
Joey Hess
8bc393a63b
releasing package git-annex version 8.20200226 2020-02-26 19:09:23 -04:00
Joey Hess
81e3faf810
Merge branch 'v7' 2020-02-26 18:15:18 -04:00
Joey Hess
d37975357d
Bugfix: export --tracking (a deprecated option) set annex-annex-tracking-branch, instead of annex-tracking-branch.
(cherry picked from commit a3a674d15b)
2020-02-26 18:08:04 -04:00
Joey Hess
e535da621c
Bugfix to getting content from an export remote with -J, when the export database was not yet populated.
(cherry picked from commit e520341500)
2020-02-26 18:07:20 -04:00
Joey Hess
8af6d2c3c5
fix encryption of content to gcrypt and git-lfs
Fix serious regression in gcrypt and encrypted git-lfs remotes.
Since version 7.20200202.7, git-annex incorrectly stored content
on those remotes without encrypting it.

Problem was, Remote.Git enumerates all git remotes, including git-lfs
and gcrypt. It then dispatches to those. So, Remote.List used the
RemoteConfigParser from Remote.Git, instead of from git-lfs or gcrypt,
and that parser does not know about encryption fields, so did not
include them in the ParsedRemoteConfig. (Also didn't include other
fields specific to those remotes, perhaps chunking etc also didn't
get through.)

To fix, had to move RemoteConfig parsing down into the generate methods
of each remote, rather than doing it in Remote.List.

And a consequence of that was that ParsedRemoteConfig had to change to
include the RemoteConfig that got parsed, so that testremote can
generate a new remote based on an existing remote.

(I would have rather fixed this just inside Remote.Git, but that was not
practical, at least not w/o re-doing work that Remote.List already did.
Big ugly mostly mechanical patch seemed preferable to making git-annex
slower.)
2020-02-26 18:05:36 -04:00
Joey Hess
9050788b66
info: Fix display of the encryption value. (Some debugging junk had crept in.) 2020-02-26 15:02:23 -04:00
Joey Hess
e520341500
Bugfix to getting content from an export remote with -J, when the export database was not yet populated. 2020-02-26 14:57:29 -04:00
Joey Hess
9659f1c30f
annex.security.allowed-ip-addresses ports syntax
Extended annex.security.allowed-ip-addresses to let specific ports of an IP
address to be used, while denying use of other ports.
2020-02-25 15:45:52 -04:00
Joey Hess
1bb32098d6
jump right to v8, don't stop part way
* init --version: When the version given is one that automatically
  upgrades to a newer version, use the newer version instead.
* Auto upgrades from older repo versions, like v5, now jump right to v8.
2020-02-24 13:21:00 -04:00
Joey Hess
09df58c4ea
handle keys with extensions consistently in all locales
Fix some cases where handling of keys with extensions varied depending on
the locale.

A filename with a unicode extension would before generate a key with an
extension in a unicode locale, but not in LANG=C, because the extension
was not all alphanumeric. Also the the length of the extension could be
counted differently depending on the locale.

In a non-unicode locale, git-annex migrate would see that the extension
was not all alphanumeric and want to "upgrade" it. Now that doesn't happen.

As far as backwards compatability, this does mean that unicode
extensions are counted by the number of bytes, not number of characters.
So, if someone is using unicode extensions, they may find git-annex
stops using them when adding files, because their extensions are too
long. Keys already in their repo with the "too long" extensions will
still work though, so this only prevents adding the same content with
the same extension generating the same key. Documented this by
documenting that annex.maxextensionlength is a number of bytes.

Also, if a filename has an extension that is not valid utf-8 and the
locale is utf-8, the extension will be allowed now, and an old
git-annex, in the same locale would not, and would also want to
"upgrade" that.
2020-02-20 17:30:25 -04:00
Joey Hess
3d00a8a1dc
Makefile: Support newer versions of cabal that use the new-build system.
Unfortunately, cabal puts the binary in a very complicated path
and does not provide any good way to get it out, leaving no good choice
except to use find.

It may be possible to use cabal (new-)install --symlink-bindir,
and ask it to symlink to pwd, but with my older version of cabal,
that does not work.

The stack branch will probably also break once it uses a newer cabal,
didn't try to deal with that.
2020-02-20 14:13:21 -04:00
Joey Hess
c1bfe86f6a
mention speedup 2020-02-19 15:09:00 -04:00
Joey Hess
0f3ecbe7e8
tighten up description of v8 upgrade 2020-02-19 15:07:48 -04:00
Joey Hess
d431dba1a8
merged v8 into master 2020-02-19 14:37:56 -04:00
Joey Hess
029c883713
Merge branch 'master' into v8 2020-02-19 14:32:11 -04:00
Joey Hess
a22ed03d0f
tighten up docs of dotfiles changes 2020-02-19 14:29:50 -04:00
Joey Hess
79a0435b77
automate remote.name.skipFetchAll
initremote, enableremote: Set remote.name.skipFetchAll when the remote
cannot be fetched from by git, so git fetch --all will not try to use it.
2020-02-19 13:58:26 -04:00
Joey Hess
a3a674d15b
Bugfix: export --tracking (a deprecated option) set annex-annex-tracking-branch, instead of annex-tracking-branch. 2020-02-19 13:34:24 -04:00
Joey Hess
cd8a208b8c
releasing package git-annex version 7.20200219 2020-02-19 12:45:30 -04:00
Joey Hess
a78eb6dd58
sync --only-annex and annex.synconlyannex
* Added sync --only-annex, which syncs the git-annex branch and annexed
  content but leaves managing the other git branches up to you.
* Added annex.synconlyannex git config setting, which can also be set with
  git-annex config to configure sync in all clones of the repo.

Use case is then the user has their own git workflow, and wants to use
git-annex without disrupting that, so they sync --only-annex to get the
git-annex stuff in sync in addition to their usual git workflow.

When annex.synconlyannex is set, --not-only-annex can be used to override
it.

It's not entirely clear what --only-annex --commit or --only-annex
--push should do, and I left that combination not documented because I
don't know if I might want to change the current behavior, which is that
such options do not override the --only-annex. My gut feeling is that
there is no good reasons to use such combinations; if you want to use
your own git workflow, you'll be doing your own committing and pulling
and pushing.

A subtle question is, how should import/export special remotes be handled?
Importing updates their remote tracking branch and merges it into master.
If --only-annex prevented that git branch stuff, then it would prevent
exporting to the special remote, in the case where it has changes that
were not imported yet, because there would be a unresolved conflict.

I decided that it's best to treat the fact that there's a remote tracking
branch for import/export as an implementation detail in this case. The more
important thing is that an import/export special remote is entirely annexed
content, and so it makes a lot of sense that --only-annex will still sync
with it.
2020-02-17 16:33:10 -04:00
Joey Hess
879f52a116
annex.tune.branchhash1=true bugfix
Fix support for repositories tuned with annex.tune.branchhash1=true,
including --all not working and git-annex log not displaying anything for
annexed files.
2020-02-14 15:22:48 -04:00
Joey Hess
352963690a
fsck --from remote -J concurrency bug
fsck --from remote: Fix a concurrency bug that could make it incorrectly
detect that content in the remote is corrupt, and remove it, resulting in
data loss.
2020-02-14 14:52:15 -04:00
Joey Hess
399319ccbc
Avoid throwing fatal errors when asked to write to a readonly git remote on http
Test suite found one of them, looking for giveup turned up several more.
2020-02-14 14:38:13 -04:00
Joey Hess
a490947068
annex.sshcaching warning improvement and allow overridding build time default
* When git-annex is built with a ssh that does not support ssh connection
  caching, default annex.sshcaching to false, but let the user override it.
* Improve warning messages further when ssh connection caching cannot
  be used, to clearly state why.
2020-02-14 14:21:03 -04:00
Joey Hess
46bf2a259b
releasing package git-annex version 7.20200204 2020-02-04 14:33:03 -04:00
Joey Hess
c9357bdc0e
ifdef persistent-template 2.8.0 fixes
The i386ancient build has a ghc too old for these extensions.

Build with persistent-template 2.8.0 tested.
2020-02-04 13:53:00 -04:00
Joey Hess
ee718fb35d
Makefile: Really move the fish completion to the vendor_completions.d directory. 2020-02-04 12:10:09 -04:00
Joey Hess
4920df6573
Fix build with newest version of persistent-template.
This is untested because of rain, also I am operating from truncated
copiler error messages in a bug report that also doesn't mention what the
library version is. Still, it should work.

May break builds with old ghc, in particular DerivingStrategies is
I think fairly new? The pragmas could be ifdefed if necessary. Works with
ghc 8.6.5.
2020-02-04 12:03:30 -04:00
Joey Hess
467cc50bb4
releasing package git-annex version 7.20200202.7 2020-02-02 16:55:38 -04:00
Joey Hess
5c3d06b070
Makefile: Move the fish completion to the vendor_completions.d directory. 2020-01-23 16:42:08 -04:00
Joey Hess
5c3636037b
Display a warning when concurrency is enabled but ssh connection caching is not enabled or won't work due to a crippled filesystem
A warning message is unsatisfying. But erroring out is too hard a failure,
especially since it may well work fine if the user has enabled passwordless
ssh.

I did think about falling back to one ssh connection at a time in this
case, but it would have needed a rework of every ssh call, which
seems far overboard for such a niche problem. There's no single place where
git-annex runs ssh, so no one place that it could block a concurrent call
on a semaphore. And, even if it did fall back to one ssh connection at a
time, it seems to me that doing so without warning the user about the
problem just invites bug reports like "git-annex is ignoring my -J2 and
only doing one download at a time". So a warning is needed, and I suppose
is good enough.
2020-01-23 12:35:46 -04:00
Joey Hess
1883f7ef8f
support git remotes that need http basic auth
using git credential to get the password

One thing this doesn't do is wrap the password prompting inside the prompt
action. So with -J, the output can be a bit garbled.
2020-01-22 16:16:19 -04:00
Joey Hess
d227093002
avoid ugly error message
Http remotes that do expose a git config file, but are not initialized
resulted in an ugly and unncessary error message, now sqelched.

When git-annex-shell configlist is run w/o the autoinit field, it may
not generate a uuid for the repository. So in that case, it's not
unexpected for the config it does list to not include a UUID, and
dumping out the config in a warning message is not needed.

If configlist is asked to autoinit and we don't get back a config with a
UUID in it, that suggests some problem, and what we got back may not be
a config at all but some diagnostic message, so it does make sense to
output it then.
2020-01-22 11:57:20 -04:00
Joey Hess
5c6bf1be97
--whatelse is a better name than --describe-other-params
The use case is basically the user having forgotten, so --help would be
best, but it would be quite hard to include this in --help, since it may
even have to spin up an external special remote program.

I also considered --umm but typoed it the first time I tried it as
--uum, and while memorable, it's too cutesy. --whatelse is good because
it explicitly asks, what other params, besides the ones I've given?
2020-01-20 17:04:45 -04:00
Joey Hess
aa949bbb7d
initremote --describe-other-params
Does not yet include descriptions from external special remote programs.
2020-01-20 16:05:51 -04:00
Joey Hess
99cb3e75f1
add LISTCONFIGS to external special remote protocol
Special remote programs that use GETCONFIG/SETCONFIG are recommended
to implement it.

The description is not yet used, but will be useful later when adding a way
to make initremote list all accepted configs.

configParser now takes a RemoteConfig parameter. Normally, that's not
needed, because configParser returns a parter, it does not parse it
itself. But, it's needed to look at externaltype and work out what
external remote program to run for LISTCONFIGS.

Note that, while externalUUID is changed to a Maybe UUID, checkExportSupported
used to use NoUUID. The code that now checks for Nothing used to behave
in some undefined way if the external program made requests that
triggered it.

Also, note that in externalSetup, once it generates external,
it parses the RemoteConfig strictly. That generates a
ParsedRemoteConfig, which is thrown away. The reason it's ok to throw
that away, is that, if the strict parse succeeded, the result must be
the same as the earlier, lenient parse.

initremote of an external special remote now runs the program three
times. First for LISTCONFIGS, then EXPORTSUPPORTED, and again
LISTCONFIGS+INITREMOTE. It would not be hard to eliminate at least
one of those, and it should be possible to only run the program once.
2020-01-17 16:07:17 -04:00
Joey Hess
9c45eca37d
update 2020-01-15 14:08:44 -04:00
Joey Hess
71ecfbfccf
be stricter about rejecting invalid configurations for remotes
This is a first step toward that goal, using the ProposedAccepted type
in RemoteConfig lets initremote/enableremote reject bad parameters that
were passed in a remote's configuration, while avoiding enableremote
rejecting bad parameters that have already been stored in remote.log

This does not eliminate every place where a remote config is parsed and a
default value is used if the parse false. But, I did fix several
things that expected foo=yes/no and so confusingly accepted foo=true but
treated it like foo=no. There are still some fields that are parsed with
yesNo but not not checked when initializing a remote, and there are other
fields that are parsed in other ways and not checked when initializing a
remote.

This also lays groundwork for rejecting unknown/typoed config keys.
2020-01-10 14:52:48 -04:00
Joey Hess
5e4deb3620
support sha256 git repos
Git will eventually switch to sha2 and there will not be one single
shaSize anymore, but two (40 and 64).

Changed all parsers for git plumbing output to support both sizes of
shas.

One potential problem this does not deal with is, if somewhere in
git-annex it reads two shas from different sources, and compares them
to see if they're the same sha, it would fail if they're sha1 and sha256
of the same value. I don't know if that will really be a concern.
2020-01-07 12:22:19 -04:00
Joey Hess
2de3dddfd2
reinject --known: Fix bug that prevented it from working in a bare repo.
ifAnnexed in a bare repo passes to git cat-file :./filename , which it
refuses to do since the repo is bare.

Note that, reinject somefile someannexedfile in a bare repo silently does
nothing, because someannexedfile is never actually an annexed worktree
file, because the repo is bare.
2020-01-06 14:22:22 -04:00
Joey Hess
2cea674d1e
Merge branch 'master' into v8 2020-01-01 14:26:43 -04:00
Joey Hess
503788238c
add --force-annex/--force-git
options make it easier to override annex.largefiles configuration
(and potentially safer as it avoids bugs like the smudge bug fixed
in the last release)

Deleted some old comments that were posted to the man page discussing such
options.

Updated docs that used -c annex.largefiles to use the options.

Note that addSmallOverridden was needed to avoid the clean filter running
on the file. It would be possible to make addFile also update the index
directly, rather than going via git add. However, it was not necessary,
and I want to avoid breaking on some edge case, particularly if the code in
addSmallOverridden has some oversight.

Also, when annex.addunlocked is set and annex.largefiles does not match a file,
git annex add --force-large works, but git status will then show the file
as added, with a unstaged modification. The unstaged modification adds the
file to git. This is identical behavior to using -c annex.largefiles=nothing
when annex.addunlocked is set. This does not prevent committing what was
intended to be added. I have not gotten to the bottom of why git thinks
the file is modified and runs it through the clean filter in this case.
2020-01-01 14:03:06 -04:00
Joey Hess
985373f8e7
releasing package git-annex version 7.20191230 2019-12-30 14:49:31 -04:00
Joey Hess
ea3cb7d277
fix a case where file tracked by git unexpectedly becomes annex pointer file
smudge: When annex.largefiles=anything, files that were already stored in
git, and have not been modified could sometimes be converted to being
stored in the annex. Changes in 7.20191024 made this more of a problem.
This case is now detected and prevented.
2019-12-27 15:08:03 -04:00
Joey Hess
3cd3757236
annex.dotfiles
The git add behavior changes could be avoided if it turns out to be
really annoying, but then it would need to behave the old way when
annex.dotfiles=false and the new way when annex.dotfiles=true. I'd
rather not have the config option result in such divergent behavior as
`git annex add .` skipping a dotfile (old) vs adding to annex (new).

Note that the assistant always adds dotfiles to the annex.
This is surprising, but not new behavior. Might be worth making it also
honor annex.dotfiles, but I wonder if perhaps some user somewhere uses
it and keeps large files in a directory that happens to begin with a
dot. Since dotfiles and dotdirs are a unix culture thing, and the
assistant users may not be part of that culture, it seems best to keep
its current behavior for now.
2019-12-26 16:33:39 -04:00
Joey Hess
2b821eb225
Merge branch 'master' into sqlite 2019-12-26 15:15:42 -04:00
Joey Hess
444d5591ee
Improve file ordering behavior when one parameter is "." and other parameters are other directories
eg, `git-annex get . ..` used to order the files strangly, because it
did not realize that when git ls-files output eg "foo", that should be
grouped with the first set of files and not the second set.

Fixed by making            dirContains "." "./foo" = True
which makes sense, because dirContains ".." "../foo" = True
2019-12-20 18:01:29 -04:00
Joey Hess
37467a008f
annex.addunlocked expressions
* annex.addunlocked can be set to an expression with the same format used by
  annex.largefiles, in case you want to default to unlocking some files but
  not others.
* annex.addunlocked can be configured by git-annex config.

Added a git-annex-matching-expression man page, broken out from
tips/largefiles.

A tricky consequence of this is that git-annex add --relaxed
honors annex.addunlocked, but an expression might want to know the size
or content of an url, which it's not going to download. I decided it was
better not to fail, and just dummy up some plausible data in that case.

Performance impact should be negligible. The global config is already
loaded for annex.largefiles. The expression only has to be parsed once,
and in the simple true/false case, it should not do any additional work
matching it.
2019-12-20 15:56:25 -04:00
Joey Hess
5591622731
git-annex-config --set/--unset: No longer change the local git config setting
e53070c1f quietly made it set the local git config too, but that was never
documented anywhere, and it had surprising results. If I set
annex.largefiles globally in a repo, I would expect to be able to change it
in another repo, and the original repo would get the change and use it,
rather than being stuck on the old value set there.

And, if I have a local annex.largefiles and set a different global default,
I'd be surprised to have my local setting overwritten.

annex.securehashesonly does need to be set locally, since it's a security
feature and the global is only a default until it gets set locally. So
special cased.
2019-12-20 13:17:28 -04:00
Joey Hess
4acbb40112
git-annex config annex.largefiles
annex.largefiles can be configured by git-annex config, to more easily set
a default that will also be used by clones, without needing to shoehorn the
expression into the gitattributes file. The git config and gitattributes
override that.

Whenever something is added to git-annex config, we have to consider what
happens if a user puts a purposfully bad value in there. Or, if a new
git-annex adds some new value that an old git-annex can't parse.
In this case, a global annex.largefiles that can't be parsed currently
makes an error be thrown. That might not be ideal, but the gitattribute
behaves the same, and is almost equally repo-global.

Performance notes:

git-annex add and addurl construct a matcher once
and uses it for every file, so the added time penalty for reading the global
config log is minor. If the gitattributes annex.largefiles were deprecated,
git-annex add would get around 2% faster (excluding hashing), because
looking that up for each file is not fast. So this new way of setting
it is progress toward speeding up add.

git-annex smudge does need to load the log every time. As well as checking
the git attribute. Not ideal. Setting annex.gitaddtoannex=false avoids
both overheads.
2019-12-20 13:01:41 -04:00
Joey Hess
ce3fb0b2e5
fixed an oversight that had always prevented annex.resolvemerge from being honored, when it was configured by git-annex config
forgot to add it to the merge function
2019-12-20 11:00:08 -04:00
Joey Hess
f6c18f6940
Merge branch 'bs' into sqlite-bs 2019-12-18 15:14:44 -04:00
Joey Hess
7d9dff5b05
Merge branch 'master' into bs
and update changelog
2019-12-18 15:13:30 -04:00
Joey Hess
d5628a16b8
Merge branch 'bs' into sqlite-bs 2019-12-18 14:51:03 -04:00
Joey Hess
7fd5376334
inprogress: Support --key 2019-12-18 14:14:16 -04:00
Joey Hess
1bc7055a21
add back changelog entry 2019-12-18 13:53:10 -04:00
Joey Hess
c19211774f
use filepath-bytestring for annex object manipulations
git-annex find is now RawFilePath end to end, no string conversions.
So is git-annex get when it does not need to get anything.
So this is a major milestone on optimisation.

Benchmarks indicate around 30% speedup in both commands.

Probably many other performance improvements. All or nearly all places
where a file is statted use RawFilePath now.
2019-12-11 15:25:07 -04:00
Joey Hess
2f9a80d803
merging sqlite and bs branches
Since the sqlite branch uses blobs extensively, there are some
performance benefits, ByteStrings now get stored and retrieved w/o
conversion in some cases like in Database.Export.
2019-12-06 15:30:45 -04:00
Joey Hess
718fa83da6
mention optimisations 2019-12-05 11:46:55 -04:00
Joey Hess
960f62a564
typo 2019-11-22 19:48:34 -04:00
Joey Hess
81d402216d cache the serialization of a Key
This will speed up the common case where a Key is deserialized from
disk, but is then serialized to build eg, the path to the annex object.

Previously attempted in 4536c93bb2
and reverted in 96aba8eff7.
The problems mentioned in the latter commit are addressed now:

Read/Show of KeyData is backwards-compatible with Read/Show of Key from before
this change, so Types.Distribution will keep working.

The Eq instance is fixed.

Also, Key has smart constructors, avoiding needing to remember to update
the cached serialization.

Used git-annex benchmark:
  find is 7% faster
  whereis is 3% faster
  get when all files are already present is 5% faster
Generally, the benchmarks are running 0.1 seconds faster per 2000 files,
on a ram disk in my laptop.
2019-11-22 17:49:16 -04:00
Joey Hess
7263aafd2b
Merge branch 'master' into sqlite 2019-11-22 12:49:35 -04:00
Joey Hess
92e1bb250b
simplify the name of the test cases 2019-11-21 17:38:58 -04:00
Joey Hess
58a8005441
Merge branch 'master' into sqlite 2019-11-21 17:28:27 -04:00
Joey Hess
a9888f6151
Windows: Fix handling of changes to time zone.
Used to work but was broken in version 7.20181031, specifically commit
5ab0f48ffb.

That this was not noticed over at least 1 daylight savings time zone
changes makes me wonder if the TSDelta stuff is still needed.
Perhaps the mtime on Windows no longer changes when the time zone is changed?

(cherry picked from commit 09ee6b0ccb)
2019-11-21 17:28:18 -04:00
Joey Hess
d4661959de
Merge branch 'master' into sqlite 2019-11-21 17:26:50 -04:00
Joey Hess
25ba8156bc
improve benchmark --databases
* benchmark: Changed --databases to take a parameter specifiying the size
  of the database to benchmark.
* benchmark --databases: Display size of the populated database.
* benchmark --databases: Improve the "addAssociatedFile to (new)"
  benchmark to really add new values, not overwriting old values.
2019-11-21 17:25:20 -04:00
Joey Hess
43f19ef00a
Fix bug that made bare repos be treated as non-bare when --git-dir was used.
Eg:

git clone url --bare r
git --git-dir r annex init

This resulted in worktree = Just "." and so several things that check
worktree to determine when the repo is bare ran code paths intended for
non-bare. One such code path[1] ran git checkout with --worktree=. which
actually makes it ignore core.bare config, and so the current directory
got populated with a checkout of the master branch in this example. There
was probably also other breakage.

The fix is a bit complicated because whether the repo is bare is not
known until after Git.Config reads the config, but Git.Config handles
setting the RepoLocations's worktree when core.worktree is set. So have
to assume the worktree is the cwd, let core.worktree override that,
and then if the repo turns out to be bare, it's set back to Nothing.
(And then GIT_WORK_TREE can still override all of that.)

[1] switchHEADBack, which runs even when the clone is not from a bare repo.
2019-11-21 13:26:02 -04:00
Joey Hess
b207d944f3
sync, assistant: Pull and push from git-lfs remotes.
Oversight, forgot to add it to gitSyncableRemote
2019-11-18 16:13:21 -04:00
Joey Hess
5877de5e80
git-lfs: remember urls, and autoenable remotes using known urls
* git-lfs: The url provided to initremote/enableremote will now be
  stored in the git-annex branch, allowing enableremote to be used without
  an url. initremote --sameas can be used to add additional urls.
* git-lfs: When there's a git remote with an url that's known to be
  used for git-lfs, automatically enable the special remote.
2019-11-18 16:09:09 -04:00
Joey Hess
cee14f147a
stop displaying rsync progress, and use git-annex's own progress display for local-to-local repo transfers
Reasons to do this include:

1. I've gotten pretty used to git-annex's own progress display, which is
   used for all transfers over ssh (except to old git-annex-shell),
   and for most special remote transfers. It's getting to seem weird to see
   the rsync progress display instead.
2. When -J was used, the rsync output could not be shown, and so there was
   no progress display. Now there will be.

Progress will also be displayed now when cp CoW is used. But I'd expect a CoW
copy to typically run so fast that the progress display will barely be
noticable.

This commit was sponsored by Peter on Patreon.
2019-11-15 13:21:06 -04:00
Joey Hess
a95efcbc55
releasing package git-annex version 7.20191114 2019-11-14 21:58:23 -04:00
Joey Hess
b321526473
OSX link libs into git-core directory
So that binaries in that directory can find the library next to them,
where they get modified to look.

This is a hack; it would be better for OSXMkLibs to build a list of what
libraries are needed where.

Unsure if this is needed due to a recent reversion, or is an older
problem, so updated changelog accordingly.
2019-11-14 18:31:58 -04:00
Joey Hess
f037ad92ec
OSX git-annex.app: Fix a regression that broke git-remote-https, git-remote-http, and git-shell
Putting the binaries in bundle/git-core/bin didn't work on OSX,
linker can't find the libraries next to those binaries where it expects to.
So instead put the binaries in the progDir.
2019-11-14 16:15:42 -04:00
Joey Hess
842449b086
linuxstandalone: Fix a regression that broke git-remote-https. 2019-11-14 15:08:23 -04:00
Joey Hess
667d38a8f1
Fix a crash (STM deadlock) when -J is used with multiple files that point to the same key
See the comment for a trace of the deadlock.

Added a new StartStage. New worker threads begin in the StartStage.
Once a thread is ready to do work, it moves away from the StartStage,
and no thread will ever transition back to it.

A thread that blocks waiting on another thread that is processing
the same key will block while in the StartStage. That other thread
will never switch back to the StartStage, and so the deadlock is avoided.
2019-11-14 13:51:09 -04:00
Joey Hess
890330f0fe
make --json-error-messages capture url download errors
Convert Utility.Url to return Either String so the error message can be
displated in the annex monad and so captured.

(When curl is used, its errors are still not caught.)
2019-11-12 13:52:38 -04:00
Joey Hess
3b34d123ed
Added annex.allowsign option.
This commit was sponsored by Ilya Shlyakhter on Patreon.
2019-11-11 16:28:56 -04:00
Joey Hess
aa010108cd
Merge branch 'master' into sqlite 2019-11-07 13:20:04 -04:00
Joey Hess
09ee6b0ccb
Windows: Fix handling of changes to time zone.
Used to work but was broken in version 7.20181031, specifically commit
5ab0f48ffb.

That this was not noticed over at least 1 daylight savings time zone
changes makes me wonder if the TSDelta stuff is still needed.
Perhaps the mtime on Windows no longer changes when the time zone is changed?
2019-11-06 14:36:49 -04:00
Joey Hess
73e928fcfb
prep release 2019-11-06 12:21:02 -04:00
Joey Hess
6147130e86
Merge branch 'master' into sqlite 2019-11-05 12:59:28 -04:00
Joey Hess
e2d4c133f5
init: fix data loss bug
Fix bug that lost modifications to unlocked files when init is re-ran in an
already initialized repo.

In retrospect needing scanUnlockedFiles False in the direct mode upgrade
path was a good hint that it was unsafe when used with True.

However, this bug did not affect upgrade from v5. In such an upgrade, an
unlocked file that is modified is left as-is. The only place
scanUnlockedFiles True did overwrite modified unlocked files is during an
git-annex init of a repo that was already initialized by git-annex.

(I also tried a scenario where the repo had not been initialized by
git-annex yet, but was cloned from a v7 repo with an unlocked file, and the
pointer file replaced with some other content, and the data loss did not
occur in that situation.)

Since the fixed scanUnlockedFiles avoids overwriting non-pointer files,
it should be safe to run in any situation, so there's no need any longer
for the parameter.
2019-11-05 12:41:15 -04:00
Joey Hess
09c7cbbaa8
update for things already fixed in this branch 2019-10-30 13:57:22 -04:00
Joey Hess
25f912de5b
benchmark: Add --databases to benchmark sqlite databases
Rescued from commit 11d6e2e260 which removed
db benchmarks in favor of benchmarking arbitrary git-annex commands. Which
is nice and general, but microbenchmarks are useful too.
2019-10-29 16:59:27 -04:00
Joey Hess
fd96408c67
releasing package git-annex version 7.20191024 2019-10-25 13:07:58 -04:00
Joey Hess
59b8294b2b
prep release 2019-10-24 14:40:36 -04:00
Joey Hess
31a5b58b2c
documentation for making git add only annex when configured by annex.largefiles
Code change should be trvial, but not yet implemented. This
significantly complicated the task of documenting how git-annex works.

I'm not sure how useful the annex.gitaddtoannex confguration is after
this change; seems that if a user has an annex.largefiles they will want
it applied consistently. But the last thing I want to hear is more
complaining from users about git add doing something they don't want it
to.

There's a pretty high risk users who got used to the git add behavior
and don't have annex.largefiles configured will miss the NEWS and
complain bitterly about their suddenly bloated repositories. Oh well.

Removed outdated comments about the old behavior to avoid confusion.
I don't know if I've found all the places that griping spread to.
2019-10-24 14:01:54 -04:00
Joey Hess
bd197be3ad
annex.gitaddtoannex configuration
Added annex.gitaddtoannex configuration. Setting it to false prevents
git add from usually adding files to the annex.
(Unless the file was annexed before, or a renamed annexed file is detected.)

Currently left at true; some users are encouraging it be set to false.
2019-10-23 15:29:46 -04:00
Joey Hess
bbdeb1a1a8
sync: Fix crash when there are submodules and an adjusted branch is checked out
Reverse adjusting the branch uses treeItemToTreeContent, which was missed
when adding submodule support earlier.
2019-10-23 11:52:56 -04:00
Joey Hess
9a5d9019ba
Deal with pkexec changing to root's home directory when running a command.
Wow, that's not documented anywhere, and seems like a major gotcha in
pkexec.

Broke enable-tor.
2019-10-21 12:39:19 -04:00
Joey Hess
5db79339a1
init: Fix a failure when used in a submodule on a crippled filesystem.
When the submodule's parent repo has an adjusted unlocked branch,
it gets cloned by git, but git checks out master. git annex init then
fails because it wants to enter the adjusted branch, but:

  adjusted branch adjusted/master(unlocked) already exists.

  Aborting because that branch may have changes that have not yet reached master

Note that init actually then exits 0, leaving master checked out.

This could also happen, absent submodules, if the parent repo has
an adjusted unlocked branch, but it is not checked out. In the more common
case where that branch is checked out, the clone uses the same branch,
so no problem.

The choices to fix this:

* Init could delete the existing adjusted branch, and re-adjust.
  But then running init inside an adjusted branch on a crippled filesystem
  would lose any changes that have not been synced back to master.
* Init could sync any changes back to master, but that would be very surprising
  behavior for it.
* Init could simply check out the existing adjusted branch. If the branch
  is diverged from master, well, sync will sort that out later.
  This mirrors the behavior of cloning a repo that has an adjusted branch
  checked out that has not yet been synced back to master.
  Picked this choice.
2019-10-21 11:41:15 -04:00
Joey Hess
f60e8f2c93
releasing package git-annex version 7.20191017 2019-10-17 18:19:47 -04:00
Joey Hess
904b175707
Fix build with persistent-2.10.
Added an additional constraint that persistent needs.
This also builds with persistent-2.9.2 without needing any cpp.
2019-10-17 11:58:31 -04:00
Joey Hess
5463f97ca2
OSX: Deal with symbolic link problem that caused git to not be included in the git-annex.dmg
Homebrew now has eg:

datalads-imac:~ joey$ ls -l /Users/joey/homebrew/Cellar/git/2.23.0/libexec/git-core
total 36776
lrwxr-xr-x   1 joey  staff       13 Aug 29 13:38 git -> ../../bin/git
lrwxr-xr-x   1 joey  staff       13 Aug 29 13:38 git-add -> ../../bin/git

So the target of the symlink also needs to be installed now.

Doing it in shell code was too hairy for my dentistry-addled brain, so
reimplemented in haskell. Also using it for building linuxstandalone.
2019-10-17 11:01:41 -04:00
Joey Hess
4306dfbe68
remove empty log files in transition
forget --drop-dead: Remove several classes of git-annex log files when they
become empty, further reducing the size of the git-annex branch.

Noticed while testing sameas uuid removal, but it could happen other times
too.

An empty log file is always treated by git-annex the same as no file
being present, and when the files are per-key, it can be a sizable space
saving to exclude them from the tree.
2019-10-14 16:04:15 -04:00
Joey Hess
9828f45d85
add RemoteStateHandle
This solves the problem of sameas remotes trampling over per-remote
state. Used for:

* per-remote state, of course
* per-remote metadata, also of course
* per-remote content identifiers, because two remote implementations
  could in theory generate the same content identifier for two different
  peices of content

While chunk logs are per-remote data, they don't use this, because the
number and size of chunks stored is a common property across sameas
remotes.

External special remote had a complication, where it was theoretically
possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or
EXPORTSUPPORTED. Since the uuid of the remote is typically generate in
Remote.setup, it would only be possible to pass a Maybe
RemoteStateHandle into it, and it would otherwise have to construct its
own. Rather than go that route, I decided to send an ERROR in this case.
It seems unlikely that any existing external special remote will be
affected. They would have to make up a git-annex key, and set state for
some reason during INITREMOTE. I can imagine such a hack, but it doesn't
seem worth complicating the code in such an ugly way to support it.

Unfortunately, both TestRemote and Annex.Import needed the Remote
to have a new field added that holds its RemoteStateHandle.
2019-10-14 13:51:42 -04:00
Joey Hess
37f725a9f7
Merge branch 'master' into sameas 2019-10-11 15:56:00 -04:00
Joey Hess
8131451c35
releasing package git-annex version 7.20191009 2019-10-09 12:33:09 -04:00
Joey Hess
f4dd7d5191
work around windows having infected git's plumbing
Work around git cat-file --batch's odd stripping of carriage return from
the end of the line (some windows infection), avoiding crashing when the
repo contains a filename ending in a carriage return.
2019-10-08 15:27:05 -04:00
Joey Hess
8966ba2cff
git-annex-standalone.rpm: Fix the git-annex-shell symlink 2019-10-08 14:43:28 -04:00
Joey Hess
53da7f1cf8
update uninit to handle all the v7 stuff
* uninit: Remove several git hooks that git-annex init sets up.
* uninit: Remove the smudge and clean filters that git-annex init sets up.
2019-10-08 14:34:00 -04:00
Joey Hess
1113caa53e
preserve unlocked file mtime when dropping
When dropping an unlocked file, preserve its mtime, which avoids git status
unncessarily running the clean filter on the file.

If the index file has close to the same mtime as a work tree file, git will
not trust the index to be up-to-date, and re-runs the clean filter
unncessarily. Preserving the mtime when depopulating a pointer file avoids
git status doing a little (or maybe a lot) of unncessary work.

There are other places that the mtime could be preserved, including other
places where pointer files are written perhaps, but also
populatePointerFile. But, I don't know of cases where those lead to git
status doing unncessary work, so I just fixed the one I'm aware of for now.
2019-10-08 14:01:12 -04:00
Joey Hess
2e6fd5de71
fix flipped diffUTCTime
fsck --incremental/--more: Fix bug that prevented the incremental fsck
information from being updated every 5 minutes as it was supposed to be; it
was only updated after 1000 files were checked, which may be more files
that are possible to fsck in a given fsck time window.

Thanks to Peter Simons for help with analysis of this bug.

Auditing for other cases of the same mistake, the keys db also had it
backwards. This seems unlikely to really have been a problem;
it would need associated files updates etc to be coming in slowly for some
reason and then be interrupted to cause any problem.

IIRC the design of the keys db assumes that any interruped
operation will be restarted, and so it can lose any buffered database
updates safely.
2019-10-03 09:54:19 -04:00
Joey Hess
61b384d2b7
add --sameas option, not yet used 2019-10-01 12:36:25 -04:00
Joey Hess
3066bdb1fb
fix annex.largefiles largerthan/smallerthan bug
Fix bug in handling of annex.largefiles that use largerthan/smallerthan.
When adding a modified file, it incorrectly used the file size of the old
version of the file, not the current size.

That was the only largefiles limit that didn't directly look at the file on
disk already. Added a new type to keep straight the two different ways such
a limit can be matched. I kind of wanted to extend MatchingFile or FileInfo
to indicate that the matcher is supposed to operate on files from disk or
annex, but it turned out to be too complex to implement it that way.

This also changes the LimitAnnexFiles case when lookupFileKey does not find
a key. It used to fall back to statting the file, now it always returns
False. I doubt the old code could really get to that point, but if it
somehow does, it's better for preferred content matching to be consistent.
2019-09-30 17:15:08 -04:00
Joey Hess
b90ddbc383
enable-tor: Use pkexec to run command as root when gksu and kdesu are not available.
gksu is no longer in debian, even stable

kdesu in debian is not installed in PATH any longer, though the executable
is still present under /usr/lib

pkexec is packagekit's replacement for those older commands.
2019-09-30 15:19:01 -04:00
Joey Hess
f2737a5fbe
enable-tor: Run kdesu with -c option. 2019-09-30 15:14:05 -04:00
Joey Hess
2b55a2b882
remotedaemon: Don't list --stop in help since it's not supported.
Also, move out of plumbing section. When using tor, the remotedaemon is
part of the user's workflow, as it runs the tor hidden service.
2019-09-30 14:40:46 -04:00
Joey Hess
090898a138
adjust --lock: This enters an adjusted branch where files are locked.
Straightforward, except for the issue of how to reverse LockAdjustment.

With --unlock, a commit that modifies/adds unlocked files gets reverse
adjusted to use locked files. That's fairly reasonable, I think.

But reversing --lock by unlocking all modified files feels wrong. Maybe
that's just because repositories typically seem to still have mostly
locked files in them (unless one is in an adjusted unlocked branch of
course!)

It may be that eventually how to reverse both will need to be configurable,
I don't know.
2019-09-27 14:23:25 -04:00
Joey Hess
9628ae2e67
Close sqlite databases more robustly.
Had a report of close throwing ErrorBusy on CIFS.

Retrying up to 16 seconds is a balance between hopefully waiting long
enough for the problem to clear up and waiting so long that git-annex seems
to hang.

The new dependency is free; persistent depends on unliftio-core.
2019-09-26 12:25:21 -04:00
Joey Hess
8af791d769
Test: Use more robust directory removal method.
I just had a test that crashed at cleanup on linux with:

.t/gpgtest/12/S.gpg-agent.browser: removeDirectoryRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:getSymbolicLinkStatus: does not exist (No such file or directory)
sleeping 10 seconds and will retry directory cleanup
git-annex: .t/gpgtest/14/S.gpg-agent.browser: removeDirectoryRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:getSymbolicLinkStatus: does not exist (No such file or directory)

removePathForcibly is supposed to be more robust to things in the directory vanishing while it's running, etc.
Will probably avoid such crashes.

It was added to directory-1.2.7, which comes with ghc since 8.0.2.
Since base >= 4.11.1.0 means ghc 8.4.4, I expect all builds will have it,
but I ifdefed it to be sure.
2019-09-24 16:59:37 -04:00
Joey Hess
6ae0a44c64
git-lfs: Added support for http basic auth 2019-09-24 14:46:20 -04:00
Joey Hess
de564df8b3
git-lfs: Only do endpoint discovery once when concurrency is enabled
This avoids some extra work, but I don't think it was possible for two ssh
endpoint discoveries run concurrently to both prompt for the ssh password;
Annex.Ssh itself deals with concurrency.

This is mostly groundwork for http password prompting.
2019-09-24 13:01:51 -04:00
Joey Hess
b13a350556
added --unlocked and --locked 2019-09-19 12:33:13 -04:00
Joey Hess
fda1bdd679
Added --mimetype and --mimeencoding file matching options.
Already had these for largefiles matching, but I forgot to add them as
command-line options.
2019-09-19 12:09:59 -04:00
Joey Hess
ab739242a3
releasing package git-annex version 7.20190912 2019-09-13 12:53:40 -04:00
Joey Hess
a8fea1644d
docs for git-annex-standalone rpm 2019-09-13 12:18:36 -04:00
Joey Hess
4508198507
building a standalone rpm from the standalone tarball
This allows the rpm to be built anywhere the necessary build deps are
available (including on debian) and the resulting package will work on as
broad a range of rpm distributions as the libc/kernel supports.

The DistributionUpdate changes to use the new script have not yet been
tested.
2019-09-13 11:53:17 -04:00
Joey Hess
4a4e08e123
release prep 2019-09-12 13:53:22 -04:00
Joey Hess
fef3cd055d
Removed support for git versions older than 2.1
debian oldoldstable has 2.1, and that's what i386ancient uses. It would be
better to require git 2.2, which is needed to use adjusted branches, but
can't do that w/o losing support for some old linux kernels or a
complicated git backport.
2019-09-11 16:14:43 -04:00
Joey Hess
061231621e
Merge branch 'master' into v7-default 2019-09-10 16:06:43 -04:00
Joey Hess
94c75d2bd9
init: Fix a reversion that broke initialization on systems that need to use pid locking
This brings back .git/annex/misctmp, but only for init. If an init
is interrupted while probing using that temp directory, the files it left
will get deleted 1 week later by a subsequent git-annex run.
2019-09-10 13:37:07 -04:00
Joey Hess
0af7ebdc2a
info: Display trust level when getting info on a uuid, same as on a remote. 2019-09-01 16:48:46 -04:00
Joey Hess
f845195354
Added annex.autoupgraderepository configuration
Can be set to false to prevent any automatic repository upgrades.

Also, removed direct mode specific upgrade code in Annex.Init, and made
needsUpgrade always include the name/path of the repo, so if
there's a problem it's clear what repo has the problem.

And, made needsUpgrade catch any exceptions that might occur during the
upgrade, so it can display a more useful error message than just the
exception.
2019-09-01 13:42:26 -04:00
Joey Hess
3f0eef4baa
v7 for all repositories
* Default to v7 for new repositories.
* Automatically upgrade v5 repositories to v7.
2019-08-30 14:09:14 -04:00
Joey Hess
1558e03014
Refuse to upgrade direct mode repositories when git is older than 2.22
That git fixed a memory leak that could cause an OOM during the upgrade.

Most git-annex builds have a new enough git already.
OSX git was upgraded with brew.

Linux i386ancient build's git was too old. Upgrading it to a fixed
git didn't work (due to the newer git not working with the old ssh,
https://bugs.chromium.org/p/git/issues/detail?id=7 )

Choices to deal with that were:

* Somehow make direct mode upgrade work with the old git, avoiding its
  OOM problem. One way would be to switch the repo to indirect mode
  first, and so upgrade to a repo with locked files. Not good when
  the filesystem does not support symlinks.
* backport the OOM fix from git 2.22
  (And do what about the version number so git-annex knows it's fixed?)
* backport openssh (and possibly more stuff)
* move the i386ancient build to at least Debian stretch (still backporting git)
  But this will make it no longer work with some of the ancient kernels it
  targets.

Of those, backporting the OOM fix seemed the best approach. Put "oomfix"
in the git version number to indicate it.

I have not automated building the git backport, so here's the patch I
used:

diff -ur orig/git-2.1.4/convert.c git-2.1.4/convert.c
--- orig/git-2.1.4/convert.c	2014-12-18 18:42:18.000000000 +0000
+++ git-2.1.4/convert.c	2019-08-29 20:05:04.371872338 +0100
@@ -404,7 +404,7 @@
 	if (start_async(&async))
 		return 0;	/* error was already reported */

-	if (strbuf_read(&nbuf, async.out, len) < 0) {
+	if (strbuf_read(&nbuf, async.out, 0) < 0) {
 		error("read from external filter %s failed", cmd);
 		ret = 0;
 	}
diff -ur orig/git-2.1.4/GIT-VERSION-GEN git-2.1.4/GIT-VERSION-GEN
--- orig/git-2.1.4/GIT-VERSION-GEN	2014-12-18 18:42:18.000000000 +0000
+++ git-2.1.4/GIT-VERSION-GEN	2019-08-29 20:06:39.132743228 +0100
@@ -1,7 +1,7 @@
 #!/bin/sh

 GVF=GIT-VERSION-FILE
-DEF_VER=v2.1.4
+DEF_VER=v2.1.4.oomfix

 LF='
 '
diff -ur orig/git-2.1.4/configure git-2.1.4/configure
--- orig/git-2.1.4/configure	2014-12-18 18:42:19.000000000 +0000
+++ git-2.1.4/configure	2019-08-29 20:27:45.896380015 +0100
@@ -580,8 +580,8 @@
 # Identity of this package.
 PACKAGE_NAME='git'
 PACKAGE_TARNAME='git'
-PACKAGE_VERSION='2.1.4'
-PACKAGE_STRING='git 2.1.4'
+PACKAGE_VERSION='2.1.4.oomfix'
+PACKAGE_STRING='git 2.1.4.oomfix'
 PACKAGE_BUGREPORT='git@vger.kernel.org'
 PACKAGE_URL=''

diff -ur orig/git-2.1.4/version git-2.1.4/version
--- orig/git-2.1.4/version	2014-12-18 18:42:19.000000000 +0000
+++ git-2.1.4/version	2019-08-29 20:06:17.572545210 +0100
@@ -1 +1 @@
-2.1.4
+2.1.4.oomfix
2019-08-29 15:24:41 -04:00
Joey Hess
4f59ac05b6
info: remove "repository mode"
info: Removed the "repository mode" from its output (including the --json
output) since with the removal of direct mode, there is no repository mode.
2019-08-29 14:12:22 -04:00
Joey Hess
d6e1f09ed2
init: Catch more exceptions when testing locking. 2019-08-29 12:19:07 -04:00
Joey Hess
586db7f06d
Avoid making a commit when upgrading from direct mode to v7
Three reasons:

* Committing as part of an upgrade is very unusual and unexpected.
* The commit was failing with a weird error message when done during an
  automatic upgrade.
* Let me remove more of that sweet^Whorrible direct mode code.
2019-08-26 16:35:44 -04:00
Joey Hess
adb89ee71b
update test suite for removal of direct mode
Removed that pass and all the complications of checking direct mode's
edge cases.
2019-08-26 15:07:10 -04:00
Joey Hess
20741b1eb4
Automatically convert direct mode repositories to v7 with adjusted unlocked branches
* Automatically convert direct mode repositories to v7 with adjusted
  unlocked branches and set annex.thin.
* init: When run on a crippled filesystem with --version=5,
  will error out, since version 7 is needed for adjusted unlocked branch.
* direct: This command always errors out as direct mode is no longer
  supported.
* indirect: This command has become a deprecated noop.
* proxy: This command is deprecated because it was only needed in direct
  mode. (But it continues to work.)

Also removed mentions of direct mode throughough the documentation.

I have not removed all the direct mode code yet.
2019-08-26 15:05:25 -04:00
Joey Hess
5877a15d7b
fix hard links when upgrading from direct mode
When upgrading a direct mode repo to v7 with adjusted unlocked branches,
fix a bug that prevented annex.thin from taking effect for the files in
working tree.

The hard links used to be ok, but commit 8e22114735 accidentially
broke them. It repopulates the worktree file, which is already a hard link,
and when it's creating the new file, the link count is already 2, and so it
doesn't make a hard link then.
2019-08-26 13:54:39 -04:00
Joey Hess
2fd27c6df5
assistant: When creating a new repository use v7 adjusted branches with annex.thin
Rather than direct mode, which this is a small step on the path to
removing.

Init on a crippled filesystem already used v7 adjusted branches,
and like that, this doesn't pose any interoperability issues with old
versions of git-annex that clone the same repo, because files are only
unlocked on the adjusted branch.
2019-08-26 12:54:14 -04:00
Joey Hess
c650389118
info: error out when file matching options used on non-directory
When file matching options are specified when getting info of
something other than a directory, they won't have any effect, so error out
to avoid confusion.

This commit was sponsored by mo on Patreon.
2019-08-24 13:20:19 -04:00
Joey Hess
972fd11f4e
releasing package git-annex version 7.20190819 2019-08-19 12:26:45 -04:00
Joey Hess
7f97575941
Makefile: Changed default zsh completion location to zsh default fpath.
Systems such as Debian that have overridden the default fpath will need to
set ZSH_COMPLETIONS_PATH.

I feel that Debian is causing unncessary complexity by making this change,
and have filed a bug report about it.

This also means that when git-annex is installed with PREFIX=/usr/local
it will use /usr/local/share/zsh/site-functions which works with probably
all versions of zsh.
2019-08-16 14:08:56 -04:00
Joey Hess
5fcaaf77db
Make git-annex-standalone.deb include the shell completions again
Was lost when the install-completions target was added.
2019-08-16 13:47:48 -04:00
Joey Hess
fa62c32233
Fix intermittent failure of the test suite
Its repeated opening and writing to the sqlite database somehow caused
inode cache information to occasionally be lost.

This loses code coverage, since running git-annex as a child process
prevents tracking what parts of the code are exercised. I have not looked
at the code coverage in a long time. It would probably be possible to
collect code coverage for the child procesess and merge it together.
2019-08-16 11:11:55 -04:00
Joey Hess
708fc6567f
S3: Fix encoding when generating public urls of S3 objects.
This code feels worryingly stringily typed, but using URI does not help
because the uriPath still has to be constructed with the right
uri-encoding.
2019-08-15 12:56:46 -04:00
Joey Hess
dc672863c3
init: Install working hook scripts when run on a crippled filesystem and on Windows 2019-08-13 15:14:17 -04:00
Joey Hess
b87ea12b6b
git-annex merge branch
* merge: When run with a branch parameter, merges from that branch.
  This is especially useful when using an adjusted branch, because
  it applies the same adjustment to the branch before merging it.
2019-08-09 13:21:15 -04:00
Joey Hess
b90ee6dc52
test: Add pass using adjusted unlocked branch
On second thought, the extra time running the test suite is worth it.
It will be gained back once we finally get rid of direct mode.

There are two failing tests, same two that have been failing on windows
(though the failure does not look identical). So this should also spare me
the Windows VM while fixing.
2019-08-09 11:34:10 -04:00
Joey Hess
298812a353
use separate main repo dir for each test suite pass
This way a failure to clean up the main repo dir from a previous pass
can't result in reusing that repo, which won't be configured right for the
current pass.
2019-08-08 14:29:28 -04:00
Joey Hess
70b71bf660
have init --version fail when repo is already initialized with other version
init: When the repo is already initialized, and --version requests a
different version, error out rather than silently not changing the version.
2019-08-08 14:13:02 -04:00
Joey Hess
3adc251f9d
Build with silently-1.2.5.1 on Windows; the old one used "NUL" which is not supported with recent versions of ghc. 2019-08-07 17:42:16 -04:00
Joey Hess
30ca02928c
Windows installer: Always install to 64 bit program files directory, since it needs 64 bit git now
I saw the installer not defaulting to any installation directory,
and I had to manually enter C:\Program Files\Git

Maybe it was choosing gitInstallDir32, and that was empty? Or the
conditional somehow failed. Simplifying so it will hopefully work again.
2019-08-07 14:05:03 -04:00
Joey Hess
bf5dd723d3
Fix querying git for object type when operating on a file containing newlines
This typo would make "git cat-file cat-file" fail, and the way it's used,
I think it broke querying all info from filenames containing newlines,
because the other queries are only run when it succeeds.
2019-08-07 13:35:42 -04:00
Joey Hess
fb7d92457f
support using gcrypt with git-lfs special remote 2019-08-05 13:43:45 -04:00
Joey Hess
8401b09e32
Allow setting up a gcrypt special remote with encryption=shared
It was documented to work, but seems it has been broken for a
while/forever.
2019-08-05 12:41:05 -04:00
Joey Hess
d1a0c7b16f
make --in=here fast
Use the same optimisation for --in=here as has always been used for --in=.
rather than the slow code path that unncessarily queries the git-annex
branch.

It looks like when "here" got added as an alias for "." back in 2012, I
forgot about this place.

Also sped up some very unlikely ways of referring to the current
repository.

Note that, this could in some rare corner case cause a behavior
change, if the git-annex branch and inAnnex disagree about whether content
is present in the local repository. But --in=. already behaved
that way, and the truth on the ground should win also.
2019-08-01 00:29:47 -04:00
Joey Hess
018b5b8173
Support building with socks-0.6 and persistant-template-2.7
persistent-template now needs UndecidableInstances.

socks changed defaultSocksConf to take a SockAddr.
2019-07-30 12:50:48 -04:00
Joey Hess
9fd37e65d0
prep release 2019-07-30 12:47:33 -04:00
Joey Hess
426053cb6c
Corrected some license statements
In 40ecf58d4b I changed the license of code I
wrote from GPL to AGPL. But, two files containing code I wrote combined
with code by others were updated to say their license is AGPL, while in
fact part of it was (the code I wrote) but part remained under the original
license (the code written by others).

Remote/Ddar.hs is now changed entirely back to GPL 3.

Annex/DirHashes.hs stays AGPL, but I broke out Utility/MD5.hs with the code
not written by me, and corrected its license statement to GPL-2, which
is the actual version of the GPL included with the code in its original
distribution at http://www.cs.ox.ac.uk/people/ian.lynagh/md5/
2019-07-28 14:27:33 -04:00
Joey Hess
875c7b5cc9
windows long filenames should be fixed now by new ghc 2019-07-22 09:44:09 -04:00
Joey Hess
ff85adba76
remove bundled rsync from windows build
rsync is only needed for rsync special remotes and git-annex-shell from
Debian oldstable. Since the library situation on windows for rsync required
a particular 32 bit build of git for it to work, and may also somehow need
git-annex to be 32 bit build, it's better to not include it.

This commit was sponsored by Jake Vosloo on Patreon.
2019-07-22 09:37:42 -04:00
Joey Hess
21ff5e1e5a
CoW probing
Improved probing when CoW copies can be made between files on the same
drive. Now supports CoW between BTRFS subvolumes. And, falls back to rsync
instead of using cp when CoW won't work, eg copies between repos on the
same EXT4 filesystem.

Rather than trying cp --reflink=always for each file copied to a remote,
it's tried once and if it fails it falls back to using rsync thereafter
for the lifetime of the Remote object. That avoids overhead of calling cp
which while small, will add up over a large number of files.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2019-07-17 14:19:08 -04:00
Joey Hess
7be690f326
check headRef not Branch.current
Support running v7 upgrade in a repo where there is no branch checked out,
but HEAD is set directly to some other ref.

This commit was sponsored by Jack Hill on Patreon.
2019-07-16 12:36:29 -04:00
Joey Hess
25f7a79217
stack.yaml: Build with http-client-0.5.14 to get a bug fix to http header parsing
The cabal file does not yet demand this version because it's not in Debian
yet and only affects use of certian broken http servers, but let's use it
when it's easily available.
2019-07-09 10:10:05 -04:00