Commit graph

275 commits

Author SHA1 Message Date
Joey Hess
656797b4e8
update for export 2017-09-04 14:25:00 -04:00
Joey Hess
db2a06b66f
init: Display an additional message when it detects a filesystem that allows writing to files whose write bit is not set. 2017-08-28 13:21:18 -04:00
Joey Hess
ee2f096e3b
Support building with feed-1.0, while still supporting older versions.
This commit was sponsored by Jeff Goeke-Smith on Patreon.
2017-08-28 12:29:28 -04:00
Joey Hess
c76ba5a15e
CVE-2017-12976 2017-08-20 16:50:53 -04:00
Joey Hess
252994e095
releasing package git-annex version 6.20170818 2017-08-18 11:19:14 -04:00
Joey Hess
55495c5a98
git-annex.cabal: Deal with breaking changes in Cabal 2.0
https://github.com/haskell/cabal/issues/4655

This means that when a module is conditionally imported via ifdef
depending on the OS or build flags, the cabal file has to mirror the
same logic there to only list the module then.

Since there are lots of OS's and lots of combinations of build flags
here, it's rather difficult to know if the cabal file has been completelty
correctly updated to match the source code.

So I am very unhappy with needing to update things in two places. I've
only tested this on linux with most build flags enables; this will
probably need significant time and testing to catch every cabal file
tweak that this change to Cabal requires. And it will be a continual
source of compile failures going forward when the code is modified and
the cabal file not also updated.

DRY DRY DRY, I repeat myself, but: DRY! Sigh..

(Also, had to remove all Build.* that are standalone programs from the
Other-Modules list, because since cabal passes those modules to ghc when
building git-annex, it complains that they use module Main. Those
modules are only used when building with the Makefile anyway, so this
change shouldn't break anything.)

This commit was sponsored by Thomas Hochstein on Patreon.
2017-08-18 11:08:58 -04:00
Joey Hess
df11e54788
avoid the dashed ssh hostname class of security holes
Security fix: Disallow hostname starting with a dash, which would get
passed to ssh and be treated an option. This could be used by an attacker
who provides a crafted ssh url (for eg a git remote) to execute arbitrary
code via ssh -oProxyCommand.

No CVE has yet been assigned for this hole.
The same class of security hole recently affected git itself,
CVE-2017-1000117.

Method: Identified all places where ssh is run, by git grep '"ssh"'
Converted them all to use a SshHost, if they did not already, for
specifying the hostname.

SshHost was made a data type with a smart constructor, which rejects
hostnames starting with '-'.

Note that git-annex already contains extensive use of Utility.SafeCommand,
which fixes a similar class of problem where a filename starting with a
dash gets passed to a program which treats it as an option.

This commit was sponsored by Jochen Bartl on Patreon.
2017-08-17 22:11:31 -04:00
Joey Hess
fdbfe88168
fix external script for filenames with spaces from protocol
Fix the external special remotes git-annex-remote-ipfs,
git-annex-remote-torrent and the example.sh template to correctly support
filenames with spaces.

This commit was sponsored by John Peloquin on Patreon.
2017-08-17 16:20:09 -04:00
Joey Hess
dafafad115
external: nice error message for keys with spaces in their name
External special remotes will refuse to operate on keys with spaces in
their names. That has never worked correctly due to the design of the
external special remote protocol. Display an error message suggesting
migration.

Not super happy with this, but it's a pragmatic solution. Better than
complicating the external special remote interface and all external special
remotes.

Note that I only made it use SafeKey in Request, not Response. git-annex
does not construct a Response, so that would not add any safety. And
presumably, if git-annex avoids feeding any such keys to an external
special remote, it will never have a reason to make a Response using such a
key. If it did, it would result in a protocol error anyway.

There's still a Serializeable instance for Key; it's used by P2P.Protocol.
There, the Key is always in the final position, so it's ok if it contains
spaces.

Note that the protocol documentation has been fixed to say that the File
may contain spaces. One way that can happen, even though the Key can't,
is when using direct mode, and the work tree filename contains spaces.
When sending such a file to the external special remote the worktree
filename is used.

This commit was sponsored by Thom May on Patreon.
2017-08-17 16:18:34 -04:00
Joey Hess
96c055eda2
migrate: WORM keys containing spaces will be migrated to not contain spaces anymore
To work around the problem that the external special remote protocol does
not support keys containing spaces.

This commit was sponsored by Denis Dzyubenko on Patreon.
2017-08-17 15:09:38 -04:00
Joey Hess
51801cff6a
Prevent spaces from being embedded in the name of new WORM keys, as that handing spaces in keys would complicate things like the external special remote protocol. 2017-08-17 14:46:33 -04:00
Joey Hess
d39c120afa
add annex-ignore-command and annex-sync-command configs
Added remote configuration settings annex-ignore-command and
annex-sync-command, which are dynamic equivilants of the annex-ignore
and annex-sync configurations.

For this I needed a new DynamicConfig infrastructure. Its implementation
should be as fast as before when there is no dynamic config, and it caches
so shell commands are only run once.

Note that annex-ignore-command exits nonzero when the remote should be ignored.
While that may seem backwards, it allows using the same command for it as
for annex-sync-command when you want to disable both.

This commit was sponsored by Trenton Cronholm on Patreon.
2017-08-17 13:54:14 -04:00
Joey Hess
86428f6261
comment 2017-08-17 12:17:47 -04:00
Joey Hess
4173decf27
Windows: Win32 package has subsumed Win32-extras; update dependency. 2017-08-16 17:43:38 -04:00
Joey Hess
69dcb08d7a
Disable http-client's default 30 second response timeout when HEADing an url to check if it exists. Some web servers take quite a long time to answer a HEAD request. 2017-08-15 13:56:12 -04:00
Joey Hess
2eb6309d3e
move, copy: Support --batch. 2017-08-15 12:39:10 -04:00
Joey Hess
8526cd7c92
test: Avoid most situations involving failure to delete test directories
By forking a worker process and only deleting the test directory once it exits.

This way, if a test leaves files open, they'll get closed when the worker
exits, so avoiding failure to delete open files on Windows, and failure to
delete directories due to NFS lock files.

If a test leaves a git worker process running, the closed pipes should
cause the worker to exit too, also avoiding the problem there. The 10
second sleep ought to give plenty of time for such worker processes to
exit, although this is of course a race.

Finally, even if test directory fails to be deleted still,
it won't appear as if the last test in the test suite failed; the error
will be displayed at the very end.

This commit was supported by the NSF-funded DataLad project.
2017-08-14 16:29:47 -04:00
Joey Hess
af6068525a
Fix a git-annex test failure when run on NFS due to NFS lock files preventing directory removal.
Should fix this:

    lock (v6 --force):                                    FAIL
      Exception: .git/annex/keys: removeDirectoryRecursive: unsatisfied constraints (Directory not empty)

Verified that the test case still catches the regression it's meant to.

This commit was supported by the NSF-funded DataLad project.
2017-08-14 15:11:42 -04:00
Joey Hess
2cecc8d2a3
Added GIT_ANNEX_VECTOR_CLOCK environment variable
Can be used to override the default timestamps used in log files in the
git-annex branch. This is a dangerous environment variable; use with
caution.

Note that this only affects writing to the logs on the git-annex branch.
It is not used for metadata in git commits (other env vars can be set for
that).

There are many other places where timestamps are still used, that don't
get committed to git, but do touch disk. Including regular timestamps
of files, and timestamps embedded in some files in .git/annex/, including
the last fsck timestamp and timestamps in transfer log files.

A good way to find such things in git-annex is to get for getPOSIXTime and
getCurrentTime, although some of the results are of course false positives
that never hit disk (unless git-annex gets swapped out..)

So this commit does NOT necessarily make git-annex comply with some HIPPA
privacy regulations; it's up to the user to determine if they can use it in
a way compliant with such regulations.

Benchmarking: It takes 0.00114 milliseconds to call getEnv
"GIT_ANNEX_VECTOR_CLOCK" when that env var is not set. So, 100 thousand log
files can be written with an added overhead of only 0.114 seconds. That
should be by far swamped by the actual overhead of writing the log files
and making the commit containing them.

This commit was supported by the NSF-funded DataLad project.
2017-08-14 14:19:58 -04:00
Joey Hess
81a861326d
fsck: Support --json.
One use case is to get a list of files that fsck fails on, in order to eg,
drop them from a remote.

This commit was sponsored by Nick Daly on Patreon.
2017-06-26 13:40:57 -04:00
Joey Hess
75cecbbe3f
Fix build with QuickCheck 2.10.
QuickCheck added an Arbitrary instance for CTime aka EpochTime. However,
while git-annex's instance disallowed times before the epoch, QuickCheck's
does not. So, rather than using its instance, convert from an Integer.

This commit was sponsored by Thomas Hochstein on Patreon.
2017-06-17 13:04:48 -04:00
Joey Hess
e4100fd60e
releasing package git-annex version 6.20170520 2017-06-12 13:55:00 -04:00
Joey Hess
1426f7ff3a
disable closingTracked on OSX
Don't trust OSX FSEvents's eventFlagItemModified to be called when the last
writer of a file closes it; apparently that sometimes does not happen,
which prevented files from being quickly added.

This commit was sponsored by John Peloquin on Patreon.
2017-06-09 14:18:58 -04:00
Joey Hess
5cf7216774
zsh and fish completions
optparse-applicative-0.14.0.0 adds support for these, so have the
Makefile install their scripts when built with it.

CmdLine/GitAnnex/Options.hs now uses action "file" in cmdParams,
which affects the bash and zsh completions, letting them complete
filenames for subcommands that use that. This is not needed for
bash, since bash-completion.bash enables -o bashdefault, which
lets it complete filenames too. But it does not seem to break the bash
completions. It is needed for zsh; the zsh completion otherwise
does not complete filenames. The fish completion will always complete
filenames no matter what. Messy.

This commit was sponsored by Denis Dzyubenko on Patreon.
2017-06-09 11:38:20 -04:00
Joey Hess
4a92eac23e
assistant: Merge changes from refs/remotes/foo/master into master.
Previously, only sync branches were merged. This makes regular git push
into a repository watched by the assistant auto-merge.

While this does hardcode an assumption about what the remote tracking
branch is named, which some unusual git configurations won't match,
git-annex sync already made the same assumption.

Also, changed behavior when a tracking branch like
refs/remotes/synced/not/master is received. When on the master branch,
that used to get merged into it, but it's the tracking branch for
not/master, so should only be merged in when on the not/master branch.

This commit was sponsored by Ewen McNeill.
2017-06-07 16:17:46 -04:00
Joey Hess
ed639c140d
Fix bug that prevented transfer locks from working when run on SMB or other filesystem that does not support fcntl locks and hard links.
This commit was sponsored by Ethan Aubin.
2017-06-06 14:22:03 -04:00
Joey Hess
e23839acf3
Avoid error about git-annex-shell not being found when syncing with -J with a git remote where git-annex-shell is not installed.
This commit was sponsored by andrea rota.
2017-06-06 12:57:27 -04:00
Joey Hess
94351daba6
configuration to disable automatic merge conflict resolution
* Added annex.resolvemerge configuration, which can be set to false to
  disable the usual automatic merge conflict resolution done by git-annex
  sync and the assistant.
* sync: Added --no-resolvemerge option.

Note that disabling merge conflict resolution is probably not a good idea
in a direct mode repo or adjusted branch. Since updates to both are done
outside the usual work tree, if it fails the tree is not left in a
conflicted state, and it would be hard to manually resolve the conflict.
Still, made annex.resolvemerge be supported in those cases for consistency.

This commit was sponsored by Riku Voipio.
2017-06-01 12:51:01 -04:00
Joey Hess
bb060f000f
error when metadata set is used with file that does not exist
When setting metadata of a file that did not exist, no error message was
displayed, unlike getting metadata and most other git-annex commands. Fixed
this oversight.

Note that, if the file exists but is not annexed, there's no error.
This is the same behavior as other git-annex commands.

This commit was supported by the NSF-funded DataLad project.
2017-06-01 11:40:47 -04:00
Joey Hess
bb18026b2c
move --to=here
* move --to=here moves from all reachable remotes to the local repository.

The output of move --from remote is changed slightly, when the remote and
local both have the content. It used to say:
move foo ok
Now:
move foo (from theremote...) ok

That was done so that, when move --to=here is used and the content is
locally present and also in several remotes, it's clear which remotes the
content gets dropped from.

Note that move --to=here will report an error if a non-reachable remote
contains the file, even if the local repository also contains the file. I
think that's reasonable; the user may be intending to move all other copies
of the file from remotes.

OTOH, if a copy of the file is believed to be present in some repository
that is not a configured remote, move --to=here does not report an error.
So a little bit inconsistent, but erroring in this case feels wrong.

copy --to=here came along for free, but it's basically the same behavior as
git-annex get, and probably with not as good messages in edge cases
(especially on failure), so I've not documented it.

This commit was sponsored by Anthony DeRobertis on Patreon.
2017-05-31 17:00:18 -04:00
Joey Hess
e1cf095ae8
Avoid concurrent git-config setting problem when running concurrent threads.
See my comment. This only avoids the problem for -J; two git-annex
processes started at the same time could still both try to write to
.git/config and one fail. That would be very unlikely though, and it
doesn't really seem worth adding an additional layer of locking around
.git/config.

This commit was supported by the NSF-funded DataLad project.
2017-05-25 18:28:23 -04:00
Joey Hess
7db37ddde0
Fix transfer log file locking problem when running concurrent transfers.
orElse is great, but was not the right thing to use here because
waitTakeLock could retry for other reasons than the lock being held,
which made tryTakeLock fail when it shouldn't.

Instead, move the code to tryTakeLock and implement waitTakeLock using
tryTakeLock and retry.

(Also, in runTransfer, when checkSaneLock fails, dropLock to avoid leaking a
lock handle.)

This commit was supported by the NSF-funded DataLad project.
2017-05-25 17:40:23 -04:00
Joey Hess
9bddc6d5ca
Improve progress display when watching file size, in cases where a transfer does not resume.
This commit was supported by the NSF-funded DataLad project.
2017-05-25 14:30:18 -04:00
Joey Hess
35465b6062
initremote, enableremote: Support gpg subkeys suffixed with an exclamation mark, which forces gpg to use a specific subkey.
This commit was sponsored by Peter Hogg on Patreon.
2017-05-24 14:08:02 -04:00
Joey Hess
c6079c3ce8
releasing package git-annex version 6.20170519 2017-05-19 10:58:03 -04:00
Joey Hess
1d45e47e3f
clear regions before ssh prompt
When built with concurrent-output 1.9, ssh password prompts will no longer
interfere with the -J display.

To avoid flicker, only done when ssh actually does need to prompt;
ssh is first run in batch mode and if that succeeds the connection is up
and no need to clear regions.

This commit was supported by the NSF-funded DataLad project.
2017-05-16 15:50:11 -04:00
Joey Hess
9bcaef1ec4
Work around bug in git 2.13.0 involving GIT_COMMON_DIR that broke merging changes into adjusted branches.
Might want to remove this when it gets fixed, in case adjusted branches are
used in a repo with a great many refs, which would become unnecessarily
slow.

This commit was supported by the NSF-funded DataLad project.
2017-05-16 14:35:37 -04:00
Joey Hess
a1730cd6af
adeiu, MissingH
Removed dependency on MissingH, instead depending on the split
library.

After laying groundwork for this since 2015, it
was mostly straightforward. Added Utility.Tuple and
Utility.Split. Eyeballed System.Path.WildMatch while implementing
the same thing.

Since MissingH's progress meter display was being used, I re-implemented
my own. Bonus: Now progress is displayed for transfers of files of
unknown size.

This commit was sponsored by Shane-o on Patreon.
2017-05-16 01:03:52 -04:00
Joey Hess
6992fe133b
Ssh password prompting improved when using -J
When ssh connection caching is enabled (and when GIT_ANNEX_USE_GIT_SSH is
not set), only one ssh password prompt will be made per host, and only one
ssh password prompt will be made at a time.

This also fixes a race in prepSocket's stale ssh connection stopping
when run with -J. It was possible for one thread to start a cached ssh
connection, and another thread to immediately stop it, resulting in excess
connections being made.

This commit was supported by the NSF-funded DataLad project.
2017-05-11 17:36:03 -04:00
Joey Hess
884505279a
releasing package git-annex version 6.20170510 2017-05-10 15:37:16 -04:00
Joey Hess
4c1e3210fa
annex.backend is the new name for what was annex.backends
It takes a single key-value backend, rather than the unncessary and confusing list.
The old option still works if set.

Simplified some old old code too.

This commit was sponsored by Thomas Hochstein on Patreon.
2017-05-09 15:04:07 -04:00
Joey Hess
bcf276655c
Keys marked as dead are now skipped by --all.
fsck already special-cased dead keys to make --all not report errors with
them, and it makes sense to also expand that to whereis. I think it makes
sense for dead keys to be skipped by all uses of --all, so mistakes can be
completely forgotten about and not come back to haunt us.

The speed impact of testing if the key is dead is negligible for fsck and
whereis, since they use the location log anyway and it gets cached.
This does slow down a few commands that support --all, in particular
metadata --all runs around 2x as slow. I don't think metadata
--all is often used though. It might slow down copy/move/mirror
--all and get --all.
log --all is not affected (does not use the normal --all machinery).

Dead keys will still be processed by --incomplete, --branch,
--failed, and --key. Although it would be unlikely for a dead key to
ave in incomplete or failed transfer. It seems to make perfect sense for
--branch to process keys on the branch, even if dead.

(fsck's special-casing of dead keys was left in, so if one of these options
causes a dead key to be fscked, there will be a nice message.)

This commit was supported by the NSF-funded DataLad project.
2017-05-09 12:55:21 -04:00
Joey Hess
e3184e54c9
version: Added "dependency versions" line.
This commit was sponsored by Anthony DeRobertis on Patreon.
2017-04-07 18:16:11 -04:00
Joey Hess
6896ac06e8
git annex add -u now supported, analagous to git add -u
Unlike git add -u, git annex add -u does not update the index for files
removed from the working tree. But then, "git add ." stages removals,
and "git annex add ." does not, so that's an existing divergence.

Seems that --update --batch would need to run git ls-files once per line of
batch input, which would surely be too slow, so just throw an error for
that.

This commit was supported by the NSF-funded DataLad project.
2017-04-07 15:55:45 -04:00
Joey Hess
57e923b712
gcrypt: Support re-enabling to change eg, encryption parameters.
This was never supported before. And it doesn't re-encrypt the
gcrypt repo to the new gcrypt-participants, but it does at least now not
crash, and set gcrypt-participants.

This commit was sponsored by andrea rota.
2017-04-07 14:10:34 -04:00
Joey Hess
99984967eb
enableremote: Fix re-enabling of existing gcrypt remotes, so that eg, encryption key changes take effect.
They were silently ignored, a reversion introduced in 6.20160527.

I don't like this regular git remote special case in enableremote, but I
can't see a way to get rid of it. So, check if the existing remote is
a Remote.Git

This commit was sponsored by Trenton Cronholm on Patreon.
2017-04-07 13:51:09 -04:00
Joey Hess
f406d16525
enableremote: When enabling a non-special remote, param=value parameters can't be used, so error out if any are provided.
This commit was sponsored by Riku Voipio.
2017-04-07 13:14:53 -04:00
Joey Hess
b6f26bac86
Disable git-annex's support for GIT_SSH and GIT_SSH_COMMAND, unless GIT_ANNEX_USE_GIT_SSH=1 is also set in the environment.
This is necessary because as feared, the extra -n parameter that git-annex
passes breaks uses of these environment variables that expect exactly the
parameters that git passes.

For example, see https://github.com/datalad/datalad/issues/1456

It would of course be possible to pre-close stdin before running ssh so not
needing the -n, and I think that would not even break ssh's password
caching. But it would probably involve a lot of work, possibly would need
to deal with some layering violations, and would be error-prone. The really
clean fix would be to make all the ssh stuff return a CreateProcess, which
could have the handle closed when appropriate, but that would be a large
reworing of the code base.

This commit was supported by the NSF-funded DataLad project.
2017-04-07 11:35:27 -04:00
Joey Hess
29e73f76ef
Added remote.<name>.annex-push and remote.<name>.annex-pull
The former can be useful to make remotes that don't get fully synced with
local changes, which comes up in a lot of situations.

The latter was mostly added for symmetry, but could be useful (though less
likely to be).

Implementing `remote.<name>.annex-pull` was a bit tricky, as there's no one
place where git-annex pulls/fetches from remotes. I audited all
instances of "fetch" and "pull". A few cases were left not checking this
config:

* Git.Repair can try to pull missing refs from a remote, and if the local
  repo is corrupted, that seems a reasonable thing to do even though
  the config would normally prevent it.
* Assistant.WebApp.Gpg and Remote.Gcrypt and Remote.Git do fetches
  as part of the setup process of a remote. The config would probably not
  be set then, and having the setup fail seems worse than honoring it if it
  is already set.

I have not prevented all the code that does a "merge" from merging branches
from remotes with remote.<name>.annex-pull=false. That could perhaps
be done, but it would need a way to map from branch name to remote name,
and the way refspecs work makes that hard to get really correct. So if the
user fetches manually, the git-annex branch will get merged, for example.
Anther way of looking at/justifying this is that the setting is called
"annex-pull", not "annex-merge".

This commit was supported by the NSF-funded DataLad project.
2017-04-05 13:22:35 -04:00
Joey Hess
c3970f6c1a
multicast: New command, uses uftp to multicast annexed files, for eg a classroom setting.
This commit was supported by the NSF-funded DataLad project.
2017-03-30 19:35:30 -04:00