Commit graph

1504 commits

Author SHA1 Message Date
Joey Hess
df11e54788
avoid the dashed ssh hostname class of security holes
Security fix: Disallow hostname starting with a dash, which would get
passed to ssh and be treated an option. This could be used by an attacker
who provides a crafted ssh url (for eg a git remote) to execute arbitrary
code via ssh -oProxyCommand.

No CVE has yet been assigned for this hole.
The same class of security hole recently affected git itself,
CVE-2017-1000117.

Method: Identified all places where ssh is run, by git grep '"ssh"'
Converted them all to use a SshHost, if they did not already, for
specifying the hostname.

SshHost was made a data type with a smart constructor, which rejects
hostnames starting with '-'.

Note that git-annex already contains extensive use of Utility.SafeCommand,
which fixes a similar class of problem where a filename starting with a
dash gets passed to a program which treats it as an option.

This commit was sponsored by Jochen Bartl on Patreon.
2017-08-17 22:11:31 -04:00
Joey Hess
d39c120afa
add annex-ignore-command and annex-sync-command configs
Added remote configuration settings annex-ignore-command and
annex-sync-command, which are dynamic equivilants of the annex-ignore
and annex-sync configurations.

For this I needed a new DynamicConfig infrastructure. Its implementation
should be as fast as before when there is no dynamic config, and it caches
so shell commands are only run once.

Note that annex-ignore-command exits nonzero when the remote should be ignored.
While that may seem backwards, it allows using the same command for it as
for annex-sync-command when you want to disable both.

This commit was sponsored by Trenton Cronholm on Patreon.
2017-08-17 13:54:14 -04:00
Joey Hess
4a92eac23e
assistant: Merge changes from refs/remotes/foo/master into master.
Previously, only sync branches were merged. This makes regular git push
into a repository watched by the assistant auto-merge.

While this does hardcode an assumption about what the remote tracking
branch is named, which some unusual git configurations won't match,
git-annex sync already made the same assumption.

Also, changed behavior when a tracking branch like
refs/remotes/synced/not/master is received. When on the master branch,
that used to get merged into it, but it's the tracking branch for
not/master, so should only be merged in when on the not/master branch.

This commit was sponsored by Ewen McNeill.
2017-06-07 16:17:46 -04:00
Joey Hess
86e4ea00b2
analysis
Also, added a comment to Assistant/Threads/Merger.hs to explain
why it only merges from /synced/ branches.
2017-06-07 13:45:18 -04:00
Joey Hess
94351daba6
configuration to disable automatic merge conflict resolution
* Added annex.resolvemerge configuration, which can be set to false to
  disable the usual automatic merge conflict resolution done by git-annex
  sync and the assistant.
* sync: Added --no-resolvemerge option.

Note that disabling merge conflict resolution is probably not a good idea
in a direct mode repo or adjusted branch. Since updates to both are done
outside the usual work tree, if it fails the tree is not left in a
conflicted state, and it would be hard to manually resolve the conflict.
Still, made annex.resolvemerge be supported in those cases for consistency.

This commit was sponsored by Riku Voipio.
2017-06-01 12:51:01 -04:00
Joey Hess
a1730cd6af
adeiu, MissingH
Removed dependency on MissingH, instead depending on the split
library.

After laying groundwork for this since 2015, it
was mostly straightforward. Added Utility.Tuple and
Utility.Split. Eyeballed System.Path.WildMatch while implementing
the same thing.

Since MissingH's progress meter display was being used, I re-implemented
my own. Bonus: Now progress is displayed for transfers of files of
unknown size.

This commit was sponsored by Shane-o on Patreon.
2017-05-16 01:03:52 -04:00
Joey Hess
44baa7b306
rename prompt since Messages added a function by that name 2017-05-15 21:35:56 -04:00
Joey Hess
e3184e54c9
version: Added "dependency versions" line.
This commit was sponsored by Anthony DeRobertis on Patreon.
2017-04-07 18:16:11 -04:00
Joey Hess
29e73f76ef
Added remote.<name>.annex-push and remote.<name>.annex-pull
The former can be useful to make remotes that don't get fully synced with
local changes, which comes up in a lot of situations.

The latter was mostly added for symmetry, but could be useful (though less
likely to be).

Implementing `remote.<name>.annex-pull` was a bit tricky, as there's no one
place where git-annex pulls/fetches from remotes. I audited all
instances of "fetch" and "pull". A few cases were left not checking this
config:

* Git.Repair can try to pull missing refs from a remote, and if the local
  repo is corrupted, that seems a reasonable thing to do even though
  the config would normally prevent it.
* Assistant.WebApp.Gpg and Remote.Gcrypt and Remote.Git do fetches
  as part of the setup process of a remote. The config would probably not
  be set then, and having the setup fail seems worse than honoring it if it
  is already set.

I have not prevented all the code that does a "merge" from merging branches
from remotes with remote.<name>.annex-pull=false. That could perhaps
be done, but it would need a way to map from branch name to remote name,
and the way refspecs work makes that hard to get really correct. So if the
user fetches manually, the git-annex branch will get merged, for example.
Anther way of looking at/justifying this is that the setting is called
"annex-pull", not "annex-merge".

This commit was supported by the NSF-funded DataLad project.
2017-04-05 13:22:35 -04:00
Joey Hess
c8e1e3dada
AssociatedFile newtype
To prevent any further mistakes like 301aff34c4

This commit was sponsored by Francois Marier on Patreon.
2017-03-10 13:35:31 -04:00
Joey Hess
def8dbd586
remove unused import 2017-03-08 16:04:36 -04:00
Joey Hess
af2a6d578e
assistant: Add 1/200th second delay between checking each file in the full transfer scan, to avoid using too much CPU.
The slowdown is not going to be large in typical small-ish repos.
And it does not seem to matter if the assistant reacts a little bit slower
in situations involving the expensive scan, since:

a) Those situations typically involve getting back in sync after something
   has changed on a remote, often after a disconnect of some duration.
   So taking a few seconds more is not noticable.
b) If the scan finds things that it needs to do, it will start
   blocking anyway after 10 transfers are queued (due to use of
   queueTransferWhenSmall). So, only the speed of finding the first 10
   transfers will be impacted by this change.

This commit was sponsored by Jochen Bartl on Patreon.
2017-03-06 13:32:47 -04:00
Joey Hess
27eca014be
fix up Read instance incompatability caused by recent commit
9c4650358c changed the Read instance for
Key.

I've checked all uses of that instance (by removing it and seeing what
breaks), and they're all limited to the webapp, except one.
That is GitAnnexDistribution's Read instance.

So, 9c4650358c would have broken upgrades
of git-annex from downloads.kitenet.net. Once the .info files there got
updated for a new release, old releases would have failed to parse them
and never upgraded.

To fix this, I found a way to make the .info files that contain
GitAnnexDistribution values be readable by the old version of git-annex.

This commit was sponsored by Ewen McNeill.
2017-02-24 18:59:12 -04:00
Joey Hess
9c4650358c
add KeyVariety type
Where before the "name" of a key and a backend was a string, this makes
it a concrete data type.

This is groundwork for allowing some varieties of keys to be disabled
in file2key, so git-annex won't use them at all.

Benchmarks ran in my big repo:

old git-annex info:

real	0m3.338s
user	0m3.124s
sys	0m0.244s

new git-annex info:

real	0m3.216s
user	0m3.024s
sys	0m0.220s

new git-annex find:

real	0m7.138s
user	0m6.924s
sys	0m0.252s

old git-annex find:

real	0m7.433s
user	0m7.240s
sys	0m0.232s

Surprising result; I'd have expected it to be slower since it now parses
all the key varieties. But, the parser is very simple and perhaps
sharing KeyVarieties uses less memory or something like that.

This commit was supported by the NSF-funded DataLad project.
2017-02-24 15:16:56 -04:00
Joey Hess
113b10cdc9
simpler more generic processTranscript'
This allows using functions that generate CreateProcess and passing the
result to processTranscript', which is more flexible, and also simpler
than the old interface.

This commit was sponsored by Riku Voipio.
2017-02-15 16:02:10 -04:00
Joey Hess
f07af03018
Run ssh with -n whenever input is not being piped into it
... to avoid it consuming stdin that it shouldn't.

This fixes git-annex-checkpresentkey --batch remote, which didn't output
results for all keys passed into it.

Other git-annex commands that communicate with a remote over ssh may also
have been consuming stdin that they shouldn't have, which could have
impacted using them in eg, shell scripts. For example, a shell script
reading files from stdin and passing them to git annex drop would be
impacted by this bug, whenever git annex drop ran git-annex-shell
checkpresent, it would consume part/all of the stdin that the shell script
was supposed to consume.

Fixed by adding a ConsumeStdin parameter to Annex.Ssh.sshOptions, which
is used throughout git-annex to run ssh (in order for ssh connection
caching to work). Every call site was checked to see if it used
CreatePipe for stdin, and if not was marked NoConsumeStdin.
2017-02-15 15:08:46 -04:00
Joey Hess
f36adc2dbc
include problem PairData in error message
Users occasionally report this error firing, and I can't see why,
so include the rejected PairData in the error message.

This is safe even if it contains evil escape characters, because showing
it displays them in escaped form.

This commit was sponsored by Bruno BEAUFILS on Patreon.
2017-02-13 15:54:28 -04:00
Joey Hess
f617988a29
Make import --deduplicate and --skip-duplicates only hash once, not twice
import: --deduplicate and --skip-duplicates were implemented inneficiently;
they unncessarily hashed each file twice. They have been improved to only
hash once.

The new approach is to lock down (minimally) and hash files, and then
reuse that information when importing them.

This was rather tricky, especially in detecting changes to files while
they are being imported.

The output of import changed slightly. While before it silently skipped
over files with eg --skip-duplicates, now it shows each file as it starts
to act on it. Since every file is hashed first thing, it would otherwise
not be clear what file import is chewing on. (Actually, it wasn't clear
before when any of the duplicates switches were used.)

This commit was sponsored by Alexander Thompson on Patreon.
2017-02-09 15:32:22 -04:00
Joey Hess
5c804cf42e
add SetupStage parameter to RemoteType.setup
Most remotes have an idempotent setup that can be reused for
enableremote, but in a few cases, it needs to tell which, and whether
a UUID was provided to setup was used.

This is groundwork for making initremote be able to provide a UUID.
It should not change any behavior.

Note that it would be nice to make the UUID always be provided to setup,
and make setup not need to generate and return a UUID. What prevented
this simplification is Remote.Git.gitSetup, which needs to reuse the
UUID of the git remote when setting it up, and so has to return that
UUID.

This commit was sponsored by Thom May on Patreon.
2017-02-07 14:55:58 -04:00
Joey Hess
ed56dba868
annex.autocommit can be configured via git-annex config
... to control the default behavior in all clones of a repository.

This includes a new Configurable data type, so the GitConfig type indicates
which values can be configured this way.

The implementation should be quite efficient; the config log is only read
once, and only when a Configurable value has not already been set by
git-config.

Indeed, it would be nice in the future to extend this, so that git-config
is itself only read on demand. Some commands may not need to look at the
git configuration at all.

This commit was sponsored by Trenton Cronholm on Patreon.
2017-02-03 13:58:53 -04:00
Joey Hess
9eb10caa27
Some optimisations to string splitting code.
Turns out that Data.List.Utils.split is slow and makes a lot of
allocations. Here's a much simpler single character splitter that behaves
the same (even in wacky corner cases) while running in half the time and
75% the allocations.

As well as being an optimisation, this helps move toward eliminating use of
missingh.

(Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and
allocates even more.)

I have not benchmarked the effect on git-annex, but would not be surprised
to see some parsing of eg, large streams from git commands run twice as
fast, and possibly in less memory.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2017-01-31 19:06:22 -04:00
Joey Hess
e92f2d1080
improve description of password prompting
Since the user does not know whether it will run su or sudo, indicate
whether the password prompt will be for root or the user's password,
when possible.

I assume that programs like gksu that can prompt for either depending on
system setup will make clear in their prompt what they're asking for.
2016-12-28 16:07:49 -04:00
Joey Hess
4d690dc680
re-enable enableTor 2016-12-28 11:23:22 -04:00
Joey Hess
18f16072fb
improve html, add screenshots 2016-12-27 17:13:32 -04:00
Joey Hess
b68d2a4b68
webapp: full wormhole pairing UI (untested)
This commit was sponsored by Riku Voipio.
2016-12-27 16:41:35 -04:00
Joey Hess
9a35077168
clean up some imports 2016-12-27 16:29:29 -04:00
Joey Hess
9e0aae036b
webapp: check that tor and magic wormhole are installed 2016-12-24 17:08:03 -04:00
Joey Hess
de79be2ba6
wording 2016-12-24 16:56:56 -04:00
Joey Hess
a196260924
mocked up wormhole pairing interface in webapp 2016-12-24 16:55:36 -04:00
Joey Hess
ab66bbfeb6
Merge branch 'master' into no-xmpp 2016-12-24 15:01:55 -04:00
Joey Hess
8484c0c197
Always use filesystem encoding for all file and handle reads and writes.
This is a big scary change. I have convinced myself it should be safe. I
hope!
2016-12-24 14:46:31 -04:00
Joey Hess
ac0cb5c2cc
max authtoken length is 128
It was stopping at 128, so the 512 was only incorrect, it didn't change
behavior.
2016-11-30 14:19:26 -04:00
Joey Hess
b039e5cc35
Merge branch 'master' into tor 2016-11-29 23:45:25 -04:00
Joey Hess
f872ef25c5
remove unused import 2016-11-29 23:44:54 -04:00
Joey Hess
398345cb26
Merge branch 'master' into tor 2016-11-29 15:45:29 -04:00
Joey Hess
af4d919793
unified AuthToken type between webapp and tor 2016-11-22 14:18:34 -04:00
Joey Hess
ae9f99f342
Relicense 5 source files that are not part of the webapp from AGPL to GPL.
Building w/o the webapp is not supposed to pull in any AGPLed files.

I appear to have written all the code in these files;
the only commit by anyone else is 64e844e1fe
and is a spelling fix that is not copyrightable.
2016-11-21 23:46:59 -04:00
Joey Hess
a101b8de37
remotedaemon: Fork to background by default. Added --foreground switch to enable old behavior.
Groundwork for tor hidden services, which the remotedaemon will serve.
2016-11-20 14:50:36 -04:00
Joey Hess
0a4479b8ec
Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors.
ghc 8 added backtraces on uncaught errors. This is great, but git-annex was
using error in many places for a error message targeted at the user, in
some known problem case. A backtrace only confuses such a message, so omit it.

Notably, commands like git annex drop that failed due to eg, numcopies,
used to use error, so had a backtrace.

This commit was sponsored by Ethan Aubin.
2016-11-15 21:29:54 -04:00
Joey Hess
69915c6c9b
fix tricky warning with ghc 8
Whether Route was exported from Assistant.WebApp.Types
or not depended on the version of ghc. So, explictly export it.
2016-11-15 18:51:07 -04:00
Joey Hess
556b2ded2b
sync: Pass --allow-unrelated-histories to git merge when used with git git 2.9.0 or newer.
This makes merging a remote into a freshly created direct mode repository
work the same as it works in indirect mode.

The git-annex branches would get merged in any case by a sync,
since that doesn't use git merge.

This might need to be revisited later to better mirror git's behavior.
2016-11-15 18:26:17 -04:00
Joey Hess
d58148031b
remove xmpp support
I've long considered the XMPP support in git-annex a wart.
It's nice to remove it.

(This also removes the NetMessager, which was only used for XMPP, and the
daemonstatus's desynced list (likewise).)

Existing XMPP remotes should be ignored by git-annex.

This commit was sponsored by Brock Spratlen on Patreon.
2016-11-14 14:53:08 -04:00
Joey Hess
989520a104
remove redundant constraint 2016-11-11 18:39:05 -04:00
Joey Hess
9e97dc7c46
add missing type signature 2016-11-11 18:14:25 -04:00
Joey Hess
4643470537
webapp: Explicitly avoid checking for auth in static subsite requests.
Yesod didn't used to do auth checks for that, but this may have changed.
I don't have a way to reproduce the reported problem yet, but this change
certianly won't hurt anything.

This commit was sponsored by Thom May on Patreon.
2016-11-10 13:48:54 -04:00
Joey Hess
166d70db77
convert TMVars that are never left empty into TVars
This is probably more efficient, and it avoids mistakenly leaving them
empty.
2016-09-30 19:51:16 -04:00
Joey Hess
9cd3fb4110
remove redundant constraint 2016-09-15 00:04:56 -04:00
Joey Hess
29b6ab467a
switch away from deprecated interface
Again the new stuff works back to network-2.4, so no need to adjust cabal
bounds.
2016-09-05 14:39:44 -04:00
Joey Hess
d5c16df944
switch away from deprecated recvFrom
Network.Socket.ByteString is in network since before 2.4, so no need to
adjust bounds.
2016-09-05 12:10:24 -04:00
Joey Hess
1a0e2c9901
get, move, copy, mirror: Added --failed switch which retries failed copies/moves
Note that get --from foo --failed will get things that a previous get --from bar
tried and failed to get, etc. I considered making --failed only retry
transfers from the same remote, but it was easier, and seems more useful,
to not have the same remote requirement.

Noisy due to some refactoring into Types/
2016-08-03 12:37:12 -04:00