Commit graph

964 commits

Author SHA1 Message Date
Joey Hess
d39c120afa
add annex-ignore-command and annex-sync-command configs
Added remote configuration settings annex-ignore-command and
annex-sync-command, which are dynamic equivilants of the annex-ignore
and annex-sync configurations.

For this I needed a new DynamicConfig infrastructure. Its implementation
should be as fast as before when there is no dynamic config, and it caches
so shell commands are only run once.

Note that annex-ignore-command exits nonzero when the remote should be ignored.
While that may seem backwards, it allows using the same command for it as
for annex-sync-command when you want to disable both.

This commit was sponsored by Trenton Cronholm on Patreon.
2017-08-17 13:54:14 -04:00
Joey Hess
86428f6261
comment 2017-08-17 12:17:47 -04:00
Joey Hess
4173decf27
Windows: Win32 package has subsumed Win32-extras; update dependency. 2017-08-16 17:43:38 -04:00
Joey Hess
69dcb08d7a
Disable http-client's default 30 second response timeout when HEADing an url to check if it exists. Some web servers take quite a long time to answer a HEAD request. 2017-08-15 13:56:12 -04:00
Joey Hess
2eb6309d3e
move, copy: Support --batch. 2017-08-15 12:39:10 -04:00
Joey Hess
8526cd7c92
test: Avoid most situations involving failure to delete test directories
By forking a worker process and only deleting the test directory once it exits.

This way, if a test leaves files open, they'll get closed when the worker
exits, so avoiding failure to delete open files on Windows, and failure to
delete directories due to NFS lock files.

If a test leaves a git worker process running, the closed pipes should
cause the worker to exit too, also avoiding the problem there. The 10
second sleep ought to give plenty of time for such worker processes to
exit, although this is of course a race.

Finally, even if test directory fails to be deleted still,
it won't appear as if the last test in the test suite failed; the error
will be displayed at the very end.

This commit was supported by the NSF-funded DataLad project.
2017-08-14 16:29:47 -04:00
Joey Hess
af6068525a
Fix a git-annex test failure when run on NFS due to NFS lock files preventing directory removal.
Should fix this:

    lock (v6 --force):                                    FAIL
      Exception: .git/annex/keys: removeDirectoryRecursive: unsatisfied constraints (Directory not empty)

Verified that the test case still catches the regression it's meant to.

This commit was supported by the NSF-funded DataLad project.
2017-08-14 15:11:42 -04:00
Joey Hess
2cecc8d2a3
Added GIT_ANNEX_VECTOR_CLOCK environment variable
Can be used to override the default timestamps used in log files in the
git-annex branch. This is a dangerous environment variable; use with
caution.

Note that this only affects writing to the logs on the git-annex branch.
It is not used for metadata in git commits (other env vars can be set for
that).

There are many other places where timestamps are still used, that don't
get committed to git, but do touch disk. Including regular timestamps
of files, and timestamps embedded in some files in .git/annex/, including
the last fsck timestamp and timestamps in transfer log files.

A good way to find such things in git-annex is to get for getPOSIXTime and
getCurrentTime, although some of the results are of course false positives
that never hit disk (unless git-annex gets swapped out..)

So this commit does NOT necessarily make git-annex comply with some HIPPA
privacy regulations; it's up to the user to determine if they can use it in
a way compliant with such regulations.

Benchmarking: It takes 0.00114 milliseconds to call getEnv
"GIT_ANNEX_VECTOR_CLOCK" when that env var is not set. So, 100 thousand log
files can be written with an added overhead of only 0.114 seconds. That
should be by far swamped by the actual overhead of writing the log files
and making the commit containing them.

This commit was supported by the NSF-funded DataLad project.
2017-08-14 14:19:58 -04:00
Joey Hess
81a861326d
fsck: Support --json.
One use case is to get a list of files that fsck fails on, in order to eg,
drop them from a remote.

This commit was sponsored by Nick Daly on Patreon.
2017-06-26 13:40:57 -04:00
Joey Hess
75cecbbe3f
Fix build with QuickCheck 2.10.
QuickCheck added an Arbitrary instance for CTime aka EpochTime. However,
while git-annex's instance disallowed times before the epoch, QuickCheck's
does not. So, rather than using its instance, convert from an Integer.

This commit was sponsored by Thomas Hochstein on Patreon.
2017-06-17 13:04:48 -04:00
Joey Hess
e4100fd60e
releasing package git-annex version 6.20170520 2017-06-12 13:55:00 -04:00
Joey Hess
1426f7ff3a
disable closingTracked on OSX
Don't trust OSX FSEvents's eventFlagItemModified to be called when the last
writer of a file closes it; apparently that sometimes does not happen,
which prevented files from being quickly added.

This commit was sponsored by John Peloquin on Patreon.
2017-06-09 14:18:58 -04:00
Joey Hess
5cf7216774
zsh and fish completions
optparse-applicative-0.14.0.0 adds support for these, so have the
Makefile install their scripts when built with it.

CmdLine/GitAnnex/Options.hs now uses action "file" in cmdParams,
which affects the bash and zsh completions, letting them complete
filenames for subcommands that use that. This is not needed for
bash, since bash-completion.bash enables -o bashdefault, which
lets it complete filenames too. But it does not seem to break the bash
completions. It is needed for zsh; the zsh completion otherwise
does not complete filenames. The fish completion will always complete
filenames no matter what. Messy.

This commit was sponsored by Denis Dzyubenko on Patreon.
2017-06-09 11:38:20 -04:00
Joey Hess
4a92eac23e
assistant: Merge changes from refs/remotes/foo/master into master.
Previously, only sync branches were merged. This makes regular git push
into a repository watched by the assistant auto-merge.

While this does hardcode an assumption about what the remote tracking
branch is named, which some unusual git configurations won't match,
git-annex sync already made the same assumption.

Also, changed behavior when a tracking branch like
refs/remotes/synced/not/master is received. When on the master branch,
that used to get merged into it, but it's the tracking branch for
not/master, so should only be merged in when on the not/master branch.

This commit was sponsored by Ewen McNeill.
2017-06-07 16:17:46 -04:00
Joey Hess
ed639c140d
Fix bug that prevented transfer locks from working when run on SMB or other filesystem that does not support fcntl locks and hard links.
This commit was sponsored by Ethan Aubin.
2017-06-06 14:22:03 -04:00
Joey Hess
e23839acf3
Avoid error about git-annex-shell not being found when syncing with -J with a git remote where git-annex-shell is not installed.
This commit was sponsored by andrea rota.
2017-06-06 12:57:27 -04:00
Joey Hess
94351daba6
configuration to disable automatic merge conflict resolution
* Added annex.resolvemerge configuration, which can be set to false to
  disable the usual automatic merge conflict resolution done by git-annex
  sync and the assistant.
* sync: Added --no-resolvemerge option.

Note that disabling merge conflict resolution is probably not a good idea
in a direct mode repo or adjusted branch. Since updates to both are done
outside the usual work tree, if it fails the tree is not left in a
conflicted state, and it would be hard to manually resolve the conflict.
Still, made annex.resolvemerge be supported in those cases for consistency.

This commit was sponsored by Riku Voipio.
2017-06-01 12:51:01 -04:00
Joey Hess
bb060f000f
error when metadata set is used with file that does not exist
When setting metadata of a file that did not exist, no error message was
displayed, unlike getting metadata and most other git-annex commands. Fixed
this oversight.

Note that, if the file exists but is not annexed, there's no error.
This is the same behavior as other git-annex commands.

This commit was supported by the NSF-funded DataLad project.
2017-06-01 11:40:47 -04:00
Joey Hess
bb18026b2c
move --to=here
* move --to=here moves from all reachable remotes to the local repository.

The output of move --from remote is changed slightly, when the remote and
local both have the content. It used to say:
move foo ok
Now:
move foo (from theremote...) ok

That was done so that, when move --to=here is used and the content is
locally present and also in several remotes, it's clear which remotes the
content gets dropped from.

Note that move --to=here will report an error if a non-reachable remote
contains the file, even if the local repository also contains the file. I
think that's reasonable; the user may be intending to move all other copies
of the file from remotes.

OTOH, if a copy of the file is believed to be present in some repository
that is not a configured remote, move --to=here does not report an error.
So a little bit inconsistent, but erroring in this case feels wrong.

copy --to=here came along for free, but it's basically the same behavior as
git-annex get, and probably with not as good messages in edge cases
(especially on failure), so I've not documented it.

This commit was sponsored by Anthony DeRobertis on Patreon.
2017-05-31 17:00:18 -04:00
Joey Hess
e1cf095ae8
Avoid concurrent git-config setting problem when running concurrent threads.
See my comment. This only avoids the problem for -J; two git-annex
processes started at the same time could still both try to write to
.git/config and one fail. That would be very unlikely though, and it
doesn't really seem worth adding an additional layer of locking around
.git/config.

This commit was supported by the NSF-funded DataLad project.
2017-05-25 18:28:23 -04:00
Joey Hess
7db37ddde0
Fix transfer log file locking problem when running concurrent transfers.
orElse is great, but was not the right thing to use here because
waitTakeLock could retry for other reasons than the lock being held,
which made tryTakeLock fail when it shouldn't.

Instead, move the code to tryTakeLock and implement waitTakeLock using
tryTakeLock and retry.

(Also, in runTransfer, when checkSaneLock fails, dropLock to avoid leaking a
lock handle.)

This commit was supported by the NSF-funded DataLad project.
2017-05-25 17:40:23 -04:00
Joey Hess
9bddc6d5ca
Improve progress display when watching file size, in cases where a transfer does not resume.
This commit was supported by the NSF-funded DataLad project.
2017-05-25 14:30:18 -04:00
Joey Hess
35465b6062
initremote, enableremote: Support gpg subkeys suffixed with an exclamation mark, which forces gpg to use a specific subkey.
This commit was sponsored by Peter Hogg on Patreon.
2017-05-24 14:08:02 -04:00
Joey Hess
c6079c3ce8
releasing package git-annex version 6.20170519 2017-05-19 10:58:03 -04:00
Joey Hess
1d45e47e3f
clear regions before ssh prompt
When built with concurrent-output 1.9, ssh password prompts will no longer
interfere with the -J display.

To avoid flicker, only done when ssh actually does need to prompt;
ssh is first run in batch mode and if that succeeds the connection is up
and no need to clear regions.

This commit was supported by the NSF-funded DataLad project.
2017-05-16 15:50:11 -04:00
Joey Hess
9bcaef1ec4
Work around bug in git 2.13.0 involving GIT_COMMON_DIR that broke merging changes into adjusted branches.
Might want to remove this when it gets fixed, in case adjusted branches are
used in a repo with a great many refs, which would become unnecessarily
slow.

This commit was supported by the NSF-funded DataLad project.
2017-05-16 14:35:37 -04:00
Joey Hess
a1730cd6af
adeiu, MissingH
Removed dependency on MissingH, instead depending on the split
library.

After laying groundwork for this since 2015, it
was mostly straightforward. Added Utility.Tuple and
Utility.Split. Eyeballed System.Path.WildMatch while implementing
the same thing.

Since MissingH's progress meter display was being used, I re-implemented
my own. Bonus: Now progress is displayed for transfers of files of
unknown size.

This commit was sponsored by Shane-o on Patreon.
2017-05-16 01:03:52 -04:00
Joey Hess
6992fe133b
Ssh password prompting improved when using -J
When ssh connection caching is enabled (and when GIT_ANNEX_USE_GIT_SSH is
not set), only one ssh password prompt will be made per host, and only one
ssh password prompt will be made at a time.

This also fixes a race in prepSocket's stale ssh connection stopping
when run with -J. It was possible for one thread to start a cached ssh
connection, and another thread to immediately stop it, resulting in excess
connections being made.

This commit was supported by the NSF-funded DataLad project.
2017-05-11 17:36:03 -04:00
Joey Hess
884505279a
releasing package git-annex version 6.20170510 2017-05-10 15:37:16 -04:00
Joey Hess
4c1e3210fa
annex.backend is the new name for what was annex.backends
It takes a single key-value backend, rather than the unncessary and confusing list.
The old option still works if set.

Simplified some old old code too.

This commit was sponsored by Thomas Hochstein on Patreon.
2017-05-09 15:04:07 -04:00
Joey Hess
bcf276655c
Keys marked as dead are now skipped by --all.
fsck already special-cased dead keys to make --all not report errors with
them, and it makes sense to also expand that to whereis. I think it makes
sense for dead keys to be skipped by all uses of --all, so mistakes can be
completely forgotten about and not come back to haunt us.

The speed impact of testing if the key is dead is negligible for fsck and
whereis, since they use the location log anyway and it gets cached.
This does slow down a few commands that support --all, in particular
metadata --all runs around 2x as slow. I don't think metadata
--all is often used though. It might slow down copy/move/mirror
--all and get --all.
log --all is not affected (does not use the normal --all machinery).

Dead keys will still be processed by --incomplete, --branch,
--failed, and --key. Although it would be unlikely for a dead key to
ave in incomplete or failed transfer. It seems to make perfect sense for
--branch to process keys on the branch, even if dead.

(fsck's special-casing of dead keys was left in, so if one of these options
causes a dead key to be fscked, there will be a nice message.)

This commit was supported by the NSF-funded DataLad project.
2017-05-09 12:55:21 -04:00
Joey Hess
e3184e54c9
version: Added "dependency versions" line.
This commit was sponsored by Anthony DeRobertis on Patreon.
2017-04-07 18:16:11 -04:00
Joey Hess
6896ac06e8
git annex add -u now supported, analagous to git add -u
Unlike git add -u, git annex add -u does not update the index for files
removed from the working tree. But then, "git add ." stages removals,
and "git annex add ." does not, so that's an existing divergence.

Seems that --update --batch would need to run git ls-files once per line of
batch input, which would surely be too slow, so just throw an error for
that.

This commit was supported by the NSF-funded DataLad project.
2017-04-07 15:55:45 -04:00
Joey Hess
57e923b712
gcrypt: Support re-enabling to change eg, encryption parameters.
This was never supported before. And it doesn't re-encrypt the
gcrypt repo to the new gcrypt-participants, but it does at least now not
crash, and set gcrypt-participants.

This commit was sponsored by andrea rota.
2017-04-07 14:10:34 -04:00
Joey Hess
99984967eb
enableremote: Fix re-enabling of existing gcrypt remotes, so that eg, encryption key changes take effect.
They were silently ignored, a reversion introduced in 6.20160527.

I don't like this regular git remote special case in enableremote, but I
can't see a way to get rid of it. So, check if the existing remote is
a Remote.Git

This commit was sponsored by Trenton Cronholm on Patreon.
2017-04-07 13:51:09 -04:00
Joey Hess
f406d16525
enableremote: When enabling a non-special remote, param=value parameters can't be used, so error out if any are provided.
This commit was sponsored by Riku Voipio.
2017-04-07 13:14:53 -04:00
Joey Hess
b6f26bac86
Disable git-annex's support for GIT_SSH and GIT_SSH_COMMAND, unless GIT_ANNEX_USE_GIT_SSH=1 is also set in the environment.
This is necessary because as feared, the extra -n parameter that git-annex
passes breaks uses of these environment variables that expect exactly the
parameters that git passes.

For example, see https://github.com/datalad/datalad/issues/1456

It would of course be possible to pre-close stdin before running ssh so not
needing the -n, and I think that would not even break ssh's password
caching. But it would probably involve a lot of work, possibly would need
to deal with some layering violations, and would be error-prone. The really
clean fix would be to make all the ssh stuff return a CreateProcess, which
could have the handle closed when appropriate, but that would be a large
reworing of the code base.

This commit was supported by the NSF-funded DataLad project.
2017-04-07 11:35:27 -04:00
Joey Hess
29e73f76ef
Added remote.<name>.annex-push and remote.<name>.annex-pull
The former can be useful to make remotes that don't get fully synced with
local changes, which comes up in a lot of situations.

The latter was mostly added for symmetry, but could be useful (though less
likely to be).

Implementing `remote.<name>.annex-pull` was a bit tricky, as there's no one
place where git-annex pulls/fetches from remotes. I audited all
instances of "fetch" and "pull". A few cases were left not checking this
config:

* Git.Repair can try to pull missing refs from a remote, and if the local
  repo is corrupted, that seems a reasonable thing to do even though
  the config would normally prevent it.
* Assistant.WebApp.Gpg and Remote.Gcrypt and Remote.Git do fetches
  as part of the setup process of a remote. The config would probably not
  be set then, and having the setup fail seems worse than honoring it if it
  is already set.

I have not prevented all the code that does a "merge" from merging branches
from remotes with remote.<name>.annex-pull=false. That could perhaps
be done, but it would need a way to map from branch name to remote name,
and the way refspecs work makes that hard to get really correct. So if the
user fetches manually, the git-annex branch will get merged, for example.
Anther way of looking at/justifying this is that the setting is called
"annex-pull", not "annex-merge".

This commit was supported by the NSF-funded DataLad project.
2017-04-05 13:22:35 -04:00
Joey Hess
c3970f6c1a
multicast: New command, uses uftp to multicast annexed files, for eg a classroom setting.
This commit was supported by the NSF-funded DataLad project.
2017-03-30 19:35:30 -04:00
Joey Hess
3c8eb59860
When a http remote does not expose an annex.uuid config, only warn about it once, not every time git-annex is run.
Same behavior as for a ssh remote.
2017-03-29 12:43:47 -04:00
Joey Hess
0e7276b5ac stack.yaml: Update to lts-8.6. 2017-03-27 20:01:46 -04:00
Joey Hess
464291243c
releasing package git-annex version 6.20170321 2017-03-21 13:46:20 -04:00
Joey Hess
64f924dc93
sync --content-of=path
For when you want to sync only some files' contents, not the whole working
tree.

This commit was sponsored by Anthony DeRobertis on Patreon.
2017-03-20 16:00:48 -04:00
Joey Hess
faecd73f32
Support GIT_SSH and GIT_SSH_COMMAND
They are handled close the same as they are by git. However, unlike git,
git-annex sometimes needs to pass the -n parameter when using these.

So, this has the potential for breaking some setup, and perhaps there ought
to be a ANNEX_USE_GIT_SSH=1 needed to use these. But I'd rather avoid that
if possible, so let's see if anyone complains.

Almost all places where "ssh" was run have been changed to support the env
vars. Anything still calling sshOptions does not support them. In
particular, rsync special remotes don't. Seems that annex-rsync-transport
already gives sufficient control there.

(Fixed in passing: Remote.Helper.Ssh.toRepo used to extract
remoteAnnexSshOptions and pass them to sshOptions, which was redundant
since sshOptions also extracts those.)

This commit was sponsored by Jeff Goeke-Smith on Patreon.
2017-03-17 16:20:37 -04:00
Joey Hess
999743c1e8
git-annex-shell: run all commands with noMessages
Fix bug when used with a recently cloned repository, where
"merging" messages were included in the output of configlist (and perhaps
other commands) and caused a "Failed to get annex.uuid configuration"
error.

This does not seem to have been a reversion.

I saw this with configlist, but it seems possible for other commands to be
effected, and it might not always happen only after a fresh clone. Eg, if a
foo/git-annex branch is pushed to the remote, the next git-annex-shell will
auto-merge it and display the message.

Decided to run all git-annex-shell commands with noMessages,
even ones that don't currently use stdout for structured communication.
Better to keep open the possibility for using stdout in the future.

This commit was supported by the NSF-funded DataLad project
2017-03-17 12:32:43 -04:00
Joey Hess
d1ecdd04b2
Windows: Fix bug in shell script shebang lookup code that caused a "delayed read on closed handle" error.
The bug was that withFile closes the handle afterwards, but the content
of the file was not read due to laziness. Using readFile avoids it.

This commit was sponsored by Nick Daly on Patreon.
2017-03-13 16:20:52 -04:00
Joey Hess
1c4e5f65fc
Drop support for building with old versions of directory, feed, and http-types. 2017-03-10 15:57:41 -04:00
Joey Hess
9ef7207d5a
Revert "Drop support for building without network-uri."
This reverts commit fc3925a1cd.

Need it in stable w/o backports for the ancient autobuilder.
2017-03-10 15:49:18 -04:00
Joey Hess
ca49a84ba5
Drop support for building with old versions of dns and http-conduit. 2017-03-10 15:49:14 -04:00
Joey Hess
fc3925a1cd
Drop support for building without network-uri.
network-uri is available in Debian stable (backports) and testing,
so no need to complicate the cabal file anymore
2017-03-10 15:38:15 -04:00
Joey Hess
5358fb992a
Windows: Improve handling of shebang in external special remote program, searching for the program in the PATH.
findShellCommand needs a full path to a file in order to check it for a
shebang on Windows. It was being run with only the base name of the external
special remote program, which would only work when it was in the current
directory.

This is why users in
https://github.com/DanielDent/git-annex-remote-rclone/pull/10 and elsewhere
were complaining that the previous improvements to git-annex didn't make
git-remote-rclone work on Windows.

Also, reworked checkearlytermination, which while it worked, seemed
to rely on a race condition. And, improved its error messages.

This commit was sponsored by Shane-o on Patreon.
2017-03-08 15:59:00 -04:00
Joey Hess
301aff34c4
fsck -q: When a file has bad content, include the name of the file in the warning message.
This commit was sponsored by Alexander Thompson on Patreon.
2017-03-08 15:15:20 -04:00
Joey Hess
0534152685
get -J: Improve distribution of jobs amoung remotes when there are more jobs than remotes.
It was distributing jobs to remotes that were not being used by any other
job. But, suppose that there are only 2 remotes, and -J10. In such a case,
the first 2 downloads would be distributed amoung the 2 remotes, but
the other 8 would all go to remote #1. Improved by keeping a counter
of how many jobs are assigned to a remote, and prefer remotes with fewer
jobs.

Note use of Data.Map.Strict to avoid blowing up space. I kept the
bang-patterns as-is, although probably not needed with Data.Map.Strict.

This commit was sponsored by Jack Hill on Patreon.
2017-03-08 14:49:30 -04:00
Joey Hess
af2a6d578e
assistant: Add 1/200th second delay between checking each file in the full transfer scan, to avoid using too much CPU.
The slowdown is not going to be large in typical small-ish repos.
And it does not seem to matter if the assistant reacts a little bit slower
in situations involving the expensive scan, since:

a) Those situations typically involve getting back in sync after something
   has changed on a remote, often after a disconnect of some duration.
   So taking a few seconds more is not noticable.
b) If the scan finds things that it needs to do, it will start
   blocking anyway after 10 transfers are queued (due to use of
   queueTransferWhenSmall). So, only the speed of finding the first 10
   transfers will be impacted by this change.

This commit was sponsored by Jochen Bartl on Patreon.
2017-03-06 13:32:47 -04:00
Joey Hess
11d3219985
Linux standalone builds put the bundled ssh last in PATH, so any system ssh will be preferred over it.
This commit was sponsored by Denis Dzyubenko on Patreon.
2017-03-02 17:40:40 -04:00
Joey Hess
874232f1a6
status: Propigate nonzero exit code from git status. 2017-03-02 14:09:42 -04:00
Joey Hess
34db79e1a5
Bugfix: Passing a command a filename that does not exist sometimes did not display an error, when a path to a directory was also passed.
It was relying on segmentPaths to work correctly, so when it didn't,
sometimes the file that did not exist got matched up with a non-null
list of results. Fixed by always checking if each parameter exists.

There are two reason segmentPaths might not work correctly.

For one, it assumes that when the original list of paths
has more than 100 paths, it's not worth paying the CPU cost to
preserve input orders.

And then, it fails when a directory such as "." or ".." or
/path/to/repo is in the input list, and the list of found paths
does not start with that same thing. It should probably not be using
dirContains, but something else.

But, it's not clear how to handle this fully. Consider
when [".", "subdir"] has been expanded by git ls-files to
["subdir/1", "subdir/2"]
-- Both of the inputs contained those results, so there's
no one right answer for segmentPaths. All these would be equally valid:
	[["subdir/1", "subdir/2"], []]
	[[], ["subdir/1", "subdir/2"]]
	[["subdir/1"], [""subdir/2"]]

So I've not tried to improve segmentPaths.
2017-03-02 13:06:20 -04:00
Joey Hess
a9e1e17d40
releasing package git-annex version 6.20170301.1 2017-03-01 12:46:26 -04:00
Joey Hess
ea1f812ebf
Fix reversion in yesterday's release that made SHA1E and MD5E backends not work. 2017-03-01 12:43:15 -04:00
Joey Hess
254b57aef7
6.20170301 version for hackage
No changes from 6.20170228; a new version number was needed due to a problem with Hackage.
2017-03-01 12:06:10 -04:00
Joey Hess
444278156c
releasing package git-annex version 6.20170228 2017-02-28 14:41:57 -04:00
Joey Hess
b5d21e884c
release prep 2017-02-28 13:31:17 -04:00
Joey Hess
e53070c1ff
inheritable annex.securehashesonly
* init: When annex.securehashesonly has been set with git-annex config,
  copy that value to the annex.securehashesonly git config.
* config --set: As well as setting value in git-annex branch,
  set local gitconfig. This is needed especially for
  annex.securehashesonly, which is read only from local gitconfig and not
  the git-annex branch.

doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn has the
rationalle for doing it this way. There's no perfect solution; this
seems to be the least-bad one.

This commit was supported by the NSF-funded DataLad project.
2017-02-27 16:08:23 -04:00
Joey Hess
9db064f50c
reorg 2017-02-27 15:04:03 -04:00
Joey Hess
49114cf4ea
securehash matching
Added --securehash option to match files using a secure hash function, and
corresponding securehash preferred content expression.

This commit was sponsored by Ethan Aubin.
2017-02-27 15:02:44 -04:00
Joey Hess
942e0174b3
make fsck check annex.securehashesonly, and new tip for working around SHA1 collisions with git-annex
This commit was sponsored by andrea rota.
2017-02-27 13:55:15 -04:00
Joey Hess
07f1e638ee
annex.securehashesonly
Cryptographically secure hashes can be forced to be used in a repository,
by setting annex.securehashesonly. This does not prevent the git repository
from containing files with insecure hashes, but it does prevent the content
of such files from being pulled into .git/annex/objects from another
repository.

We want to make sure that at no point does git-annex accept content into
.git/annex/objects that is hashed with an insecure key. Here's how it
was done:

* .git/annex/objects/xx/yy/KEY/ is kept frozen, so nothing can be
  written to it normally
* So every place that writes content must call, thawContent or modifyContent.
  We can audit for these, and be sure we've considered all cases.
* The main functions are moveAnnex, and linkToAnnex; these were made to
  check annex.securehashesonly, and are the main security boundary
  for annex.securehashesonly.
* Most other calls to modifyContent deal with other files in the KEY
  directory (inode cache etc). The other ones that mess with the content
  are:
	- Annex.Direct.toDirectGen, in which content already in the
	  annex directory is moved to the direct mode file, so not relevant.
	- fix and lock, which don't add new content
	- Command.ReKey.linkKey, which manually unlocks it to make a
	  copy.
* All other calls to thawContent appear safe.

Made moveAnnex return a Bool, so checked all callsites and made them
deal with a failure in appropriate ways.

linkToAnnex simply returns LinkAnnexFailed; all callsites already deal
with it failing in appropriate ways.

This commit was sponsored by Riku Voipio.
2017-02-27 13:33:59 -04:00
Joey Hess
40327cab6e
Removed support for building with the old cryptohash library.
Building with that library made git-annex not support SHA3; it's time for
that to always be supported in case SHA2 dominoes.
2017-02-24 20:56:26 -04:00
Joey Hess
6b52fcbb7e
SHA1 collisions in key names was more exploitable than I thought
Yesterday's SHA1 collision attack could be used to generate eg:

SHA256-sfoo--whatever.good
SHA256-sfoo--whatever.bad

Such that they collide. A repository with the good one could have the
bad one swapped in and signed commits would still verify.

I've already mitigated this.
2017-02-24 19:54:36 -04:00
Joey Hess
9de0767d0e
update 2017-02-24 12:31:23 -04:00
Joey Hess
35739a74c2
make file2key reject E* backend keys with a long extension
I am not happy that I had to put backend-specific code in file2key. But
it would be very difficult to avoid this layering violation.

Most of the time, when parsing a Key from a symlink target, git-annex
never looks up its Backend at all, so adding this check to a method of
the Backend object would not work.

The Key could be made to contain the appropriate
Backend, but since Backend is parameterized on an "a" that is fixed to
the Annex monad later, that would need Key to change to "Key a".

The only way to clean this up that I can see would be to have the Key
contain a LowlevelBackend, and put the validation in LowlevelBackend.
Perhaps later, but that would be an extensive change, so let's not do
it in this commit which may want to cherry-pick to backports.

This commit was sponsored by Ethan Aubin.
2017-02-24 11:22:15 -04:00
Joey Hess
102e04b30c
typo 2017-02-24 00:29:37 -04:00
Joey Hess
60d99a80a6
Tighten key parser to not accept keys containing a non-numeric fields, which could be used to embed data useful for a SHA1 attack against git.
Also todo about why this is important, and with some further hardening to
add.

This commit was sponsored by Ignacio on Patreon.
2017-02-24 00:17:25 -04:00
Joey Hess
75a15e1ad7
status: Pass --ignore-submodules=when option on to git status.
Didn't make --ignore-submodules without a value be handled because I can't
see a way to make optparse-applicative parse that. I've opened a bug
requesting a way to do that:
https://github.com/pcapriotti/optparse-applicative/issues/243
2017-02-20 17:01:24 -04:00
Joey Hess
7a0d6d81a0
make curl show http errors to stderr
* Run curl with -S, so HTTP errors are displayed, even when
  it's otherwise silent.
* When downloading in --json or --quiet mode, use curl in preference
  to wget, since curl is able to display only errors to stderr, unlike
  wget.

This does mean that downloadQuiet is only silent on stdout, not necessarily
on stderr, which affects a couple other calls of it. For example,
downloading the .git/config of a http remote may show an error message now,
perhaps with slightly suboptimal formatting due to other output.
2017-02-20 16:09:32 -04:00
Joey Hess
4a397b5313
Run wget with -nv instead of -q, so it will display HTTP errors.
This adds one extra line of output when a download is successful,
after the progress bar. I don't much like that, but wget does not provide a
way to show HTTP errors without it.
2017-02-20 15:25:02 -04:00
Joey Hess
a13c0ce66c
adjust: Fix behavior when used in a repository that contains submodules.
Also fixed the LsFiles parser to not assume its output has a fixed width
type field.
2017-02-20 13:44:55 -04:00
Joey Hess
c5cf5cf03a
git-annex.cabal: Make crypto-api a dependency even when built w/o webapp and test suite.
The p2p code made it always be needed.

This commit was sponsored by Anthony DeRobertis on Patreon.
2017-02-20 12:21:35 -04:00
Joey Hess
e6857e75a6
sync hack to make updateInstead work on eg FAT
sync: When syncing with a local repository located on a crippled
filesystem, run the post-receive hook there, since it wouldn't get run
otherwise. This makes pushing to repos on FAT-formatted removable drives
update them when receive.denyCurrentBranch=updateInstead.

Made Remote.Git export onLocal, which was cleaned up to not have so many
caveats about its use.

This commit was sponsored by Jeff Goeke-Smith on Patreon.
2017-02-17 15:21:52 -04:00
Joey Hess
d074532aff
post-recive hook to make updateInstead work in direct mode and adjusted branches
* Added post-recieve hook, which makes updateInstead work with direct
  mode and adjusted branches.
* init: Set up the post-receive hook.

This commit was sponsored by Fernando Jimenez on Patreon.
2017-02-17 14:04:43 -04:00
Joey Hess
d0651bb567
make query commands not output extraneous messages
config group groupwanted numcopies schedule wanted required:  Avoid
displaying extraneous messages about repository auto-init, git-annex branch
merging, etc, when being used to get information.
2017-02-16 13:24:35 -04:00
Joey Hess
a73c8ce4a1
sync: Improve integration with receive.denyCurrentBranch=updateInstead
By displaying error messages from the remote then it fails to update
its checked out branch.

Error messages in the default receive.denyCurrentBranch are still
suppressed, which matches user expectations.

This commit was sponsored by Nick Daly on Patreon.
2017-02-15 16:13:30 -04:00
Joey Hess
f07af03018
Run ssh with -n whenever input is not being piped into it
... to avoid it consuming stdin that it shouldn't.

This fixes git-annex-checkpresentkey --batch remote, which didn't output
results for all keys passed into it.

Other git-annex commands that communicate with a remote over ssh may also
have been consuming stdin that they shouldn't have, which could have
impacted using them in eg, shell scripts. For example, a shell script
reading files from stdin and passing them to git annex drop would be
impacted by this bug, whenever git annex drop ran git-annex-shell
checkpresent, it would consume part/all of the stdin that the shell script
was supposed to consume.

Fixed by adding a ConsumeStdin parameter to Annex.Ssh.sshOptions, which
is used throughout git-annex to run ssh (in order for ssh connection
caching to work). Every call site was checked to see if it used
CreatePipe for stdin, and if not was marked NoConsumeStdin.
2017-02-15 15:08:46 -04:00
Joey Hess
69baa45f14
sync, merge: Fail when the current branch has no commits yet, instead of not merging in anything from remotes and appearing to succeed.
At first I wanted to make it go ahead and merge into the newborn branch,
so made it use Git.Branch.currentUnsafe to get the current branch. But that
failed:

fatal: ambiguous argument 'refs/heads/master..refs/heads/synced/master':
unknown revision or path not in the working tree.

A whole nother code path to handle merging into newborn branches seemed
excessive, so went with displaying a warning and propigating failure
status.

This commit was sponsored by Brock Spratlen on Patreon.
2017-02-14 16:09:55 -04:00
Joey Hess
95390f0c27
releasing package git-annex version 6.20170214 2017-02-14 14:56:11 -04:00
Joey Hess
3b22ad9f47
Work around sqlite's incorrect handling of umask when creating databases.
Refactored some common code into initDb.

This only deals with the problem when creating new databases. If a repo
got bad permissions into it, it's up to the user to deal with it.

This commit was sponsored by Ole-Morten Duesund on Patreon.
2017-02-13 17:39:16 -04:00
Joey Hess
976676a7b0
S3: Fix check of uuid file stored in bucket, which was not working.
The check was broken in two ways.. First, nowhere did it error out when
checkUUIDFile found a different UUID already in the file. Instead,
it overwrote the uuid file.

And, checkUUIDFile's implementation was for some reason always failing with
a ConnectionClosed exception. Apparently something to do with using two
different runResourceT's and a response getting GCed inbetween. I'm pretty
sure that used to work, but changed to a more obviously correct
implementation.

This commit was sponsored by Peter Hogg on Patreon.
2017-02-13 15:35:24 -04:00
Edward Betts
0750913136
correct spelling mistakes 2017-02-12 17:30:23 -04:00
Joey Hess
5e6ced7d0f
Improve pid locking code to work on filesystems that don't support hard links.
Probing for hard link support in the pid locking code is redundant since
git-annex init already probes that. But, it didn't seem worth threading
that data through; the pid locking code runs at most once per git-annex
process, and only on unusual filesystems. Optimising a single hard link
and unlink isn't worth it.

This commit was sponsored by Francois Marier on Patreon.
2017-02-10 15:22:28 -04:00
Joey Hess
e2c98f5788
Added git template directory to Linux standalone tarball and OSX app bundle.
Git does not provide a switch to find out where this directory is, and
while the git-init man page says it will always be in
/usr/share/git-core/templates, that's not the case on OSX with git
installed from homebrew. So, I used a hack taking the --man-path and
constructing a path from that. Works on both Debian and OSX at least.
2017-02-10 13:55:54 -04:00
Joey Hess
c1ece47ea0
import --reinject-duplicates
This is the same as running git annex reinject --known, followed by
git-annex import. The advantage to having it in one command is that it
only has to hash each file once; the two commands have to
hash the imported files a second time.

This commit was sponsored by Shane-o on Patreon.
2017-02-09 15:41:00 -04:00
Joey Hess
f617988a29
Make import --deduplicate and --skip-duplicates only hash once, not twice
import: --deduplicate and --skip-duplicates were implemented inneficiently;
they unncessarily hashed each file twice. They have been improved to only
hash once.

The new approach is to lock down (minimally) and hash files, and then
reuse that information when importing them.

This was rather tricky, especially in detecting changes to files while
they are being imported.

The output of import changed slightly. While before it silently skipped
over files with eg --skip-duplicates, now it shows each file as it starts
to act on it. Since every file is hashed first thing, it would otherwise
not be clear what file import is chewing on. (Actually, it wasn't clear
before when any of the duplicates switches were used.)

This commit was sponsored by Alexander Thompson on Patreon.
2017-02-09 15:32:22 -04:00
Joey Hess
e7e36b6e72
import: Changed how --deduplicate, --skip-duplicates, and --clean-duplicates determine if a file is a duplicate
Before, only content known to be present somewhere was considered a
duplicate. Now, any content that has been annexed before will be considered
a duplicate, even if all annexed copies of the data have been lost.

Note that --clean-duplicates and --deduplicate still check numcopies,
so won't delete duplicate files unless there's an annexed copy.

This makes import use the same method as reinject --known.

The man page already said that duplicate meant "its content is either
present in the local repository already, or git-annex knows of another
repository that contains it, or it was present in the annex before but has
been removed now". So, this is really only bringing the implementation into
line with the man page.

This commit was sponsored by Jochen Bartl on Patreon.
2017-02-07 17:41:58 -04:00
Joey Hess
27e89aeffc
initremote: When a uuid= parameter is passed, use the specified UUID for the new special remote, instead of generating a UUID.
This can be useful in some situations, eg when the same data can be
accessed via two different special remote backends.
2017-02-07 15:10:41 -04:00
Joey Hess
3439f3cc87
assistant: Make --autostart --foreground wait for the children it starts.
Before, the --foreground was ignored when autostarting.

This commit was sponsored by Denis Dzyubenko on Patreon.
2017-02-07 13:31:45 -04:00
Joey Hess
655f707990
Fix build with aws 0.16. Thanks, aristidb. 2017-02-07 13:01:57 -04:00
Joey Hess
3fe9d99f24
wormhole pairing appid flag day 2021-12-31
Wormhole pairing will start to provide an appid to wormhole on 2021-12-31.
An appid can't be provided now because Debian stable is going to ship a
older version of git-annex that does not provide an appid. Assumption is
that by 2021-12-31, this version of git-annex will be shipped in a Debian
stable release. If that turns out to not be the case, this change will need
to be cherry-picked into the git-annex in Debian stable, or its wormhole
pairing will break.

This commit was sponsored by Thomas Hochstein on Patreon.
2017-02-03 15:06:40 -04:00
Joey Hess
06f307ad13
lost a changelog entry; put back 2017-02-03 14:40:53 -04:00
Joey Hess
b77903af48
New annex.synccontent config setting
.. which can be set to true to make git annex sync default to --content.

This may become the default at some point in the future.

As well as being configuable by git config, it can be configured by
git-annex config to control the default behavior in all clones of a
repository.

Had to add a separate --no-content switch to we can tell if it's been
explicitly set, and should override annex.synccontent. If --content was the
default, this complication would not be necessary.

This commit was sponsored by Jake Vosloo on Patreon.
2017-02-03 14:31:17 -04:00
Joey Hess
ed56dba868
annex.autocommit can be configured via git-annex config
... to control the default behavior in all clones of a repository.

This includes a new Configurable data type, so the GitConfig type indicates
which values can be configured this way.

The implementation should be quite efficient; the config log is only read
once, and only when a Configurable value has not already been set by
git-config.

Indeed, it would be nice in the future to extend this, so that git-config
is itself only read on demand. Some commands may not need to look at the
git configuration at all.

This commit was sponsored by Trenton Cronholm on Patreon.
2017-02-03 13:58:53 -04:00
Joey Hess
ed60f60e9b
unused: Improved memory use significantly when there are a lot of differences between branches.
Argh, didn't need an accumulator here!

I think I use accumulators a lot more than I need to when recusively
processing lists..

This commit was sponsored by Jeff Goeke-Smith on Patreon.
2017-01-31 19:42:00 -04:00
Joey Hess
062286135c
unused: When large files are checked right into git, avoid buffering their contents in memory.
This makes it a little bit slower since it has to check file size,
but worth it to fix a potential memory use problem.

This commit was sponsored by Fernando Jimenez on Patreon.
2017-01-31 19:09:37 -04:00
Joey Hess
9eb10caa27
Some optimisations to string splitting code.
Turns out that Data.List.Utils.split is slow and makes a lot of
allocations. Here's a much simpler single character splitter that behaves
the same (even in wacky corner cases) while running in half the time and
75% the allocations.

As well as being an optimisation, this helps move toward eliminating use of
missingh.

(Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and
allocates even more.)

I have not benchmarked the effect on git-annex, but would not be surprised
to see some parsing of eg, large streams from git commands run twice as
fast, and possibly in less memory.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2017-01-31 19:06:22 -04:00
Joey Hess
3300911b14
lts-7.18 finally!
esqueleto finally got fixed, thanks to @bitemyapp

Since XMPP was removed, the previous build failures related to it should
no longer be a problem either.

Meanwhile, lts-5.18 fails to build anymore on Debian due to linker
hardening breaking the version of ghc stack uses with that version.

This commit was sponsored by Francois Marier on Patreon.
2017-01-31 12:27:08 -04:00
Joey Hess
339464e847
config: New command for storing configuration in the git-annex branch.
Any config names can be set using this; git-annex commands will only look
at specific ones that make sense and are worth the overhead of querying the
branch.

This might also be useful for storing whatever other config-type stuff the
user might want to shove into the git-annex branch.

This commit was sponsored by Jochen Bartl on Patreon.
2017-01-30 16:46:38 -04:00
Joey Hess
26d23e38f1
vicfg: Include the numcopies configuation.
Docs say vicfg can configure everything from git-annex branch,
so it ought to configure numcopies.

Note that commenting out existing numcopies does not unset it.

This commit was sponsored by Thom May on Patreon.
2017-01-30 15:27:25 -04:00
Joey Hess
280442ca2c
Remove -j short option for --json-progress; that option was already taken for --json.
This commit was sponsored by Trenton Cronholm.
2017-01-30 12:46:42 -04:00
Joey Hess
f275caf732
Increase default cost for p2p remotes from 200 to 1000. This makes git-annex prefer transferring data from special remotes when possible. 2017-01-06 15:23:30 -04:00
Joey Hess
8740cd9716
releasing package git-annex version 6.20170101 2016-12-31 23:59:56 -04:00
Joey Hess
10e4d93212
Support all common locations of the torrc file. 2016-12-28 15:12:31 -04:00
Joey Hess
b68d2a4b68
webapp: full wormhole pairing UI (untested)
This commit was sponsored by Riku Voipio.
2016-12-27 16:41:35 -04:00
Joey Hess
8484c0c197
Always use filesystem encoding for all file and handle reads and writes.
This is a big scary change. I have convinced myself it should be safe. I
hope!
2016-12-24 14:46:31 -04:00
Joey Hess
e08691b393
enable-tor: When run as a regular user, test a connection back to the hidden service over tor.
This way we know that after enable-tor, the tor hidden service is fully
published and working, and so there should be no problems with it at
pairing time.

It has to start up its own temporary listener on the hidden service. It
would be nice to have it start the remotedaemon running, so that extra
step is not needed afterwards. But, there may already be a remotedaemon
running, in communication with the assistant and we don't want to start
another one. I thought about trying to HUP any running remotedaemon, but
Windows does not make it easy to do that. In any case, having the user
start the remotedaemon themselves lets them know it needs to be running
to serve the hidden service.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2016-12-24 12:50:23 -04:00
Joey Hess
22252e8e4c
Revert "close"
This reverts commit 3aaabc906b.

Commit contained incomplete work.
2016-12-24 12:07:15 -04:00
Joey Hess
3aaabc906b
close 2016-12-22 13:59:21 -04:00
Joey Hess
f7ca2b92fb
enable-tor: No longer needs to be run as root.
When run by not root, su's to root automatically.

This commit was sponsored by Brock Spratlen on Patreon.
2016-12-20 17:40:36 -04:00
Joey Hess
944a6503b9
relocate tor socket out of /etc
weasel explained that apparmor limits on what files tor can read do not
apply to sockets (because they're not files). And apparently the
problems I was seeing with hidden services not being accessible had to
do with onion address propigation and not the location of the socket
file.

remotedaemon looks up the HiddenServicePort in torrc, so if it was
previously configured with the socket in /etc, that will still work.

This commit was sponsored by Denis Dzyubenko on Patreon.
2016-12-20 16:24:46 -04:00
Joey Hess
8f3b2c206c
Debian: Suggest tor and magic-wormhole.
Suggests, not recommends, because tor is not for everyone.
2016-12-20 15:26:14 -04:00
Joey Hess
e312ec3750
Fix build with directory-1.3.
See https://github.com/haskell/directory/issues/66
2016-12-20 15:23:59 -04:00
Joey Hess
a171e576b2
rekey --force: Incorrectly marked the new key's content as being present in the local repo even when it was not. 2016-12-19 18:18:57 -04:00
Joey Hess
95c8b37544
Linux standalone: Improve generation of locale definition files, supporting locales such as, en_GB.UTF-8. 2016-12-19 17:03:52 -04:00
Joey Hess
ccde0932a5
p2p --pair with magic wormhole (untested)
It builds. I have not tried to run it yet. :)

This commit was sponsored by Jake Vosloo on Patreon.
2016-12-18 16:51:41 -04:00
Joey Hess
38f9337e16
Revert "p2p --link now defaults to setting up a bi-directional link"
This reverts commit 3037feb1bf.

On second thought, this was an overcomplication of what should be the
lowest-level primitive. Let's build bi-directional links at the pairing
level with eg magic wormhole.
2016-12-16 18:26:07 -04:00
Joey Hess
bd811d3853
p2p: Added --one-way option.
This commit was sponsored by Fernando Jimenez on Patreon.
2016-12-16 16:43:37 -04:00
Joey Hess
3037feb1bf
p2p --link now defaults to setting up a bi-directional link
Both the local and remote git repositories get remotes added
pointing at one-another.

Makes pairing twice as easy!

Security: The new LINK command in the protocol can be sent repeatedly,
but only by a peer who has authenticated with us. So, it's entirely safe to
add a link back to that peer, or to some other peer it knows about.
Anything we receive over such a link, the peer could send us over the
current connection.

There is some risk of being flooded with LINKs, and adding too many
remotes. To guard against that, there's a hard cap on the number of remotes
that can be set up this way. This will only be a problem if setting up
large p2p networks that have exceptional interconnectedness.

A new, dedicated authtoken is created when sending LINK.

This also allows, in theory, using a p2p network like tor, to learn about
links on other networks, like telehash.

This commit was sponsored by Bruno BEAUFILS on Patreon.
2016-12-16 16:38:06 -04:00
Joey Hess
e67a310da1
p2p: --link no longer takes a remote name, instead the --name option can be used. 2016-12-16 15:37:50 -04:00
Joey Hess
469bfa7ff3
Make all --batch input, as well as fromkey and registerurl stdin be processed without requiring it to be in the current encoding. 2016-12-13 15:35:04 -04:00
Joey Hess
48d9624a2d
Revert ServerAliveInterval
Revert ServerAliveInterval change in 6.20161111, which caused problems
with too many old versions of ssh and unusual ssh configurations.

It should have not been needed anyway since ssh is supposted to
have TCPKeepAlive enabled by default.
2016-12-13 12:12:38 -04:00
Joey Hess
59fead6da3
Pass annex.web-options to wget and curl after other options, so that eg --no-show-progress can be set by the user to disable the default --show-progress. 2016-12-13 11:56:23 -04:00
Joey Hess
d9490685fd
metadata --batch: Fix bug when conflicting metadata changes were made in the same batch run.
1 microsecond delay is ugly.. but, maintaining an queue of a list of timestamps
and taking a new one from the queue each time around, or maintaining a timestamp
counter, would probably be slower.
2016-12-13 11:07:49 -04:00
Joey Hess
a52c011581
Debian: Build webapp on armel. 2016-12-11 21:30:07 -04:00
Joey Hess
bb66e098b1
linux standalone builds should have "unable to decommit memory" bug fixed 2016-12-11 15:37:52 -04:00
Joey Hess
73a79147b1
releasing package git-annex version 6.20161210 2016-12-10 12:23:18 -04:00
Joey Hess
749623df86
fixed 2016-12-10 10:47:16 -04:00
Joey Hess
15be5c04a6
git-annex-shell, remotedaemon, git remote: Fix some memory DOS attacks.
The attacker could just send a very lot of data, with no \n and it would
all be buffered in memory until the kernel killed git-annex or perhaps OOM
killed some other more valuable process.

This is a low impact security hole, only affecting communication between
local git-annex and git-annex-shell on the remote system. (With either
able to be the attacker). Only those with the right ssh key can do it. And,
there are probably lots of ways to construct git repositories that make git
use a lot of memory in various ways, which would have similar impact as
this attack.

The fix in P2P/IO.hs would have been higher impact, if it had made it to a
released version, since it would have allowed DOSing the tor hidden
service without needing to authenticate.

(The LockContent and NotifyChanges instances may not be really
exploitable; since the line is read and ignored, it probably gets read
lazily and does not end up staying buffered in memory.)
2016-12-09 13:34:32 -04:00
Joey Hess
2fb6fd7434
Merge branch 'master' into tor 2016-12-07 14:32:25 -04:00
Joey Hess
f61508aed4
add: Stage modified non-large files when running in indirect mode.
(This was already done in v6 mode and direct mode.)
2016-12-05 14:10:21 -04:00
Joey Hess
82d01f5619
rekey: Added --batch mode.
Would have liked to make the Parser parse the file and key pairs, but it
seems that optparse-applicative is unable to handle eg:

	many ((,) <$> argument <*> argument)

This commit was sponsored by Thomas Hochstein on Patreon.
2016-12-05 12:55:50 -04:00
Joey Hess
e65c31e56b
changelog 2016-12-05 12:16:35 -04:00
Joey Hess
93852dd7e8
rmurl: --batch
* rmurl: Multiple pairs of files and urls can be provided on the
  command line.
* rmurl: Added --batch mode.

This commit was sponsored by Trenton Cronholm on Patreon.
2016-12-05 12:10:07 -04:00
Joey Hess
bfc8305814
implement p2p command 2016-11-30 14:35:24 -04:00
Joey Hess
24593aaa32
Merge branch 'master' into tor 2016-11-30 14:16:36 -04:00
Joey Hess
8354612131
prefer xdot over dot
* map: Run xdot if it's available in PATH. On OSX, the dot command
  does not support graphical display, while xdot does.
* Debian: xdot is a better interactive viewer than dot, so Suggest
  xdot, rather than graphviz.
2016-11-30 12:50:49 -04:00
Joey Hess
398345cb26
Merge branch 'master' into tor 2016-11-29 15:45:29 -04:00
Joey Hess
ae9f99f342
Relicense 5 source files that are not part of the webapp from AGPL to GPL.
Building w/o the webapp is not supposed to pull in any AGPLed files.

I appear to have written all the code in these files;
the only commit by anyone else is 64e844e1fe
and is a spelling fix that is not copyrightable.
2016-11-21 23:46:59 -04:00
Joey Hess
070fb9e624
Added git-remote-tor-annex, which allows git pull and push to the tor hidden service.
Almost working, but there's a bug in the relaying.

Also, made tor hidden service setup pick a random port, to make it harder
to port scan.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2016-11-21 17:27:38 -04:00
Joey Hess
6e6d1a8c15
addurl: Fix bug in checking annex.largefiles expressions using largerthan, mimetype, and smallerthan; the first two always failed to match, and the latter always matched. 2016-11-21 11:30:53 -04:00
Joey Hess
74691ddf0e
remotedaemon: serve tor hidden service 2016-11-20 15:48:12 -04:00
Joey Hess
a101b8de37
remotedaemon: Fork to background by default. Added --foreground switch to enable old behavior.
Groundwork for tor hidden services, which the remotedaemon will serve.
2016-11-20 14:50:36 -04:00
Joey Hess
5680565122
releasing package git-annex version 6.20161118 2016-11-18 11:59:49 -04:00
Joey Hess
aedec5d08d
arm build uses 32kb page size
(Change was made in gitannexbuilder scripts not here.)
2016-11-16 18:05:42 -04:00
Joey Hess
2577f1c0a2
fsck --all --from was checking the content of files in the local repository, rather than on the special remote.
Straight up forgot to handle this case!

This commit was sponsored by Fernando Jimenez on Patreon.
2016-11-16 15:33:57 -04:00
Joey Hess
0a4479b8ec
Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors.
ghc 8 added backtraces on uncaught errors. This is great, but git-annex was
using error in many places for a error message targeted at the user, in
some known problem case. A backtrace only confuses such a message, so omit it.

Notably, commands like git annex drop that failed due to eg, numcopies,
used to use error, so had a backtrace.

This commit was sponsored by Ethan Aubin.
2016-11-15 21:29:54 -04:00
Joey Hess
556b2ded2b
sync: Pass --allow-unrelated-histories to git merge when used with git git 2.9.0 or newer.
This makes merging a remote into a freshly created direct mode repository
work the same as it works in indirect mode.

The git-annex branches would get merged in any case by a sync,
since that doesn't use git merge.

This might need to be revisited later to better mirror git's behavior.
2016-11-15 18:26:17 -04:00
Joey Hess
6416ae9c09
unbreak all the autobuilders
git-annex.cabal: Loosen bounds on persistent to allow 2.5, which on Debian
has been patched to work with esqueleto. This may break cabal's resolver on
non-Debian systems; if so, either use stack to build, or run cabal with
--constraint='persistent ==2.2.4.1' Hopefully this mess with esqueleto will
be resolved soon.

https://github.com/prowdsponsor/esqueleto/issues/137
2016-11-15 11:19:57 -04:00
Joey Hess
e544cf7a31
releasing package git-annex version 6.20161111 2016-11-11 14:47:31 -04:00
Joey Hess
d48f4caaef
Linux standalone: Avoid using hard links in the tarball so it can be untarred on eg, afs which does not support them. 2016-11-10 15:12:30 -04:00
Joey Hess
4643470537
webapp: Explicitly avoid checking for auth in static subsite requests.
Yesod didn't used to do auth checks for that, but this may have changed.
I don't have a way to reproduce the reported problem yet, but this change
certianly won't hurt anything.

This commit was sponsored by Thom May on Patreon.
2016-11-10 13:48:54 -04:00
Joey Hess
c44ac268be
OSX: Remove RPATHs from git-annex binary, which are not needed, slow down startup, and break the OSX Sierra linker.
ghc 8.0.2 may make this unncessary, but it's not in a stackage version yet,
so put in a workaround.

Note that the linux builds already delete the RPATHs for similar reasons.

This commit was sponsored by Josh Taylor on Patreon.
2016-11-07 14:22:14 -04:00
Joey Hess
5afc2eaa54
reinject --known: Avoid second, unncessary checksum of file. 2016-11-07 12:07:36 -04:00
Joey Hess
5343544822
S3: Support the special case endpoint needed for the cn-north-1 region.
* S3: Support the special case endpoint needed for the cn-north-1 region.
* Webapp: Don't list the Frankfurt region, as this (and some other new
  regions) need V4 authorization which the aws library does not yet use.

This commit was sponsored by Nick Daly on Patreon.
2016-11-07 11:49:34 -04:00
Joey Hess
7ed96a2405
Make .git/annex/ssh.config file work with versions of ssh older than 7.3, which don't support Include.
When used with an older version of ssh, any ServerAliveInterval in
~/.ssh/config will be overridden by .git/annex/ssh.config.

This commit was sponsored by Josh Taylor on Patreon.
2016-11-07 10:32:57 -04:00
Joey Hess
e23028d19b
restart coprocess in raw mode
Restarting a crashing git process could result in filename encoding issues
when not in a unicode locale, as the restarted processes's handles were not
read in raw mode.

Since rawMode is always used when starting a coprocess, didn't bother
to parameterise it and just always enable it for simplicity.

This commit was sponsored by Jake Vosloo on Patreon.
2016-11-01 14:03:59 -04:00
Joey Hess
5b5dcabdcf
releasing package git-annex version 6.20161031 2016-10-31 18:56:18 -04:00
Joey Hess
b530e43285
Fix reversion in 6.20161012 that prevented adding files with a space in their name. 2016-10-31 18:39:37 -04:00
Joey Hess
ec2e1569e6
Linux standalone: Fix location of locale files in the bundle.
The Makefile was putting them in git-annex.linux/i18n/i18n, and so I18NPATH
did not point to the files. I think that on close enough to Debian systems,
localedef then fell back to using the system-wide locale files, while on
other systems it would fail to generate locales.
2016-10-31 14:31:46 -04:00
Joey Hess
2ad7b00e29
Assistant, repair: Fix ignoring of git fsck errors due to duplicate file entries in tree objects. 2016-10-31 14:00:37 -04:00
Joey Hess
5b84f367e4
prep release 2016-10-27 15:25:05 -04:00
Joey Hess
0ae08947ac
Run ssh with ServerAliveInterval 60
So that stalled transfers will be noticed within about 3 minutes,
even if TCPKeepAlive is disabled or doesn't work.

Rather than setting with -o, use -F with another config file,
so that any settings in ~/.ssh/config or /etc/ssh/ssh_config overrides this.
2016-10-26 16:41:34 -04:00
Joey Hess
8dcf79694d
enable forwardRetry for command-line transfers
If a transfer fails for some reason, but some data managed to be sent, the
transfer will be retried. (The assistant already did this.)

Possible impacts:

* More ssh prompts if ssh needs to prompt for a password to connect to a
  host, or is prompting about some other problem like a ssh key mismatch.

* More data transfer due to retrying, epecially when a remote does not
  support resuming a transfer.

  In the worst case, a lot of data will be transferred but it fails before
  the end, and then all that data gets transferred again plus one byte more;
  repeat until it manages to get the whole file.
2016-10-26 15:38:27 -04:00
Joey Hess
1a8ba7eab4
Improve ssh socket cleanup code to skip over the cruft that NFS sometimes puts in a directory when a file is being deleted. 2016-10-26 13:16:41 -04:00
Joey Hess
6e4fee1faf
test: Deal with gpg-agent behavior change that broke the test suite.
gpg-agent started deleting its socket file on shutdown, and this tickled an
ugly behavior in removeDirectoryRecursive,
https://github.com/haskell/directory/issues/60

Running removeDirectoryRecursive again on exception avoids the problem.
2016-10-18 16:56:38 -04:00
Joey Hess
090a922a98
Assistant, repair: Improved filtering out of git fsck lines about duplicate file entries in tree objects. 2016-10-18 11:19:41 -04:00
Joey Hess
0b1c061382
importfeed: Drop URL parameters from file extension.
Thanks, James MacMahon.
2016-10-17 16:02:05 -04:00
Joey Hess
10ca4b9788
Improve style of offline html build of website. 2016-10-17 15:55:49 -04:00
Joey Hess
8e22114735
upgrade: Handle upgrade to v6 when the repository already contains v6 unlocked files whose content is already present.
Closes https://github.com/datalad/datalad/issues/1020

The use of runWriter in scanUnlockedFiles broke due to this change;
it failed with blocked indefinitely in mvar, because the database write
handle was taken while linkFromAnnex needed to also write to it (to update
the inode cache). So, switched to using a separate runWriter for each call
to addAssociatedFileFast. A little less efficient, but not greatly; the
writes should all still be cached.
2016-10-17 15:19:47 -04:00
Joey Hess
ee309d6941
lock: Fix edge cases where data loss could occur in v6 mode.
In the case where the pointer file is in place, and not the content
of the object, lock's  performNew was called with filemodified=True,
which caused it to try to repopulate the object from an unmodified
associated file, of which there were none. So, the content of the object
got thrown away incorrectly. This was the cause (although not the root
cause) of data loss in https://github.com/datalad/datalad/issues/1020

The same problem could also occur when the work tree file is modified,
but the object is not, and lock is called with --force. Added a test case
for this, since it's excercising the same code path and is easier to set up
than the problem above.

Note that this only occurred when the keys database did not have an inode
cache recorded for the annex object. Normally, the annex object would be in
there, but there are of course circumstances where the inode cache is out
of sync with reality, since it's only a cache.

Fixed by checking if the object is unmodified; if so we don't need to
try to repopulate it. This does add an additional checksum to the unlock
path, but it's already checksumming the worktree file in another case,
so it doesn't slow it down overall.

Further investigation found a similar problem occurred when smudge --clean
is called on a file and the inode cache is not populated. cleanOldKeys
deleted the unmodified old object file in this case. This was also
fixed by checking if the object is unmodified.

In general, use of getInodeCaches and sameInodeCache is potentially
dangerous if the inode cache has not gotten populated for some reason.
Better to use isUnmodified. I breifly auited other places that check the
inode cache, and did not see any immediate problems, but it would be easy
to miss this kind of problem.
2016-10-17 13:58:43 -04:00
Joey Hess
c0cdac5c4a
releasing package git-annex version 6.20161012 2016-10-12 09:38:03 -04:00
Joey Hess
b82c3e0783
sync: Fix bug in adjusted branch merging that could cause recently added files to be lost when updating the adjusted branch.
The modification flag was not being set when making modifications deep
in a tree, so parent trees were not updated to contain the modified tree.

Seems to have exposed another bug where the wrong filename gets grafted in.

This commit was sponsored by Brock Spratlen on Patreon.
2016-10-10 15:00:45 -04:00
Joey Hess
933bc5c917
Support using v3 repositories without upgrading them to v5.
An easy change now that supportedVersions is a list. Since v3 and v5 are
identical other than version number, just add v3 to the list.

This commit was sponsored by andrea rota.
2016-10-05 16:53:09 -04:00
Joey Hess
f867fc157f
When auto-upgrading a v3 remote, avoid upgrading to version 6, instead keep it at version 5.
Fixes a bug introduced with v6 mode that I didn't notice until now.
Probably not many v3 repos left out there, and upgrading them to v6 mode
is not disastrous, only a little premature.

This commit was sponsored by Riku Voipio
2016-10-05 16:23:09 -04:00
Joey Hess
34530e59d9
Avoid using a lot of memory when large objects are present in the git repository
.. and have to be checked to see if they are a pointed to an annexed file.

Cases where such memory use could occur included, but were not limited to:
  - git commit -a of a large unlocked file (in v5 mode)
  - git-annex adjust when a large file was checked into git directly
Generally, any use of catKey was a potential problem.

Fix by using git cat-file --batch-check to check size before catting.
This adds another git batch process, which is included in the CatFileHandle
for simplicity.

There could be performance impact, anywhere catKey is used. Particularly
likely to affect adjusted branch generation speed, and operations on
unlocked files in v6 mode. Hopefully since the --batch-check and
--batch read the same data, disk buffering will avoid most overhead.
Leaving only the overhead of talking to the process over the pipe and
whatever computation --batch-check needs to do.

This commit was sponsored by Bruno BEAUFILS on Patreon.
2016-10-05 15:24:13 -04:00
Joey Hess
aacd9b190d
Linux standalone: Include locale files in the bundle, and generate locale definition files for the locales in use when starting runshell.
Currently only done for utf-8 locales because the charset can easily be
told for those. Other locales don't include the charset in their name.

The locale definition is generated under git-annex.linux/locales.
So, this only works if the user can write there.

If locale generation fails for any reason, it's silently skipped.

The git-annex-standalone.deb installs the bundle under /usr, so this locale
generation won't work for non-root users.
2016-10-04 16:37:43 -04:00
Joey Hess
c079811226
Linux standalone: Add back the LOCPATH=/dev/null hack to avoid the system locale-archive being read.
Version mismatches between the system locale-archive and the glibc in the
bundle have been observed to cause git crashes.

Unfortunately, this causes locales to not be used in the linux standalone
bundle, as was the case until version 6.20160419.

glibc hardcodes the path to /usr/lib/locale/locale-archive and does not
let an environment variable cause a different locale-archive file to be used.

The only other option to include locales in the bundle would be to include
exploded locale definition directories in the bundle for a number of
locales, generated by localedef. But these take at least 300 kb per locale,
and there are a great many locales; it would be hundreds of megabytes to
include them all.

(Hmm, we could include localdef in the bundle, and check LANG in runshell
and compile the locale directories on the fly. This would need
/usr/share/i18n/ and /usr/lib/locale-archive to be included in the bundle.
It's.. doable.)

I know this is going to once again cause users of the bundle to complain
that eg, ls doesn't show their unicode filenames right. Better than strange
crashes though.
2016-10-04 12:53:09 -04:00
Joey Hess
5bf4623a1d
allow multiple concurrent external special remote processes
Multiple external special remote processes for the same remote will be
started as needed when using -J.

This should not beak any existing external special remotes, because running
multiple git-annex commands at the same time could already start multiple
processes for the same external special remotes.
2016-09-30 14:29:02 -04:00
Joey Hess
28c6209f55
Make --json-progress output be shown even when the size of a object is not known. 2016-09-29 16:59:48 -04:00
Joey Hess
161d891508
Add "total-size" field to --json-progress output. 2016-09-29 16:29:54 -04:00
Joey Hess
7cae6c746c
Optimised git-annex branch log file timestamp parsing. 10% speedup
This sped up git annex find --not --in web from 6.64s to 5.69s.
The optimised parser is probably more like 50% faster than the general one
it replaced.
2016-09-29 14:04:53 -04:00
Joey Hess
1cd02762bf
Optimisations to git-annex branch query and setting, avoiding repeated copies of the environment.
Speeds up commands like  "git-annex find --in remote" by over 50%.

Profiling showed that adjustGitEnv was 21% of the time and 37% of the
allocations of that command. It copied the environment each time with
getEnvironment.

The only repeated use of adjustGitEnv is in withIndexFile, which tends to
be run at least once per file. So, it was optimised by keeping a cache of
the environment, which can be reused.

There could be other better ways to optimise this. Maybe get the while
environment once at startup. But, then it would have to be serialized back
out each time running a child process, so I doubt that would be a net win.

It might be better to cache a version of the environment that is
pre-modified to use .git-annex/index. But, profiling doesn't show that
modifying the enviroment is taking any significant time.
2016-09-29 13:36:48 -04:00
Joey Hess
8794dcf27b
Optimisations to time it takes git-annex to walk working tree and find files to work on. Sped up by around 18%.
key2file and file2key were top cost centers according to profiling.
The repeated use of replace was not efficient. This new approach is quite a
lot more efficient.

This commit was sponsored by Denis Dzyubenko on Patreon.
2016-09-26 16:48:57 -04:00
Joey Hess
1678510680
prep release 2016-09-23 09:45:46 -04:00
Joey Hess
4b26aee92c
Revert "stack.yaml: Update to lts-7.0 (ghc 8)"
This reverts commit e181603103.

This broke the i386ancient autobuilder due to its use of
--flag git-annex:XMPP --flag=git-annex:dbus

-- Failure when adding dependencies:
fdo-notify: needed ((>=0.3)), stack configuration has no specified version
(latest applicable is 0.3.1)
gnutls: needed ((>=0.1.4)), stack configuration has no specified version
(latest applicable is 0.2)
network-protocol-xmpp: needed (-any), stack configuration has no specified
version (latest applicable is 0.4.8)

OSX autobuilder also seems hosed by it, so too soon.

De-revert later..
2016-09-21 18:01:23 -04:00
Joey Hess
c910004d50
addurl, importfeed: Improve behavior when file being added is gitignored. 2016-09-21 17:21:48 -04:00
Joey Hess
a569f195b7
fix bugs in handing of deep branches with sync and adjusted branches
* sync: Previously, when run in a branch with a slash in its name,
  such as "foo/bar", the sync branch was "synced/bar". That conflicted
  with the sync branch used for branch "bar", so has been changed to
  "synced/foo/bar".
* adjust: Previously, when adjusting a branch with a slash in its name,
  such as "foo/bar", the adjusted branch was "adjusted/bar(unlocked)".
  That conflicted with the adjusted branch used for branch "bar",
  so has been changed to "adjusted/foo/bar(unlocked)"
* Also, running sync in an adjusted branch did not correctly sync
  changes back to the parent branch when it had a slash in its name.
  This bug has been fixed.

Eliminate use of Git.Ref.under and Git.Ref.basename; using
Git.Ref.underBase and Git.Ref.base make everything handle deep branches
correctly.

Probably noone was adjusting deep branches, and v6 is still experimental
anyway, so I'm not going to worry about the mess that was left by that bug.

In the case of git-annex sync, using a fixed git-annex with an old unfixed
one will mean they use different sync branches for a deep branch, and so
they may stop syncing until the old one is upgraded. However, that's only
a problem when syncing between repositories without going via a central
bare repository. Added a warning about this to the CHANGELOG, but it's
probably not going to affect many people at all.

This commit was sponsored by Riku Voipio.
2016-09-21 15:23:47 -04:00
Joey Hess
0e30e71e9c
info: Support being passed a treeish, and show info about the annexed files in it similar to how a directory is handled. 2016-09-15 12:51:00 -04:00
Joey Hess
e181603103
stack.yaml: Update to lts-7.0 (ghc 8)
A few of these extra-deps are setting versions to work around various
library dep issues with ghc 8.
2016-09-15 00:37:05 -04:00
Joey Hess
ec3558fb79
Improve gpg secret key list parser to deal with changes in gpg 2.1.15. Fixes key name display in webapp.
gpg 2.1.15 (or so) seems to have added some new fields to the --with-colons
--list-secret-keys output. These include "fpr" and "grp", and come before
the "uid" line. So, the parser was giving up before it saw the name. Fix by
continuing to look for the uid line until the next "sec" line.

This commit was sponsored by Ole-Morten,Duesund on Patreon.
2016-09-14 13:31:00 -04:00
Joey Hess
3e22d60549
copy, move, mirror: Support --json and --json-progress. 2016-09-09 16:24:26 -04:00
Joey Hess
05d4438383
addurl, get: Added --json-progress option, which adds progress objects to the json output.
This doesn't work right when used with -J yet, and there is some really
ugly hand-crafting of part of the json output.
2016-09-09 15:06:54 -04:00
Joey Hess
f421a7f001
Remove key:null from git-annex add --json output. 2016-09-09 14:26:34 -04:00
Joey Hess
089c592977
buffer json output until done when in concurrent mode 2016-09-09 13:21:38 -04:00
Joey Hess
e0fae28c72
Rate limit console progress display updates to 10 per second. Was updating as frequently as changes were reported, up to hundreds of times per second, which used unncessary bandwidth when running git-annex over ssh etc. 2016-09-08 13:17:43 -04:00
Joey Hess
ad0a7f6cb3
prep release 2016-09-07 11:12:33 -04:00
Joey Hess
31289da691
get -J: Download different files from different remotes when the remotes have the same costs.
Only done in -J mode because only if there's concurrency can downloading
from two remotes be faster. Without concurrency, it's likely the case that
sequential downloads from the same remote are faster than switching back
and forth between two remotes.

There is some hairy MVar code here, but basically it just keeps
the activeremotes MVar full except when deciding which remote to assign
to a thread.

Also affects gets by sync --content -J

This commit was sponsored by Jochen Bartl.
2016-09-06 12:45:21 -04:00
Joey Hess
dd0dff9dc4
Assistant, repair: Filter out git fsck lines about duplicate file entries in tree objects. 2016-09-05 16:08:49 -04:00
Joey Hess
219e2fa157
Make --json and --quiet suppress automatic init messages
And any other messages that might be output before a command starts.

Fixes a reversion introduced in version 5.20150727.

During the optparse-applicative conversion, I needed a place to run
per-command global option setters, and I made it get run during the seek stage. But
that is too late to have --json and --quiet disable output produced in the
check stage. Fix is just to run those per-command global option setters at
the same time as the all-command global option setters.

This commit was sponsored by Thom May.
2016-09-05 15:34:38 -04:00
Joey Hess
49b3ef88f7
Android: Fix disabling use of cp --reflink=auto, curl, sha224, and sha384.
This was originally done in a7ef05a9, but got lost in some change to the
Makefile. Use CROSS_COMPILE=Android to tell configure that it's configuring
for android instead of passing it a parameter.
2016-09-05 14:11:35 -04:00
Joey Hess
5d70eaacaf
examimekey: Allow being run in a git repo that is not initialized by git-annex yet.
No reason not to; indeed there's no real reason to need a git repository
at all except the implementation uses the Annex monad.
2016-09-05 12:26:59 -04:00
Joey Hess
f1a9c5f248
Fix formatting of git-annex-smudge man page, and improve mdwn2man. Thanks, Jim Paris. 2016-09-05 12:16:03 -04:00
Joey Hess
f292f78366
Windows: Handle shebang in external special remote program. 2016-09-05 12:09:23 -04:00
Joey Hess
3752426ca1
releasing package git-annex version 6.20160808 2016-08-08 11:57:09 -04:00
Joey Hess
f461bcae4b
Re-enable accumulating transfer failure log files for command-line actions
This was disabled in commit 61ccf95004,
because only the assistant used them, and they were clutter. But, now
--failed also uses them.

Remove the failure log files after successful transfers. Should avoid
most of the clutter problems.

Commit 61ccf95004 mentions a subtle behavior
change, which has now been reverted:

    There is one behavior change from this. If glacier is being used, and a
    manual git annex get --from glacier fails because the file isn't available
    yet, the assistant will no longer later see that failed transfer file and
    retry the get.
2016-08-03 13:41:07 -04:00
Joey Hess
1a0e2c9901
get, move, copy, mirror: Added --failed switch which retries failed copies/moves
Note that get --from foo --failed will get things that a previous get --from bar
tried and failed to get, etc. I considered making --failed only retry
transfers from the same remote, but it was easier, and seems more useful,
to not have the same remote requirement.

Noisy due to some refactoring into Types/
2016-08-03 12:37:12 -04:00
Joey Hess
f0886a1bdd
info: When run on a file now includes an indication of whether the content is present locally. 2016-07-30 12:29:59 -04:00
Joey Hess
bf3327ff25
Added metadata --batch option, which allows getting, setting, deleting, and modifying metadata for multiple files/keys. 2016-07-27 10:46:25 -04:00
Joey Hess
e5225f08fc
When built with ut uid-1.3.12, generate more random UUIDs than before
Use nextRandom to generate the random UUID, rather than using randomIO.
This gets fixes for the following two bugs in the uuid library.

However, this did not impact git-annex much, so a hard depedency has
not been added on uuid-1.3.12.

https://github.com/aslatter/uuid/issues/15
	"v4 UUIDs are not that random"

	This doesn't greatly affect git-annex, because even with only
	2^64 possible UUIDs, the chance that two git-annex repositories
	that are clones of the same git repo get the same UUID is miniscule.

	And, git-annex generates only one UUID per run, so preducting
	subsequent UUIDs is not a problem.

https://github.com/aslatter/uuid/issues/16
	"Remove Random instance for UUID, or mark it as deprecated"

	git-annex was using that instance; let's stop before it gets
	deprecated or removed.
2016-07-27 07:46:08 -04:00
Joey Hess
870873bdaa
Removed dependency on json library; all JSON is now handled by aeson.
I've eyeballed all --json commands, and the only difference should be
that some fields are re-ordered.
2016-07-26 19:15:34 -04:00
Joey Hess
8bc8469c38
saner format for metadata --json
metadata --json output format has changed, adding a inner json object
named "fields" which contains only the fields and their values.

This should be easier to parse than the old format, which mixed up
metadata fields with other keys in the json object.

Any consumers of the old format will need to be updated.

This adds a dependency on unordered-containers for parsing MetaData
from JSON, but it's a free dependency; aeson pulls in that library.
2016-07-26 15:41:04 -04:00
Joey Hess
d344f04d09
cabal constraints for aws and esqueleto
closes https://github.com/joeyh/git-annex/pull/55

* git-annex.cabal: Temporarily limit to http-conduit <2.2.0
  since aws 0.14.0 is not compatible with the newer version.
* git-annex.cabal: Temporarily limit to persistent <2.5
  since esqueleto 2.4.3 is not compatible with the newer version.
2016-07-22 12:41:28 -04:00
Joey Hess
bf8bf14e8e
--branch, stage 1
Added --branch option to copy, drop, fsck, get, metadata, mirror, move, and
whereis commands. This option makes git-annex operate on files that are
included in a specified branch (or other treeish).

The names of the files from the branch that are being operated on are not
displayed yet; only the keys. Displaying the filenames will need changes
to every affected command.

Also, note that --branch can be specified repeatedly. This is not really
documented, but seemed worth supporting, especially since we may later want
the ability to operate on all branches matching a refspec. However, when
operating on two branches that contain the same key, that key will be
operated on twice.
2016-07-20 12:05:26 -04:00
Joey Hess
2cbd1afdb6
prep release 2016-07-19 14:18:16 -04:00
Joey Hess
2619019630
Avoid any access to keys database in v5 mode repositories, which are not supposed to use that database. 2016-07-19 12:12:19 -04:00
Joey Hess
50e63f75d1
webapp: Escape unusual characters in ssh hostnames when generating mangled hostnames. This allows IPv6 addresses to be used on filesystems not supporting : in filenames. 2016-07-19 11:37:03 -04:00
Joey Hess
c4d011bf3e
log: Added --all option. 2016-07-17 15:15:08 -04:00
Joey Hess
154c939830
Speed up startup time by caching the refs that have been merged into the git-annex branch.
This can speed up git-annex commands by as much as a second, depending on
the number of remotes.
2016-07-17 12:24:34 -04:00
Joey Hess
79704528c0
Support checking presence of content at a http url that redirects to a ftp url. 2016-07-12 16:41:45 -04:00
Joey Hess
0c713a94bd
uninit: Fix crash due to trying to write to deleted keys db.
Reversion introduced by v6 mode support, affects v5 too.

Also fix a similar crash when the webapp is used to delete a repository.
2016-07-12 14:18:35 -04:00
Joey Hess
5642daa651
fsck: Fix a reversion in direct mode fsck of a file that is present when the location log thinks it is not. Reversion introduced in version 5.20151208. 2016-07-12 13:41:03 -04:00
Joey Hess
ea23660f9d
typo: s/enabled/disabled/ 2016-07-06 15:10:57 -04:00
Joey Hess
8b36d54319
Remove the EKG build flag, since Gentoo for some reason decided to enable this flag, depsite it not being intended for production use and so enabled by default. 2016-07-06 15:09:56 -04:00
Joey Hess
c4229be9a7
Remove unnecessary rpaths in the git-annex binary, but only when it's built using make, not cabal. This speeds up git-annex statup time by around 50%. 2016-07-06 14:40:18 -04:00
Joey Hess
5171e98988
drop: Add --batch and --json options. 2016-07-06 11:56:11 -04:00
Joey Hess
d6483deeb1
testremote: Fix crash when testing a freshly made external special remote.
Ignore exceptions when getting the cost and availability for the remote,
and return sane defaults. These defaults are not cached, so if a special
remote program has a transient problem, it will re-query it later.
2016-07-05 16:34:39 -04:00
Joey Hess
f3f6dfcf35
New url for git-remote-gcrypt, now maintained by spwhitton. 2016-07-05 11:30:58 -04:00
Joey Hess
ed8ecbea0c
get: Add --batch and --json options. 2016-07-05 08:57:45 -04:00
Joey Hess
486c355cc1
merged patch
closes https://github.com/joeyh/git-annex/pull/54
2016-06-13 21:53:43 -04:00
Joey Hess
de395dc48c
prep release 2016-06-13 14:57:52 -04:00
Joey Hess
bfd00a0f8c
v6: Fix bad merge in an adjusted branch that resulted in an empty tree. 2016-06-13 14:18:22 -04:00
Joey Hess
b146e13bff
highlight v6 changes 2016-06-09 15:17:54 -04:00
Joey Hess
b6b5a11601
Make git clean filter preserve the backend that was used for a file. 2016-06-09 15:17:08 -04:00
Joey Hess
7fe2ecff91
Fix update of associated files db when unlocking a file in a v6 repo. 2016-06-09 14:45:00 -04:00
Joey Hess
0bc7fee660
Make lock and unlock work in v6 repos on files whose content is not present. 2016-06-09 14:40:44 -04:00
Joey Hess
0249f3aff5
Fix bug in initialization of clone from a repo with an adjusted branch that had not been synced back to master.
This bug caused broken tree objects to get built by a later git annex sync.

This is a somewhat unlikely but not impossible situation, and the test
suite's union_merge_regression test tickled it when it was run on FAT.
2016-06-09 14:11:00 -04:00
Joey Hess
d62d81ee1c
Avoid a crash if getpwuid does not work, when querying the user's full name. 2016-06-08 13:48:03 -04:00
Joey Hess
9569d6be63
Fix bad automatic merge conflict resolution between an annexed file and a directory with the same name when in an adjusted branch.
When running in an overlay work tree, all unchanged files show as deleted,
so this code that stages deletions should not run.
2016-06-07 12:53:35 -04:00
Joey Hess
74e01a2d01
move --to: Better behavior when system is completely out of disk space; drop content from disk before writing location log.
I noticed move --to failing when there was no disk space. The file was sent
to the remote, but it crashed before it could be dropped locally. This
could fix that.
2016-06-05 13:51:22 -04:00
Joey Hess
9996e04f41
list: Do not include dead repositories. 2016-06-04 14:33:31 -04:00
Joey Hess
907fc62f2c
Fix initialization of a bare clone of a repo that has an adjusted branch checked out. 2016-06-02 17:02:38 -04:00
Joey Hess
8452ea45ca
remotedaemon: Fixed support for notifications of changes to gcrypt remotes, which was never tested and didn't quite work before. 2016-06-02 16:19:31 -04:00
Joey Hess
1dcfd797fa
fix git version 2016-06-02 15:55:11 -04:00
Joey Hess
72f0d3d384
Automatically enable v6 mode when initializing in a clone from a repo that has an adjusted branch checked out.
The clone also has the adjusted branch checked out, so it needs to be
initialized to a version that supports that.
2016-06-02 15:34:30 -04:00
Joey Hess
02a20288ef
Pass -S to git commit-tree when commit.gpgsign is set and when making a non-automatic commit, in order to preserve current behavior when used with git 1.9, which has stopped doing this itself. 2016-06-02 15:07:20 -04:00
Joey Hess
fbf5045d4f
sync --content: Fix bug that caused transfers of files to be made to a git remote that does not have a UUID. This particularly impacted clones from gcrypt repositories.
Added guard in Annex.Transfer to prevent this problem at a deeper level.

I'm unhappy ith NoUUID, but having Maybe UUID instead wouldn't help either
if nothing checked that there was a UUID. Since there legitimately need to
be Remotes that do not have a UUID, I can't see a way to fix it at the type
level, short making there be two separate types of Remotes.
2016-06-02 13:50:43 -04:00
Yaroslav Halchenko
64e844e1fe
minor typo fixes throughout
problematic
flexibility
2016-06-02 11:22:18 -04:00
Joey Hess
f4489cc415
Remove Makefile from cabal tarball; man page building is now handled by a small haskell program.
This actually runs faster than building the man pages from the makefile
did. But the main purpose is to let Setup.hs import Build.Mans and so not
need the makefile.
2016-05-31 13:58:13 -04:00
Joey Hess
84f20c9f69
Windows: Avoid terminating git-annex branch lines with \r\n when union merging. 2016-05-27 15:22:52 -04:00
Joey Hess
45308ec78b
Improve SHA*E extension extraction code.
Filter out over-long "extensions" before stripping out non-alphanumerics
from them, so that eg "foo.ba__________r" is not considered a .bar
extension.
2016-05-27 13:14:51 -04:00
Joey Hess
f21a425791
releasing package git-annex version 6.20160527 2016-05-27 12:35:18 -04:00
Joey Hess
5418c8a23a
prep release 2016-05-27 11:50:27 -04:00
Joey Hess
eba68572dc
Split lines in the git-annex branch on \r as well as \n, to deal with \r\n terminated lines written by some versions of git-annex on Windows.
This fixes strange displays in some cases, including whereis showing
many duplicate locations, and showing more total copies than actually
exist.

It's unknown if that lead to data loss when eg, dropping. At the moment,
it seems unlikely it could, since the UUID with \r's appended is not the
same as a UUID without, and so no remote matches it.

It's also unknown if \r's can leak in on windows, perhaps when merging the
git-annex branch.
2016-05-27 11:45:13 -04:00
Joey Hess
1b3bde0625
enableremote: Remove annex-ignore configuration from a remote. 2016-05-24 15:58:27 -04:00
Joey Hess
b33a649a25
enableremote: Can now be used to explicitly enable git-annex to use git remotes. Using the command this way prevents other git-annex commands from probing new git remotes to auto-enable them. 2016-05-24 15:24:38 -04:00
Joey Hess
f79875ef3b
Updated cabal file explictly lists source files.
The tarball on hackage will include only the files needed for cabal install;
it is NOT the full git-annex source tree. While it's totally obnoxious that
cabal files need every file listed out when basic wildcard support could
avoid hundreds of lines, and have to be maintained when files are added,
this does get the tarball size back down to 1 mb.

This also stops stack from complaining that it found modules not listed in
the cabal file.

debian/changelog, debian/NEWS, debian/copyright: Converted to symlinks
to CHANGELOG, NEWS, and COPYRIGHT, which used to symlink to these instead.
This avoids needing to include debian/ in the hackage tarball.

Setup.hs: Build man pages at install time using make and mdwn2man.
If it fails, which it probably will on windows, just skip installing
them.
2016-05-24 01:28:07 -04:00
Joey Hess
d47fb4c11e symlinks 2010-10-27 15:14:59 -04:00