Commit graph

232 commits

Author SHA1 Message Date
Joey Hess
f07af03018
Run ssh with -n whenever input is not being piped into it
... to avoid it consuming stdin that it shouldn't.

This fixes git-annex-checkpresentkey --batch remote, which didn't output
results for all keys passed into it.

Other git-annex commands that communicate with a remote over ssh may also
have been consuming stdin that they shouldn't have, which could have
impacted using them in eg, shell scripts. For example, a shell script
reading files from stdin and passing them to git annex drop would be
impacted by this bug, whenever git annex drop ran git-annex-shell
checkpresent, it would consume part/all of the stdin that the shell script
was supposed to consume.

Fixed by adding a ConsumeStdin parameter to Annex.Ssh.sshOptions, which
is used throughout git-annex to run ssh (in order for ssh connection
caching to work). Every call site was checked to see if it used
CreatePipe for stdin, and if not was marked NoConsumeStdin.
2017-02-15 15:08:46 -04:00
Joey Hess
69baa45f14
sync, merge: Fail when the current branch has no commits yet, instead of not merging in anything from remotes and appearing to succeed.
At first I wanted to make it go ahead and merge into the newborn branch,
so made it use Git.Branch.currentUnsafe to get the current branch. But that
failed:

fatal: ambiguous argument 'refs/heads/master..refs/heads/synced/master':
unknown revision or path not in the working tree.

A whole nother code path to handle merging into newborn branches seemed
excessive, so went with displaying a warning and propigating failure
status.

This commit was sponsored by Brock Spratlen on Patreon.
2017-02-14 16:09:55 -04:00
Joey Hess
95390f0c27
releasing package git-annex version 6.20170214 2017-02-14 14:56:11 -04:00
Joey Hess
3b22ad9f47
Work around sqlite's incorrect handling of umask when creating databases.
Refactored some common code into initDb.

This only deals with the problem when creating new databases. If a repo
got bad permissions into it, it's up to the user to deal with it.

This commit was sponsored by Ole-Morten Duesund on Patreon.
2017-02-13 17:39:16 -04:00
Joey Hess
976676a7b0
S3: Fix check of uuid file stored in bucket, which was not working.
The check was broken in two ways.. First, nowhere did it error out when
checkUUIDFile found a different UUID already in the file. Instead,
it overwrote the uuid file.

And, checkUUIDFile's implementation was for some reason always failing with
a ConnectionClosed exception. Apparently something to do with using two
different runResourceT's and a response getting GCed inbetween. I'm pretty
sure that used to work, but changed to a more obviously correct
implementation.

This commit was sponsored by Peter Hogg on Patreon.
2017-02-13 15:35:24 -04:00
Edward Betts
0750913136
correct spelling mistakes 2017-02-12 17:30:23 -04:00
Joey Hess
5e6ced7d0f
Improve pid locking code to work on filesystems that don't support hard links.
Probing for hard link support in the pid locking code is redundant since
git-annex init already probes that. But, it didn't seem worth threading
that data through; the pid locking code runs at most once per git-annex
process, and only on unusual filesystems. Optimising a single hard link
and unlink isn't worth it.

This commit was sponsored by Francois Marier on Patreon.
2017-02-10 15:22:28 -04:00
Joey Hess
e2c98f5788
Added git template directory to Linux standalone tarball and OSX app bundle.
Git does not provide a switch to find out where this directory is, and
while the git-init man page says it will always be in
/usr/share/git-core/templates, that's not the case on OSX with git
installed from homebrew. So, I used a hack taking the --man-path and
constructing a path from that. Works on both Debian and OSX at least.
2017-02-10 13:55:54 -04:00
Joey Hess
c1ece47ea0
import --reinject-duplicates
This is the same as running git annex reinject --known, followed by
git-annex import. The advantage to having it in one command is that it
only has to hash each file once; the two commands have to
hash the imported files a second time.

This commit was sponsored by Shane-o on Patreon.
2017-02-09 15:41:00 -04:00
Joey Hess
f617988a29
Make import --deduplicate and --skip-duplicates only hash once, not twice
import: --deduplicate and --skip-duplicates were implemented inneficiently;
they unncessarily hashed each file twice. They have been improved to only
hash once.

The new approach is to lock down (minimally) and hash files, and then
reuse that information when importing them.

This was rather tricky, especially in detecting changes to files while
they are being imported.

The output of import changed slightly. While before it silently skipped
over files with eg --skip-duplicates, now it shows each file as it starts
to act on it. Since every file is hashed first thing, it would otherwise
not be clear what file import is chewing on. (Actually, it wasn't clear
before when any of the duplicates switches were used.)

This commit was sponsored by Alexander Thompson on Patreon.
2017-02-09 15:32:22 -04:00
Joey Hess
e7e36b6e72
import: Changed how --deduplicate, --skip-duplicates, and --clean-duplicates determine if a file is a duplicate
Before, only content known to be present somewhere was considered a
duplicate. Now, any content that has been annexed before will be considered
a duplicate, even if all annexed copies of the data have been lost.

Note that --clean-duplicates and --deduplicate still check numcopies,
so won't delete duplicate files unless there's an annexed copy.

This makes import use the same method as reinject --known.

The man page already said that duplicate meant "its content is either
present in the local repository already, or git-annex knows of another
repository that contains it, or it was present in the annex before but has
been removed now". So, this is really only bringing the implementation into
line with the man page.

This commit was sponsored by Jochen Bartl on Patreon.
2017-02-07 17:41:58 -04:00
Joey Hess
27e89aeffc
initremote: When a uuid= parameter is passed, use the specified UUID for the new special remote, instead of generating a UUID.
This can be useful in some situations, eg when the same data can be
accessed via two different special remote backends.
2017-02-07 15:10:41 -04:00
Joey Hess
3439f3cc87
assistant: Make --autostart --foreground wait for the children it starts.
Before, the --foreground was ignored when autostarting.

This commit was sponsored by Denis Dzyubenko on Patreon.
2017-02-07 13:31:45 -04:00
Joey Hess
655f707990
Fix build with aws 0.16. Thanks, aristidb. 2017-02-07 13:01:57 -04:00
Joey Hess
3fe9d99f24
wormhole pairing appid flag day 2021-12-31
Wormhole pairing will start to provide an appid to wormhole on 2021-12-31.
An appid can't be provided now because Debian stable is going to ship a
older version of git-annex that does not provide an appid. Assumption is
that by 2021-12-31, this version of git-annex will be shipped in a Debian
stable release. If that turns out to not be the case, this change will need
to be cherry-picked into the git-annex in Debian stable, or its wormhole
pairing will break.

This commit was sponsored by Thomas Hochstein on Patreon.
2017-02-03 15:06:40 -04:00
Joey Hess
06f307ad13
lost a changelog entry; put back 2017-02-03 14:40:53 -04:00
Joey Hess
b77903af48
New annex.synccontent config setting
.. which can be set to true to make git annex sync default to --content.

This may become the default at some point in the future.

As well as being configuable by git config, it can be configured by
git-annex config to control the default behavior in all clones of a
repository.

Had to add a separate --no-content switch to we can tell if it's been
explicitly set, and should override annex.synccontent. If --content was the
default, this complication would not be necessary.

This commit was sponsored by Jake Vosloo on Patreon.
2017-02-03 14:31:17 -04:00
Joey Hess
ed56dba868
annex.autocommit can be configured via git-annex config
... to control the default behavior in all clones of a repository.

This includes a new Configurable data type, so the GitConfig type indicates
which values can be configured this way.

The implementation should be quite efficient; the config log is only read
once, and only when a Configurable value has not already been set by
git-config.

Indeed, it would be nice in the future to extend this, so that git-config
is itself only read on demand. Some commands may not need to look at the
git configuration at all.

This commit was sponsored by Trenton Cronholm on Patreon.
2017-02-03 13:58:53 -04:00
Joey Hess
ed60f60e9b
unused: Improved memory use significantly when there are a lot of differences between branches.
Argh, didn't need an accumulator here!

I think I use accumulators a lot more than I need to when recusively
processing lists..

This commit was sponsored by Jeff Goeke-Smith on Patreon.
2017-01-31 19:42:00 -04:00
Joey Hess
062286135c
unused: When large files are checked right into git, avoid buffering their contents in memory.
This makes it a little bit slower since it has to check file size,
but worth it to fix a potential memory use problem.

This commit was sponsored by Fernando Jimenez on Patreon.
2017-01-31 19:09:37 -04:00
Joey Hess
9eb10caa27
Some optimisations to string splitting code.
Turns out that Data.List.Utils.split is slow and makes a lot of
allocations. Here's a much simpler single character splitter that behaves
the same (even in wacky corner cases) while running in half the time and
75% the allocations.

As well as being an optimisation, this helps move toward eliminating use of
missingh.

(Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and
allocates even more.)

I have not benchmarked the effect on git-annex, but would not be surprised
to see some parsing of eg, large streams from git commands run twice as
fast, and possibly in less memory.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2017-01-31 19:06:22 -04:00
Joey Hess
3300911b14
lts-7.18 finally!
esqueleto finally got fixed, thanks to @bitemyapp

Since XMPP was removed, the previous build failures related to it should
no longer be a problem either.

Meanwhile, lts-5.18 fails to build anymore on Debian due to linker
hardening breaking the version of ghc stack uses with that version.

This commit was sponsored by Francois Marier on Patreon.
2017-01-31 12:27:08 -04:00
Joey Hess
339464e847
config: New command for storing configuration in the git-annex branch.
Any config names can be set using this; git-annex commands will only look
at specific ones that make sense and are worth the overhead of querying the
branch.

This might also be useful for storing whatever other config-type stuff the
user might want to shove into the git-annex branch.

This commit was sponsored by Jochen Bartl on Patreon.
2017-01-30 16:46:38 -04:00
Joey Hess
26d23e38f1
vicfg: Include the numcopies configuation.
Docs say vicfg can configure everything from git-annex branch,
so it ought to configure numcopies.

Note that commenting out existing numcopies does not unset it.

This commit was sponsored by Thom May on Patreon.
2017-01-30 15:27:25 -04:00
Joey Hess
280442ca2c
Remove -j short option for --json-progress; that option was already taken for --json.
This commit was sponsored by Trenton Cronholm.
2017-01-30 12:46:42 -04:00
Joey Hess
f275caf732
Increase default cost for p2p remotes from 200 to 1000. This makes git-annex prefer transferring data from special remotes when possible. 2017-01-06 15:23:30 -04:00
Joey Hess
8740cd9716
releasing package git-annex version 6.20170101 2016-12-31 23:59:56 -04:00
Joey Hess
10e4d93212
Support all common locations of the torrc file. 2016-12-28 15:12:31 -04:00
Joey Hess
b68d2a4b68
webapp: full wormhole pairing UI (untested)
This commit was sponsored by Riku Voipio.
2016-12-27 16:41:35 -04:00
Joey Hess
8484c0c197
Always use filesystem encoding for all file and handle reads and writes.
This is a big scary change. I have convinced myself it should be safe. I
hope!
2016-12-24 14:46:31 -04:00
Joey Hess
e08691b393
enable-tor: When run as a regular user, test a connection back to the hidden service over tor.
This way we know that after enable-tor, the tor hidden service is fully
published and working, and so there should be no problems with it at
pairing time.

It has to start up its own temporary listener on the hidden service. It
would be nice to have it start the remotedaemon running, so that extra
step is not needed afterwards. But, there may already be a remotedaemon
running, in communication with the assistant and we don't want to start
another one. I thought about trying to HUP any running remotedaemon, but
Windows does not make it easy to do that. In any case, having the user
start the remotedaemon themselves lets them know it needs to be running
to serve the hidden service.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2016-12-24 12:50:23 -04:00
Joey Hess
22252e8e4c
Revert "close"
This reverts commit 3aaabc906b.

Commit contained incomplete work.
2016-12-24 12:07:15 -04:00
Joey Hess
3aaabc906b
close 2016-12-22 13:59:21 -04:00
Joey Hess
f7ca2b92fb
enable-tor: No longer needs to be run as root.
When run by not root, su's to root automatically.

This commit was sponsored by Brock Spratlen on Patreon.
2016-12-20 17:40:36 -04:00
Joey Hess
944a6503b9
relocate tor socket out of /etc
weasel explained that apparmor limits on what files tor can read do not
apply to sockets (because they're not files). And apparently the
problems I was seeing with hidden services not being accessible had to
do with onion address propigation and not the location of the socket
file.

remotedaemon looks up the HiddenServicePort in torrc, so if it was
previously configured with the socket in /etc, that will still work.

This commit was sponsored by Denis Dzyubenko on Patreon.
2016-12-20 16:24:46 -04:00
Joey Hess
8f3b2c206c
Debian: Suggest tor and magic-wormhole.
Suggests, not recommends, because tor is not for everyone.
2016-12-20 15:26:14 -04:00
Joey Hess
e312ec3750
Fix build with directory-1.3.
See https://github.com/haskell/directory/issues/66
2016-12-20 15:23:59 -04:00
Joey Hess
a171e576b2
rekey --force: Incorrectly marked the new key's content as being present in the local repo even when it was not. 2016-12-19 18:18:57 -04:00
Joey Hess
95c8b37544
Linux standalone: Improve generation of locale definition files, supporting locales such as, en_GB.UTF-8. 2016-12-19 17:03:52 -04:00
Joey Hess
ccde0932a5
p2p --pair with magic wormhole (untested)
It builds. I have not tried to run it yet. :)

This commit was sponsored by Jake Vosloo on Patreon.
2016-12-18 16:51:41 -04:00
Joey Hess
38f9337e16
Revert "p2p --link now defaults to setting up a bi-directional link"
This reverts commit 3037feb1bf.

On second thought, this was an overcomplication of what should be the
lowest-level primitive. Let's build bi-directional links at the pairing
level with eg magic wormhole.
2016-12-16 18:26:07 -04:00
Joey Hess
bd811d3853
p2p: Added --one-way option.
This commit was sponsored by Fernando Jimenez on Patreon.
2016-12-16 16:43:37 -04:00
Joey Hess
3037feb1bf
p2p --link now defaults to setting up a bi-directional link
Both the local and remote git repositories get remotes added
pointing at one-another.

Makes pairing twice as easy!

Security: The new LINK command in the protocol can be sent repeatedly,
but only by a peer who has authenticated with us. So, it's entirely safe to
add a link back to that peer, or to some other peer it knows about.
Anything we receive over such a link, the peer could send us over the
current connection.

There is some risk of being flooded with LINKs, and adding too many
remotes. To guard against that, there's a hard cap on the number of remotes
that can be set up this way. This will only be a problem if setting up
large p2p networks that have exceptional interconnectedness.

A new, dedicated authtoken is created when sending LINK.

This also allows, in theory, using a p2p network like tor, to learn about
links on other networks, like telehash.

This commit was sponsored by Bruno BEAUFILS on Patreon.
2016-12-16 16:38:06 -04:00
Joey Hess
e67a310da1
p2p: --link no longer takes a remote name, instead the --name option can be used. 2016-12-16 15:37:50 -04:00
Joey Hess
469bfa7ff3
Make all --batch input, as well as fromkey and registerurl stdin be processed without requiring it to be in the current encoding. 2016-12-13 15:35:04 -04:00
Joey Hess
48d9624a2d
Revert ServerAliveInterval
Revert ServerAliveInterval change in 6.20161111, which caused problems
with too many old versions of ssh and unusual ssh configurations.

It should have not been needed anyway since ssh is supposted to
have TCPKeepAlive enabled by default.
2016-12-13 12:12:38 -04:00
Joey Hess
59fead6da3
Pass annex.web-options to wget and curl after other options, so that eg --no-show-progress can be set by the user to disable the default --show-progress. 2016-12-13 11:56:23 -04:00
Joey Hess
d9490685fd
metadata --batch: Fix bug when conflicting metadata changes were made in the same batch run.
1 microsecond delay is ugly.. but, maintaining an queue of a list of timestamps
and taking a new one from the queue each time around, or maintaining a timestamp
counter, would probably be slower.
2016-12-13 11:07:49 -04:00
Joey Hess
a52c011581
Debian: Build webapp on armel. 2016-12-11 21:30:07 -04:00
Joey Hess
bb66e098b1
linux standalone builds should have "unable to decommit memory" bug fixed 2016-12-11 15:37:52 -04:00
Joey Hess
73a79147b1
releasing package git-annex version 6.20161210 2016-12-10 12:23:18 -04:00
Joey Hess
749623df86
fixed 2016-12-10 10:47:16 -04:00
Joey Hess
15be5c04a6
git-annex-shell, remotedaemon, git remote: Fix some memory DOS attacks.
The attacker could just send a very lot of data, with no \n and it would
all be buffered in memory until the kernel killed git-annex or perhaps OOM
killed some other more valuable process.

This is a low impact security hole, only affecting communication between
local git-annex and git-annex-shell on the remote system. (With either
able to be the attacker). Only those with the right ssh key can do it. And,
there are probably lots of ways to construct git repositories that make git
use a lot of memory in various ways, which would have similar impact as
this attack.

The fix in P2P/IO.hs would have been higher impact, if it had made it to a
released version, since it would have allowed DOSing the tor hidden
service without needing to authenticate.

(The LockContent and NotifyChanges instances may not be really
exploitable; since the line is read and ignored, it probably gets read
lazily and does not end up staying buffered in memory.)
2016-12-09 13:34:32 -04:00
Joey Hess
2fb6fd7434
Merge branch 'master' into tor 2016-12-07 14:32:25 -04:00
Joey Hess
f61508aed4
add: Stage modified non-large files when running in indirect mode.
(This was already done in v6 mode and direct mode.)
2016-12-05 14:10:21 -04:00
Joey Hess
82d01f5619
rekey: Added --batch mode.
Would have liked to make the Parser parse the file and key pairs, but it
seems that optparse-applicative is unable to handle eg:

	many ((,) <$> argument <*> argument)

This commit was sponsored by Thomas Hochstein on Patreon.
2016-12-05 12:55:50 -04:00
Joey Hess
e65c31e56b
changelog 2016-12-05 12:16:35 -04:00
Joey Hess
93852dd7e8
rmurl: --batch
* rmurl: Multiple pairs of files and urls can be provided on the
  command line.
* rmurl: Added --batch mode.

This commit was sponsored by Trenton Cronholm on Patreon.
2016-12-05 12:10:07 -04:00
Joey Hess
bfc8305814
implement p2p command 2016-11-30 14:35:24 -04:00
Joey Hess
24593aaa32
Merge branch 'master' into tor 2016-11-30 14:16:36 -04:00
Joey Hess
8354612131
prefer xdot over dot
* map: Run xdot if it's available in PATH. On OSX, the dot command
  does not support graphical display, while xdot does.
* Debian: xdot is a better interactive viewer than dot, so Suggest
  xdot, rather than graphviz.
2016-11-30 12:50:49 -04:00
Joey Hess
398345cb26
Merge branch 'master' into tor 2016-11-29 15:45:29 -04:00
Joey Hess
ae9f99f342
Relicense 5 source files that are not part of the webapp from AGPL to GPL.
Building w/o the webapp is not supposed to pull in any AGPLed files.

I appear to have written all the code in these files;
the only commit by anyone else is 64e844e1fe
and is a spelling fix that is not copyrightable.
2016-11-21 23:46:59 -04:00
Joey Hess
070fb9e624
Added git-remote-tor-annex, which allows git pull and push to the tor hidden service.
Almost working, but there's a bug in the relaying.

Also, made tor hidden service setup pick a random port, to make it harder
to port scan.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2016-11-21 17:27:38 -04:00
Joey Hess
6e6d1a8c15
addurl: Fix bug in checking annex.largefiles expressions using largerthan, mimetype, and smallerthan; the first two always failed to match, and the latter always matched. 2016-11-21 11:30:53 -04:00
Joey Hess
74691ddf0e
remotedaemon: serve tor hidden service 2016-11-20 15:48:12 -04:00
Joey Hess
a101b8de37
remotedaemon: Fork to background by default. Added --foreground switch to enable old behavior.
Groundwork for tor hidden services, which the remotedaemon will serve.
2016-11-20 14:50:36 -04:00
Joey Hess
5680565122
releasing package git-annex version 6.20161118 2016-11-18 11:59:49 -04:00
Joey Hess
aedec5d08d
arm build uses 32kb page size
(Change was made in gitannexbuilder scripts not here.)
2016-11-16 18:05:42 -04:00
Joey Hess
2577f1c0a2
fsck --all --from was checking the content of files in the local repository, rather than on the special remote.
Straight up forgot to handle this case!

This commit was sponsored by Fernando Jimenez on Patreon.
2016-11-16 15:33:57 -04:00
Joey Hess
0a4479b8ec
Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors.
ghc 8 added backtraces on uncaught errors. This is great, but git-annex was
using error in many places for a error message targeted at the user, in
some known problem case. A backtrace only confuses such a message, so omit it.

Notably, commands like git annex drop that failed due to eg, numcopies,
used to use error, so had a backtrace.

This commit was sponsored by Ethan Aubin.
2016-11-15 21:29:54 -04:00
Joey Hess
556b2ded2b
sync: Pass --allow-unrelated-histories to git merge when used with git git 2.9.0 or newer.
This makes merging a remote into a freshly created direct mode repository
work the same as it works in indirect mode.

The git-annex branches would get merged in any case by a sync,
since that doesn't use git merge.

This might need to be revisited later to better mirror git's behavior.
2016-11-15 18:26:17 -04:00
Joey Hess
6416ae9c09
unbreak all the autobuilders
git-annex.cabal: Loosen bounds on persistent to allow 2.5, which on Debian
has been patched to work with esqueleto. This may break cabal's resolver on
non-Debian systems; if so, either use stack to build, or run cabal with
--constraint='persistent ==2.2.4.1' Hopefully this mess with esqueleto will
be resolved soon.

https://github.com/prowdsponsor/esqueleto/issues/137
2016-11-15 11:19:57 -04:00
Joey Hess
e544cf7a31
releasing package git-annex version 6.20161111 2016-11-11 14:47:31 -04:00
Joey Hess
d48f4caaef
Linux standalone: Avoid using hard links in the tarball so it can be untarred on eg, afs which does not support them. 2016-11-10 15:12:30 -04:00
Joey Hess
4643470537
webapp: Explicitly avoid checking for auth in static subsite requests.
Yesod didn't used to do auth checks for that, but this may have changed.
I don't have a way to reproduce the reported problem yet, but this change
certianly won't hurt anything.

This commit was sponsored by Thom May on Patreon.
2016-11-10 13:48:54 -04:00
Joey Hess
c44ac268be
OSX: Remove RPATHs from git-annex binary, which are not needed, slow down startup, and break the OSX Sierra linker.
ghc 8.0.2 may make this unncessary, but it's not in a stackage version yet,
so put in a workaround.

Note that the linux builds already delete the RPATHs for similar reasons.

This commit was sponsored by Josh Taylor on Patreon.
2016-11-07 14:22:14 -04:00
Joey Hess
5afc2eaa54
reinject --known: Avoid second, unncessary checksum of file. 2016-11-07 12:07:36 -04:00
Joey Hess
5343544822
S3: Support the special case endpoint needed for the cn-north-1 region.
* S3: Support the special case endpoint needed for the cn-north-1 region.
* Webapp: Don't list the Frankfurt region, as this (and some other new
  regions) need V4 authorization which the aws library does not yet use.

This commit was sponsored by Nick Daly on Patreon.
2016-11-07 11:49:34 -04:00
Joey Hess
7ed96a2405
Make .git/annex/ssh.config file work with versions of ssh older than 7.3, which don't support Include.
When used with an older version of ssh, any ServerAliveInterval in
~/.ssh/config will be overridden by .git/annex/ssh.config.

This commit was sponsored by Josh Taylor on Patreon.
2016-11-07 10:32:57 -04:00
Joey Hess
e23028d19b
restart coprocess in raw mode
Restarting a crashing git process could result in filename encoding issues
when not in a unicode locale, as the restarted processes's handles were not
read in raw mode.

Since rawMode is always used when starting a coprocess, didn't bother
to parameterise it and just always enable it for simplicity.

This commit was sponsored by Jake Vosloo on Patreon.
2016-11-01 14:03:59 -04:00
Joey Hess
5b5dcabdcf
releasing package git-annex version 6.20161031 2016-10-31 18:56:18 -04:00
Joey Hess
b530e43285
Fix reversion in 6.20161012 that prevented adding files with a space in their name. 2016-10-31 18:39:37 -04:00
Joey Hess
ec2e1569e6
Linux standalone: Fix location of locale files in the bundle.
The Makefile was putting them in git-annex.linux/i18n/i18n, and so I18NPATH
did not point to the files. I think that on close enough to Debian systems,
localedef then fell back to using the system-wide locale files, while on
other systems it would fail to generate locales.
2016-10-31 14:31:46 -04:00
Joey Hess
2ad7b00e29
Assistant, repair: Fix ignoring of git fsck errors due to duplicate file entries in tree objects. 2016-10-31 14:00:37 -04:00
Joey Hess
5b84f367e4
prep release 2016-10-27 15:25:05 -04:00
Joey Hess
0ae08947ac
Run ssh with ServerAliveInterval 60
So that stalled transfers will be noticed within about 3 minutes,
even if TCPKeepAlive is disabled or doesn't work.

Rather than setting with -o, use -F with another config file,
so that any settings in ~/.ssh/config or /etc/ssh/ssh_config overrides this.
2016-10-26 16:41:34 -04:00
Joey Hess
8dcf79694d
enable forwardRetry for command-line transfers
If a transfer fails for some reason, but some data managed to be sent, the
transfer will be retried. (The assistant already did this.)

Possible impacts:

* More ssh prompts if ssh needs to prompt for a password to connect to a
  host, or is prompting about some other problem like a ssh key mismatch.

* More data transfer due to retrying, epecially when a remote does not
  support resuming a transfer.

  In the worst case, a lot of data will be transferred but it fails before
  the end, and then all that data gets transferred again plus one byte more;
  repeat until it manages to get the whole file.
2016-10-26 15:38:27 -04:00
Joey Hess
1a8ba7eab4
Improve ssh socket cleanup code to skip over the cruft that NFS sometimes puts in a directory when a file is being deleted. 2016-10-26 13:16:41 -04:00
Joey Hess
6e4fee1faf
test: Deal with gpg-agent behavior change that broke the test suite.
gpg-agent started deleting its socket file on shutdown, and this tickled an
ugly behavior in removeDirectoryRecursive,
https://github.com/haskell/directory/issues/60

Running removeDirectoryRecursive again on exception avoids the problem.
2016-10-18 16:56:38 -04:00
Joey Hess
090a922a98
Assistant, repair: Improved filtering out of git fsck lines about duplicate file entries in tree objects. 2016-10-18 11:19:41 -04:00
Joey Hess
0b1c061382
importfeed: Drop URL parameters from file extension.
Thanks, James MacMahon.
2016-10-17 16:02:05 -04:00
Joey Hess
10ca4b9788
Improve style of offline html build of website. 2016-10-17 15:55:49 -04:00
Joey Hess
8e22114735
upgrade: Handle upgrade to v6 when the repository already contains v6 unlocked files whose content is already present.
Closes https://github.com/datalad/datalad/issues/1020

The use of runWriter in scanUnlockedFiles broke due to this change;
it failed with blocked indefinitely in mvar, because the database write
handle was taken while linkFromAnnex needed to also write to it (to update
the inode cache). So, switched to using a separate runWriter for each call
to addAssociatedFileFast. A little less efficient, but not greatly; the
writes should all still be cached.
2016-10-17 15:19:47 -04:00
Joey Hess
ee309d6941
lock: Fix edge cases where data loss could occur in v6 mode.
In the case where the pointer file is in place, and not the content
of the object, lock's  performNew was called with filemodified=True,
which caused it to try to repopulate the object from an unmodified
associated file, of which there were none. So, the content of the object
got thrown away incorrectly. This was the cause (although not the root
cause) of data loss in https://github.com/datalad/datalad/issues/1020

The same problem could also occur when the work tree file is modified,
but the object is not, and lock is called with --force. Added a test case
for this, since it's excercising the same code path and is easier to set up
than the problem above.

Note that this only occurred when the keys database did not have an inode
cache recorded for the annex object. Normally, the annex object would be in
there, but there are of course circumstances where the inode cache is out
of sync with reality, since it's only a cache.

Fixed by checking if the object is unmodified; if so we don't need to
try to repopulate it. This does add an additional checksum to the unlock
path, but it's already checksumming the worktree file in another case,
so it doesn't slow it down overall.

Further investigation found a similar problem occurred when smudge --clean
is called on a file and the inode cache is not populated. cleanOldKeys
deleted the unmodified old object file in this case. This was also
fixed by checking if the object is unmodified.

In general, use of getInodeCaches and sameInodeCache is potentially
dangerous if the inode cache has not gotten populated for some reason.
Better to use isUnmodified. I breifly auited other places that check the
inode cache, and did not see any immediate problems, but it would be easy
to miss this kind of problem.
2016-10-17 13:58:43 -04:00
Joey Hess
c0cdac5c4a
releasing package git-annex version 6.20161012 2016-10-12 09:38:03 -04:00
Joey Hess
b82c3e0783
sync: Fix bug in adjusted branch merging that could cause recently added files to be lost when updating the adjusted branch.
The modification flag was not being set when making modifications deep
in a tree, so parent trees were not updated to contain the modified tree.

Seems to have exposed another bug where the wrong filename gets grafted in.

This commit was sponsored by Brock Spratlen on Patreon.
2016-10-10 15:00:45 -04:00
Joey Hess
933bc5c917
Support using v3 repositories without upgrading them to v5.
An easy change now that supportedVersions is a list. Since v3 and v5 are
identical other than version number, just add v3 to the list.

This commit was sponsored by andrea rota.
2016-10-05 16:53:09 -04:00
Joey Hess
f867fc157f
When auto-upgrading a v3 remote, avoid upgrading to version 6, instead keep it at version 5.
Fixes a bug introduced with v6 mode that I didn't notice until now.
Probably not many v3 repos left out there, and upgrading them to v6 mode
is not disastrous, only a little premature.

This commit was sponsored by Riku Voipio
2016-10-05 16:23:09 -04:00
Joey Hess
34530e59d9
Avoid using a lot of memory when large objects are present in the git repository
.. and have to be checked to see if they are a pointed to an annexed file.

Cases where such memory use could occur included, but were not limited to:
  - git commit -a of a large unlocked file (in v5 mode)
  - git-annex adjust when a large file was checked into git directly
Generally, any use of catKey was a potential problem.

Fix by using git cat-file --batch-check to check size before catting.
This adds another git batch process, which is included in the CatFileHandle
for simplicity.

There could be performance impact, anywhere catKey is used. Particularly
likely to affect adjusted branch generation speed, and operations on
unlocked files in v6 mode. Hopefully since the --batch-check and
--batch read the same data, disk buffering will avoid most overhead.
Leaving only the overhead of talking to the process over the pipe and
whatever computation --batch-check needs to do.

This commit was sponsored by Bruno BEAUFILS on Patreon.
2016-10-05 15:24:13 -04:00