Fuzz tests have shown that git cat-file --batch sometimes stops running.
It's not yet known why (no error message; repo seems ok). But this is
something we can deal with in the CoProcess framework, since all 3 types of
long-running git processes should be restartable if they fail.
Note that, as implemented, only IO errors are caught. So an error thrown
by the reveiver, when it sees something that is not valid output from
git cat-file (etc) will not cause a restart. I don't want it to retry
if git commands change their output or are just outputting garbage.
This does mean that if the command did a partial output and crashed in the
middle, it would still not be restarted.
There is currently no guard against restarting a command repeatedly, if,
for example, it crashes repeatedly on startup.
The checkpresent hook can return either True or, False, or fail with a message
if it cannot successfully check the remote. Currently for glacier, when
--trust-glacier is not set, it always returns False. Crucially, in the case
when a file is in glacier, this is telling git-annex it's not there, so copy
re-uploads it. This is not desirable; it breaks using glacier-cli to retreive
that file later, and it wastes money/bandwidth.
What if it instead, when the glacier inventory is missing a
file, it returns False. And when the glacier inventory has a file, unless
--trust-glacier is set, it *fails*.
The result would be:
* `git annex copy --to glacier` would only send things not listed in inventory. If a file is listed in the inventory, `copy`
would complain that --trust-glacier` is not set, and not re-upload the file.
* `git annex drop` would only trust that glacier has a file when --trust-glacier is set. Behavior unchanged.
* `git annex move --to glacier`, when the file is not listed in inventory, would send the file, and delete it locally. Behavior unchanged.
* `git annex move --to glacier`, when the file is listed in inventory, would only trust that glacier has the file when --trust-glacier is set
* `git annex copy --from glacier` / `git annex get`, when the file is located in glacier, would trust the location log, and attempt to get the file from glacier.
Ie, when there'a a conflicted merge we may get foo.variant-xxxx
created in a merge. If a second merge conflict occurs on that new file,
it was not falling back to putting in the whole key (which should stop
the merge conflicts happening for good, but is ugly).
The current manual mode preferred content expression is:
"present and (((exclude=*/archive/* and exclude=archive/*) or (not (copies=archive:1 or copies=smallarchive:1))) or (not copies=semitrusted+:1))"
The old matcher misparsed this, to basically:
OR (present and (...)) (not copies=semitrusted+:1))
The paren handling and indeed the whole conversion from tokens to the
matcher was just wrong. The new way may not be the cleverest, but I think
it is correct, and you can see how it pattern matches structurally against
the expressions when parsing them.
That expression is now parsed to:
MAnd (MOp <function>)
(MOr (MOr (MAnd (MOp <function>) (MOp <function>)) (MNot (MOr (MOp <function>) (MOp <function>))))
(MNot (MOp <function>)))
Which appears correct, and behaves correct in testing.
Also threw in a simplifier, so the final generated Matcher has less
unnecessary clutter in it. Mostly so that I could more easily read &
confirm them.
Also, added a simple test of the Matcher to the test suite.
There is a small chance of badly formed preferred content expressions
behaving differently than before due to this rewrite.
I noticed that when my modem hung up and redialed, my xmpp client was left
sending messages into the void. This will also handle any idle
disconnection issues.
This bug was turned up by the test suite, running fsck in direct mode.
A repository was cloned, was put into direct mode, was fscked, and fsck
incorrectly said that no copy existed of a file, that was actually present
in origin.
This turned out to occur because fsck first did a Annex.Branch.change,
recording that it did not locally have the file. That was recorded in the
journal. Since neither the git annex direct not the fsck had yet needed to
read any info from the branch, but had only made changes to it, the
origin/git-annex branch was not yet merged in. So the journal got a
location log entry written to it, but this did not include
the location log info for the origin. When fsck then did a
Annex.Branch.get, it trusted the journal was cosnsitent, and returned it,
again w/o merging from origin/git-annex. This latter behavior is the
actual bug.
Refer to commit e9bfa8eaed for the thinking
behind it being ok to make a change to a file on the branch, without
first merging the branch. That thinking still stands. However, it means
that files in the journal cannot be trusted to be consistent if the branch
has not been merged. So, to fix, just enure the branch gets merged, even
when reading from the journal.
In tests, this does not seem to cause any extra merging. Except, of course,
in the one case described above. But git annex add, etc, are able to make
changes w/o first merging the branch.
The bug was in movein, which just replaceFile'd the file with a symlink,
even if it already had the desired content, before trying to pull the
content out of the annex and replace the symlink with it.
That was ok-ish for non conflicted merges, where if the file existed it would
be an old version of the content. But for conflicted merges, the automatic
merge resolver has already run, and will have already put the desired
content into the file for the local variant.
Also, made removeDirect not trust that the associated files map is correct.
Only if it can verify that another file has the content will it not move it
into .git/annex/objects.
As seen in this bug report, the lifted exception handling using the StateT
monad throws away state changes when an action throws an exception.
http://git-annex.branchable.com/bugs/git_annex_fork_bombs_on_gpg_file/
.. Which can result in cached values being redundantly calculated, or other
possibly worse bugs when the annex state gets out of sync with reality.
This switches from a StateT AnnexState to a ReaderT (MVar AnnexState).
All changes to the state go via the MVar. So when an Annex action is
running inside an exception handler, and it makes some changes, they
immediately go into affect in the MVar. If it then throws an exception
(or even crashes its thread!), the state changes are still in effect.
The MonadCatchIO-transformers change is actually only incidental.
I could have kept on using lifted-base for the exception handling.
However, I'd have needed to write a new instance of MonadBaseControl
for the new monad.. and I didn't write the old instance.. I begged Bas
and he kindly sent it to me. Happily, MonadCatchIO-transformers is
able to derive a MonadCatchIO instance for my monad.
This is a deep level change. It passes the test suite! What could it break?
Well.. The most likely breakage would be to code that runs an Annex action
in an exception handler, and *wants* state changes to be thrown away.
Perhaps the state changes leaves the state inconsistent, or wrong. Since
there are relatively few places in git-annex that catch exceptions in the
Annex monad, and the AnnexState is generally just used to cache calculated
data, this is unlikely to be a problem.
Oh yeah, this change also makes Assistant.Types.ThreadedMonad a bit
redundant. It's now entirely possible to run concurrent Annex actions in
different threads, all sharing access to the same state! The ThreadedMonad
just adds some extra work on top of that, with its own MVar, and avoids
such actions possibly stepping on one-another's toes. I have not gotten
rid of it, but might try that later. Being able to run concurrent Annex
actions would simplify parts of the Assistant code.
Run the same code git-annex used to get the sha, including its sanity
checking. Much better than old grep. Should detect FreeBSD systems with
sha commands that output in stange format.
This after fielding a bug where git-annex was built with a sha256 program
whose output checked out, but was then run with one that output lines
like:
SHA256 (file) = <sha here>
Which it then parsed as having a SHA256 of "SHA256"!
Now the output of the command is required to be of the right length,
and contain only the right characters.
It's possible for files in indirect mode to have a direct mode mapping
file. Probably from when they were in direct mode. In this case,
toDirectGen tried to copy the content from the direct mode file that the
mapping said had it. But, being in indirect mode, it didn't really have the
content. So it did nothing. This fix makes it always move the content from
.git/annex/objects/ when it's there.
Observed that the pushed refs were received, but not merged into master.
The merger never saw an add event for these refs. Either git is not writing
to a new file and renaming it into place, or the inotify code didn't notice
that. Changed it to also watch for modify events and that seems to have
fixed it!
Note that the note on df88c51334 turned out
to be wrong. Multiple repository pairing over XMPP does work, because the
annex-uuid of the xmpp remote is updated to the uuid of the new repo
when pairing takes place. So the push from it is accepted. (And the other
UUIDs are listed in uuid.log, so pushes from those repositories also are
accepted of course.)
The root of the problem is that toInodeCache sees a non-symlink, and so
goes on and generates a new inode cache for the dummy symlink.
Any place that toInodeCache, or sameFileStatus, or genInodeCache are called
may need to deal with this case. Although many of them are ok. For example,
prepSendAnnex calls sameInodeCache, which calls genInodeCache.. but if
the file content is not present, the InodeCache generated for its standin
file is appropriately not the same, and so it returns Nothing.
I've audited some, but have to say I'm not happy with this; it should be
handled at the type level somehow, or a toInodeCache wrapper be used that
is aware of dummy symlinks.
(The Watcher already dealt with it, via the guardSymlinkStandin function.)
This is so git remotes on servers without git-annex installed can be used
to keep clients' git repos in sync.
This is a behavior change, but since annex-sync can be set to disable
syncing with a remote, I think it's acceptable.
assistant: Work around horrible, terrible, very bad behavior of
gnome-keyring, by not storing special-purpose ssh keys in ~/.ssh/*.pub.
Apparently gnome-keyring apparently will load and indiscriminately use such
keys in some cases, even if they are not using any of the standard ssh key
names. Instead store the keys in ~/.ssh/annex/, which gnome-keyring will
not check.
Note that neither I nor #debian-devel were able to quite reproduce this
problem, but I believe it exists, and that this fixes it. And it certianly
won't hurt anything..
Most remotes have meters in their implementations of retrieveKeyFile
already. Simply hooking these up to the transfer log makes that information
available. Easy peasy.
This is particularly valuable information for encrypted remotes, which
otherwise bypass the assistant's polling of temp files, and so don't have
good progress bars yet.
Still some work to do here (see progressbars.mdwn changes), but this
is entirely an improvement from the lack of progress bars for encrypted
downloads.
* addurl: Register transfer so the webapp can see it.
* addurl: Automatically retry downloads that fail, as long as some
additional content was downloaded.
Fixed by storing a list of cached inodes for a key, instead of just one.
Backwards compatability note: An old git-annex version will fail to parse
an inode cache file that has been written by a new version, and has
multiple items. It will succees if just one. So old git-annexes will have
even worse behavior when there are duplicated files, if that is possible.
I don't think it will be a problem. (Famous last words.)
Also, note that it doesn't expire old and unused inode caches for a key.
It would be possible to add this if needed; just look through the
associated files for a key and if there are more cached inodes, throw out
any not corresponding to associated files. Unless a file is being copied
repeatedly and the old copy deleted, this lack of expiry should not be a
problem.
* since this is a crippled filesystem anyway, git-annex doesn't use
symlinks on it
* so there's no reason to use the mixed case hash directories that we're
stuck using to avoid breaking everyone's symlinks to the content
* so we can do what is already done for all bare repos, and make non-bare
repos on crippled filesystems use the all-lower case hash directories
* which are, happily, all 3 letters long, so they cannot conflict with
mixed case hash directories
* so I was able to 100% fix this and even resuming `git annex add` in the
test case will recover and it will all just work.
This avoids commit churn by the assistant when eg,
replacing a file with a symlink.
But, just as importantly, it prevents the working tree being left with a
deleted file if git-annex, or perhaps the whole system, crashes at the
wrong time.
(It also probably avoids confusing displays in file managers.)
My test case for this bug is to have the assistant running and syncing to
a remote, and create a file in the annex. Then at the command line run
git annex drop. The assistant sees that the file is gone, sees it's a wanted
file, and downloads it from the remote.
With a directory special remote and a small file, I was seeing around 1
time in 3, a race where the file got unstaged from git after it got
downloaded.
Looking at what direct mode content managing code does in this case, it
deletes the symlink, and then adds the file content back. It would be
possible, sometimes, to avoid removing the symlink and do this atomically.
And I probably should.. but in some cases, particularly where the file
needs to be run through `cp` (multiple direct mode files with same
content), there's no way to atomically replace the symlink with the
content.
Anyway, the bug turns out to be something that the watcher does right for
indirect mode, but not for direct mode. When it got an add event, it
checked to see if this was a new file, or one we've already added. In the
latter case, no add event was queued. But that means that only the rm event
is queued, and so it unstages the file.
Fixed by queueing an add event even when the file is already in git.
Tested by running hundreds of drops in a loop; file remained staged.
I would have sort of liked to put this in .gitattributes, but it seems
it does not support multi-word attribute values. Also, making this a single
config setting makes it easy to only parse the expression once.
A natural next step would be to make the assistant `git add` files that
are not annex.largefiles. OTOH, I don't think `git annex add` should
`git add` such files, because git-annex command line tools are
not in the business of wrapping git command line tools.
There was confusion in different parts of the progress bar code about
whether an update contained the total number of bytes transferred, or the
number of bytes transferred since the last update. One way this bug
showed up was progress bars that seemed to stick at zero for a long time.
In order to fix it comprehensively, I add a new BytesProcessed data type,
that is explicitly a total quantity of bytes, not a delta.
Note that this doesn't necessarily fix every problem with progress bars.
Particularly, buffering can now cause progress bars to seem to run ahead
of transfers, reaching 100% when data is still being uploaded.
When a page is loaded, the javascript requests an notification url, and
does long polling on the url to be informed of changes. But if a change
occured before the notification url was requested, it would not be notified
of that change, and so the page display would not update.
I fixed this by *always* updating the page display after it gets
the notification url. This is extra work, but the overhead is not noticable
in the other overhead of loading a page.
(A nicer way would be to somehow record the version of a page initially
loaded, and then compare it with the current version when getting the
notification url, and only force an update if it's changed. But getting
the "version" of the different parts of the page that use long polling
is difficult.)
Move all the binaries and libraries under a bundle/ subdirectory;
so when it's in PATH only git-annex, runshell, and git-annex-webapp
will be available.
A long time ago I made Remote be an instance of the Ord typeclass, with an
implementation that compared the costs of Remotes. That seemed like a good
idea at the time, as it saved typing.. But at the time I was still making
custom Read and Show instances too. I've since learned that this is *not* a
good idea, and neither is making custom Ord instances, without deep thought
about the possible sets of values in a type. Haskell typeclasses are not a
toy.
This Ord instance came around and bit me when I put Remotes into a Set,
because now remotes with the same cost appeared to be in the Set even if
they were not. Also affected putting Remotes into a Map.
Rarely does a bug go this deep. I've fixed it comprehensively, first
removing the Ord instance entirely, and fixing the places that wanted to
order remotes by cost to do it explicitly. Then adding back an Ord instance
that is much more sane. Also by checking the rest of the Ord instances in
the code base (which were all ok).
While doing that, I found lots of places that kept remotes in Maps and
Sets. All of it was probably subtly broken in one way or another before
this fix, but it would be hard to say exactly how the bugs would
manifest.
This may work around google talk's horrible presence handling, in which
clients often don't learn about other clients, at least when using the same
account. This way, every time we start a git push over xmpp, we'll waste
bandwidth asking clients to please try again to identify themselves.
Just before starting a transfer, do one last check that it's still
preferred content.
I was just doing this for uploads, as part of the smarter flood filling
bug, but realized it's also possible for a download that was preferred
content to change to not be before the download begins, so check that too.