Currently only implemented for local git remotes. May try to add support
to git-annex-shell for ssh remotes later. Could concevably also be
supported by some special remote, although that seems unlikely.
Cronner user this when available, and when not falls back to
fsck --fast --from remote
git annex fsck --from does not itself use this interface.
To do so, I would need to pass --fast and all other options that influence
fsck on to the git annex fsck that it runs inside the remote. And that
seems like a lot of work for a result that would be no better than
cd remote; git annex fsck
This may need to be revisited if git-annex-shell gets support, since it
may be the case that the user cannot ssh to the server to run git-annex
fsck there, but can run git-annex-shell there.
This commit was sponsored by Damien Diederen.
Once I built the basic widget, it turned out to be rather easy to replicate
it once per scheduled activity and wire it all up to a fully working UI.
This does abuse yesod's form handling a bit, but I think it's ok.
And it would be nice to have it all ajax-y, so that saving one modified
form won't lose any modifications to other forms. But for now, a nice
simple 115 line of code implementation is a win.
This late night hack session commit was sponsored by Andrea Rota.
I probably need to improve handling of the PleaseTerminate exception to
kill the fsck process. Also, if fsck finds bad files, something needs
to requeue downloads of them. Otherwise, this should work, but is probably
quite buggy since I have only tested the pure code over the past 2 days.
Extends the index.lock handling to other git lock files. I surveyed
all lock files used by git, and found more than I expected. All are
handled the same in git; it leaves them open while doing the operation,
possibly writing the new file content to the lock file, and then closes
them when done.
The gc.pid file is excluded because it won't affect the normal operation
of the assistant, and waiting for a gc to finish on startup wouldn't be
good.
All threads except the webapp thread wait on the new startup sanity checker
thread to complete, so they won't try to do things with git that fail
due to stale lock files. The webapp thread mostly avoids doing that kind of
thing itself. A few configurators might fail on lock files, but only if the
user is explicitly trying to run them. The webapp needs to start
immediately when the user has opened it, even if there are stale lock
files.
Arranging for the threads to wait on the startup sanity checker was a bit
of a bear. Have to get all the NotificationHandles set up before the
startup sanity checker runs, or they won't see its signal. Perhaps
the NotificationBroadcaster is not the best interface to have used for
this. Oh well, it works.
This commit was sponsored by Michael Jakl
However, this is not working for gcrypt repos with a mangled hostname.
Problem is that the locked down key is installed before the repo is
initialized, so git-annex-shell refuses to allow the gcrypt special remote
to do its setup.
Improved probing the remote server, so it gathers a list of the
capabilities it has. From that list, we can determine which types
of remotes are supported, and display an appropriate UI.
The new buttons for making gcrypt repos don't work yet, but the old buttons
for unencrypted git repo and encrypted rsync repo have been adapted to the
new data types and are working.
This commit was sponsored by David Schmitt.
This happened because the transferrer process did not know about the new
remote. remoteFromUUID crashed, which crashed the transferrer. When it was
restarted, the new one knew about the new remote so all further files would
transfer, but the one file would temporarily not be, until transfers retried.
Fixed by making remoteFromUUID not crash, and try reloading the remote list
if it does not know about a remote.
Note that this means that remoteFromUUID does not only return Nothing anymore
when the UUID is the UUID of the local repository. So had to change some code
that dependend on that assumption.
Overridable with --user-agent option.
Not yet done for S3 or WebDAV due to limitations of libraries used --
nether allows a user-agent header to be specified.
This commit sponsored by Michael Zehrer.
This pulls off quite a nice trick: When given a path on rsync.net, it
determines if it is an encrypted git repository that the user has
the key to decrypt, and merges with it. This is works even when
the local repository had no idea that the gcrypt remote exists!
(As previously done with local drives.)
This commit sponsored by Pedro Côrte-Real
This is motivated by a user report that the assistant was repeatedly
retrying transfers of files that had been deleted (in direct mode, so
removing the only copy).
Note that the glacier code retries failed transfers after a while to retry
downloads that have aged long enough to be available. This is ok; if we're
doing a full transfer scan we'll retry on every file that is still in the
git tree.
Also note that this makes the assistant less likely to get every file
referenced by old revs of the git tree. Not something the assistant tries
to ensure anyway, so I feel this is acceptable.
Now can tell if a repo uses gcrypt or not, and whether it's decryptable
with the current gpg keys.
This closes the hole that undecryptable gcrypt repos could have before been
combined into the repo in encrypted mode.
When adding a removable drive, it's now detected if the drive contains
a gcrypt special remote, and that's all handled nicely. This includes
fetching the git-annex branch from the gcrypt repo in order to find
out how to set up the special remote.
Note that gcrypt repos that are not git-annex special remotes are not
supported. It will attempt to detect such a gcrypt repo and refuse
to use it. (But this is hard to do any may fail; see
https://github.com/blake2-ppc/git-remote-gcrypt/issues/6)
The problem with supporting regular gcrypt repos is that we don't know
what the gcrypt.participants setting is intended to be for the repo.
So even if we can decrypt it, if we push changes to it they might not be
visible to other participants.
Anyway, encrypted sneakernet (or mailnet) is now fully possible with the
git-annex assistant! Assuming that the gpg key distribution is handled
somehow, which the assistant doesn't yet help with.
This commit was sponsored by Navishkar Rao.
Now the webapp can generate a gpg key that is dedicated for use by
git-annex. Since the key is single use, much of the complexity of
generating gpg keys is avoided.
Note that the key has no password, because gpg-agent is not available
everywhere the assistant is installed. This is not a big security problem
because the key is going to live on the same disk as the git annex
repository, so an attacker with access to it can look directly in the
repository to see the same files that get stored in the encrypted
repository on the removable drive.
There is no provision yet for backing up keys.
This commit sponsored by Robert Beaty.
I noticed that adding a removable drive repo, then trying to add the same
drive again resulted in the question about whether repos should be
combined. This was because the uuid.log was not updated. Which happened
because the new uuid did not get committed on the removable drive.
This fixes that.
To support this, a core.gcrypt-id is stored by git-annex inside the git
config of a local gcrypt repository, when setting it up.
That is compared with the remote's cached gcrypt-id. When different, a
drive has been changed. git-annex then looks up the remote config for
the uuid mapped from the core.gcrypt-id, and tweaks the configuration
appropriately. When there is no known config for the uuid, it will refuse to
use the remote.
This is a git-remote-gcrypt encrypted special remote. Only sending files
in to the remote works, and only for local repositories.
Most of the work so far has involved making initremote work. A particular
problem is that remote setup in this case needs to generate its own uuid,
derivied from the gcrypt-id. That required some larger changes in the code
to support.
For ssh remotes, this will probably just reuse Remote.Rsync's code, so
should be easy enough. And for downloading from a web remote, I will need
to factor out the part of Remote.Git that does that.
One particular thing that will need work is supporting hot-swapping a local
gcrypt remote. I think it needs to store the gcrypt-id in the git config of the
local remote, so that it can check it every time, and compare with the
cached annex-uuid for the remote. If there is a mismatch, it can change
both the cached annex-uuid and the gcrypt-id. That should work, and I laid
some groundwork for it by already reading the remote's config when it's
local. (Also needed for other reasons.)
This commit was sponsored by Daniel Callahan.
Requires git 1.8.4 or newer. When it's installed, a background
git check-ignore process is run, and used to efficiently check ignores
whenever a new file is added.
Thanks to Adam Spiers, for getting the necessary support into git for this.
A complication is what to do about files that are gitignored but have
been checked into git anyway. git commands assume the ignore has been
overridden in this case, and not need any more overriding to commit a
changed version.
However, for the assistant to do the same, it would have to run git ls-files
to check if the ignored file is in git. This is somewhat expensive. Or it
could use the running git-cat-file process to query the file that way,
but that requires transferring the whole file content over a pipe, so it
can be quite expensive too, for files that are not git-annex
symlinks.
Now imagine if the user knows that a file or directory tree will be getting
frequent changes, and doesn't want the assistant to sync it, so gitignores
it. The assistant could overload the system with repeated ls-files checks!
So, I've decided that the assistant will not automatically commit changes
to files that are gitignored. This is a tradeoff. Hopefully it won't be a
problem to adjust .gitignore settings to not ignore files you want the
assistant to autocommit, or to manually git annex add files that are listed
in .gitignore.
(This could be revisited if git-annex gets access to an interface to check
the content of the index w/o forking a git command. This could be libgit2,
or perhaps a separate git cat-file --batch-check process, so it wouldn't
need to ship over the whole file content.)
This commit was sponsored by Francois Marier. Thanks!
This includes recovery from the ssh-agent problem that led to many reporting
http://git-annex.branchable.com/bugs/Internal_Server_Error:_Unknown_UUID/
(Including fixing up .ssh/config to set IdentitiesOnly.)
Remotes that have no known uuid are now displayed in the webapp as
"unfinished". There's a link to check their status, and if the remote
has been set annex-ignore, a retry button can be used to unset that and
try again to set up the remote.
As this bug has shown, the process of adding a ssh remote has some failure
modes that are not really ideal. It would certianly be better if, when
setting up a ssh remote it would detect if it's failed to get the UUID,
and handle that in the remote setup process, rather than waiting until
later and handling it this way.
However, that's hard to do, particularly for local pairing, since the
PairListener runs as a background thread. The best it could do is pop up an
alert if there's a problem. This solution is not much different.
Also, this solution handles cases where the user has gotten their repo into
a mess manually and let's the assistant help with cleaning it up.
This commit was sponsored by Chia Shee Liang. Thanks!
A ssh remote will breifly have NoUUID when it's just being set up and
git-annex-shell has not yet been queried for the UUID. So it doesn't make
sense to display any kind of error message in this case. The UI doesn't
work when there's NoUUID, and it can even crash the ajax long polling code.
So hiding NoUUID repositories is the right thing to do.
I've tested and the automatic refresh of the repolist causes the remote
to show up as soon as a UUID is recorded, when doing local pairing, and
when adding a ssh remote.
When setting up a dedicated ssh key to access the annex on a host,
set IdentitiesOnly to prevent the ssh-agent from forcing use of a different
ssh key.
That behavior could result in unncessary password prompts. I remember
getting a message or two from people who got deluged with password
prompts and I couldn't at the time see why.
Also, it would prevent git-annex-shell from being run on the remote host,
when git-annex was installed there by unpacking the standalone tarball,
since the authorized_keys line for the dedicated ssh key, which sets
up calling git-annex-shell when it's not in path, wouldn't be used.
This fixes
http://git-annex.branchable.com/bugs/Internal_Server_Error:_Unknown_UUID
but I've not closed that bug yet since I should still:
1. Investigate why the ssh remote got set up despite being so broken.
2. Make the webapp not handle the NoUUID state in such an ugly way.
3. Possibly add code to fix up systems that encountered the problem.
Although since it requires changes to .ssh/config this may be one for
the release notes.
Thanks to TJ for pointing me in the right direction to understand what
was happening here.
This bug was introduced in 82a6db8fe8,
which improved handling of adding very large numbers of files by ensuring
that a minimum number of max size commits (5000 files each) were done.
I accidentially made it wait for another change to appear after such a max
size commit, even if a lot of queued changes were already accumulated.
That resulted in a stall when it got to the end. Now fixed to not wait
any longer than necessary to ensure the watcher has had time to wake back
up after the max size commit.
This commit was sponsored by Michael Linksvayer. Thanks!
This is a laziness problem. Despite the bang pattern on newfiles, the list
was not being fully evaluated before cleanup was called. Moving cleanup out
to after the list is actually used fixes this.
More evidence that I should be using ResourceT or pipes, if any was needed.
This affected both the hourly NetWatcherFallback thread and the syncing
when network connection is detected.
It was a reversion of sorts, introduced in
8861e270be, when annex-ignore was changed to
not control git syncing. I forgot to make it check annex-sync at that
point.
I only added this to the presense messages that are really intended for
presence. The ones used for tunneling git etc don't have the tag, because
that would waste bandwidth.
gmail.com has some XMPP SRV records, but does not itself respond to XMPP
traffic, although it does accept connections on port 5222. So if a user
entered the wrong password, it would try all the SRVs and fall back to
trying gmail, and hang at that point.
This seems the right thing to do, not just a workaround.
The icon files will be installed when running make install or cabal
install. Did not try to run update-icon-caches, since I think it's debian
specific, and dh_icons will take care of that for the Debian package.
Using the favicon as a 16x16 icon. At 24x24 the svg displays pretty well,
although the dotted lines are rather faint. The svg is ok at all higher
resolutions.
The standalone linux build auto-installs the desktop and autostart files
when run. I have not made it auto-install the icon file too, because
a) that would take more work to include them in the tarball and find them
b) it would need to be an install to ~/.icons/, and I don't know if that
really works!
The ssh setup first runs ssh to the real hostname, to probe if a ssh key is
needed. If one is, it generates a mangled hostname that uses a key. This
mangled hostname was being used to ssh into the server to set up the key.
But if the server already had the key set up, and it was locked down, the
setup would fail. This changes it to use the real hostname when sshing in
to set up the key, which avoids the problem.
Note that it will redundantly set up the key on the ssh server. But it's
the same key; the ssh key generation code uses the key if it already
exists.
maximum is partial, so need to ensure the list is not empty.
This lack of associated files would most likely be a problem -- and fsck
fixes it. It could also occur legitimately due to a race.
This is a compromise. I would like to nice every thread except for the
webapp thread, but it's not practical to do so. That would need every
thread to run as a bound thread, which could add significant overhead.
And any forkIO would escape the nice level.