Leverage the new chunked remotes to automatically resume downloads.
Sort of like rsync, although of course not as efficient since this
needs to start at a chunk boundry.
But, unlike rsync, this method will work for S3, WebDAV, external
special remotes, etc, etc. Only directory special remotes so far,
but many more soon!
This implementation will also properly handle starting a download
from one remote, interrupting, and resuming from another one, and so on.
(Resuming interrupted chunked uploads is similarly doable, although
slightly more expensive.)
This commit was sponsored by Thomas Djärv.
The repair code assumed that if fsck found no broken objects, after
removing bad objects and possibly pulling replacements from remote, all was
well.. but this is not really true. Removing bad objects could leave some
branches broken. fsck doesn't report any missing objects in this case,
and its messages about broken branches are ignored by the fsck output
parser.
To deal with this, added a separate scan of all refs to find broken ones
and remove them when --forced. This will also let anyone who ran into this
bug run repair again to fix up the incomplete repair done before.
This commit was sponsored by Aaron Whitehouse.
Based on the example from the tip, but modified to cd into the repo before
running git-annex, since konqueror does not. Also, at least on my system,
the directory is ~/.kde, not ~/.kde4. (konqueror 4.12.4)
This commit was sponsored by Jürgen Peters.
This is a security/usability tradeoff. To avoid exposing the gpg key ids
who can decrypt the repository, users can unset
gcrypt-publish-participants.
The gcrypt-publish-participants option is available in my fork of
git-remote-gcrypt.
This commit was sponsored by Christopher Kernahan.
Catch an exception when ensureInitialized is run in a non-initted
repository. In this case, just read the git config, so that the Git.Repo
object is not LocalUnknown, which is what is used to represent remotes
on eg, drives that are not connected.
The assistant already got this right, and like with the assistant, this
causes an implicit git-annex init of the local remote on the second sync,
once the git-annex branch has been pushed to it.
See this comment for more analysis:
http://git-annex.branchable.com/todo/Recovering_from_a_bad_sync/#comment-64e469a2c1969829ee149cbb41b1c138
This commit was sponsored by jscit.
I think this is a git behavior change, but have not checked to be sure.
Conflict cruft used to look like $foo~HEAD, but now just $foo is left
behind as conflict cruft.
With test case.
Running `git annex direct` would cause loss of data, because the object
was moved to a temp file, which it then tried to replace the work tree file
with, and on failure, the temp file got deleted. Now it's instead moved
back into the annex object location.
Minor because normally only 1 FD is leaked per git-annex run. However,
the test suite leaks a few hundred FDs, and this broke it on the Debian
autobuilders, which seem to have a tigher than usual ulimit.
The leak was introduced by the lazy getDirectoryContents' that was
introduced in e6330988dd in order to scale to
millions of journal files -- if the lazy list was never fully consumed, the
directory handle did not get closed.
Instead, pull in openDirectory/readDirectory/closeDirectory code that I
already developed and submitted in a patch to the haskell directory library
earlier. Using this in journalDirty avoids the place that the lazy list
caused a problem. And using it in stageJournal eliminates the need for
getDirectoryContents'.
The getJournalFiles* functions are switched back to using the regular
strict getDirectoryContents. I'm not sure if those always consume the whole
list, so this avoids any leak. And the things that call those are things
like git annex unused, which also look at every file committed to the
git-annex branch, so would need more work to scale to insane numbers of
files anyway.
When one side is an annexed symlink, and the other side is a non-annexed symlink.
In this case, git-merge does not replace the annexed symlink in the work
tree with the non-annexed symlink, which is different from it's handling of
conflicts between annexed symlinks and regular files or directories.
So, while git-annex generated the correct merge commit, the work tree
didn't get updated to reflect it.
See comments on bug for additional analysis.
Did not add this to the test suite yet; just unloaded a truckload of firewood
and am feeling lazy.
This commit was sponsored by Adam Spiers.
Eg after git-annex add has run on 2 million files in one go.
Slightly unhappy with the neeed to use a temp file here, but I cannot see
any other alternative (see comments on the bug report).
This commit was sponsored by Hamish Coleman.
Support users who have set commit.gpgsign, by disabling gpg signatures for
git-annex branch commits and commits made by the assistant.
The thinking here is that a user sets commit.gpgsign intending the commits
that they manually initiate to be gpg signed. But not commits made in the
background, whether by a deamon or implicitly to the git-annex branch.
gpg signing those would be at best a waste of CPU and at worst would fail,
or flood the user with gpg passphrase prompts, or put their signature on
changes they did not directly do.
See Debian bug #753720.
Also makes all commits done by git-annex go through a few central control
points, to make such changes easier in future.
Also disables commit.gpgsign in the test suite.
This commit was sponsored by Antoine Boegli.
When annex.genmetadata is set, metadata from the feed is added to files
that are imported from it.
Reused the same feedtitle and itemtitle, feedauthor, itemauthor, etc names
that are used in --template.
Also added title and author, which are the item title/author if available,
falling back to the feed title/author. These are more likely to be common
metadata fields.
(There is a small bit of dupication here, but once git gets
around to packing the object, it will compress it away.)
The itempubdate field is not included in the metadata as a string; instead
it is used to generate year and month fields, same as is done when adding
files with annex.genmetadata set.
This commit was sponsored by Amitai Schlair, who cooincidentially
is responsible for ikiwiki generating nice feed metadata!
I had thought that this was already done, but apparently not. There may
have been a reversion around version 5.20140606. Anna's laptop had its
desktop menu file etc having that version despite having upgraded git-annex
to a newer version. However, I could not find any commits that removed a
call to ensureInstalled.
The bug caused the size of the queue to be miscalculted; it was doubled
each time an item was added. Commands run after approx 140 items rather
than the intended 10240!
Yes, this means that git annex webapp on windows execs git-annex, which
execs itself to set env, and the execs itself again to redirect logs.
This is disgusting. This is Windows(TM).
Using the crazy but apparently best approach of a VB Script that runs
git-annex, in a hidden DOS window.
Note that currently the git-annex messages are not directed to daemon.log.
Would probably need another layer of script. Also problimatic since the
repository may not exist yet.
When in direct mode, update the master branch after committing to the
annex/direct/master branch. Also, update the synced/master branch.
This fixes a topology A->B where both A and B are in direct mode and
running the assistant, and a change is made to B. Before this fix, A pulled
the changes from B, but since they were only on the annex/direct/master
branch, it did not merge them.
Note that I considered making the assistant merge the
remotes/B/annex/direct/master, but decided to keep it simple and only merge
the sync branches as before.
Rather than calculating the TSDelta once, and caching it, this now
reads the inode sential file's InodeCache file once, and then each time a
new InodeCache is generated, looks at the sentinal file to get the current
delta.
This way, if the time zone changes while git-annex is running, it will
adapt.
This adds some inneffiency, but only on Windows, and only 1 stat per new
file added. The worst innefficiency is that `git annex status` and
`git annex sync` will now (on Windows) stat the inode sentinal file once per
file in the repo.
It would be more efficient to use getCurrentTimeZone, rather than needing
to stat the sentinal file. This should be easy to do, once the time
package gets my bugfix patch.
This commit was sponsored by Jürgen Lüters.
Deal with FAT's low resolution timestamps, which in combination with
Linux's caching of higher res timestamps while a FAT is mounted, caused
direct mode repositories on FAT to seem to have modified files after they
were unmounted and remounted.
This commit was sponsored by Fabrice Rossi.
This version of wai changed the type of Middleware, so I cannot seem
to liftIO inside it. So, got rid of a lot of not really needed
complexity to use System.Log.Logger's logging stuff, and just use
the standard wai stdout logger when debug logging is enabled.
Format may change some, and it logs http to stdout instead of stderr
now. Doesn't matter for the webapp since both go to the same log anyway.
It was possible for a interrupted sync or merge in direct mode to
leave the work tree out of sync with the last recorded commit.
This would result in the next commit seeing files missing from the work
tree, and committing their removal.
Now, a direct mode merge happens not only in a throwaway work tree, but using
a temporary index file, and without any commits or index changes
being made until the real work tree has been updated. If the merge is
interrupted, the work tree may have some updated files, but worst case a
commit will redundantly commit changes that come from the merge.
This commit was sponsored by Tony Cantor.