Merge branch 'master' into smudge

This commit is contained in:
Joey Hess 2015-12-09 17:58:59 -04:00
commit bf98d2bd77
Failed to extract signature
11 changed files with 148 additions and 24 deletions

5
debian/changelog vendored
View file

@ -1,3 +1,4 @@
* annex.version increased to 6, but version 5 is also still supported.
* The upgrade to version 6 is not done fully automatically, because
upgrading a direct mode repository to version 6 will prevent old
@ -8,7 +9,7 @@
filter. Note that this changes the default behavior of git add in a
newly initialized repository; it will add files to the annex.
git-annex (5.20151117) UNRELEASED; urgency=medium
git-annex (5.20151208) unstable; urgency=medium
* Build with -j1 again to get reproducible build.
* Display progress meter in -J mode when copying from a local git repo,
@ -33,7 +34,7 @@ git-annex (5.20151117) UNRELEASED; urgency=medium
* Fix reversion in handling of long filenames, particularly when using
addurl/importfeed, which was introduced in the previous release.
-- Joey Hess <id@joeyh.name> Mon, 16 Nov 2015 16:49:34 -0400
-- Joey Hess <id@joeyh.name> Tue, 08 Dec 2015 11:14:03 -0400
git-annex (5.20151116) unstable; urgency=medium

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="wsha.code+ga@b38779424f41c5701bbe5937340be43ff1474b2d"
subject="comment 3"
date="2015-12-08T09:44:49Z"
content="""
It seems like aws-0.13.0 is released now. Will it be used in the next release of git-annex? I am using the prebuilt binary for 5.20151116 and STANDARD_IA does not work for me.
Is it possible to print out the version numbers of git-annex dependencies (something like `git-annex version --verbose`)?
"""]]

View file

@ -0,0 +1,41 @@
[[!comment format=mdwn
username="sts"
subject="comment 1"
date="2015-12-07T08:12:26Z"
content="""
ok, I could find the source of the problem: they use sabredav as WebDAV server and sabredav does not support chunked transfers:
// Intercepting the Finder problem
if (($expected = $request->getHeader('X-Expected-Entity-Length')) && $expected > 0) {
/*
Many webservers will not cooperate well with Finder PUT requests,
because it uses 'Chunked' transfer encoding for the request body.
The symptom of this problem is that Finder sends files to the
server, but they arrive as 0-length files in PHP.
If we don't do anything, the user might think they are uploading
files successfully, but they end up empty on the server. Instead,
we throw back an error if we detect this.
The reason Finder uses Chunked, is because it thinks the files
might change as it's being uploaded, and therefore the
Content-Length can vary.
Instead it sends the X-Expected-Entity-Length header with the size
of the file at the very start of the request. If this header is set,
but we don't get a request body we will fail the request to
protect the end-user.
*/
// Only reading first byte
$firstByte = fread($body, 1);
if (strlen($firstByte) !== 1) {
throw new Exception\Forbidden('This server is not compatible with OS/X finder. Consider using a different WebDAV client or webserver.');
}
...
Although, I did not told git-annex to chunk the transfer :-/, because I did not append a 'chunk' parameter. Any ideas how to fix that?
"""]]

View file

@ -0,0 +1,30 @@
Made a lot of progress today. Implemented the database mapping a key to its
associated files. As expected this database, when updated by the
smudge/clean filters, is not always consistent with the current git work tree.
In particular, commands like `git mv` don't update the database with the
new filename. So queries of the database will need to do some additional
work first to get it updated with any staged changes. But the database is
good enough for a proof of concept, I hope.
Then I got git-annex commands treating smudged files as annexed files.
So this works:
joey@darkstar:~/tmp/new>git annex init
init ok
(recording state in git...)
joey@darkstar:~/tmp/new>cp ~/some.mp3 .
joey@darkstar:~/tmp/new>git add some.mp3
joey@darkstar:~/tmp/new>git diff --cached
diff --git a/some.mp3 b/some.mp3
new file mode 100644
index 0000000..2df8868
--- /dev/null
+++ b/some.mp3
@@ -0,0 +1 @@
+/annex/objects/SHA256E-s191213--e4b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855.mp3
joey@darkstar:~/tmp/new>git annex whereis some.mp3
whereis some.mp3 (1 copy)
7de17427-329a-46ec-afd0-0a088f0d0b1b -- joey@darkstar:~/tmp/new [here]
ok
get/drop don't yet update the smudged files, and that's the next step.

View file

@ -1,9 +1,9 @@
I am interested in using `git annex` to manage encrypted backups to Amazon S3/Glacier. So `git annex` will be used with the main file directory in direct mode and an encrypted S3 or Glacier remote set up in archive mode and then `git annex add .` and `git annex sync` will be run periodically. The intent is for this set up to be a backup for catastrophic failure, so I want to make sure I take care of future-proofing and disaster recovery properly. So my basic question is what would I need to have backed up and what would I have to do if the computer with the main repository died. I try to break that out into more specific questions below.
0. Do the S3/Glacier remotes just store the contents of `.git/annex/objects` in encrypted form and nothing else? So if I was left with nothing but the AWS bucket and couldn't get `git annex` to work for whatever reason, I could recover my files by hand if I had the encryption key (though I wouldn't know the file names or directory structure)?
0. S3/Glacier remotes store the contents of `.git/annex/objects` in encrypted form with hashes for file names and nothing else (other than a uuid). The hashes do not match the keys in the main repo. Are they the same keys encrypted? Is there a way to look up the S3 file name corresponding to a file in the repo?
1. For `shared` encryption, I see the cipher text in `remote.log` in the `git-annex` branch. Assuming I didn't have access to `git annex`, what would I need to do to convert that cipher text into a form that I could use with `gpg` to decrypt files?
2. Same question but for `hybrid` encryption rather than `shared`. I assume the answer is similar but I need to decrypt the cipher first with my gpg key? How do I do that?
3. Assuming I did have access to `git annex`, what would I need to create a new repo on a new computer with access to all of the files in the S3/Glacier bucket? I think I would need my Amazon credentials, my gpg key if using hybrid or public key encryption, and the `.git` folder as it was the last time files were pushed to the S3/Glacier remote (which would have the necessary decryption information for shared encryption). Is that right? I guess mainly I am checking that the remote does not store any metadata about the repo, so for `git annex` to be able to pull files back out I would need a backup of the `.git` directory and that back up would need to be up to date (can't just copy remote.log and have `git annex` work out the rest from the remote's contents). So for a full backup, my script would need to `tar` the `.git` directory, encrypt it, and push it to S3/Glacier separately after `git annex` does a sync. Then I could recover everything as long as I had a secure backup of my Amazon credentials and my encryption key(s).
3. Assuming I did have access to `git annex`, what would I need to create a new repo on a new computer with access to all of the files in the S3/Glacier bucket? I think I would need my Amazon credentials (possibly already embedded in the git repo), my gpg key if using hybrid or public key encryption, and the `.git` folder as it was the last time files were pushed to the S3/Glacier remote (which would have the necessary decryption information for shared encryption). Is that right? I guess mainly I am checking that the remote does not store any metadata about the repo, so for `git annex` to be able to pull files back out I would need a backup of the `.git` directory and that back up would need to be up to date (can't just copy remote.log and have `git annex` work out the rest from the remote's contents). So for a full backup, my script would need to `tar` the `.git` directory, encrypt it, and push it to S3/Glacier separately after `git annex` does a sync. Then I could recover everything as long as I had a secure backup of my Amazon credentials and my encryption key(s).

View file

@ -1,19 +0,0 @@
git-annex 5.20150930 released with [[!toggle text="these changes"]]
[[!toggleable text="""
* Added new linux standalone "ancient" build to support kernels
like 2.6.32.
* info: Don't allow use in a non-git-annex repository, since it
uses the git-annex branch and would create it if it were missing.
* assistant: When updating ~/.ssh/config, preserve any symlinks.
* webapp: Remove the "disable remote" feature from the UI.
* S3: When built with aws-0.13.0, supports using more storage classes.
In particular, storageclass=STANDARD\_IA to use Amazon's
new Infrequently Accessed storage, and storageclass=NEARLINE
to use Google's NearLine storage.
* Improve ~/.ssh/config modification code to not add trailing spaces
to lines it cannot parse.
* Fix a crash at direct mode merge time when .git/index doesn't exist
yet. Triggered by eg, git-annex sync --no-commit in a fresh clone of
a repository.
* status: Show added but not yet committed files.
* Added stack.yaml to support easy builds from source with stack."""]]

View file

@ -0,0 +1,24 @@
git-annex 5.20151208 released with [[!toggle text="these changes"]]
[[!toggleable text="""
* Build with -j1 again to get reproducible build.
* Display progress meter in -J mode when copying from a local git repo,
to a local git repo, and from a remote git repo.
* Display progress meter in -J mode when downloading from the web.
* map: Improve display of git remotes with non-ssh urls, including http
and gcrypt.
* When core.sharedRepository is set, annex object files are not made mode
444, since that prevents a user other than the file owner from locking
them. Instead, a mode such as 664 is used in this case.
* tahoe: Include tahoe capabilities in whereis display.
* import: Changed to honor annex.largefiles settings.
* addurl, importfeed: Changed to honor annex.largefiles settings,
when the content of the url is downloaded. (Not when using --fast or
--relaxed.)
* webapp: Fix bugs that could result in a relative path such as "."
being written to ~/.config/git-annex/autostart, and ignore any such
relative paths in the file.
This was a reversion caused by the relative path changes in 5.20150113.
* dropunused: Make more robust when trying to drop an object that has
already been dropped.
* Fix reversion in handling of long filenames, particularly when using
addurl/importfeed, which was introduced in the previous release."""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="cantora@432fae6be728a32ac472387df86a8922f059d4a6"
nickname="cantora"
subject="How to view configuration of special remotes?"
date="2015-12-08T08:29:12Z"
content="""
I don't remember which gpg key my s3 remote is using, but I can't seem to get git annex to tell me about the configuration of my s3 remote, which has the gpg key ID that I need to find (I need to restore it from many backed up keys, but need to know which one). Is there a way to view the remote metadata? I was hoping to see a command like `git annex remoteinfo NAME`. (git annex version: `5.20140408-gb37d538`).
"""]]

View file

@ -0,0 +1,22 @@
[[!comment format=mdwn
username="ben"
subject="Problems initializing glacier remote"
date="2015-12-08T10:39:30Z"
content="""
Hi, when I try to create a glacier remote, the command freezes without further output:
$ git init
$ git annex init
$ git annex initremote glacier type=glacier keyid=xxxxxxxx
initremote glacier (encryption setup)
I can see the following processes in sleep state:
11438 pts/0 S+ 0:00 git --git-dir=/home/b/Documents/annex/.git --work-tree=/home/b/Documents/annex cat-file --batch
11440 pts/0 SL+ 0:00 gpg2 --batch --no-tty --use-agent --quiet --trust-model always --gen-random --armor 2 512
I'm on fedora 22, git-annex version: 5.20140717. Any suggestions appreciated, thanks!
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="joey"
subject="""comment 8"""
date="2015-12-08T15:12:59Z"
content="""
@ben, it's generating the encryption key, and is blocked waiting on enropy.
You can pass --fast to use lower-quality randomness.
"""]]

View file

@ -1,5 +1,5 @@
Name: git-annex
Version: 5.20151116
Version: 5.20151208
Cabal-Version: >= 1.8
License: GPL-3
Maintainer: Joey Hess <id@joeyh.name>