Merge branch 'master' into smudge

This commit is contained in:
Joey Hess 2015-12-10 14:07:11 -04:00
commit 3d936fdb59
Failed to extract signature
11 changed files with 196 additions and 0 deletions

View file

@ -37,6 +37,12 @@ buildFlags = filter (not . null)
#endif
#ifdef WITH_S3
, "S3"
#if MIN_VERSION_aws(0,10,6)
++ "(multipartupload)"
#endif
#if MIN_VERSION_aws(0,13,0)
++ "(storageclasses)"
#endif
#else
#warning Building without S3.
#endif

6
debian/changelog vendored
View file

@ -12,6 +12,12 @@ git-annex (6.20151225) unstable; urgency=medium
-- Joey Hess <id@joeyh.name> Tue, 08 Dec 2015 11:14:03 -0400
git-annex (5.20151209) UNRELEASED; urgency=medium
* Add S3 features to git-annex version output.
-- Joey Hess <id@joeyh.name> Thu, 10 Dec 2015 11:39:34 -0400
git-annex (5.20151208) unstable; urgency=medium
* Build with -j1 again to get reproducible build.

View file

@ -0,0 +1,57 @@
### Please describe the problem.
I have a README file in my repository, which is an ordinary text file added with `git add` (not `git annex add`). This seems fine with all the Linux machines, including ones running the assistant. However, when I start the assistant on Android, it converts it to an annexed file. I don't have any other direct mode repositories to check if it's a direct mode problem or an Android problem.
My setup is basically a star tolopogy with a Debian GNU/Linux (jessie) box in the middle. All the clients are Debian as well, mostly testing, except for an Android tablet.
### What steps will reproduce the problem?
On a Linux box:
git add README
git commit -m 'README'
git annex sync --content
then on Android, start git-annex and let the assistant sync.
You'll get a commit like this:
$ git show 4f1c76374c75a11702c14ea6a5dbe82c99c6dd08
commit 4f1c76374c75a11702c14ea6a5dbe82c99c6dd08
Author: android <git-annex@android>
Date: Wed Dec 9 15:49:01 2015 -0500
git-annex in Smoot /sdcard/Westerley-Board
diff --git a/Contracts/Archive/README b/Contracts/Archive/README
deleted file mode 100644
index 8fe1349..0000000
--- a/Contracts/Archive/README
+++ /dev/null
@@ -1,3 +0,0 @@
-These are old, no longer active contracts. The year is the year of
-archival (typically when the contract ended, or the last year covered by
-the contract).
diff --git a/Contracts/Archive/README b/Contracts/Archive/README
new file mode 120000
index 0000000..38ba43f
--- /dev/null
+++ b/Contracts/Archive/README
@@ -0,0 +1 @@
+../../.git/annex/objects/0v/9K/SHA256E-s155--d0e49ec7e493366a5afea5bc12629ba579fd8407162795c22a6346c25bafbb6e/SHA256E-s155--d0e49ec7e493366a5afea5bc12629ba579fd8407162795c22a6346c25bafbb6e
\ No newline at end of file
### What version of git-annex are you using? On what operating system?
Android is fairly recent, unfortunately the battery is currently dead
making it hard to check :-( Must be at least 5.20151019, but probably
20151116.
### Please provide any additional information below.
If needed, I'll grab the assistant log from the tablet once the battery
is charged.
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Yes! I've been using git-annex quite a bit over the past year, for everything from my music collection to my personal files. Using it for a not-for-profit too. Even trying to get some Mac and Windows users to use it for our HOA's files. I'm looking forward to smudge mode to make direct mode work better.

View file

@ -0,0 +1,16 @@
[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2015-12-10T15:23:12Z"
content="""
I've added something to `git annex version` about this today:
"S3(storageclasses)"
The prebuilt linux builds currently use package versions from Debian
unstable, so we have to wait for the aws package to get updated there.
(I could hand-hack it, but that would make the build break later.)
The Windows and OSX builds have already been updated; the latter only just
now, so it'll be in the next daily build. Updating the Android build
environment is a massive pain and I try to only do that bi-anually.
"""]]

View file

@ -0,0 +1,13 @@
[[!comment format=mdwn
username="https://openid.stackexchange.com/user/27ceb3c5-0762-42b8-8f8a-ed21c284748f"
nickname="g"
subject="The downside"
date="2015-12-10T03:45:09Z"
content="""
If I'm understanding correctly, that one downside (requiring all checkouts to have all files be direct if any filesystems require it) seems to be a fairly major limitation, no? Changing the concept of locked/unlocked files from being a local, per-repo concern to a global one seems like quite a major change.
For instance, would mean that any public repo using git annex for distributing a set of data files would either have to have all files be unlocked, or else no one would be able clone onto a FAT32-formatted external hdd?
FWIW, the particular use case I'm concerned about personally is having my annexes on my android device.
"""]]

View file

@ -0,0 +1,30 @@
[[!comment format=mdwn
username="joey"
subject="""comment 5"""
date="2015-12-10T15:00:52Z"
content="""
I'm concerned about that too. But it may be possible to finesse it,
when git-annex is running on a crippled filesystem, it may be able to
unlock all files as it gets content for them, producing a local fork.
The first difficulty would be avoiding or autoresolving conflicts
between locked and unlocked when merging changes into that fork. I think
this is very tractable; such a conflict comes down mostly to the symlink
bit in the tree object.
The real difficulty would be that any pushes from that fork would include
its change converting all files to unlocked. Although it's fairly mechanical
to convert such a commit into one that doesn't unlock files, so perhaps
that could be automated somehow on push or merge.
There's also a small and probably easy to implement git change that
would avoid all this complexity: If git's smudge filters were optionally
able to run on the link-text of symlinks, then a file could be unlocked
locally without changing what's in the repo and all the smudge stuff
would still work on it.
Crippled filesystems aside, I think there's value in being able to unlock
files across clones of a repo. For example, a repo could have a workflow
where the files for the current episiode/experiment/whatever start out
unlocked and are locked once it's complete.
"""]]

View file

@ -0,0 +1,26 @@
Well, another day working on smudge filters, or unlocked files as the
feature will be known when it's ready. Got both `git annex get` and `git
annex drop` working for these files today.
Get was the easy part; it just has to hard link or copy the object to the
work tree file(s) that point to it.
Handling dropping was hard. If the user drops a file, but it's unlocked and
modified, it shouldn't reset it to the pointer file. For this, I reused the
InodeCache stuff that was built for direct mode. So the sqlite database
tracks the InodeCaches of unlocked files, and when a key is dropped it can
check if the file is modified.
But that's not a complete solution, because when git uses a clean filter,
it will write the file itself, and git-annex won't have an InodeCache for
it. To handle this case, git-annex will fall back to verifying the content
of the file when dropping it if its InodeCache isn't known.
Bit of a shame to need an expensive checksum to drop an unlocked file;
maybe the git clean filter interface will eventually be improved to let
git-annex use it more efficiently.
Anyway, smudged aka unlocked files are working now well enough to be a
proof of concept. I have several missing safety checks that need to be
added to get the implementation to be really correct, and quite a lot
of polishing still to do, including making `unlock`, `lock`, `fsck`,
and `merge` handle them, and finishing repository upgrade code.

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="joey"
subject="""comment 21"""
date="2015-12-10T15:20:43Z"
content="""
@cantora with a recent enough version of git-annex, `git annex info
$theremotename` will show quite a lot of information about a special
remote, including encryption details.
"""]]

View file

@ -0,0 +1,7 @@
[[!comment format=mdwn
username="https://me.yahoo.com/a/EbvxpTI_xP9Aod7Mg4cwGhgjrCrdM5s-#7c0f4"
subject="anyone saw/worked on backend for watchdox service? (not free one but needed :-/)"
date="2015-12-08T19:45:02Z"
content="""
subject
"""]]

View file

@ -0,0 +1,7 @@
[[!comment format=mdwn
username="openmedi"
subject="comment 31"
date="2015-12-09T20:18:48Z"
content="""
How does git-annex handle space issues with special remotes? For example my Owncloud instance has 100 GB space. What happens if I run out of space on that remote? Does git-annex handle that gracefully? Do I have to do something? Can I set a sort of \"quota\"?
"""]]

View file

@ -0,0 +1,19 @@
[[!comment format=mdwn
username="joey"
subject="""comment 32"""
date="2015-12-10T15:15:42Z"
content="""
@openmedi git-annex doesn't currently keep track of how much space it's
using on a special remote. It's actually quite a difficult problem to do
that in general, since multiple distributed clones of a repository can be
uploading to the same special remote at the same time.
If it runs out of space and transfers fail, git-annex will handle the
failures semi-gracefully, which is to say nothing will stop it from trying
again or trying to send other data, but it will certianly be aware that
files are not reaching the special remote.
If a particular storage service has a way to check free space, it would not
be hard to make git-annex's special remote implementation check it and
avoid trying transfers when it's full.
"""]]