Merge remote branch 'branchable/master'

This commit is contained in:
Joey Hess 2011-02-22 14:45:46 -04:00
commit 53282f4f67
2 changed files with 40 additions and 4 deletions

View file

@ -3,7 +3,7 @@ This is a rough sketch of a modification of git-annex to rely more on git commit
Summary
=========
Currently, the location tracking is only used for informational purposes unless a repository is [[trust]]ed, in which case there is no checking at all. It is proposed to use the location tracking information as a commitment to keep track of a file without any promise that it might not be dropped if another repository takes over responsibility.
Currently, [[location tracking]] is only used for informational purposes unless a repository is [[trust]]ed, in which case there is no checking at all. It is proposed to use the location tracking information as a commitment to keep track of a file until another repository takes over responsibility.
git's semantics for atomic commits are proposed to be used, which makes sure that before files are actually deleted, another repository has accepted the deletion.
@ -20,18 +20,18 @@ The new behavior would be to
* revert if that fails,
* otherwise really drop the files from the backend.
Unlike explicit checking, this never looks at the remote backend if the file is really present -- otoh, git-annex already relies on the files in the backend to not be touched by anyone but git-annex, and git-annex would only drop them if they were derefed and committed, in which case git would not accept the push. (git by itself would accept a merged push, but even if the reverting step failed due to a power outage or similar, git-annex would, before really deleting files from the backend, check again if the numcopies restraint is still met, and revert its own delete commit as the files are still present anyway.)
Unlike explicit checking, this never looks at the remote backend if the file is really present -- otoh, git-annex already relies on the files in the backend to not be touched by anyone but git-annex itself, and git-annex would only drop them if they were derefed and committed, in which case git would not accept the push. (git by itself would accept a merged push, but even if the reverting step failed due to a power outage or similar, git-annex would, before really deleting files from the backend, check again if the numcopies restraint is still met, and revert its own delete commit as the files are still present anyway.)
Implications for trust
==============
The proposed change also changes the semantics of trust. Trust can now be controlled in a finer-grained way between untrusted and semi-trusted, as best illustrated by a use case:
> Alice takes her netbook with her on a trip through Spain, and will fill most of its disk up with pictures she takes. As she expects to meet some old friends during the first days, she wants to take older pictures with her, which are safely backed up at home.
> Alice takes her netbook with her on a trip through Spain, and will fill most of its disk up with pictures she takes. As she expects to meet some old friends during the first days, she wants to take older pictures with her, which are safely backed up at home, so they can be deleted on demand.
>
> She tells her netbook's repository to dereference the old images (but not other parts of the repository she has not copied anywhere yet) and pushes to the server before leaving. When she adds pictures from her camera to the repository, git-annex can now free up space as needed.
Dereferencing could be implemented as `git annex drop --not-yet`, freeing space is similar to `dropunused`.
Dereferencing could be implemented as `git annex drop --no-rm` (or `move --no-rm`), freeing space is similar to `dropunused`.
A trusted repository with the new semantics would mean that the repository would not accept dropping anything, just as before.

View file

@ -0,0 +1,36 @@
[[!comment format=mdwn
username="http://joey.kitenet.net/"
nickname="joey"
subject="comment 1"
date="2011-02-22T18:44:28Z"
content="""
I see the following problems with this scheme:
- Disallows removal of files when disconnected. It's currently safe to force that, as long as
git-annex tells you enough other repos are belived to have the file. Just as long as you
only force on one machine (say your laptop). With your scheme, if you drop a file while
disconnected, any other host could see that the counter is still at N, because your
laptop had the file last time it was online, and can decide to drop the file, and lose the last
version.
- pushing a changed counter commit to other repos is tricky, because they're not bare, and
the network topology to get the commit pulled into the other repo could vary.
- Merging counter files issues. If the counter file doesn't automerge, two repos dropping the same file will conflict. But, if it does automerge, it breaks the counter conflict detection.
- Needing to revert commits is going to be annoying. An actual git revert
could probably not reliably be done. It's need to construct a revert
and commit it as a new commit. And then try to push that to remotes, and
what if *that* push conflicts?
- I do like the pre-removal dropping somewhat as an alternative to
trust checking. I think that can be done with current git-annex though,
just remove the files from the location log, but keep them in-annex.
Dropping a file only looks at repos that the location log says have a
file; so other repos can have retained a copy of a file secretly like
this, and can safely remove it at any time. I'd need to look into this a bit more to be 100% sure it's safe, but have started [[todo/hidden_files]].
- I don't see any reduced round trips. It still has to contact N other
repos on drop. Now, rather than checking that they have a file, it needs
to push a change to them.
"""]]