file map analysis
This commit is contained in:
parent
d2e83db759
commit
3f63666727
1 changed files with 64 additions and 15 deletions
|
@ -162,8 +162,8 @@ Data:
|
|||
be a message saying that the file's content is not currently available.
|
||||
An annex pointer file is checked into the git repository the same way
|
||||
that an annex symlink is checked in.
|
||||
* file2key maps are maintained by git-annex, to keep track of
|
||||
what files are pointers at keys.
|
||||
* A file map is maintained by git-annex, to keep track of the keys
|
||||
that are used by files in the working tree.
|
||||
|
||||
Configuration:
|
||||
|
||||
|
@ -206,16 +206,16 @@ git-annex clean:
|
|||
also drop that copy once the object gets uploaded to another repo ...
|
||||
But that gets complicated quickly.
|
||||
|
||||
Update file2key map.
|
||||
Update file map.
|
||||
|
||||
Output the pointer file content to stdout.
|
||||
|
||||
git-annex smudge:
|
||||
|
||||
* Run by eg `git checkout` and passed the filename, as well as fed
|
||||
the pointer file content on stdin.
|
||||
* Run by eg `git checkout`
|
||||
and passed the filename, as well as fed the pointer file content on stdin.
|
||||
|
||||
Updates file2key map.
|
||||
Update file map.
|
||||
|
||||
When an object is present in the annex, outputs its content to stdout.
|
||||
Otherwise, outputs the file pointer content.
|
||||
|
@ -242,16 +242,65 @@ git annex lock/unlock:
|
|||
itself to break such a hard link. Always finish by locking down the
|
||||
permissions of the annex object.
|
||||
|
||||
All other git-annex commands that look at annex symlinks to get keys will
|
||||
need fall back to checking if a given work tree file is stored in git as
|
||||
pointer file. This can be done by checking the file2key map (or by looking
|
||||
it up in the index).
|
||||
#### file map
|
||||
|
||||
Note that I have not verified if file2key maps can be maintained
|
||||
consistently using the smudge/clean filters. Seems likely to work,
|
||||
based on when I see smudge/clean filters being run. The file2key
|
||||
optimisation may not be needed though, looking at the index
|
||||
might be fast enough.
|
||||
The file map needs to map from `Key -> [File]`. `File -> Key`
|
||||
seems useful to have, but in practice is not worthwhile.
|
||||
|
||||
Drop and get operations need to know what files in the work tree use a
|
||||
given key in order to update the work tree.
|
||||
|
||||
git-annex commands that look at annex symlinks to get keys to act on will
|
||||
need fall back to either consulting the file map, or looking at the staged
|
||||
file to see if it's a pointer to a key. So a `File -> Key` map is a possible
|
||||
optimisation.
|
||||
|
||||
Question: If the smudge/clean filters update the file map incrementally
|
||||
based on the pointer files they generate/see, will the result
|
||||
always be consistent with the content of the working tree?
|
||||
|
||||
This depends on when git calls the smudge/clean filters and on what.
|
||||
In particular:
|
||||
|
||||
* Does the clean filter always get called when adding a relevant
|
||||
file to git? Yes.
|
||||
* Is the clean filter called at any other time? Yes, for example
|
||||
git diff will clean relevant modified files to generate the diff.
|
||||
So, the clean filter may see file versions that have not yet been staged
|
||||
in git.
|
||||
* Is the clean filter ever passed content not in the work tree?
|
||||
I don't think so, but not 100% sure.
|
||||
* Is the smudge filter always called when git updates a relevant file
|
||||
in the work tree? Yes.
|
||||
* Is the smudge filter called at any other time? Seems unlikely but then
|
||||
there could be situations with a detached work tree or such.
|
||||
* Does git call any useful hooks when removing a file from the work tree,
|
||||
or converting it to not be annexed?
|
||||
No!
|
||||
|
||||
From this analysis, any file map generated by the smudge/clean filters
|
||||
is necessary potentially innaccurate. It may list deleted files.
|
||||
It may or may not reflect current unstaged changes from the work tree.
|
||||
|
||||
Follows that any use of the file map needs to verify the info from it,
|
||||
and throw out bad cached info (updating the map to match reality).
|
||||
|
||||
When downloading a key, check if the files listed in the file map are
|
||||
still pointer files in the work tree, and only replace them with the
|
||||
content if so.
|
||||
|
||||
When dropping a key, check if the files listed for it in the file map are
|
||||
unmodified in the work tree, and are staged as pointers to the key,
|
||||
and only reset them to the pointers if so. Note that this means that
|
||||
a modified work tree file that has not yet been staged, but that
|
||||
corresponds to a key, won't be reset when the key is dropped.
|
||||
This is probably not a big deal; the user will either add the
|
||||
file, which will add the key back, or reset the file.
|
||||
|
||||
Does the `File -> Key` map have any benefits given this innaccuracy?
|
||||
Answer seems to be no; any answer that map gives may be innaccurate and
|
||||
needs to be verified by looking at actual repo content, so might as well
|
||||
just look at the repo content in the first place..
|
||||
|
||||
#### Upgrading
|
||||
|
||||
|
|
Loading…
Reference in a new issue