file map analysis
This commit is contained in:
parent
d2e83db759
commit
3f63666727
1 changed files with 64 additions and 15 deletions
|
@ -162,8 +162,8 @@ Data:
|
||||||
be a message saying that the file's content is not currently available.
|
be a message saying that the file's content is not currently available.
|
||||||
An annex pointer file is checked into the git repository the same way
|
An annex pointer file is checked into the git repository the same way
|
||||||
that an annex symlink is checked in.
|
that an annex symlink is checked in.
|
||||||
* file2key maps are maintained by git-annex, to keep track of
|
* A file map is maintained by git-annex, to keep track of the keys
|
||||||
what files are pointers at keys.
|
that are used by files in the working tree.
|
||||||
|
|
||||||
Configuration:
|
Configuration:
|
||||||
|
|
||||||
|
@ -206,16 +206,16 @@ git-annex clean:
|
||||||
also drop that copy once the object gets uploaded to another repo ...
|
also drop that copy once the object gets uploaded to another repo ...
|
||||||
But that gets complicated quickly.
|
But that gets complicated quickly.
|
||||||
|
|
||||||
Update file2key map.
|
Update file map.
|
||||||
|
|
||||||
Output the pointer file content to stdout.
|
Output the pointer file content to stdout.
|
||||||
|
|
||||||
git-annex smudge:
|
git-annex smudge:
|
||||||
|
|
||||||
* Run by eg `git checkout` and passed the filename, as well as fed
|
* Run by eg `git checkout`
|
||||||
the pointer file content on stdin.
|
and passed the filename, as well as fed the pointer file content on stdin.
|
||||||
|
|
||||||
Updates file2key map.
|
Update file map.
|
||||||
|
|
||||||
When an object is present in the annex, outputs its content to stdout.
|
When an object is present in the annex, outputs its content to stdout.
|
||||||
Otherwise, outputs the file pointer content.
|
Otherwise, outputs the file pointer content.
|
||||||
|
@ -242,16 +242,65 @@ git annex lock/unlock:
|
||||||
itself to break such a hard link. Always finish by locking down the
|
itself to break such a hard link. Always finish by locking down the
|
||||||
permissions of the annex object.
|
permissions of the annex object.
|
||||||
|
|
||||||
All other git-annex commands that look at annex symlinks to get keys will
|
#### file map
|
||||||
need fall back to checking if a given work tree file is stored in git as
|
|
||||||
pointer file. This can be done by checking the file2key map (or by looking
|
|
||||||
it up in the index).
|
|
||||||
|
|
||||||
Note that I have not verified if file2key maps can be maintained
|
The file map needs to map from `Key -> [File]`. `File -> Key`
|
||||||
consistently using the smudge/clean filters. Seems likely to work,
|
seems useful to have, but in practice is not worthwhile.
|
||||||
based on when I see smudge/clean filters being run. The file2key
|
|
||||||
optimisation may not be needed though, looking at the index
|
Drop and get operations need to know what files in the work tree use a
|
||||||
might be fast enough.
|
given key in order to update the work tree.
|
||||||
|
|
||||||
|
git-annex commands that look at annex symlinks to get keys to act on will
|
||||||
|
need fall back to either consulting the file map, or looking at the staged
|
||||||
|
file to see if it's a pointer to a key. So a `File -> Key` map is a possible
|
||||||
|
optimisation.
|
||||||
|
|
||||||
|
Question: If the smudge/clean filters update the file map incrementally
|
||||||
|
based on the pointer files they generate/see, will the result
|
||||||
|
always be consistent with the content of the working tree?
|
||||||
|
|
||||||
|
This depends on when git calls the smudge/clean filters and on what.
|
||||||
|
In particular:
|
||||||
|
|
||||||
|
* Does the clean filter always get called when adding a relevant
|
||||||
|
file to git? Yes.
|
||||||
|
* Is the clean filter called at any other time? Yes, for example
|
||||||
|
git diff will clean relevant modified files to generate the diff.
|
||||||
|
So, the clean filter may see file versions that have not yet been staged
|
||||||
|
in git.
|
||||||
|
* Is the clean filter ever passed content not in the work tree?
|
||||||
|
I don't think so, but not 100% sure.
|
||||||
|
* Is the smudge filter always called when git updates a relevant file
|
||||||
|
in the work tree? Yes.
|
||||||
|
* Is the smudge filter called at any other time? Seems unlikely but then
|
||||||
|
there could be situations with a detached work tree or such.
|
||||||
|
* Does git call any useful hooks when removing a file from the work tree,
|
||||||
|
or converting it to not be annexed?
|
||||||
|
No!
|
||||||
|
|
||||||
|
From this analysis, any file map generated by the smudge/clean filters
|
||||||
|
is necessary potentially innaccurate. It may list deleted files.
|
||||||
|
It may or may not reflect current unstaged changes from the work tree.
|
||||||
|
|
||||||
|
Follows that any use of the file map needs to verify the info from it,
|
||||||
|
and throw out bad cached info (updating the map to match reality).
|
||||||
|
|
||||||
|
When downloading a key, check if the files listed in the file map are
|
||||||
|
still pointer files in the work tree, and only replace them with the
|
||||||
|
content if so.
|
||||||
|
|
||||||
|
When dropping a key, check if the files listed for it in the file map are
|
||||||
|
unmodified in the work tree, and are staged as pointers to the key,
|
||||||
|
and only reset them to the pointers if so. Note that this means that
|
||||||
|
a modified work tree file that has not yet been staged, but that
|
||||||
|
corresponds to a key, won't be reset when the key is dropped.
|
||||||
|
This is probably not a big deal; the user will either add the
|
||||||
|
file, which will add the key back, or reset the file.
|
||||||
|
|
||||||
|
Does the `File -> Key` map have any benefits given this innaccuracy?
|
||||||
|
Answer seems to be no; any answer that map gives may be innaccurate and
|
||||||
|
needs to be verified by looking at actual repo content, so might as well
|
||||||
|
just look at the repo content in the first place..
|
||||||
|
|
||||||
#### Upgrading
|
#### Upgrading
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue