fully specify the pointer file format
This format is designed to detect accidental appends, while having some room for future expansion. Detect when an unlocked file whose content is not present has gotten some other content appended to it, and avoid treating it as a pointer file, so that appended content will not be checked into git, but will be annexed like any other file. Dropped the max size of a pointer file down to 32kb, it was around 80 kb, but without any good reason and certianly there are no valid pointer files anywhere that are larger than 8kb, because it's just been specified what it means for a pointer file with additional data even looks like. I assume 32kb will be good enough for anyone. ;-) Really though, it needs to be some smallish number, because that much of a file in git gets read into memory when eg, catting pointer files. And since we have no use cases for the extra lines of a pointer file yet, except possibly to add some human-visible explanation that it is a git-annex pointer file, 32k seems as reasonable an arbitrary number as anything. Increasing it would be possible, eg to 64k, as long as users of such jumbo pointer files didn't mind upgrading all their git-annex installations to one that supports the new larger size. Sponsored-by: Dartmouth College's Datalad project
This commit is contained in:
parent
649464619e
commit
67245ae00f
5 changed files with 113 additions and 9 deletions
|
@ -0,0 +1,30 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 4"""
|
||||
date="2022-02-23T16:45:24Z"
|
||||
content="""
|
||||
I've now specified a format in [[internals/pointer_file]], which is
|
||||
designed to allow detecting accidental appends.
|
||||
|
||||
And git-annex will now treat a pointer file that has been appeneded to as
|
||||
not a pointer file any longer.
|
||||
|
||||
So, for example:
|
||||
|
||||
joey@darkstar:/tmp/r>echo oops >> foo
|
||||
joey@darkstar:/tmp/r>cat foo
|
||||
/annex/objects/SHA256E-s14169--bdcf6188db530bc3af79c898208ce2a56df6197f59b3872b03613a248ac8faf4
|
||||
oops
|
||||
joey@darkstar:/tmp/r>git add foo
|
||||
joey@darkstar:/tmp/r>git diff --cached foo | tail -n 2
|
||||
-/annex/objects/SHA256E-s14169--bdcf6188db530bc3af79c898208ce2a56df6197f59b3872b03613a248ac8faf4
|
||||
+/annex/objects/SHA256E-s101--b7da3d6b0ad2f6a2a263e783e59efb60f2520f03bb36cea35a556a684b0d5c9d
|
||||
|
||||
Since the file is not a valid pointer file after being appended to,
|
||||
git add does what it would do with any file, in this case adding the
|
||||
content to the annex.
|
||||
|
||||
So at least it keeps the possibly large appeneded content out of git now.
|
||||
I think that's the most important thing. Detecting and warning about
|
||||
pointer files that are not valid due to appends should be easy from here.
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue