git-annex/doc/internals/pointer_file.mdwn
Joey Hess 67245ae00f
fully specify the pointer file format
This format is designed to detect accidental appends, while having some
room for future expansion.

Detect when an unlocked file whose content is not present has gotten some
other content appended to it, and avoid treating it as a pointer file, so
that appended content will not be checked into git, but will be annexed
like any other file.

Dropped the max size of a pointer file down to 32kb, it was around 80 kb,
but without any good reason and certianly there are no valid pointer files
anywhere that are larger than 8kb, because it's just been specified what it
means for a pointer file with additional data even looks like.

I assume 32kb will be good enough for anyone. ;-) Really though, it needs
to be some smallish number, because that much of a file in git gets read
into memory when eg, catting pointer files. And since we have no use cases
for the extra lines of a pointer file yet, except possibly to add
some human-visible explanation that it is a git-annex pointer file, 32k
seems as reasonable an arbitrary number as anything. Increasing it would be
possible, eg to 64k, as long as users of such jumbo pointer files didn't
mind upgrading all their git-annex installations to one that supports the
new larger size.

Sponsored-by: Dartmouth College's Datalad project
2022-02-23 14:20:31 -04:00

25 lines
1.2 KiB
Markdown

A pointer file is one of two ways that an annex object can be checked into
git. The other is a symbolic link pointing to a file in the
.git/annex/objects/ directory.
A pointer file starts with "/annex/objects/", which is followed
by the key (see [[key_format]]). (In some situations a pointer file
might instead contain the content of a symlink target.)
Pointer files usually have a newline after the key. This is not required.
A carriage return followed by a newline is also accepted, as is end of file.
After that, there is usually nothing more in a pointer file, but git-annex
does support pointer files with additional text on subsequent lines.
Every such subsequent line has to contain "/annex/" somewhere in it,
and end in a newline. Otherwise it not considered to be a valid pointer file.
The maximum size of a pointer file is 32 kb. If it is any longer, it is not
considered to be a valid pointer file.
The possibility exists that a pointer file is in a working tree,
representing an annex object that is not present, and something appends
data onto it accidentally. The limitation that each line of a valid
pointer file contains "/annex/" and that it cannot be larger than 32kb
let such a situation be detected.