git-annex/Annex
Joey Hess 67245ae00f
fully specify the pointer file format
This format is designed to detect accidental appends, while having some
room for future expansion.

Detect when an unlocked file whose content is not present has gotten some
other content appended to it, and avoid treating it as a pointer file, so
that appended content will not be checked into git, but will be annexed
like any other file.

Dropped the max size of a pointer file down to 32kb, it was around 80 kb,
but without any good reason and certianly there are no valid pointer files
anywhere that are larger than 8kb, because it's just been specified what it
means for a pointer file with additional data even looks like.

I assume 32kb will be good enough for anyone. ;-) Really though, it needs
to be some smallish number, because that much of a file in git gets read
into memory when eg, catting pointer files. And since we have no use cases
for the extra lines of a pointer file yet, except possibly to add
some human-visible explanation that it is a git-annex pointer file, 32k
seems as reasonable an arbitrary number as anything. Increasing it would be
possible, eg to 64k, as long as users of such jumbo pointer files didn't
mind upgrading all their git-annex installations to one that supports the
new larger size.

Sponsored-by: Dartmouth College's Datalad project
2022-02-23 14:20:31 -04:00
..
AdjustedBranch annex.adjustedbranchrefresh 2020-11-16 14:27:28 -04:00
Branch handle transitions with read-only unmerged git-annex branches 2021-12-28 13:23:32 -04:00
Concurrent differentiate between concurrency enabled at command line and by git config 2020-09-16 11:47:12 -04:00
Content detect v10 upgrade while running 2022-01-21 12:56:38 -04:00
Debug implement fastDebug 2021-04-06 15:24:28 -04:00
LockPool fine-grained locking when annex.pidlock is enabled 2021-12-03 17:20:21 -04:00
MetaData update licenses from GPL to AGPL 2019-03-13 15:48:14 -04:00
SpecialRemote remove redundant imports 2020-06-22 11:05:34 -04:00
VectorClock deal better with clock skew situations, using vector clocks 2021-08-04 12:33:46 -04:00
View Fix test suite failure on Windows 2021-08-24 14:03:29 -04:00
Action.hs fix cat-file leak in get with -J 2021-11-19 12:51:08 -04:00
AdjustedBranch.hs sync --quiet 2021-07-19 11:28:47 -04:00
AutoMerge.hs support git 2.34.0's handling of merge conflict between annexed and non-annexed file 2021-11-22 16:10:24 -04:00
BloomFilter.hs Revert "data type that starts off using a set but converts to a bloom filter when large" 2020-07-01 20:12:19 -04:00
Branch.hs handle transitions with read-only unmerged git-annex branches 2021-12-28 13:23:32 -04:00
BranchState.hs handle transitions with read-only unmerged git-annex branches 2021-12-28 13:23:32 -04:00
CatFile.hs read a consistent amount from pointer file 2022-02-23 12:52:34 -04:00
ChangedRefs.hs more RawFilePath conversion 2020-10-29 14:20:57 -04:00
CheckAttr.hs mincopies 2021-01-06 14:15:19 -04:00
CheckIgnore.hs more RawFilePath conversion 2020-11-03 10:11:04 -04:00
Common.hs use fastDebug everywhere it can be used 2021-04-06 15:41:24 -04:00
Concurrent.hs remove unused import 2021-11-23 16:15:57 -04:00
Content.hs info: Allow using matching options in more situations 2022-02-21 14:46:07 -04:00
CopyFile.hs factor out IncrementalHasher from IncrementalVerifier 2021-11-09 12:33:22 -04:00
CurrentBranch.hs refactor getCurrentBranch 2018-10-19 17:29:18 -04:00
Debug.hs fix fastDebug to check if debugging is actually enabled 2021-04-06 16:28:37 -04:00
Difference.hs include git-annex-shell back in 2019-12-02 11:51:52 -04:00
DirHashes.hs Added http special remote, which is useful for accessing other remotes that publish content stored in them via http/https. 2020-09-01 15:16:35 -04:00
Drop.hs dropping unused marks as dead 2021-06-25 15:22:26 -04:00
Environment.hs include git-annex-shell back in 2019-12-02 11:51:52 -04:00
Export.hs convert Key to ShortByteString 2021-10-05 20:20:08 -04:00
ExternalAddonProcess.hs use fastDebug everywhere it can be used 2021-04-06 15:41:24 -04:00
FileMatcher.hs prep for fixing find --branch --unlocked 2021-03-02 13:39:31 -04:00
Fixup.hs fix a bug that prevented git-annex init from working in a submodule 2021-01-21 15:33:15 -04:00
GitOverlay.hs add: Significantly speed up adding lots of non-large files to git 2021-01-04 13:12:28 -04:00
HashObject.hs more RawFilePath conversion 2020-10-28 17:25:59 -04:00
Hook.hs don't try to remove pre-commit-annex and post-update-annex-hooks 2020-10-19 13:13:49 -04:00
Import.hs ImportableContentsChunkable 2021-10-08 13:15:22 -04:00
Ingest.hs fix error message 2021-12-09 15:25:59 -04:00
Init.hs shared repository content file permissions for v9 2022-01-11 16:50:50 -04:00
InodeSentinal.hs add debugging in sameInodeCache 2021-07-26 10:58:07 -04:00
Journal.hs when private journal file exists, still read from git-annex branch 2021-10-26 13:43:50 -04:00
Link.hs fully specify the pointer file format 2022-02-23 14:20:31 -04:00
Locations.hs fix failing readonly test case 2022-01-21 13:49:31 -04:00
LockFile.hs more RawFilePath conversion 2020-10-29 10:50:29 -04:00
LockPool.hs update licenses from GPL to AGPL 2019-03-13 15:48:14 -04:00
Magic.hs Serialize use of C magic library, which is not thread safe. 2020-09-17 17:27:42 -04:00
MetaData.hs fix error message 2021-12-09 15:25:59 -04:00
Multicast.hs use programPath consistently, not readProgramFile 2020-03-30 16:06:27 -04:00
Notification.hs wip RawFilePath 2x git-annex find speedup 2019-11-26 16:01:58 -04:00
NumCopies.hs drop, move, mirror: when two files have the same content, honor the max numcopies and requiredcopies 2021-06-15 11:38:44 -04:00
Path.hs assistant: Fix a crash on startup by avoiding using forkProcess 2021-05-12 15:08:03 -04:00
Perms.hs v9 upgrade implemented 2022-01-13 13:25:10 -04:00
PidLock.hs close pid lock only once no threads use it 2021-12-06 15:01:39 -04:00
Queue.hs Avoid git status taking a long time after git-annex unlock of many files. 2022-02-18 15:06:40 -04:00
RemoteTrackingBranch.hs refactor 2019-11-11 19:10:52 -04:00
ReplaceFile.hs fix test suite 2021-08-02 13:59:23 -04:00
SpecialRemote.hs renameremote: Better handling of case where there are multiple special remotes with a name 2022-01-05 15:24:02 -04:00
Ssh.hs Added annex.adviceNoSshCaching config. 2021-05-27 12:37:49 -04:00
StallDetection.hs bwlimit 2021-09-21 16:58:10 -04:00
TaggedPush.hs Ref ByteString conversion done 2020-04-07 17:41:09 -04:00
Tmp.hs propagate signals to the transferrer process group 2020-12-11 15:32:00 -04:00
Transfer.hs add a comment about checkSaneLock 2021-10-27 14:55:30 -04:00
TransferrerPool.hs avoid using temp file size when deciding whether to retry failed transfer 2021-06-25 12:04:23 -04:00
UntrustedFilePath.hs importfeed: Fix reversion that caused some '.' in filenames to be replaced with '_' 2020-08-05 11:35:00 -04:00
UpdateInstead.hs v7 for all repositories 2019-08-30 14:09:14 -04:00
Url.hs incremental verification for web special remote 2021-08-18 15:02:22 -04:00
UUID.hs simplify and speed up Utility.FileSystemEncoding 2021-08-11 12:13:31 -04:00
VariantFile.hs more RawFilePath 2019-12-18 17:10:28 -04:00
VectorClock.hs deal better with clock skew situations, using vector clocks 2021-08-04 12:33:46 -04:00
Verify.hs factor out IncrementalHasher from IncrementalVerifier 2021-11-09 12:33:22 -04:00
Version.hs have v9 autoupgrade to v10 2022-01-26 13:16:06 -04:00
View.hs Fix a bug in view filename generation when a metadata value ended with "/" 2021-01-22 14:05:14 -04:00
Wanted.hs prevent dropping required content of other file using same content 2021-05-25 11:34:06 -04:00
WorkerPool.hs start splitting out readonly values from AnnexState 2021-04-02 15:51:44 -04:00
WorkTree.hs work around strange auto-init bug 2021-07-30 18:36:03 -04:00
YoutubeDl.hs fix comment typo 2021-11-17 13:03:37 -04:00