git-annex/doc/todo/Allow_globally_limiting_filename_length.mdwn

10 lines
2.3 KiB
Text
Raw Normal View History

2016-09-09 18:00:41 +00:00
I have some Git Annex repos that I keep copies of on both NTFS on Linux and on ecryptfs (which Ubuntu uses for home directory encryption) on Linux. Now, ecryptfs allows each path component of a filename to be only up to 140-ish characters, because it has to encrypt that filename, add some encryption info to it, and store it inside another filename on a backing ext4 filesystem (which limits path components to 255 characters).
Several times now I've added a bunch of stuff to my annex on the NTFS checkout, where path components are allowed to be longer than 140 characters, synced it over to my other annex checkout on ecryptfs, and then had Git Annex fail during the sync, trying to create these empty symlinks with path components too long for the filesystem it is on. When in this state, I don't really know how to fix it. I can't just "git mv" the offending file to a valid name, both because "git mv" needs the source file to be on disk in the first place and because the failed "git checkout" leaves my repo thinking it has thousands of untracked files (because some stuff did get created, but git refused to officially move to the commit it was trying to check out, because the checkout failed).
I am looking for a solution for this inside Git Annex. The simplest thing, I think, would be to set a max path component length for the whole set of repos, so I could get an error when I go to "git annex add" on the NTFS checkout that the filenames being added are too long for some of the repos that will eventually want to check them out. Is it possible to do this with a pre-commit hook somehow?
The next simplest thing would be for Git Annex to look at the filesystem it is running on and do something smarter than exploding and leaving my repo in a weird out-of-sync state if some of the filenames it wants to create can't be created. Maybe it should fail the sync earlier, in Git Annex itself rather than in git checkout. Maybe it should just leave those files out of the checkout, or force/allow me to rename them right then.
The most complex thing would be to somehow make it work anyway and check out the symlinks under different, valid names. Perhaps it could just truncate those path components in the symlink view? There's already support for different metadata views; this would be sort of like that. You get a special view of the repo subject to the constraints of your filesystem.