git-annex/Backend/WORM.hs

67 lines
1.9 KiB
Haskell
Raw Normal View History

2010-10-15 20:42:36 +00:00
{- git-annex "WORM" backend -- Write Once, Read Many
2010-10-27 20:53:54 +00:00
-
- Copyright 2010 Joey Hess <id@joeyh.name>
2010-10-27 20:53:54 +00:00
-
- Licensed under the GNU AGPL version 3 or higher.
2010-10-27 20:53:54 +00:00
-}
2010-10-15 20:42:36 +00:00
module Backend.WORM (backends) where
2010-10-15 20:42:36 +00:00
import Annex.Common
import Types.Key
import Types.Backend
import Types.KeySource
Better sanitization of problem characters when generating URL and WORM keys. FAT has a lot of characters it does not allow in filenames, like ? and * It's probably the worst offender, but other filesystems also have limitiations. In 2011, I made keyFile escape : to handle FAT, but missed the other characters. It also turns out that when I did that, I was also living dangerously; any existing keys that contained a : had their object location change. Oops. So, adding new characters to escape to keyFile is out. Well, it would be possible to make keyFile behave differently on a per-filesystem basis, but this would be a real nightmare to get right. Consider that a rsync special remote uses keyFile to determine the filenames to use, and we don't know the underlying filesystem on the rsync server.. Instead, I have gone for a solution that is backwards compatable and simple. Its only downside is that already generated URL and WORM keys might not be able to be stored on FAT or some other filesystem that dislikes a character used in the key. (In this case, the user can just migrate the problem keys to a checksumming backend. If this became a big problem, fsck could be made to detect these and suggest a migration.) Going forward, new keys that are created will escape all characters that are likely to cause problems. And if some filesystem comes along that's even worse than FAT (seems unlikely, but here it is 2013, and people are still using FAT!), additional characters can be added to the set that are escaped without difficulty. (Also, made WORM limit the part of the filename that is embedded in the key, to deal with filesystem filename length limits. This could have already been a problem, but is more likely now, since the escaping of the filename can make it longer.) This commit was sponsored by Ian Downes
2013-10-05 19:01:49 +00:00
import Backend.Utilities
import Git.FilePath
import Utility.Metered
2010-10-16 20:20:49 +00:00
import qualified Data.ByteString.Char8 as S8
2020-02-21 13:34:59 +00:00
import qualified Utility.RawFilePath as R
2011-12-31 08:11:39 +00:00
backends :: [Backend]
backends = [backend]
2011-12-31 08:11:39 +00:00
backend :: Backend
backend = Backend
{ backendVariety = WORMKey
, genKey = Just keyValue
, verifyKeyContent = Nothing
, verifyKeyContentIncrementally = Nothing
, canUpgradeKey = Just needsUpgrade
, fastMigrate = Just removeProblemChars
, isStableKey = const True
, isCryptographicallySecure = const False
}
2010-10-15 20:42:36 +00:00
{- The key includes the file size, modification time, and the
- original filename relative to the top of the git repository.
-}
keyValue :: KeySource -> MeterUpdate -> Annex Key
keyValue source _ = do
let f = contentLocation source
2020-02-21 13:34:59 +00:00
stat <- liftIO $ R.getFileStatus f
sz <- liftIO $ getFileSize' f stat
relf <- fromRawFilePath . getTopFilePath
2020-02-21 13:34:59 +00:00
<$> inRepo (toTopFilePath $ keyFilename source)
return $ mkKey $ \k -> k
{ keyName = genKeyName relf
, keyVariety = WORMKey
, keySize = Just sz
Better sanitization of problem characters when generating URL and WORM keys. FAT has a lot of characters it does not allow in filenames, like ? and * It's probably the worst offender, but other filesystems also have limitiations. In 2011, I made keyFile escape : to handle FAT, but missed the other characters. It also turns out that when I did that, I was also living dangerously; any existing keys that contained a : had their object location change. Oops. So, adding new characters to escape to keyFile is out. Well, it would be possible to make keyFile behave differently on a per-filesystem basis, but this would be a real nightmare to get right. Consider that a rsync special remote uses keyFile to determine the filenames to use, and we don't know the underlying filesystem on the rsync server.. Instead, I have gone for a solution that is backwards compatable and simple. Its only downside is that already generated URL and WORM keys might not be able to be stored on FAT or some other filesystem that dislikes a character used in the key. (In this case, the user can just migrate the problem keys to a checksumming backend. If this became a big problem, fsck could be made to detect these and suggest a migration.) Going forward, new keys that are created will escape all characters that are likely to cause problems. And if some filesystem comes along that's even worse than FAT (seems unlikely, but here it is 2013, and people are still using FAT!), additional characters can be added to the set that are escaped without difficulty. (Also, made WORM limit the part of the filename that is embedded in the key, to deal with filesystem filename length limits. This could have already been a problem, but is more likely now, since the escaping of the filename can make it longer.) This commit was sponsored by Ian Downes
2013-10-05 19:01:49 +00:00
, keyMtime = Just $ modificationTime stat
}
{- Old WORM keys could contain spaces and carriage returns,
- and can be upgraded to remove them. -}
needsUpgrade :: Key -> Bool
needsUpgrade key = any (`S8.elem` fromKey keyName key) [' ', '\r']
removeProblemChars :: Key -> Backend -> AssociatedFile -> Annex (Maybe Key)
removeProblemChars oldkey newbackend _
| migratable = return $ Just $ alterKey oldkey $ \d -> d
{ keyName = encodeBS $ reSanitizeKeyName $ decodeBS $ keyName d }
| otherwise = return Nothing
where
migratable = oldvariety == newvariety
oldvariety = fromKey keyVariety oldkey
newvariety = backendVariety newbackend