git-annex/Logs/UUID.hs
Joey Hess 38e9ea8497
one-way escaping of newlines in uuid.log
A repository can have a newline in its description due to being in a
directory containing a newline, or due to git-annex describe being
passed a string with a newline in it for some reason. Putting that
newline in uuid.log breaks its format.

So, escape the newline when it enters uuid.log, to \n

This is a one-way escaping, it is not converted back to a newline
when reading the log. If it were, commands like git-annex info and
whereis would display a multi-line description, which could be confusing
to read.

And, implementing roundtripping would necessarily cause problems if an
old version of git-annex were used to set a description that contained
whatever special character is used to escape the \n. Eg, a \ or if
it used the ! prefix before base64 data that is used in some other logs,
the ! character. Then the description set by the old git-annex would not
roundtrip.

There just doesn't seem to be any benefit of roundtripping newlines through,
so why bother? And, git often displays \n for newline when a filename
contains a newline, so git-annex doing it in this case seems sorta ok
by analogy to git.

(Some other git-annex logs can also have newlines put into them if the
user really wants to break git-annex. For example:
git-annex config annex.largefiles "foo
bar"
The full list is probably config.log, remote.log, group.log,
preferred-content.log, required-content.log,
group-preferred-content.log, schedule.log. Probably there is no
good reason to use a newline in any of these, and the breakage is
probably limited to the bad data the user put in not coming back out.
And users can write any garbage to log files themselves manually in any
case. So, I am not going to address all of those at this time. If a
problem such as this one with the newline in the repository path comes
up, it can be dealt with on a case by case basis.)

Sponsored-by: Dartmouth College's Datalad project
2023-03-13 14:19:32 -04:00

72 lines
2.1 KiB
Haskell

{- git-annex uuid log
-
- uuid.log stores a list of known uuids, and their descriptions.
-
- Copyright 2010-2023 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
{-# LANGUAGE OverloadedStrings #-}
module Logs.UUID (
uuidLog,
describeUUID,
uuidDescMap,
uuidDescMapLoad,
uuidDescMapRaw,
) where
import Types.UUID
import Annex.Common
import Annex.VectorClock
import qualified Annex
import qualified Annex.Branch
import Logs
import Logs.UUIDBased
import qualified Annex.UUID
import qualified Data.Map.Strict as M
import qualified Data.ByteString as B
import qualified Data.ByteString.Lazy as L
import qualified Data.Attoparsec.ByteString.Lazy as A
import Data.ByteString.Builder
import Data.Char
{- Records a description for a uuid in the log. -}
describeUUID :: UUID -> UUIDDesc -> Annex ()
describeUUID uuid desc = do
c <- currentVectorClock
Annex.Branch.change (Annex.Branch.RegardingUUID [uuid]) uuidLog $
buildLogOld builder . changeLog c uuid desc . parseUUIDLog
where
builder (UUIDDesc b) = byteString (escnewline b)
-- Escape any newline in the description, since newlines cannot
-- be present in the logged value. This is a one-way escaping.
escnewline = B.intercalate "\\n" . B.split newline
newline = fromIntegral (ord '\n')
{- The map is cached for speed. -}
uuidDescMap :: Annex UUIDDescMap
uuidDescMap = maybe uuidDescMapLoad return =<< Annex.getState Annex.uuiddescmap
{- Read the uuidLog into a map, and cache it for later use.
-
- If the current repository has not been described, it is still included
- in the map with an empty description. -}
uuidDescMapLoad :: Annex UUIDDescMap
uuidDescMapLoad = do
m <- uuidDescMapRaw
u <- Annex.UUID.getUUID
let m' = M.insertWith preferold u mempty m
Annex.changeState $ \s -> s { Annex.uuiddescmap = Just m' }
return m'
where
preferold = flip const
{- Read the uuidLog into a map. Includes only actually set descriptions. -}
uuidDescMapRaw :: Annex UUIDDescMap
uuidDescMapRaw = simpleMap . parseUUIDLog <$> Annex.Branch.get uuidLog
parseUUIDLog :: L.ByteString -> Log UUIDDesc
parseUUIDLog = parseLogOld (UUIDDesc <$> A.takeByteString)