addurl --preserve-filename and a few related changes
* addurl --preserve-filename: New option, uses server-provided filename without any sanitization, but with some security checking. Not yet implemented for remotes other than the web. * addurl, importfeed: Avoid adding filenames with leading '.', instead it will be replaced with '_'. This might be considered a security fix, but a CVE seems unwattanted. It was possible for addurl to create a dotfile, which could change behavior of some program. It was also possible for a web server to say the file name was ".git" or "foo/.git". That would not overrwrite the .git directory, but would cause addurl to fail; of course git won't add "foo/.git". sanitizeFilePath is too opinionated to remain in Utility, so moved it. The changes to mkSafeFilePath are because it used sanitizeFilePath. In particular: isDrive will never succeed, because "c:" gets munged to "c_" ".." gets sanitized now ".git" gets sanitized now It will never be null, because sanitizeFilePath keeps the length the same, and splitDirectories never returns a null path. Also, on the off chance a web server suggests a filename of "", ignore that, rather than trying to save to such a filename, which would fail in some way.
This commit is contained in:
parent
54599207f7
commit
6952060665
9 changed files with 132 additions and 39 deletions
56
Annex/UntrustedFilePath.hs
Normal file
56
Annex/UntrustedFilePath.hs
Normal file
|
@ -0,0 +1,56 @@
|
||||||
|
{- handling untrusted filepaths
|
||||||
|
-
|
||||||
|
- Copyright 2010-2020 Joey Hess <id@joeyh.name>
|
||||||
|
-
|
||||||
|
- Licensed under the GNU AGPL version 3 or higher.
|
||||||
|
-}
|
||||||
|
|
||||||
|
module Annex.UntrustedFilePath where
|
||||||
|
|
||||||
|
import Data.Char
|
||||||
|
import System.FilePath
|
||||||
|
|
||||||
|
{- Given a string that we'd like to use as the basis for FilePath, but that
|
||||||
|
- was provided by a third party and is not to be trusted, returns the closest
|
||||||
|
- sane FilePath.
|
||||||
|
-
|
||||||
|
- All spaces and punctuation and other wacky stuff are replaced
|
||||||
|
- with '_', except for '.'
|
||||||
|
-
|
||||||
|
- "../" becomes ".._", which is safe.
|
||||||
|
- "/foo" becomes "_foo", which is safe.
|
||||||
|
- "c:foo" becomes "c_foo", which is safe even on windows.
|
||||||
|
-
|
||||||
|
- Leading '.' is also replaced with '_', so ".git/foo" becomes "_git_foo"
|
||||||
|
- and so no dotfiles that might control a program are inadvertently created.
|
||||||
|
-}
|
||||||
|
sanitizeFilePath :: String -> FilePath
|
||||||
|
sanitizeFilePath = leadingdot . map sanitize
|
||||||
|
where
|
||||||
|
sanitize c
|
||||||
|
| c == '.' = c
|
||||||
|
| isSpace c || isPunctuation c || isSymbol c || isControl c || c == '/' = '_'
|
||||||
|
| otherwise = c
|
||||||
|
|
||||||
|
leadingdot ('.':s) = '_':s
|
||||||
|
leadingdot s = s
|
||||||
|
|
||||||
|
escapeSequenceInFilePath :: FilePath -> Bool
|
||||||
|
escapeSequenceInFilePath f = '\ESC' `elem` f
|
||||||
|
|
||||||
|
{- ../ is a path traversal, no matter where it appears.
|
||||||
|
-
|
||||||
|
- An absolute path is, of course.
|
||||||
|
-}
|
||||||
|
pathTraversalInFilePath :: FilePath -> Bool
|
||||||
|
pathTraversalInFilePath f
|
||||||
|
| isAbsolute f = True
|
||||||
|
| any (== "..") (splitPath f) = True
|
||||||
|
-- On windows, C:foo with no directory is not considered absolute
|
||||||
|
| hasDrive f = True
|
||||||
|
| otherwise = False
|
||||||
|
|
||||||
|
gitDirectoryInFilePath :: FilePath -> Bool
|
||||||
|
gitDirectoryInFilePath = any (== ".git")
|
||||||
|
. map dropTrailingPathSeparator
|
||||||
|
. splitPath
|
|
@ -16,6 +16,10 @@ git-annex (8.20200502) UNRELEASED; urgency=medium
|
||||||
* git-lfs repos that encrypt the annexed content but not the git repo
|
* git-lfs repos that encrypt the annexed content but not the git repo
|
||||||
only need --force passed to initremote, allow enableremote and
|
only need --force passed to initremote, allow enableremote and
|
||||||
autoenable of such remotes without forcing again.
|
autoenable of such remotes without forcing again.
|
||||||
|
* addurl, importfeed: Avoid adding filenames with leading '.', instead
|
||||||
|
it will be replaced with '_'.
|
||||||
|
* addurl --preserve-filename: New option, uses server-provided filename
|
||||||
|
without any sanitization, but with some security checking.
|
||||||
|
|
||||||
-- Joey Hess <id@joeyh.name> Mon, 04 May 2020 12:46:11 -0400
|
-- Joey Hess <id@joeyh.name> Mon, 04 May 2020 12:46:11 -0400
|
||||||
|
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
{- git-annex command
|
{- git-annex command
|
||||||
-
|
-
|
||||||
- Copyright 2011-2017 Joey Hess <id@joeyh.name>
|
- Copyright 2011-2020 Joey Hess <id@joeyh.name>
|
||||||
-
|
-
|
||||||
- Licensed under the GNU AGPL version 3 or higher.
|
- Licensed under the GNU AGPL version 3 or higher.
|
||||||
-}
|
-}
|
||||||
|
@ -23,6 +23,7 @@ import Annex.CheckIgnore
|
||||||
import Annex.Perms
|
import Annex.Perms
|
||||||
import Annex.UUID
|
import Annex.UUID
|
||||||
import Annex.YoutubeDl
|
import Annex.YoutubeDl
|
||||||
|
import Annex.UntrustedFilePath
|
||||||
import Logs.Web
|
import Logs.Web
|
||||||
import Types.KeySource
|
import Types.KeySource
|
||||||
import Types.UrlContents
|
import Types.UrlContents
|
||||||
|
@ -52,6 +53,7 @@ data DownloadOptions = DownloadOptions
|
||||||
{ relaxedOption :: Bool
|
{ relaxedOption :: Bool
|
||||||
, rawOption :: Bool
|
, rawOption :: Bool
|
||||||
, fileOption :: Maybe FilePath
|
, fileOption :: Maybe FilePath
|
||||||
|
, preserveFilenameOption :: Bool
|
||||||
}
|
}
|
||||||
|
|
||||||
optParser :: CmdParamsDesc -> Parser AddUrlOptions
|
optParser :: CmdParamsDesc -> Parser AddUrlOptions
|
||||||
|
@ -77,7 +79,7 @@ optParser desc = AddUrlOptions
|
||||||
)
|
)
|
||||||
|
|
||||||
parseDownloadOptions :: Bool -> Parser DownloadOptions
|
parseDownloadOptions :: Bool -> Parser DownloadOptions
|
||||||
parseDownloadOptions withfileoption = DownloadOptions
|
parseDownloadOptions withfileoptions = DownloadOptions
|
||||||
<$> switch
|
<$> switch
|
||||||
( long "relaxed"
|
( long "relaxed"
|
||||||
<> help "skip size check"
|
<> help "skip size check"
|
||||||
|
@ -86,12 +88,18 @@ parseDownloadOptions withfileoption = DownloadOptions
|
||||||
( long "raw"
|
( long "raw"
|
||||||
<> help "disable special handling for torrents, youtube-dl, etc"
|
<> help "disable special handling for torrents, youtube-dl, etc"
|
||||||
)
|
)
|
||||||
<*> if withfileoption
|
<*> (if withfileoptions
|
||||||
then optional (strOption
|
then optional (strOption
|
||||||
( long "file" <> metavar paramFile
|
( long "file" <> metavar paramFile
|
||||||
<> help "specify what file the url is added to"
|
<> help "specify what file the url is added to"
|
||||||
))
|
))
|
||||||
else pure Nothing
|
else pure Nothing)
|
||||||
|
<*> (if withfileoptions
|
||||||
|
then switch
|
||||||
|
( long "preserve-filename"
|
||||||
|
<> help "use filename provided by server as-is"
|
||||||
|
)
|
||||||
|
else pure False)
|
||||||
|
|
||||||
seek :: AddUrlOptions -> CommandSeek
|
seek :: AddUrlOptions -> CommandSeek
|
||||||
seek o = startConcurrency commandStages $ do
|
seek o = startConcurrency commandStages $ do
|
||||||
|
@ -207,16 +215,35 @@ startWeb addunlockedmatcher o urlstring = go $ fromMaybe bad $ parseURI urlstrin
|
||||||
file <- adjustFile o <$> case fileOption (downloadOptions o) of
|
file <- adjustFile o <$> case fileOption (downloadOptions o) of
|
||||||
Just f -> pure f
|
Just f -> pure f
|
||||||
Nothing -> case Url.urlSuggestedFile urlinfo of
|
Nothing -> case Url.urlSuggestedFile urlinfo of
|
||||||
Nothing -> pure $ url2file url (pathdepthOption o) pathmax
|
Just sf | not (null sf) -> if preserveFilenameOption (downloadOptions o)
|
||||||
Just sf -> do
|
then do
|
||||||
let f = truncateFilePath pathmax $
|
checkPreserveFileNameSecurity sf
|
||||||
sanitizeFilePath sf
|
return sf
|
||||||
ifM (liftIO $ doesFileExist f <||> doesDirectoryExist f)
|
else do
|
||||||
( pure $ url2file url (pathdepthOption o) pathmax
|
let f = truncateFilePath pathmax $
|
||||||
, pure f
|
sanitizeFilePath sf
|
||||||
)
|
ifM (liftIO $ doesFileExist f <||> doesDirectoryExist f)
|
||||||
|
( pure $ url2file url (pathdepthOption o) pathmax
|
||||||
|
, pure f
|
||||||
|
)
|
||||||
|
_ -> pure $ url2file url (pathdepthOption o) pathmax
|
||||||
performWeb addunlockedmatcher o urlstring file urlinfo
|
performWeb addunlockedmatcher o urlstring file urlinfo
|
||||||
|
|
||||||
|
-- sanitizeFilePath avoids all these security problems
|
||||||
|
-- (and probably others, but at least this catches the most egrarious ones).
|
||||||
|
checkPreserveFileNameSecurity :: FilePath -> Annex ()
|
||||||
|
checkPreserveFileNameSecurity f = do
|
||||||
|
checksecurity escapeSequenceInFilePath False "escape sequence"
|
||||||
|
checksecurity pathTraversalInFilePath True "path traversal"
|
||||||
|
checksecurity gitDirectoryInFilePath True "contains a .git directory"
|
||||||
|
where
|
||||||
|
checksecurity p canshow d = when (p f) $
|
||||||
|
giveup $ concat
|
||||||
|
[ "--preserve-filename was used, but the filename "
|
||||||
|
, if canshow then "(" ++ f ++ ") " else ""
|
||||||
|
, "has a security problem (" ++ d ++ "), not adding."
|
||||||
|
]
|
||||||
|
|
||||||
performWeb :: AddUnlockedMatcher -> AddUrlOptions -> URLString -> FilePath -> Url.UrlInfo -> CommandPerform
|
performWeb :: AddUnlockedMatcher -> AddUrlOptions -> URLString -> FilePath -> Url.UrlInfo -> CommandPerform
|
||||||
performWeb addunlockedmatcher o url file urlinfo = ifAnnexed (toRawFilePath file) addurl geturl
|
performWeb addunlockedmatcher o url file urlinfo = ifAnnexed (toRawFilePath file) addurl geturl
|
||||||
where
|
where
|
||||||
|
|
|
@ -40,6 +40,7 @@ import Logs.MetaData
|
||||||
import Annex.MetaData
|
import Annex.MetaData
|
||||||
import Annex.FileMatcher
|
import Annex.FileMatcher
|
||||||
import Command.AddUrl (addWorkTree)
|
import Command.AddUrl (addWorkTree)
|
||||||
|
import Annex.UntrustedFilePath
|
||||||
|
|
||||||
cmd :: Command
|
cmd :: Command
|
||||||
cmd = notBareRepo $
|
cmd = notBareRepo $
|
||||||
|
|
|
@ -13,7 +13,7 @@ module Types.UrlContents (
|
||||||
) where
|
) where
|
||||||
|
|
||||||
import Utility.Url
|
import Utility.Url
|
||||||
import Utility.Path
|
import Annex.UntrustedFilePath
|
||||||
|
|
||||||
import System.FilePath
|
import System.FilePath
|
||||||
|
|
||||||
|
@ -35,13 +35,7 @@ newtype SafeFilePath = SafeFilePath FilePath
|
||||||
mkSafeFilePath :: FilePath -> SafeFilePath
|
mkSafeFilePath :: FilePath -> SafeFilePath
|
||||||
mkSafeFilePath p = SafeFilePath $ if null p' then "file" else p'
|
mkSafeFilePath p = SafeFilePath $ if null p' then "file" else p'
|
||||||
where
|
where
|
||||||
p' = joinPath $ filter safe $ map sanitizeFilePath $ splitDirectories p
|
p' = joinPath $ map sanitizeFilePath $ splitDirectories p
|
||||||
safe s
|
|
||||||
| isDrive s = False
|
|
||||||
| s == ".." = False
|
|
||||||
| s == ".git" = False
|
|
||||||
| null s = False
|
|
||||||
| otherwise = True
|
|
||||||
|
|
||||||
fromSafeFilePath :: SafeFilePath -> FilePath
|
fromSafeFilePath :: SafeFilePath -> FilePath
|
||||||
fromSafeFilePath (SafeFilePath p) = p
|
fromSafeFilePath (SafeFilePath p) = p
|
||||||
|
|
|
@ -1,6 +1,6 @@
|
||||||
{- path manipulation
|
{- path manipulation
|
||||||
-
|
-
|
||||||
- Copyright 2010-2014 Joey Hess <id@joeyh.name>
|
- Copyright 2010-2020 Joey Hess <id@joeyh.name>
|
||||||
-
|
-
|
||||||
- License: BSD-2-clause
|
- License: BSD-2-clause
|
||||||
-}
|
-}
|
||||||
|
@ -24,7 +24,6 @@ module Utility.Path (
|
||||||
inPath,
|
inPath,
|
||||||
searchPath,
|
searchPath,
|
||||||
dotfile,
|
dotfile,
|
||||||
sanitizeFilePath,
|
|
||||||
splitShortExtensions,
|
splitShortExtensions,
|
||||||
|
|
||||||
prop_upFrom_basics,
|
prop_upFrom_basics,
|
||||||
|
@ -35,7 +34,6 @@ module Utility.Path (
|
||||||
import System.FilePath
|
import System.FilePath
|
||||||
import Data.List
|
import Data.List
|
||||||
import Data.Maybe
|
import Data.Maybe
|
||||||
import Data.Char
|
|
||||||
import Control.Applicative
|
import Control.Applicative
|
||||||
import Prelude
|
import Prelude
|
||||||
|
|
||||||
|
@ -276,22 +274,6 @@ dotfile file
|
||||||
where
|
where
|
||||||
f = takeFileName file
|
f = takeFileName file
|
||||||
|
|
||||||
{- Given a string that we'd like to use as the basis for FilePath, but that
|
|
||||||
- was provided by a third party and is not to be trusted, returns the closest
|
|
||||||
- sane FilePath.
|
|
||||||
-
|
|
||||||
- All spaces and punctuation and other wacky stuff are replaced
|
|
||||||
- with '_', except for '.'
|
|
||||||
- "../" will thus turn into ".._", which is safe.
|
|
||||||
-}
|
|
||||||
sanitizeFilePath :: String -> FilePath
|
|
||||||
sanitizeFilePath = map sanitize
|
|
||||||
where
|
|
||||||
sanitize c
|
|
||||||
| c == '.' = c
|
|
||||||
| isSpace c || isPunctuation c || isSymbol c || isControl c || c == '/' = '_'
|
|
||||||
| otherwise = c
|
|
||||||
|
|
||||||
{- Similar to splitExtensions, but knows that some things in FilePaths
|
{- Similar to splitExtensions, but knows that some things in FilePaths
|
||||||
- after a dot are too long to be extensions. -}
|
- after a dot are too long to be extensions. -}
|
||||||
splitShortExtensions :: FilePath -> (FilePath, [String])
|
splitShortExtensions :: FilePath -> (FilePath, [String])
|
||||||
|
|
|
@ -0,0 +1,18 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 3"""
|
||||||
|
date="2020-05-08T19:56:27Z"
|
||||||
|
content="""
|
||||||
|
Implemented git-annex addurl --preserve-filename, which will do what you
|
||||||
|
want.
|
||||||
|
|
||||||
|
Leaving this bug open because I only implemented it for web urls, not yet
|
||||||
|
for torrents and other special remotes that have their own url scheme.
|
||||||
|
The sanitization for those is currently done at a lower level than addurl,
|
||||||
|
and so that will take a bit more work to implement.
|
||||||
|
|
||||||
|
(importfeed does not, I think, need to implement this option, because
|
||||||
|
the filenames are based on information from the rss feed, and it's
|
||||||
|
perfectly fine to sanitize eg a podcast episode title to get a reasonable
|
||||||
|
filename.)
|
||||||
|
"""]]
|
|
@ -57,6 +57,16 @@ be used to get better filenames.
|
||||||
If the file already exists, addurl will record that it can be downloaded
|
If the file already exists, addurl will record that it can be downloaded
|
||||||
from the specified url(s).
|
from the specified url(s).
|
||||||
|
|
||||||
|
* `--preserve-filename`
|
||||||
|
|
||||||
|
When the web server (or torrent, etc) provides a filename, use it as-is,
|
||||||
|
avoiding sanitizing unusual characters, or truncating it to length, or any
|
||||||
|
other modifications.
|
||||||
|
|
||||||
|
git-annex will still check the filename for safety, and if the filename
|
||||||
|
has a security problem such as path traversal or an escape sequence,
|
||||||
|
it will refuse to add it.
|
||||||
|
|
||||||
* `--pathdepth=N`
|
* `--pathdepth=N`
|
||||||
|
|
||||||
Rather than basing the filename on the whole url, this causes a path to
|
Rather than basing the filename on the whole url, this causes a path to
|
||||||
|
|
|
@ -661,6 +661,7 @@ Executable git-annex
|
||||||
Annex.TaggedPush
|
Annex.TaggedPush
|
||||||
Annex.Tmp
|
Annex.Tmp
|
||||||
Annex.Transfer
|
Annex.Transfer
|
||||||
|
Annex.UntrustedFilePath
|
||||||
Annex.UpdateInstead
|
Annex.UpdateInstead
|
||||||
Annex.UUID
|
Annex.UUID
|
||||||
Annex.Url
|
Annex.Url
|
||||||
|
|
Loading…
Add table
Reference in a new issue