addurl --preserve-filename and a few related changes
* addurl --preserve-filename: New option, uses server-provided filename without any sanitization, but with some security checking. Not yet implemented for remotes other than the web. * addurl, importfeed: Avoid adding filenames with leading '.', instead it will be replaced with '_'. This might be considered a security fix, but a CVE seems unwattanted. It was possible for addurl to create a dotfile, which could change behavior of some program. It was also possible for a web server to say the file name was ".git" or "foo/.git". That would not overrwrite the .git directory, but would cause addurl to fail; of course git won't add "foo/.git". sanitizeFilePath is too opinionated to remain in Utility, so moved it. The changes to mkSafeFilePath are because it used sanitizeFilePath. In particular: isDrive will never succeed, because "c:" gets munged to "c_" ".." gets sanitized now ".git" gets sanitized now It will never be null, because sanitizeFilePath keeps the length the same, and splitDirectories never returns a null path. Also, on the off chance a web server suggests a filename of "", ignore that, rather than trying to save to such a filename, which would fail in some way.
This commit is contained in:
parent
54599207f7
commit
6952060665
9 changed files with 132 additions and 39 deletions
56
Annex/UntrustedFilePath.hs
Normal file
56
Annex/UntrustedFilePath.hs
Normal file
|
@ -0,0 +1,56 @@
|
|||
{- handling untrusted filepaths
|
||||
-
|
||||
- Copyright 2010-2020 Joey Hess <id@joeyh.name>
|
||||
-
|
||||
- Licensed under the GNU AGPL version 3 or higher.
|
||||
-}
|
||||
|
||||
module Annex.UntrustedFilePath where
|
||||
|
||||
import Data.Char
|
||||
import System.FilePath
|
||||
|
||||
{- Given a string that we'd like to use as the basis for FilePath, but that
|
||||
- was provided by a third party and is not to be trusted, returns the closest
|
||||
- sane FilePath.
|
||||
-
|
||||
- All spaces and punctuation and other wacky stuff are replaced
|
||||
- with '_', except for '.'
|
||||
-
|
||||
- "../" becomes ".._", which is safe.
|
||||
- "/foo" becomes "_foo", which is safe.
|
||||
- "c:foo" becomes "c_foo", which is safe even on windows.
|
||||
-
|
||||
- Leading '.' is also replaced with '_', so ".git/foo" becomes "_git_foo"
|
||||
- and so no dotfiles that might control a program are inadvertently created.
|
||||
-}
|
||||
sanitizeFilePath :: String -> FilePath
|
||||
sanitizeFilePath = leadingdot . map sanitize
|
||||
where
|
||||
sanitize c
|
||||
| c == '.' = c
|
||||
| isSpace c || isPunctuation c || isSymbol c || isControl c || c == '/' = '_'
|
||||
| otherwise = c
|
||||
|
||||
leadingdot ('.':s) = '_':s
|
||||
leadingdot s = s
|
||||
|
||||
escapeSequenceInFilePath :: FilePath -> Bool
|
||||
escapeSequenceInFilePath f = '\ESC' `elem` f
|
||||
|
||||
{- ../ is a path traversal, no matter where it appears.
|
||||
-
|
||||
- An absolute path is, of course.
|
||||
-}
|
||||
pathTraversalInFilePath :: FilePath -> Bool
|
||||
pathTraversalInFilePath f
|
||||
| isAbsolute f = True
|
||||
| any (== "..") (splitPath f) = True
|
||||
-- On windows, C:foo with no directory is not considered absolute
|
||||
| hasDrive f = True
|
||||
| otherwise = False
|
||||
|
||||
gitDirectoryInFilePath :: FilePath -> Bool
|
||||
gitDirectoryInFilePath = any (== ".git")
|
||||
. map dropTrailingPathSeparator
|
||||
. splitPath
|
|
@ -16,6 +16,10 @@ git-annex (8.20200502) UNRELEASED; urgency=medium
|
|||
* git-lfs repos that encrypt the annexed content but not the git repo
|
||||
only need --force passed to initremote, allow enableremote and
|
||||
autoenable of such remotes without forcing again.
|
||||
* addurl, importfeed: Avoid adding filenames with leading '.', instead
|
||||
it will be replaced with '_'.
|
||||
* addurl --preserve-filename: New option, uses server-provided filename
|
||||
without any sanitization, but with some security checking.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Mon, 04 May 2020 12:46:11 -0400
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
{- git-annex command
|
||||
-
|
||||
- Copyright 2011-2017 Joey Hess <id@joeyh.name>
|
||||
- Copyright 2011-2020 Joey Hess <id@joeyh.name>
|
||||
-
|
||||
- Licensed under the GNU AGPL version 3 or higher.
|
||||
-}
|
||||
|
@ -23,6 +23,7 @@ import Annex.CheckIgnore
|
|||
import Annex.Perms
|
||||
import Annex.UUID
|
||||
import Annex.YoutubeDl
|
||||
import Annex.UntrustedFilePath
|
||||
import Logs.Web
|
||||
import Types.KeySource
|
||||
import Types.UrlContents
|
||||
|
@ -52,6 +53,7 @@ data DownloadOptions = DownloadOptions
|
|||
{ relaxedOption :: Bool
|
||||
, rawOption :: Bool
|
||||
, fileOption :: Maybe FilePath
|
||||
, preserveFilenameOption :: Bool
|
||||
}
|
||||
|
||||
optParser :: CmdParamsDesc -> Parser AddUrlOptions
|
||||
|
@ -77,7 +79,7 @@ optParser desc = AddUrlOptions
|
|||
)
|
||||
|
||||
parseDownloadOptions :: Bool -> Parser DownloadOptions
|
||||
parseDownloadOptions withfileoption = DownloadOptions
|
||||
parseDownloadOptions withfileoptions = DownloadOptions
|
||||
<$> switch
|
||||
( long "relaxed"
|
||||
<> help "skip size check"
|
||||
|
@ -86,12 +88,18 @@ parseDownloadOptions withfileoption = DownloadOptions
|
|||
( long "raw"
|
||||
<> help "disable special handling for torrents, youtube-dl, etc"
|
||||
)
|
||||
<*> if withfileoption
|
||||
<*> (if withfileoptions
|
||||
then optional (strOption
|
||||
( long "file" <> metavar paramFile
|
||||
<> help "specify what file the url is added to"
|
||||
))
|
||||
else pure Nothing
|
||||
else pure Nothing)
|
||||
<*> (if withfileoptions
|
||||
then switch
|
||||
( long "preserve-filename"
|
||||
<> help "use filename provided by server as-is"
|
||||
)
|
||||
else pure False)
|
||||
|
||||
seek :: AddUrlOptions -> CommandSeek
|
||||
seek o = startConcurrency commandStages $ do
|
||||
|
@ -207,16 +215,35 @@ startWeb addunlockedmatcher o urlstring = go $ fromMaybe bad $ parseURI urlstrin
|
|||
file <- adjustFile o <$> case fileOption (downloadOptions o) of
|
||||
Just f -> pure f
|
||||
Nothing -> case Url.urlSuggestedFile urlinfo of
|
||||
Nothing -> pure $ url2file url (pathdepthOption o) pathmax
|
||||
Just sf -> do
|
||||
let f = truncateFilePath pathmax $
|
||||
sanitizeFilePath sf
|
||||
ifM (liftIO $ doesFileExist f <||> doesDirectoryExist f)
|
||||
( pure $ url2file url (pathdepthOption o) pathmax
|
||||
, pure f
|
||||
)
|
||||
Just sf | not (null sf) -> if preserveFilenameOption (downloadOptions o)
|
||||
then do
|
||||
checkPreserveFileNameSecurity sf
|
||||
return sf
|
||||
else do
|
||||
let f = truncateFilePath pathmax $
|
||||
sanitizeFilePath sf
|
||||
ifM (liftIO $ doesFileExist f <||> doesDirectoryExist f)
|
||||
( pure $ url2file url (pathdepthOption o) pathmax
|
||||
, pure f
|
||||
)
|
||||
_ -> pure $ url2file url (pathdepthOption o) pathmax
|
||||
performWeb addunlockedmatcher o urlstring file urlinfo
|
||||
|
||||
-- sanitizeFilePath avoids all these security problems
|
||||
-- (and probably others, but at least this catches the most egrarious ones).
|
||||
checkPreserveFileNameSecurity :: FilePath -> Annex ()
|
||||
checkPreserveFileNameSecurity f = do
|
||||
checksecurity escapeSequenceInFilePath False "escape sequence"
|
||||
checksecurity pathTraversalInFilePath True "path traversal"
|
||||
checksecurity gitDirectoryInFilePath True "contains a .git directory"
|
||||
where
|
||||
checksecurity p canshow d = when (p f) $
|
||||
giveup $ concat
|
||||
[ "--preserve-filename was used, but the filename "
|
||||
, if canshow then "(" ++ f ++ ") " else ""
|
||||
, "has a security problem (" ++ d ++ "), not adding."
|
||||
]
|
||||
|
||||
performWeb :: AddUnlockedMatcher -> AddUrlOptions -> URLString -> FilePath -> Url.UrlInfo -> CommandPerform
|
||||
performWeb addunlockedmatcher o url file urlinfo = ifAnnexed (toRawFilePath file) addurl geturl
|
||||
where
|
||||
|
|
|
@ -40,6 +40,7 @@ import Logs.MetaData
|
|||
import Annex.MetaData
|
||||
import Annex.FileMatcher
|
||||
import Command.AddUrl (addWorkTree)
|
||||
import Annex.UntrustedFilePath
|
||||
|
||||
cmd :: Command
|
||||
cmd = notBareRepo $
|
||||
|
|
|
@ -13,7 +13,7 @@ module Types.UrlContents (
|
|||
) where
|
||||
|
||||
import Utility.Url
|
||||
import Utility.Path
|
||||
import Annex.UntrustedFilePath
|
||||
|
||||
import System.FilePath
|
||||
|
||||
|
@ -35,13 +35,7 @@ newtype SafeFilePath = SafeFilePath FilePath
|
|||
mkSafeFilePath :: FilePath -> SafeFilePath
|
||||
mkSafeFilePath p = SafeFilePath $ if null p' then "file" else p'
|
||||
where
|
||||
p' = joinPath $ filter safe $ map sanitizeFilePath $ splitDirectories p
|
||||
safe s
|
||||
| isDrive s = False
|
||||
| s == ".." = False
|
||||
| s == ".git" = False
|
||||
| null s = False
|
||||
| otherwise = True
|
||||
p' = joinPath $ map sanitizeFilePath $ splitDirectories p
|
||||
|
||||
fromSafeFilePath :: SafeFilePath -> FilePath
|
||||
fromSafeFilePath (SafeFilePath p) = p
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
{- path manipulation
|
||||
-
|
||||
- Copyright 2010-2014 Joey Hess <id@joeyh.name>
|
||||
- Copyright 2010-2020 Joey Hess <id@joeyh.name>
|
||||
-
|
||||
- License: BSD-2-clause
|
||||
-}
|
||||
|
@ -24,7 +24,6 @@ module Utility.Path (
|
|||
inPath,
|
||||
searchPath,
|
||||
dotfile,
|
||||
sanitizeFilePath,
|
||||
splitShortExtensions,
|
||||
|
||||
prop_upFrom_basics,
|
||||
|
@ -35,7 +34,6 @@ module Utility.Path (
|
|||
import System.FilePath
|
||||
import Data.List
|
||||
import Data.Maybe
|
||||
import Data.Char
|
||||
import Control.Applicative
|
||||
import Prelude
|
||||
|
||||
|
@ -276,22 +274,6 @@ dotfile file
|
|||
where
|
||||
f = takeFileName file
|
||||
|
||||
{- Given a string that we'd like to use as the basis for FilePath, but that
|
||||
- was provided by a third party and is not to be trusted, returns the closest
|
||||
- sane FilePath.
|
||||
-
|
||||
- All spaces and punctuation and other wacky stuff are replaced
|
||||
- with '_', except for '.'
|
||||
- "../" will thus turn into ".._", which is safe.
|
||||
-}
|
||||
sanitizeFilePath :: String -> FilePath
|
||||
sanitizeFilePath = map sanitize
|
||||
where
|
||||
sanitize c
|
||||
| c == '.' = c
|
||||
| isSpace c || isPunctuation c || isSymbol c || isControl c || c == '/' = '_'
|
||||
| otherwise = c
|
||||
|
||||
{- Similar to splitExtensions, but knows that some things in FilePaths
|
||||
- after a dot are too long to be extensions. -}
|
||||
splitShortExtensions :: FilePath -> (FilePath, [String])
|
||||
|
|
|
@ -0,0 +1,18 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2020-05-08T19:56:27Z"
|
||||
content="""
|
||||
Implemented git-annex addurl --preserve-filename, which will do what you
|
||||
want.
|
||||
|
||||
Leaving this bug open because I only implemented it for web urls, not yet
|
||||
for torrents and other special remotes that have their own url scheme.
|
||||
The sanitization for those is currently done at a lower level than addurl,
|
||||
and so that will take a bit more work to implement.
|
||||
|
||||
(importfeed does not, I think, need to implement this option, because
|
||||
the filenames are based on information from the rss feed, and it's
|
||||
perfectly fine to sanitize eg a podcast episode title to get a reasonable
|
||||
filename.)
|
||||
"""]]
|
|
@ -57,6 +57,16 @@ be used to get better filenames.
|
|||
If the file already exists, addurl will record that it can be downloaded
|
||||
from the specified url(s).
|
||||
|
||||
* `--preserve-filename`
|
||||
|
||||
When the web server (or torrent, etc) provides a filename, use it as-is,
|
||||
avoiding sanitizing unusual characters, or truncating it to length, or any
|
||||
other modifications.
|
||||
|
||||
git-annex will still check the filename for safety, and if the filename
|
||||
has a security problem such as path traversal or an escape sequence,
|
||||
it will refuse to add it.
|
||||
|
||||
* `--pathdepth=N`
|
||||
|
||||
Rather than basing the filename on the whole url, this causes a path to
|
||||
|
|
|
@ -661,6 +661,7 @@ Executable git-annex
|
|||
Annex.TaggedPush
|
||||
Annex.Tmp
|
||||
Annex.Transfer
|
||||
Annex.UntrustedFilePath
|
||||
Annex.UpdateInstead
|
||||
Annex.UUID
|
||||
Annex.Url
|
||||
|
|
Loading…
Add table
Reference in a new issue