Serialize use of C magic library, which is not thread safe.

This fixes failures uploading to S3 when using -J.

This commit was sponsored by Denis Dzyubenko on Patreon.
This commit is contained in:
Joey Hess 2020-09-17 17:27:42 -04:00
parent 59f5d6509c
commit 922621301a
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 38 additions and 2 deletions

View file

@ -1,6 +1,6 @@
{- Interface to libmagic {- Interface to libmagic
- -
- Copyright 2019 Joey Hess <id@joeyh.name> - Copyright 2019-2020 Joey Hess <id@joeyh.name>
- -
- Licensed under the GNU AGPL version 3 or higher. - Licensed under the GNU AGPL version 3 or higher.
-} -}
@ -21,6 +21,8 @@ import Control.Monad.IO.Class
#ifdef WITH_MAGICMIME #ifdef WITH_MAGICMIME
import Magic import Magic
import Utility.Env import Utility.Env
import Control.Concurrent
import System.IO.Unsafe (unsafePerformIO)
import Common import Common
#else #else
type Magic = () type Magic = ()
@ -41,7 +43,7 @@ initMagicMime = return Nothing
getMagicMime :: Magic -> FilePath -> IO (Maybe (MimeType, MimeEncoding)) getMagicMime :: Magic -> FilePath -> IO (Maybe (MimeType, MimeEncoding))
#ifdef WITH_MAGICMIME #ifdef WITH_MAGICMIME
getMagicMime m f = Just . parse <$> magicFile m f getMagicMime m f = Just . parse <$> magicConcurrentSafe (magicFile m f)
where where
parse s = parse s =
let (mimetype, rest) = separate (== ';') s let (mimetype, rest) = separate (== ';') s
@ -58,3 +60,15 @@ getMagicMimeType m f = liftIO $ fmap fst <$> getMagicMime m f
getMagicMimeEncoding :: MonadIO m => Magic -> FilePath -> m(Maybe MimeEncoding) getMagicMimeEncoding :: MonadIO m => Magic -> FilePath -> m(Maybe MimeEncoding)
getMagicMimeEncoding m f = liftIO $ fmap snd <$> getMagicMime m f getMagicMimeEncoding m f = liftIO $ fmap snd <$> getMagicMime m f
#ifdef WITH_MAGICMIME
{-# NOINLINE mutex #-}
mutex :: MVar ()
mutex = unsafePerformIO $ newMVar ()
-- Work around a bug, the library is not concurrency safe and will
-- sometimes access the wrong memory if multiple ones are called at the
-- same time.
magicConcurrentSafe :: IO a -> IO a
magicConcurrentSafe = bracket_ (takeMVar mutex) (putMVar mutex ())
#endif

View file

@ -9,6 +9,8 @@ git-annex (8.20200909) UNRELEASED; urgency=medium
to determine which batch request each response corresponds to. to determine which batch request each response corresponds to.
* aws-0.22 improved its support for setting etags, which improves * aws-0.22 improved its support for setting etags, which improves
support for versioned S3 buckets. support for versioned S3 buckets.
* Serialize use of C magic library, which is not thread safe.
This fixes failures uploading to S3 when using -J.
-- Joey Hess <id@joeyh.name> Mon, 14 Sep 2020 18:34:37 -0400 -- Joey Hess <id@joeyh.name> Mon, 14 Sep 2020 18:34:37 -0400

View file

@ -0,0 +1,20 @@
[[!comment format=mdwn
username="joey"
subject="""comment 7"""
date="2020-09-17T21:03:27Z"
content="""
I instrumented git-annex to output the content-type it's sending,
and it IS corrupted before hitting the http stack. It seems that libmagic
is returning garbage.
Perhaps it's not thread safe? If so, it might be *writing* to the wrong
location too in some cases, so could explain all this weird behavior.
And S3 remote is almost the only part of git-annex that uses libmagic, and the
only part that uses it concurrently...
So, I added a mutex around uses of it. Problem went away. Bug sent to
library maintainer, and for now I hope this fixes the problem.
If you were seeing hangs, crashes, etc, please try with a git-annex
autobuild made after this comment..
"""]]