Switch from using regex-compat to regex-tdfa, as the C regex library is rather buggy.

This commit is contained in:
Joey Hess 2013-03-08 15:29:01 -04:00
parent 0dbea8a9a1
commit a2d94bd627
7 changed files with 29 additions and 7 deletions

View file

@ -13,7 +13,8 @@ import Data.Time.Clock.POSIX
import qualified Data.Set as S import qualified Data.Set as S
import qualified Data.Map as M import qualified Data.Map as M
import System.Path.WildMatch import System.Path.WildMatch
import Text.Regex import Text.Regex.TDFA
import Text.Regex.TDFA.String
import Common.Annex import Common.Annex
import qualified Annex import qualified Annex
@ -83,12 +84,17 @@ limitExclude :: MkLimit
limitExclude glob = Right $ const $ return . not . matchglob glob limitExclude glob = Right $ const $ return . not . matchglob glob
{- Could just use wildCheckCase, but this way the regex is only compiled {- Could just use wildCheckCase, but this way the regex is only compiled
- once. -} - once. Also, we use regex-TDFA because it's less buggy in its support
- of non-unicode characters. -}
matchglob :: String -> Annex.FileInfo -> Bool matchglob :: String -> Annex.FileInfo -> Bool
matchglob glob (Annex.FileInfo { Annex.matchFile = f }) = matchglob glob (Annex.FileInfo { Annex.matchFile = f }) =
isJust $ matchRegex cregex f case cregex of
Right r -> case execute r f of
Right (Just _) -> True
_ -> False
Left _ -> error $ "failed to compile regex: " ++ regex
where where
cregex = mkRegex regex cregex = compile defaultCompOpt defaultExecOpt regex
regex = '^':wildToRegex glob regex = '^':wildToRegex glob
{- Adds a limit to skip files not believed to be present {- Adds a limit to skip files not believed to be present

View file

@ -19,7 +19,7 @@ fast: dist/caballog
@$$(grep 'ghc --make' dist/caballog | head -n 1) @$$(grep 'ghc --make' dist/caballog | head -n 1)
@ln -sf dist/build/git-annex/git-annex git-annex @ln -sf dist/build/git-annex/git-annex git-annex
dist/caballog: dist/caballog: git-annex.cabal
$(CABAL) configure -f"-Production" -O0 $(CABAL) configure -f"-Production" -O0
$(CABAL) build -v2 | tee $@ $(CABAL) build -v2 | tee $@

2
debian/changelog vendored
View file

@ -34,6 +34,8 @@ git-annex (4.20130228) UNRELEASED; urgency=low
should add files in the current directory, but not act on unlocked files should add files in the current directory, but not act on unlocked files
elsewhere in the tree. elsewhere in the tree.
* assistant: Sync with all git remotes on startup. * assistant: Sync with all git remotes on startup.
* Switch from using regex-compat to regex-tdfa, as the C regex library
is rather buggy.
-- Joey Hess <joeyh@debian.org> Wed, 27 Feb 2013 23:20:40 -0400 -- Joey Hess <joeyh@debian.org> Wed, 27 Feb 2013 23:20:40 -0400

1
debian/control vendored
View file

@ -9,6 +9,7 @@ Build-Depends:
libghc-hslogger-dev, libghc-hslogger-dev,
libghc-pcre-light-dev, libghc-pcre-light-dev,
libghc-sha-dev, libghc-sha-dev,
libghc-regex-tdfa-dev,
libghc-dataenc-dev, libghc-dataenc-dev,
libghc-utf8-string-dev, libghc-utf8-string-dev,
libghc-hs3-dev (>= 0.5.6), libghc-hs3-dev (>= 0.5.6),

View file

@ -33,3 +33,16 @@ git-annex 4.20130227, on Debian GNU/Linux (sid, i386).
LC_ALL= LC_ALL=
</code></pre> </code></pre>
> Tracked this back to a bug in either the C library or the haskell
> regex-posix wrpaper around it. I'm not sure which, but I emailed the
> maintainer of the haskell library. It just doesn't think these
> things are characters; even `.` fails to match them! Everything should
> match that...
>
> There are apparently quite a lot of bugs on POSIX regex libraries
> as implemented on different systems:
> <http://www.haskell.org/haskellwiki/Regex_Posix>
>
> It seemed best to jettison this dependency entirely; I've switched it to
> haskell's pure regex-tdfa library, which works nicely. [[done]]
> --[[Joey]]

View file

@ -19,7 +19,7 @@ quite a lot.
* [DAV](http://hackage.haskell.org/package/DAV) (optional) * [DAV](http://hackage.haskell.org/package/DAV) (optional)
* [SafeSemaphore](http://hackage.haskell.org/package/SafeSemaphore) * [SafeSemaphore](http://hackage.haskell.org/package/SafeSemaphore)
* [UUID](http://hackage.haskell.org/package/uuid) * [UUID](http://hackage.haskell.org/package/uuid)
* [Glob](http://hackage.haskell.org/package/Glob) * [regex-tdfa](http://hackage.haskell.org/package/regex-tdfa)
* Optional haskell stuff, used by the [[assistant]] and its webapp (edit Makefile to disable) * Optional haskell stuff, used by the [[assistant]] and its webapp (edit Makefile to disable)
* [stm](http://hackage.haskell.org/package/stm) * [stm](http://hackage.haskell.org/package/stm)
(version 2.3 or newer) (version 2.3 or newer)

View file

@ -70,7 +70,7 @@ Executable git-annex
extensible-exceptions, dataenc, SHA, process, json, extensible-exceptions, dataenc, SHA, process, json,
base (>= 4.5 && < 4.8), monad-control, transformers-base, lifted-base, base (>= 4.5 && < 4.8), monad-control, transformers-base, lifted-base,
IfElse, text, QuickCheck >= 2.1, bloomfilter, edit-distance, process, IfElse, text, QuickCheck >= 2.1, bloomfilter, edit-distance, process,
SafeSemaphore, uuid, random, regex-compat SafeSemaphore, uuid, random, regex-tdfa
-- Need to list these because they're generated from .hsc files. -- Need to list these because they're generated from .hsc files.
Other-Modules: Utility.Touch Utility.Mounts Other-Modules: Utility.Touch Utility.Mounts
Include-Dirs: Utility Include-Dirs: Utility