Switch from using regex-compat to regex-tdfa, as the C regex library is rather buggy.

This commit is contained in:
Joey Hess 2013-03-08 15:29:01 -04:00
parent 0dbea8a9a1
commit a2d94bd627
7 changed files with 29 additions and 7 deletions

View file

@ -13,7 +13,8 @@ import Data.Time.Clock.POSIX
import qualified Data.Set as S
import qualified Data.Map as M
import System.Path.WildMatch
import Text.Regex
import Text.Regex.TDFA
import Text.Regex.TDFA.String
import Common.Annex
import qualified Annex
@ -83,12 +84,17 @@ limitExclude :: MkLimit
limitExclude glob = Right $ const $ return . not . matchglob glob
{- Could just use wildCheckCase, but this way the regex is only compiled
- once. -}
- once. Also, we use regex-TDFA because it's less buggy in its support
- of non-unicode characters. -}
matchglob :: String -> Annex.FileInfo -> Bool
matchglob glob (Annex.FileInfo { Annex.matchFile = f }) =
isJust $ matchRegex cregex f
case cregex of
Right r -> case execute r f of
Right (Just _) -> True
_ -> False
Left _ -> error $ "failed to compile regex: " ++ regex
where
cregex = mkRegex regex
cregex = compile defaultCompOpt defaultExecOpt regex
regex = '^':wildToRegex glob
{- Adds a limit to skip files not believed to be present

View file

@ -19,7 +19,7 @@ fast: dist/caballog
@$$(grep 'ghc --make' dist/caballog | head -n 1)
@ln -sf dist/build/git-annex/git-annex git-annex
dist/caballog:
dist/caballog: git-annex.cabal
$(CABAL) configure -f"-Production" -O0
$(CABAL) build -v2 | tee $@

2
debian/changelog vendored
View file

@ -34,6 +34,8 @@ git-annex (4.20130228) UNRELEASED; urgency=low
should add files in the current directory, but not act on unlocked files
elsewhere in the tree.
* assistant: Sync with all git remotes on startup.
* Switch from using regex-compat to regex-tdfa, as the C regex library
is rather buggy.
-- Joey Hess <joeyh@debian.org> Wed, 27 Feb 2013 23:20:40 -0400

1
debian/control vendored
View file

@ -9,6 +9,7 @@ Build-Depends:
libghc-hslogger-dev,
libghc-pcre-light-dev,
libghc-sha-dev,
libghc-regex-tdfa-dev,
libghc-dataenc-dev,
libghc-utf8-string-dev,
libghc-hs3-dev (>= 0.5.6),

View file

@ -33,3 +33,16 @@ git-annex 4.20130227, on Debian GNU/Linux (sid, i386).
LC_ALL=
</code></pre>
> Tracked this back to a bug in either the C library or the haskell
> regex-posix wrpaper around it. I'm not sure which, but I emailed the
> maintainer of the haskell library. It just doesn't think these
> things are characters; even `.` fails to match them! Everything should
> match that...
>
> There are apparently quite a lot of bugs on POSIX regex libraries
> as implemented on different systems:
> <http://www.haskell.org/haskellwiki/Regex_Posix>
>
> It seemed best to jettison this dependency entirely; I've switched it to
> haskell's pure regex-tdfa library, which works nicely. [[done]]
> --[[Joey]]

View file

@ -19,7 +19,7 @@ quite a lot.
* [DAV](http://hackage.haskell.org/package/DAV) (optional)
* [SafeSemaphore](http://hackage.haskell.org/package/SafeSemaphore)
* [UUID](http://hackage.haskell.org/package/uuid)
* [Glob](http://hackage.haskell.org/package/Glob)
* [regex-tdfa](http://hackage.haskell.org/package/regex-tdfa)
* Optional haskell stuff, used by the [[assistant]] and its webapp (edit Makefile to disable)
* [stm](http://hackage.haskell.org/package/stm)
(version 2.3 or newer)

View file

@ -70,7 +70,7 @@ Executable git-annex
extensible-exceptions, dataenc, SHA, process, json,
base (>= 4.5 && < 4.8), monad-control, transformers-base, lifted-base,
IfElse, text, QuickCheck >= 2.1, bloomfilter, edit-distance, process,
SafeSemaphore, uuid, random, regex-compat
SafeSemaphore, uuid, random, regex-tdfa
-- Need to list these because they're generated from .hsc files.
Other-Modules: Utility.Touch Utility.Mounts
Include-Dirs: Utility