p2phttp: Scan multilevel directories with --directory

This allows for eg dir/user/repo structure. But also other layouts. It
still does not look for repositories that are nested inside other
repositories.

The check for symlinks is mostly to avoid cycles that would prevent
findRepos from returning. Eg, foo/bar/baz being a symlink to foo/bar.

If the directory is writable by someone else they can still race it and
get it to follow a symlink to some other directory. I don't think p2phttp
needs to worry about that kind of situation though, and I doubt it avoids
such problems when operating on files in a git-annex repository either.
This commit is contained in:
Joey Hess 2025-07-07 16:07:13 -04:00
parent 0ad937f230
commit 66b009a0f6
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 30 additions and 6 deletions

View file

@ -1,3 +1,9 @@
git-annex (10.20250631) UNRELEASED; urgency=medium
* p2phttp: Scan multilevel directories with --directory.
-- Joey Hess <id@joeyh.name> Mon, 07 Jul 2025 15:59:42 -0400
git-annex (10.20250630) upstream; urgency=medium git-annex (10.20250630) upstream; urgency=medium
* Work around git 2.50 bug that caused it to crash when there is a merge * Work around git 2.50 bug that caused it to crash when there is a merge

View file

@ -1,6 +1,6 @@
{- git-annex command {- git-annex command
- -
- Copyright 2024 Joey Hess <id@joeyh.name> - Copyright 2024-2025 Joey Hess <id@joeyh.name>
- -
- Licensed under the GNU AGPL version 3 or higher. - Licensed under the GNU AGPL version 3 or higher.
-} -}
@ -21,11 +21,13 @@ import qualified Git
import qualified Git.Construct import qualified Git.Construct
import qualified Annex import qualified Annex
import Types.Concurrency import Types.Concurrency
import qualified Utility.RawFilePath as R
import Servant import Servant
import qualified Network.Wai.Handler.Warp as Warp import qualified Network.Wai.Handler.Warp as Warp
import qualified Network.Wai.Handler.WarpTLS as Warp import qualified Network.Wai.Handler.WarpTLS as Warp
import Network.Socket (PortNumber) import Network.Socket (PortNumber)
import System.PosixCompat.Files (isSymbolicLink)
import qualified Data.Map as M import qualified Data.Map as M
import Data.String import Data.String
import Control.Concurrent.STM import Control.Concurrent.STM
@ -268,6 +270,20 @@ findRepos :: Options -> IO [Git.Repo]
findRepos o = do findRepos o = do
files <- concat files <- concat
<$> mapM (dirContents . toOsPath) (directoryOption o) <$> mapM (dirContents . toOsPath) (directoryOption o)
map Git.Construct.newFrom . catMaybes concat <$> mapM go files
<$> mapM Git.Construct.checkForRepo files where
go f = Git.Construct.checkForRepo f >>= \case
Just loc -> return [Git.Construct.newFrom loc]
Nothing ->
-- Avoid following symlinks, both to avoid
-- cycles and in case there is an unexpected
-- symlink to some other directory we are not
-- supposed to serve.
ifM (isSymbolicLink <$> R.getSymbolicLinkStatus f)
( return []
-- Ignore any errors getting the contents of a
-- subdirectory.
, catchNonAsync
(concat <$> (mapM go =<< dirContents f))
(const (return []))
)

View file

@ -41,8 +41,10 @@ convenient way to download the content of any key, by using the path
* `--directory=path` * `--directory=path`
Serve each git-annex repository found in immediate Serve each git-annex repository found in subdirectories of the directory.
subdirectories of a directory. For example, `--directory=/foo` will find git-annex repositories
in `/foo/bar`, `/foo/user/bar`, and so on. Note that a git-annex
repository located within another git-annex repository will not be found.
This option can be provided more than once to serve several directories This option can be provided more than once to serve several directories
full of git-annex repositories. full of git-annex repositories.