2013-10-28 18:50:17 +00:00
|
|
|
{- git-annex checking whether content is wanted
|
2012-10-08 20:06:56 +00:00
|
|
|
-
|
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
preferred content did not get dropped.
The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.
It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.
While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.
Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.
Sponsored-by: Dartmouth College's Datalad project
2021-05-24 18:02:50 +00:00
|
|
|
- Copyright 2012-2021 Joey Hess <id@joeyh.name>
|
2012-10-08 20:06:56 +00:00
|
|
|
-
|
2019-03-13 19:48:14 +00:00
|
|
|
- Licensed under the GNU AGPL version 3 or higher.
|
2012-10-08 20:06:56 +00:00
|
|
|
-}
|
|
|
|
|
|
|
|
module Annex.Wanted where
|
|
|
|
|
2016-01-20 20:36:33 +00:00
|
|
|
import Annex.Common
|
2012-10-08 20:06:56 +00:00
|
|
|
import Logs.PreferredContent
|
|
|
|
import Annex.UUID
|
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
preferred content did not get dropped.
The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.
It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.
While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.
Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.
Sponsored-by: Dartmouth College's Datalad project
2021-05-24 18:02:50 +00:00
|
|
|
import Annex.CatFile
|
|
|
|
import Git.FilePath
|
|
|
|
import qualified Database.Keys
|
2021-05-25 14:57:06 +00:00
|
|
|
import Types.FileMatcher
|
2012-10-08 20:06:56 +00:00
|
|
|
|
|
|
|
import qualified Data.Set as S
|
|
|
|
|
2012-10-08 21:14:01 +00:00
|
|
|
{- Check if a file is preferred content for the local repository. -}
|
2024-08-23 20:35:12 +00:00
|
|
|
wantGet :: LiveUpdate -> Bool -> Maybe Key -> AssociatedFile -> Annex Bool
|
|
|
|
wantGet lu d key file = isPreferredContent lu Nothing S.empty key file d
|
2012-10-08 20:06:56 +00:00
|
|
|
|
2021-05-25 14:57:06 +00:00
|
|
|
{- Check if a file is preferred content for a repository. -}
|
2024-08-23 20:35:12 +00:00
|
|
|
wantGetBy :: LiveUpdate -> Bool -> Maybe Key -> AssociatedFile -> UUID -> Annex Bool
|
|
|
|
wantGetBy lu d key file to = isPreferredContent lu (Just to) S.empty key file d
|
2012-10-08 20:06:56 +00:00
|
|
|
|
2021-05-25 14:57:06 +00:00
|
|
|
{- Check if a file is not preferred or required content, and can be
|
|
|
|
- dropped. When a UUID is provided, checks for that repository.
|
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
preferred content did not get dropped.
The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.
It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.
While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.
Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.
Sponsored-by: Dartmouth College's Datalad project
2021-05-24 18:02:50 +00:00
|
|
|
-
|
|
|
|
- The AssociatedFile is the one that the user requested to drop.
|
|
|
|
- There may be other files that use the same key, and preferred content
|
|
|
|
- may match some of those and not others. If any are preferred content,
|
|
|
|
- that will prevent dropping. When the other associated files are known,
|
|
|
|
- they can be provided, otherwise this looks them up.
|
|
|
|
-}
|
2024-08-23 20:35:12 +00:00
|
|
|
wantDrop :: LiveUpdate -> Bool -> Maybe UUID -> Maybe Key -> AssociatedFile -> (Maybe [AssociatedFile]) -> Annex Bool
|
|
|
|
wantDrop lu d from key file others =
|
|
|
|
isNothing <$> checkDrop isPreferredContent lu d from key file others
|
2021-05-25 14:57:06 +00:00
|
|
|
|
|
|
|
{- Generalization of wantDrop that can also be used with isRequiredContent.
|
|
|
|
-
|
|
|
|
- When the content should not be dropped, returns Just the file that
|
|
|
|
- the checker matches.
|
|
|
|
-}
|
2024-08-23 20:35:12 +00:00
|
|
|
checkDrop :: (LiveUpdate -> Maybe UUID -> AssumeNotPresent -> Maybe Key -> AssociatedFile -> Bool -> Annex Bool) -> LiveUpdate -> Bool -> Maybe UUID -> Maybe Key -> AssociatedFile -> (Maybe [AssociatedFile]) -> Annex (Maybe AssociatedFile)
|
|
|
|
checkDrop checker lu d from key file others = do
|
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
preferred content did not get dropped.
The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.
It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.
While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.
Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.
Sponsored-by: Dartmouth College's Datalad project
2021-05-24 18:02:50 +00:00
|
|
|
u <- maybe getUUID (pure . id) from
|
|
|
|
let s = S.singleton u
|
2024-08-23 20:35:12 +00:00
|
|
|
let checker' f = checker lu (Just u) s key f d
|
2021-05-25 14:57:06 +00:00
|
|
|
ifM (checker' file)
|
|
|
|
( return (Just file)
|
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
preferred content did not get dropped.
The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.
It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.
While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.
Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.
Sponsored-by: Dartmouth College's Datalad project
2021-05-24 18:02:50 +00:00
|
|
|
, do
|
|
|
|
others' <- case others of
|
|
|
|
Just afs -> pure (filter (/= file) afs)
|
|
|
|
Nothing -> case key of
|
2021-05-24 18:46:59 +00:00
|
|
|
Just k ->
|
|
|
|
mapM (\f -> AssociatedFile . Just <$> fromRepo (fromTopFilePath f))
|
|
|
|
=<< Database.Keys.getAssociatedFiles k
|
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
preferred content did not get dropped.
The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.
It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.
While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.
Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.
Sponsored-by: Dartmouth College's Datalad project
2021-05-24 18:02:50 +00:00
|
|
|
Nothing -> pure []
|
2021-05-25 14:57:06 +00:00
|
|
|
l <- filterM checker' others'
|
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
preferred content did not get dropped.
The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.
It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.
While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.
Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.
Sponsored-by: Dartmouth College's Datalad project
2021-05-24 18:02:50 +00:00
|
|
|
if null l
|
2021-05-25 14:57:06 +00:00
|
|
|
then return Nothing
|
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
preferred content did not get dropped.
The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.
It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.
While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.
Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.
Sponsored-by: Dartmouth College's Datalad project
2021-05-24 18:02:50 +00:00
|
|
|
else checkassociated l
|
|
|
|
)
|
|
|
|
where
|
|
|
|
-- Some associated files that are in the keys database may no
|
|
|
|
-- longer correspond to files in the repository, and should
|
|
|
|
-- not prevent dropping.
|
2021-05-25 14:57:06 +00:00
|
|
|
checkassociated [] = return Nothing
|
|
|
|
checkassociated (af@(AssociatedFile (Just f)):fs) =
|
|
|
|
catKeyFile f >>= \case
|
|
|
|
Just k | Just k == key -> return (Just af)
|
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
preferred content did not get dropped.
The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.
It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.
While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.
Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.
Sponsored-by: Dartmouth College's Datalad project
2021-05-24 18:02:50 +00:00
|
|
|
_ -> checkassociated fs
|
|
|
|
checkassociated (AssociatedFile Nothing:fs) = checkassociated fs
|