honor preferred content settings of cluster nodes

Except when no nodes want a file, it has to be stored somewhere, so
store it on all. Which is not really desirable, but neither is having to
pick one.

ProtoAssociatedFile deserialization is rather broken, and this could
possibly affect preferred content expressions that match on filenames.

The inability to roundtrip whitespace like tabs and newlines through is
not a problem because preferred content expressions can't be written
that match on whitespace such as a tab. For example:

joey@darkstar:~/tmp/bench/z>git-annex wanted  origin-node2 'exclude=*CTRL-VTab*'
wanted origin-node2
git-annex: Parse error: Parse failure: near "*"

But, the filtering of control characters could perhaps be a problem. I think
that filtering is now obsolete, git-annex has comprehensive filtering of
control characters when displaying filenames, that happens at a higher level.
However, I don't want to risk a security hole so am leaving in that filtering
in ProtoAssociatedFile deserialization for now.
This commit is contained in:
Joey Hess 2024-06-25 11:35:41 -04:00
parent a23b0abf28
commit 1bfe7f8a53
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 48 additions and 28 deletions

View file

@ -173,12 +173,15 @@ instance Proto.Serializable Service where
-- its serialization cannot contain any whitespace. This is handled
-- by replacing whitespace with '%' (and '%' with '%%')
--
-- When deserializing an AssociatedFile from a peer, it's sanitized,
-- to avoid any unusual characters that might cause problems when it's
-- displayed to the user.
-- When deserializing an AssociatedFile from a peer, that escaping is
-- reversed. Unfortunately, an input tab will be deescaped to a space
-- though. And it's sanitized, to avoid any control characters that might
-- cause problems when it's displayed to the user.
--
-- These mungings are ok, because a ProtoAssociatedFile is only ever displayed
-- to the user and does not need to match a file on disk.
-- These mungings are ok, because a ProtoAssociatedFile is normally
-- only displayed to the user and so does not need to match a file on disk.
-- It may also be used in checking preferred content, which is very
-- unlikely to care about spaces vs tabs or control characters.
instance Proto.Serializable ProtoAssociatedFile where
serialize (ProtoAssociatedFile (AssociatedFile Nothing)) = ""
serialize (ProtoAssociatedFile (AssociatedFile (Just af))) =