git-annex/doc/todo/terminal_escapes_in_filenames.mdwn
Joey Hess 3290a09a70
filter out control characters in warning messages
Converted warning and similar to use StringContainingQuotedPath. Most
warnings are static strings, some do refer to filepaths that need to be
quoted, and others don't need quoting.

Note that, since quote filters out control characters of even
UnquotedString, this makes all warnings safe, even when an attacker
sneaks in a control character in some other way.

When json is being output, no quoting is done, since json gets its own
quoting.

This does, as a side effect, make warning messages in json output not
be indented. The indentation is only needed to offset warning messages
underneath the display of the file they apply to, so that's ok.

Sponsored-by: Brett Eisenberg on Patreon
2023-04-10 15:55:44 -04:00

67 lines
2.7 KiB
Markdown

touch $(echo -e "\e[31mfoo\e[0m")
git-annex add
git-annex whereis
That displays "foo" in red twice. Compare with behavior of git commands that
display that filename, which display it escaped.
git-annex should probably do the same, when displaying filenames that it's
working on or in messages.
`git-annex find` is an interesting case because it's expected to be
pipeable, and so should have raw filenames. Note that `find` actually
escapes such filenames when outputting to a terminal, but not a pipe.
git porcelain also accepts the escaped form of files as input, necessary for
round-tripping though. git-annex currently does not. (git plumbing doesn't
either)
While terminals mostly protect against escape sequences doing very bad
things, there are security holes in terminals still being found.
Of course, such files in git repos can also be exploited by other commands
eg `echo *`.
So this does not seem like a security hole in git-annex, but it would be
useful defense in depth against terminal security holes, and also good to
behave more like git.
--[[Joey]]
> Git.Filename.encode is implemented, and only needs to be used.
> Note that core.quotePath controls whether git quotes unicode characters
> (by default it does), so once this gets implemented, some users may want
> to set that config to false. --[[Joey]]
> Update: Most git-annex commands now quote filenames, due to work on
> ActionItem display. `git-annex find`, `git-annex info $file`,
> and everywhere filenames get
> embedded in info messages still need to be done.
----
Also:
It's possible that keys can also contain an escape sequence, eg in the
extension of a SHA-E key. So commands like `git-annex lookupkey`
and `git-annex find` that output keys might need to handle
that, when outputting to a terminal?
Also:
`git-annex metadata` could also contain an escape sequence. So could
`git-annex config --get` and `git-annex schedule` and `git-annex wanted`
and `git-annex required` and `git-annex group`. And so could the
description of a repository. It seems that git-annex could just filter out
control characters from all of these, since they are not filenames, and
any control characters in them are surely malicious.
Also: git-annex importfeed displays urls from the feed, and should filter
out control characters. If such an url even can be parsed?
Also: git-annex initremote with autoenable may be able to cause a remote
with a malicious name to be set up?
Also: Any place that an exception is thrown with an attacker-controlled value.
`giveup` has been made to filter out control characters, but that leave
other exceptions, including ones thrown by libraries. Catch all exceptions
at top-level (of program and/or worker threads) and filter out control
characters?