addurl --preserve-filename and a few related changes

* addurl --preserve-filename: New option, uses server-provided filename
  without any sanitization, but with some security checking.

  Not yet implemented for remotes other than the web.

* addurl, importfeed: Avoid adding filenames with leading '.', instead
  it will be replaced with '_'.

  This might be considered a security fix, but a CVE seems unwattanted.
  It was possible for addurl to create a dotfile, which could change
  behavior of some program. It was also possible for a web server to say
  the file name was ".git" or "foo/.git". That would not overrwrite the
  .git directory, but would cause addurl to fail; of course git won't
  add "foo/.git".

sanitizeFilePath is too opinionated to remain in Utility, so moved it.

The changes to mkSafeFilePath are because it used sanitizeFilePath.
In particular:

	isDrive will never succeed, because "c:" gets munged to "c_"
	".." gets sanitized now
	".git" gets sanitized now
	It will never be null, because sanitizeFilePath keeps the length
	the same, and splitDirectories never returns a null path.

Also, on the off chance a web server suggests a filename of "",
ignore that, rather than trying to save to such a filename, which would
fail in some way.
This commit is contained in:
Joey Hess 2020-05-08 16:09:29 -04:00
parent 54599207f7
commit 6952060665
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
9 changed files with 132 additions and 39 deletions

View file

@ -0,0 +1,18 @@
[[!comment format=mdwn
username="joey"
subject="""comment 3"""
date="2020-05-08T19:56:27Z"
content="""
Implemented git-annex addurl --preserve-filename, which will do what you
want.
Leaving this bug open because I only implemented it for web urls, not yet
for torrents and other special remotes that have their own url scheme.
The sanitization for those is currently done at a lower level than addurl,
and so that will take a bit more work to implement.
(importfeed does not, I think, need to implement this option, because
the filenames are based on information from the rss feed, and it's
perfectly fine to sanitize eg a podcast episode title to get a reasonable
filename.)
"""]]