importfeed: Look at not only permalinks, but now also guids to identify previously downloaded files.

I've seen rss feeds that have no permalinks, only guids (which are
sometimes in the form of permalinks, argh/sigh).

I had previously avoided trusting guids to be globally unique, because my
survey of rss feeds that I subscribe to shows a lot of pretty bad
"guids" like "2 at http://serialpodcast.org" or even worse "oth20150401-hq".
Worry was that two podcasts that are generating guids so badly, that
there's no guarantee they're actually globally unique.

But, I'm seeing too many url changes that result in redundant files, so
let's try this. If feeds are so broken that guids overlap, they could just
as well incorrectly call them permalinks too.
This commit is contained in:
Joey Hess 2015-07-20 14:56:57 -04:00
parent 3c134ee21a
commit f95a8c8672
2 changed files with 6 additions and 2 deletions

View file

@ -219,8 +219,7 @@ performDownload opts cache todownload = case location todownload of
| otherwise = a
knownitemid = case getItemId (item todownload) of
-- only when it's a permalink
Just (True, itemid) -> S.member itemid (knownitems cache)
Just (_, itemid) -> S.member itemid (knownitems cache)
_ -> False
rundownload url extension getter = do