From 17c013aa92d094adb670cb9ca1bea7d1e6dbfff4 Mon Sep 17 00:00:00 2001 From: "https://id.koumbit.net/anarcat" Date: Sat, 30 May 2015 06:15:25 +0000 Subject: [PATCH] weird rss feeds with youtube enclosures and unicorns --- ..._can__39__t_deal_with_pycon_rss_feeds.mdwn | 25 +++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 doc/bugs/importfeed_can__39__t_deal_with_pycon_rss_feeds.mdwn diff --git a/doc/bugs/importfeed_can__39__t_deal_with_pycon_rss_feeds.mdwn b/doc/bugs/importfeed_can__39__t_deal_with_pycon_rss_feeds.mdwn new file mode 100644 index 0000000000..1cfe85c0f8 --- /dev/null +++ b/doc/bugs/importfeed_can__39__t_deal_with_pycon_rss_feeds.mdwn @@ -0,0 +1,25 @@ +### Please describe the problem. + +[[git-annex-importfeed]] cannot reliably import feeds from + +### What steps will reproduce the problem? + + git annex importfeed pyvideo.org/category/65/pycon-us-2015/rss + +### What version of git-annex are you using? On what operating system? + +5.20141125 + +### Please provide any additional information below. + +It seems related to the fact that the `enclosure` XML item links to the youtube page instead of the actual video, which sounds a little crazy... so maybe the feed is just wrong and this is invalid. + +In [this pull request to the conference proceedings repo](https://github.com/RichiH/conference_proceedings/pull/42), i hacked around that problem with a [simple loop](https://github.com/anarcat/conference_proceedings/blob/pycon/PyCon/README.md): + + curl -s http://pyvideo.org/category/65/pycon-us-2015/rss | xmllint --xpath '//enclosure/@url' - | sed 's/url=/\n/g' | xargs git annex addurl --fast + +xargs doesn't actually work so well because every once in a while, youtube will return a 403 and the whole pipeline crumbles to the ground. I found this to work better: + + curl -s http://pyvideo.org/category/65/pycon-us-2015/rss | xmllint --xpath '//enclosure/@url' - | sed 's/url="/\n/g;s/"//' | while read url ; do git annex addurl --fast $url ; done + +but it's pretty ugly! --[[anarcat]]