Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2013-10-11 17:35:54 -04:00
commit 36f8bfdf37
3 changed files with 79 additions and 0 deletions

View file

@ -0,0 +1,61 @@
### Please describe the problem.
`addurl` doesn't support the internet archive:
1. it doesn't actually accept the proper URL as a secondary source of content
2. it doesn't parse the HTML from the video page (the "details page")
### What steps will reproduce the problem?
# download eben moglen's excellent re:publica presentation from youtube
git annex addurl https://www.youtube.com/watch?v=sKOk4Y4inVY
# copy that file aside
cp -L re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm-bak
# drop it so we can try again
git annex drop re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
# add the IA URL for the same video, failing
git annex addurl --file=re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
# try again with --relaxed to skip some checks
git annex addurl --relaxed --file=re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
# observe both files are the same size and checksum
The files should look like this:
[[!format txt """
anarcat@angela:presentations$ ls -alL re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm*
-r--r--r-- 1 anarcat anarcat 419359123 oct 9 23:41 re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
-r--r--r-- 1 anarcat anarcat 419359123 oct 11 19:40 re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm-bak
anarcat@angela:presentations$ md5sum re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm*
7892df24a9e1c40e2587be1035728ef0 re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
7892df24a9e1c40e2587be1035728ef0 re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm-bak
"""]]
There are two separate bugs here: one is the above need to use --relaxed even though the file is the same.
The second is probably simply that quvi doesn't support the internet archive, and maybe that one should be moved to a separate [[todo]]/[[wishlist]]. I was expecting this to "do the right thing" (ie. download the video):
git annex addurl http://archive.org/details/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia
... but instead it downloads the HTML.
### What version of git-annex are you using? On what operating system?
my good old faithful `4.20130921-g434dc22` i compiled manually some time ago. :)
### Please provide any additional information below.
[[!format sh """
anarcat@marcos:presentations$ git annex addurl --file=re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm http://archive.org/download/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
addurl re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
failed to verify url exists: http://archive.org/download/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
failed
git-annex: addurl: 1 failed
anarcat@marcos:presentations$ git annex addurl --debug --file=re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
[2013-10-09 18:26:30 EDT] call: quvi ["-v","mute","--support","http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm"]
addurl re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm [2013-10-09 18:26:30 EDT] read: curl ["-s","--head","-L","http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm","-w","%{http_code}"]
failed to verify url exists: http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
failed
git-annex: addurl: 1 failed
"""]]
Originally reported in [[tips/Internet_Archive_via_S3]]. --[[anarcat]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.4.22"
subject="comment 1"
date="2013-10-11T18:49:25Z"
content="""
Afaik this was fixed in 747f5b123cb3c6b3b87d4e79f8767e69d842b96b.
Probably noone has bothered to add IA support to quvi, but it should be doable.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://id.koumbit.net/anarcat"
ip="72.0.72.144"
subject="still a bug, filed separately!"
date="2013-10-11T18:49:06Z"
content="""
Aaah, of course, sorry for the noise here. It turns out that this is *not* because the filesize (or even the checksum, for that matter) are different, so there's clearly a bug there, and i filed it in [[bugs/addurl_fails_on_the_internet_archive]]. Thanks!
"""]]