Merge branch 'master' into youtube-dl
This commit is contained in:
commit
640cb36a5c
12 changed files with 207 additions and 2 deletions
4
Makefile
4
Makefile
|
@ -299,12 +299,12 @@ dist/caballog: git-annex.cabal
|
||||||
# TODO should be possible to derive this from caballog.
|
# TODO should be possible to derive this from caballog.
|
||||||
hdevtools:
|
hdevtools:
|
||||||
hdevtools --stop-server || true
|
hdevtools --stop-server || true
|
||||||
hdevtools check git-annex.hs -g -cpp -g -i -g -idist/build/git-annex/git-annex-tmp -g -i. -g -idist/build/autogen -g -Idist/build/autogen -g -Idist/build/git-annex/git-annex-tmp -g -IUtility -g -DWITH_TESTSUITE -g -DWITH_S3 -g -DWITH_ASSISTANT -g -DWITH_INOTIFY -g -DWITH_DBUS -g -DWITH_PAIRING -g -g -optP-include -g -optPdist/build/autogen/cabal_macros.h -g -odir -g dist/build/git-annex/git-annex-tmp -g -hidir -g dist/build/git-annex/git-annex-tmp -g -stubdir -g dist/build/git-annex/git-annex-tmp -g -threaded -g -Wall -g -XHaskell98 -g -XPackageImports
|
hdevtools check git-annex.hs -g -cpp -g -i -g -idist/build/git-annex/git-annex-tmp -g -i. -g -idist/build/autogen -g -Idist/build/autogen -g -Idist/build/git-annex/git-annex-tmp -g -IUtility -g -DWITH_TESTSUITE -g -DWITH_S3 -g -DWITH_ASSISTANT -g -DWITH_INOTIFY -g -DWITH_DBUS -g -DWITH_PAIRING -g -g -optP-include -g -optPdist/build/autogen/cabal_macros.h -g -odir -g dist/build/git-annex/git-annex-tmp -g -hidir -g dist/build/git-annex/git-annex-tmp -g -stubdir -g dist/build/git-annex/git-annex-tmp -g -threaded -g -Wall -g -XHaskell98 -g -XPackageImports -g -XLambdaCase
|
||||||
|
|
||||||
distributionupdate:
|
distributionupdate:
|
||||||
git pull
|
git pull
|
||||||
cabal configure
|
cabal configure
|
||||||
ghc -Wall -fno-warn-tabs --make Build/DistributionUpdate -XPackageImports -optP-include -optPdist/build/autogen/cabal_macros.h
|
ghc -Wall -fno-warn-tabs --make Build/DistributionUpdate -XLambdaCase -XPackageImports -optP-include -optPdist/build/autogen/cabal_macros.h
|
||||||
./Build/DistributionUpdate
|
./Build/DistributionUpdate
|
||||||
|
|
||||||
.PHONY: git-annex git-union-merge tags
|
.PHONY: git-annex git-union-merge tags
|
||||||
|
|
15
doc/bugs/Adjust_--unlock_not_using_--reflink__63__.mdwn
Normal file
15
doc/bugs/Adjust_--unlock_not_using_--reflink__63__.mdwn
Normal file
|
@ -0,0 +1,15 @@
|
||||||
|
### Please describe the problem.
|
||||||
|
|
||||||
|
Running adjust --unlock is unexpectedly slow and seems to use a lot of space, even on BTRFS, suggesting it probably does not use --reflink=auto like most other commands.
|
||||||
|
|
||||||
|
### What steps will reproduce the problem?
|
||||||
|
|
||||||
|
Run adjust --unlock with very large files.
|
||||||
|
|
||||||
|
### What version of git-annex are you using? On what operating system?
|
||||||
|
|
||||||
|
6.20170101-1+deb9u1 on Debian Stretch
|
||||||
|
|
||||||
|
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||||
|
|
||||||
|
Yes I have! I've used it manage lots of video editing disks before, and am now migrating several slightly different copies of 15TB sized documentary footage from random USB3 disks and LTO tapes to a RAID server with BTRFS.
|
30
doc/devblog/youtube-dl.mdwn
Normal file
30
doc/devblog/youtube-dl.mdwn
Normal file
|
@ -0,0 +1,30 @@
|
||||||
|
Working on [[todo/switch_from_quvi_to_youtube-dl]], because
|
||||||
|
quvi is not being maintained and youtube-dl can download a lot more stuff.
|
||||||
|
|
||||||
|
Unfortunately, youtube-dl's interface is not a good fit for git-annex,
|
||||||
|
compared with quvi's interface which was a near-perfect fit. Two things
|
||||||
|
git-annex relied on quvi for are a way to check if a url has embedded media
|
||||||
|
without downloading the url, and a way to get the url from which the
|
||||||
|
embedded media can be downloaded. Youtube-dl supports neither. Also it has
|
||||||
|
some other warts that make it unncessarily hard to interface with, like not
|
||||||
|
always [storing the download in the location specified by --output](https://github.com/rg3/youtube-dl/issues/14864),
|
||||||
|
and [sometimes crashing when downloading non-media urls (eg over my satellite internet)](http://bugs.debian.org/874321).
|
||||||
|
|
||||||
|
I've found ways to avoid all these problems. For example, to make
|
||||||
|
`git annex addurl` avoid unncessarily overhead of running youtube-dl
|
||||||
|
in the common case of downloading some non-web-page file, I'll have it
|
||||||
|
download the url content, and check if it looks like a html page.
|
||||||
|
Only then will it use youtube-dl. So addurl of html pages without
|
||||||
|
embedded media will get slower, but addurl of everything else
|
||||||
|
will be as fast as before.
|
||||||
|
|
||||||
|
But there's an unavoidable change to `addurl --relaxed`. It will not check
|
||||||
|
for embedded media and more, because that would make it a lot slower, since
|
||||||
|
it would have to hit the network. `addurl --fast` will have to be used for
|
||||||
|
such urls instead. I hope this behavior change won't affect workflows
|
||||||
|
badly.
|
||||||
|
|
||||||
|
Today was all coding groundwork, and I just got to the point that I'm
|
||||||
|
ready to have it run youtube-dl. Hope to finish it tomorrow.
|
||||||
|
|
||||||
|
Today's work was sponsored by Jake Vosloo [on Patreon](https://www.patreon.com/joeyh).
|
13
doc/devblog/youtube-dl_day_2.mdwn
Normal file
13
doc/devblog/youtube-dl_day_2.mdwn
Normal file
|
@ -0,0 +1,13 @@
|
||||||
|
It's mostly working now. Still need to fix --fast and --relaxed, and avoid
|
||||||
|
youtube-dl running out of the annex.diskreserve.
|
||||||
|
|
||||||
|
The first hour or two was spent adding support for per-key temp
|
||||||
|
directories. youtube-dl is run inside such a directory, to let it write
|
||||||
|
whatever files it needs. Like the per-key temp files, these temp directories
|
||||||
|
are not cleaned up when a download fails or is interrupted, so resuming can
|
||||||
|
pick up where it left off. Taught `git annex dropunused` and everything
|
||||||
|
else that cleans up per-key temp files to also clean up the temp
|
||||||
|
directories.
|
||||||
|
|
||||||
|
Today's work was sponsored by Trenton Cronholm on
|
||||||
|
[Patreon](https://patreon.com/joeyh/)
|
32
doc/forum/Rename_local_repository_for_git-annex-info.mdwn
Normal file
32
doc/forum/Rename_local_repository_for_git-annex-info.mdwn
Normal file
|
@ -0,0 +1,32 @@
|
||||||
|
I initialized local repository with
|
||||||
|
|
||||||
|
git-annex init $HOSTNAME --version=6
|
||||||
|
|
||||||
|
unfortunately I didn't change HOSTNAME on a new machine and it was 'localhost.localdomain'. I didn't notice that before I cloned a git-annex repository.
|
||||||
|
|
||||||
|
Now in the remote repository when I run `git-annex info` (and same in local repository), I see
|
||||||
|
|
||||||
|
$ git-annex info
|
||||||
|
...
|
||||||
|
semitrusted repositories: 6
|
||||||
|
00000000-0000-0000-0000-000000000001 -- web
|
||||||
|
00000000-0000-0000-0000-000000000002 -- bittorrent
|
||||||
|
0085(maybe it's not secure to write it on the internet)-e8f803a -- localhost.localdomain
|
||||||
|
...
|
||||||
|
|
||||||
|
(and other repositories. By the way, I never initialized 'web' and 'bittorent', where did they get from?)
|
||||||
|
|
||||||
|
I would like 'localhost.localdomain' to become my real $HOSTNAME, so that I would distinguish that machine. How could I do that?
|
||||||
|
|
||||||
|
I found [How to rename a remote](https://git-annex.branchable.com/forum/How_to_rename_a_remote__63__/), but my 'localhost' is not listed in git-remotes.
|
||||||
|
|
||||||
|
I grep-ed .git for 'localhost.localdomain', and changed `.git/COMMIT_EDITMSG`. However, after running git-annex sync it returns to 'localhost.localdomain'.
|
||||||
|
|
||||||
|
$ more .git/COMMIT_EDITMSG
|
||||||
|
git-annex in Acer
|
||||||
|
$ git-annex sync
|
||||||
|
...
|
||||||
|
$ more .git/COMMIT_EDITMSG
|
||||||
|
git-annex in localhost.localdomain
|
||||||
|
|
||||||
|
I would like to change 'localhost' to my real machine name both on the remote repository from which I cloned and on local repository. Thank you.
|
|
@ -0,0 +1,13 @@
|
||||||
|
Hello. Am a newbie to Git Annex(ga), but love it already. I kept trying to index own important files for the past long time, but ended up all tangled up. With ga I now see a light at the end of the tunnel! (Hope it's not a train heading my way :)
|
||||||
|
|
||||||
|
So thanks a bucket for writing Git Annex!
|
||||||
|
|
||||||
|
I am an "archiver": Every file I add to ga repo is a never-to-be-changed file (it's checksum stays same throughout eternity, only metadata keeps changin). All I need ga for atm is to tag all files. Unfortunately we are talking about few hundred thousand files and the performance with the master git-annex-6.20170519 is not quite what one might hope for.
|
||||||
|
|
||||||
|
From your design/caching_database doc I gather that the outlook with metadata is positive ( "For metadata, the story is much nicer. Querying for 30000 keys that all have a particular tag in their metadata takes 0.65s. So fast enough to be used in views." ), but is not in a db (sqlite) yet in the master (git-annex-6.20170519) . I tried to dig through some of the Links there to find out which commit could I checkout and build to try out a cached metadata, but no avail.
|
||||||
|
|
||||||
|
Since I don't ever change any file once it gets checked into the ga repo, does that simplify my possible use of current metadata cache code, or will I have to try to learn haskell and will I need to code stuff to get performance (creating views and such).
|
||||||
|
|
||||||
|
TIA for any pointers, tips and cavats and THANKS AGAIN FOR WRITING GIT-ANNEX.
|
||||||
|
|
||||||
|
ganewbie01
|
|
@ -0,0 +1,10 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="ganewbie01"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/a3b7d6e560486cb87c51cb0cf3328c8e"
|
||||||
|
subject="development branches inaccessible?"
|
||||||
|
date="2017-11-26T13:12:16Z"
|
||||||
|
content="""
|
||||||
|
To not sit idle, I've been looking for development branches (specifically the one containing code that gave the rise to Joey's claim \"Querying for 30000 keys that all have a particular tag in their metadata takes 0.65s.\"), but could find only repos with the one branch - the master branch, which doesn't (naturally seem to) include the code for SQLite metadata tinkering.
|
||||||
|
|
||||||
|
Is there someplace I could find such development branches please?
|
||||||
|
"""]]
|
|
@ -0,0 +1,39 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="olaf"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/4ae498d3d6ee558d6b65caa658f72572"
|
||||||
|
subject="comment 2"
|
||||||
|
date="2017-11-27T05:39:04Z"
|
||||||
|
content="""
|
||||||
|
Did you clone the repository?
|
||||||
|
|
||||||
|
$ git clone git://git-annex.branchable.com/ git-annex
|
||||||
|
|
||||||
|
I see lots of branches (remember they are *remote* branches so you will need the `-a` flag):
|
||||||
|
|
||||||
|
$ git branch -a
|
||||||
|
* master
|
||||||
|
remotes/origin/HEAD -> origin/master
|
||||||
|
remotes/origin/atomic-store-test
|
||||||
|
remotes/origin/debian
|
||||||
|
remotes/origin/debian-jessie-backport
|
||||||
|
remotes/origin/debian-squeeze-backport
|
||||||
|
remotes/origin/debian-stable-security-fix
|
||||||
|
remotes/origin/debian-wheezy-backport
|
||||||
|
remotes/origin/ghc7.0
|
||||||
|
remotes/origin/improved-smudge-filters
|
||||||
|
remotes/origin/master
|
||||||
|
remotes/origin/newwinrelease
|
||||||
|
remotes/origin/no-direct-mode
|
||||||
|
remotes/origin/p2p-map
|
||||||
|
remotes/origin/setup
|
||||||
|
remotes/origin/smudge
|
||||||
|
remotes/origin/tweak-fetch
|
||||||
|
remotes/origin/uuid-type-rework
|
||||||
|
remotes/origin/winsplicehack
|
||||||
|
|
||||||
|
You can checkout one of the branches like:
|
||||||
|
|
||||||
|
$ git checkout remotes/origin/setup
|
||||||
|
|
||||||
|
Does that help?
|
||||||
|
"""]]
|
|
@ -0,0 +1,15 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="ganewbie01"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/a3b7d6e560486cb87c51cb0cf3328c8e"
|
||||||
|
subject="found it! ( I think ... or should I be still looking for "database" branch? )"
|
||||||
|
date="2017-11-28T01:03:05Z"
|
||||||
|
content="""
|
||||||
|
hi, thanks for your reply;
|
||||||
|
I've spent several hours today looking through the git-annex repo. I think it was a great idea to place the forums and everything in one repo! It provides sort of a \"running commentary\" on what was going on and why ...
|
||||||
|
|
||||||
|
After a couple of hours looking through the repo using tig, I checked out the key commit \"bb242bdd82a438ebfc937609d8d13b512cb49943\" and found the foo.hs and fooes.hs files which are most likely the ones that Joey was writing about when he expressed hopes for metadata in an sqlite file. ( I didn't find a way to see \"old branches\" though, e.g. the one named `database`. Maybe if I study git more ... )
|
||||||
|
|
||||||
|
Thanks for your reply to a silly newbie question anyway! I'll study this some more and see if I have some on-topic questions (hopefully they will be more educated by then :) )
|
||||||
|
|
||||||
|
g'day!
|
||||||
|
"""]]
|
|
@ -0,0 +1,22 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 4"""
|
||||||
|
date="2017-11-28T21:47:54Z"
|
||||||
|
content="""
|
||||||
|
Yeah, you found the stuff. That's as far as the metadata cache idea has
|
||||||
|
gotten yet. I've restored the missing "database" branch, which was just
|
||||||
|
that commit you found.
|
||||||
|
|
||||||
|
I do hope to circle back around to this eventually to speed up generating
|
||||||
|
views and other metadata queries.
|
||||||
|
|
||||||
|
But, as a programmer, you could create your own sqlite database and put
|
||||||
|
metadata about your git-annex repository in it. Using
|
||||||
|
`git annex metadata --batch --json` you can query git-annex
|
||||||
|
for metadata about your files as fast as it can pull it out of git,
|
||||||
|
and shove it into your database, and then write your own sql queries.
|
||||||
|
|
||||||
|
That would be a good first step, because working with real-world
|
||||||
|
data would help develop the sql schema and see if it'll be fast enough to
|
||||||
|
bother with putting into git-annex.
|
||||||
|
"""]]
|
|
@ -0,0 +1,8 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="sunny256"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/8a221001f74d0e8f4dadee3c7d1996e4"
|
||||||
|
subject="Version missing from the annex"
|
||||||
|
date="2017-11-29T16:15:03Z"
|
||||||
|
content="""
|
||||||
|
It seems as this version is missing from https://downloads.kitenet.net/.git/ , the newest version there is v6.20171109.
|
||||||
|
"""]]
|
|
@ -0,0 +1,8 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 2"""
|
||||||
|
date="2017-11-29T21:38:26Z"
|
||||||
|
content="""
|
||||||
|
Indeed it was. I must have forgotten to push out the files for that
|
||||||
|
release. Done so now.
|
||||||
|
"""]]
|
Loading…
Reference in a new issue