Merge branch 'master' of git://git-annex.branchable.com

This commit is contained in:
Richard Hartmann 2014-02-21 21:22:20 +01:00
commit 3ddb4bd08d
268 changed files with 5260 additions and 910 deletions

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="stp"
ip="188.193.207.34"
subject="Update one forgetting keys no longer present"
date="2014-02-17T23:21:49Z"
content="""
I have some repos where due to some hiccups file versions (not in the working tree anymore) were lost and now they come up again and again when fsck is running.
So I would be happy if I could make my repos forget these not available files via \"git annex forget $key\" and perhaps even have a better solution to show all objects with numcopies=0.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="209.250.56.172"
subject="comment 3"
date="2014-02-20T18:54:48Z"
content="""
@stp, It seems to me if you just delete the symlinks in your git repository that point to the lost files, `git annex fsck` will shut up.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="stp"
ip="84.56.21.11"
subject="comment 4"
date="2014-02-20T21:07:41Z"
content="""
Yeah true if I remove symlinks from the history (as I understand your suggestion) it would work. I just wanted to suggest that it could be something useful for the git annex forget function as it already cleans out old dead repos and other things.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="209.250.56.172"
subject="comment 5"
date="2014-02-20T21:14:21Z"
content="""
You don't need to delete them from the history, just from the branch you're running `git annex fsck` in.
"""]]

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="stp"
ip="84.56.21.11"
subject="comment 6"
date="2014-02-20T22:19:12Z"
content="""
As discussed on irc.
Fsck --all does check more then the working tree and therefore for fsck to not complain this would be a worthy feature to be added. (git annex forget $key)
"""]]

View file

@ -0,0 +1,25 @@
Last night I tracked down and fixed a bug in the DAV library that has been
affecting WebDAV remotes. I've been deploying the fix for that today,
including to the android and arm autobuilders. While I finished a clean
reinstall of the android autobuilder, I ran into problems getting a clean
reinstall of the arm autobuilder (some type mismatch error building
yesod-core), so manually fixed its DAV for now.
The WebDAV fix and other recent fixes makes me want to make a release soon,
probably Monday.
ObWindows: Fixed git-annex to not crash when run on Windows
in a git repository that has a remote with a unix-style path
like "/foo/bar". Seems that not everything aggrees on whether such a path
is absolute; even sometimes different parts of the same library disagree!
[[!format haskell """
import System.FilePath.Windows
prop_windows_is_sane :: Bool
prop_windows_is_sane = isAbsolute upath || ("C:\\STUFF" </> upath /= upath)
where upath = "/foo/bar"
"""]]
Perhaps more interestingly, I've been helping dxtrish port git-annex to
OpenBSD and it seems most of the way there.

View file

@ -0,0 +1,6 @@
Pushed out the new release. This is the first one where I consider the
git-annex command line beta quality on Windows.
Did some testing of the webapp on Windows, trying out every part of the UI.
I now have eleven todo items involving the webapp listed in
[[todo/windows_support]]. Most of them don't look too bad to fix.

View file

@ -0,0 +1,18 @@
There's a new design document for letting git-annex store arbitrary
metadata. The really neat thing about this is the user can check out only
files matching the tags or values they care about, and get an automatically
structuted file tree layout that can be dynamically filtered. It's going to
be awesome! [[design/metadata]]
In the meantime, spent most of today working on Windows. Very good
progress, possibly motivated by wanting to get it over with so I can spend
some time this month on the above. ;)
* webapp can make box.com and S3 remotes. This just involved fixing a hack
where the webapp set environment variables to communicate creds to
initremote. Can't change environment on Windows (or I don't know how to).
* webapp can make repos on removable drives.
* `git annex assistant --stop` works, although this is not likely to really
be useful
* The source tree now has 0 `func = error "Windows TODO"` type stubbed out
functions to trip over.

View file

@ -0,0 +1,9 @@
Built the core data types, and log for metadata storage. Making metadata
union merge well is tricky, but I have a design I'm happy with, that will
allow distributed changes to metadata.
Finished up the day with a `git annex metadata` command to get/set metadata
for a file.
This is all the goundwork needed to begin experimenting with generating
git branches that display different metadata-driven views of annexed files.

View file

@ -0,0 +1,8 @@
Windows porting all day. Fixed a lot of issues with the webapp,
so quite productive. Except for the 2 hours wasted finding a way to kill a
process by PID from Haskell on Windows.
Last night, made `git annex metadata` able to set metadata on a whole
directory or list of files if desired. And added a `--metadata field=value`
switch (and corresponding preferred content terminal) which limits
git-annex to acting on files with the specified metadata.

View file

@ -0,0 +1,17 @@
More Windows porting.. Seem to be getting near an end of the easy stuff,
and also the webapp is getting pretty usable on Windows now, the only
really important thing lacking is XMPP support.
Made git-annex on Windows set HOME when it's not already set. Several of
the bundled cygwin tools only look at HOME. This was made a lot harder and
uglier due to there not being any way to modify the environment of the
running process.. git-annex has to re-run itself with the fixed
environment.
Got rsync.net working in the webapp. Although with an extra rsync.net
password prompt on Windows, which I cannot find a way to avoid.
While testing that, I discovered that openssh 6.5p1 has broken support for
~/.ssh/config Host lines that contain upper case letters! I have filed a
bug about this and put a quick fix in git-annex, which sometimes generated
such lines.

View file

@ -0,0 +1,54 @@
Working on building [[design/metadata]] filtered branches.
Spent most of the day on types and pure code. Finally at the end
I wrote down two actions that I still need to implement to make
it all work:
[[!format haskell """
applyView' :: MkFileView -> View -> Annex Git.Branch
updateView :: View -> Git.Ref -> Git.Ref -> Annex Git.Branch
"""]]
I know how to implement these, more or less. And in most cases
they will be pretty fast.
The more interesting part is already done. That was the issue of how to
generate filenames in the filter branches. That depends on the `View` being
used to filter and organize the branch, but also on the original filename used
in the reference branch. Each filter branch has a reference branch (such as
"master"), and displays a filtered and metadata-driven reorganized tree
of files from its reference branch.
[[!format haskell """
fileViews :: View -> (FilePath -> FileView) -> FilePath -> MetaData -> Maybe [FileView]
"""]]
So, a view that matches files tagged "haskell" or "git-annex"
and with an author of "J\*" will generate filenames like
"haskell/Joachim/interesting_theoretical_talk.ogg" and
"git-annex/Joey/mytalk.ogg".
It can also work backwards from these
filenames to derive the MetaData that is encoded in them.
[[!format haskell """
fromView :: View -> FileView -> MetaData
"""]]
So, copying a file to "haskell/Joey/mytalk.ogg" lets it know that
it's gained a "haskell" tag. I knew I was on the right track when
`fromView` turned out to be only 6 lines of code!
The trickiest part of all this, which I spent most of yesterday thinking
about, is what to do if the master branch has files in subdirectories. It
probably does not makes sense to retain that hierarchical directory
structure in the filtered branch, because we instead have a
non-hierarchical metadata structure to express. (And there would probably
be a lot of deep directory structures containing only one file.) But
throwing away the subdirectory information entirely means that two files
with the same basename and same metadata would have colliding names.
I eventually decided to embed the subdirectory information into the filenames
used on the filter branch. Currently that is done by converting
`dir/subdir/file.foo` to `file(dir)(subdir).foo`. We'll see how this works
out in practice..

View file

@ -0,0 +1,76 @@
Today I built `git annex view`, and `git annex vadd` and a few related
commands. A quick demo:
<pre>
joey@darkstar:~/lib/talks>ls
Chaos_Communication_Congress/ FOSDEM/ Linux_Conference_Australia/
Debian/ LibrePlanet/ README.md
joey@darkstar:~/lib/talks>git annex view tag=*
view (searching...)
Switched to branch 'views/_'
ok
joey@darkstar:~/lib/talks#_>tree -d
.
|-- Debian
|-- android
|-- bigpicture
|-- debhelper
|-- git
|-- git-annex
`-- seen
7 directories
joey@darkstar:~/lib/talks#_>git annex vadd author=*
vadd
Switched to branch 'views/author=_;_'
ok
joey@darkstar:~/lib/talks#author=_;_>tree -d
.
|-- Benjamin Mako Hill
| `-- bigpicture
|-- Denis Carikli
| `-- android
|-- Joey Hess
| |-- Debian
| |-- bigpicture
| |-- debhelper
| |-- git
| `-- git-annex
|-- Richard Hartmann
| |-- git
| `-- git-annex
`-- Stefano Zacchiroli
`-- Debian
15 directories
joey@darkstar:~/lib/talks#author=_;_>git annex vpop
vpop 1
Switched to branch 'views/_'
ok
joey@darkstar:~/lib/talks#_>git annex vadd tag=git-annex
vadd
Switched to branch 'views/(git-annex)'
ok
joey@darkstar:~/lib/talks#(git-annex)>ls
1025_gitify_your_life_{Debian;2013;DebConf13;high}.ogv@
git_annex___manage_files_with_git__without_checking_their_contents_into_git_{FOSDEM;2012;lightningtalks}.webm@
mirror.linux.org.au_linux.conf.au_2013_mp4_gitannex_{Linux_Conference_Australia;2013}.mp4@
joey@darkstar:~/lib/talks#_>git annex vpop 2
vpop 2
Switched to branch 'master'
ok
</pre>
Not 100% happy with the speed -- the generation of the view branch is close
to optimal, and fast enough (unless the branch has very many matching
files). And `vadd` can be quite fast if the view has already limited the
total number of files to a smallish amount. But `view` has to look at every
file's metadata, and this can take a while in a large repository. Needs indexes.
It also needs integration with `git annex sync`, so the view branches
update when files are added to the master branch, and moving files around
inside a view and committing them does not yet update their metadata.
---
Today's work was sponsored by Daniel Atlas.

View file

@ -0,0 +1,7 @@
Still working on views. The most important addition today is that
`git annex precommit` notices when files have been moved/copied/deleted
in a view, and updates the metadata to reflect the changes.
Also wrote some walkthrough documentation: [[tips/metadata_driven_views]].
And, recorded a screencast demoing views, which I will upload next time
I have bandwidth.

View file

@ -0,0 +1,15 @@
Spent the day catching up on the last week or so's traffic. Ended up
making numerous small big fixes and improvements. Message backlog stands at
44.
Here's the [[screencast demoing views|videos/git-annex_views_demo]]!
Added to the design today the idea of
automatically deriving metadata from the location of files in the master
branch's directory tree. Eg, `git annex view tag=* podcasts/=*` in a
repository that has a `podcasts/` directory would make a tree like
"$tag/$podcast". Seems promising.
So much still to do with views.. I have belatedly added them to
the roadmap for this month; doing Windows and Android in the same month was
too much to expect.

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawlJEI45rGczFAnuM7gRSj4C6s9AS9yPZDc"
nickname="Kevin"
subject="Neat!"
date="2014-02-20T21:59:05Z"
content="""
When the [[metadata design|day_112__metadata_design]] stuff appeared on the blog I didn't understand what you meant by automatically creating new tree layouts. I'm really liking these views and can already imagine how useful it would be to tag my photos by person/place/time. This is awesome! Keep up the good work.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawneiQ3iR9VXOPEP34u7m_L3Qr28H1nEfE0"
nickname="Ethan"
subject="LFS"
date="2014-02-21T00:03:59Z"
content="""
You might be interested in the Logic File System at http://www.padator.org/wiki/wiki-LFS/doku.php or http://www.padator.org/papers/ which has a similar idea with views and metadata.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawmZgZuUhZlHpd_AbbcixY0QQiutb2I7GWY"
nickname="Jimmy"
subject="comment 3"
date="2014-02-21T07:05:34Z"
content="""
I agree with Kevin as to the potential usefulness for photos. Particularly if there's some way of automatically extracting and using tags or other EXIF metadata.
"""]]