git-annex/doc/tips/metadata_driven_views.mdwn
Joey Hess 079b35a1a8 views: add automatically constructed file location metadata
When constructing views, metadata is available about the location of the
file in the view's reference branch. Allows incorporating parts of the
directory hierarchy in a view.

For example `git annex view tag=* podcasts/=*` makes a view in the form
tag/showname.

Performance impact: I benchmarked git annex view tag=* in the conference
proceedings repo to take 6.459s before this change, and 6.544s after.

FWIW, I considered making the syntax for this be podcasts/*, which might
be easier for the user to learn. However, I think it's not as good:

* The user has to then juggle two different syntaxes, and podcasts/* will
  be expanded by the shell so they also need to quote it, while podcasts/=*
  is unlikely to be expanded by the shell.
* It would allow for things like podcasts/*/* and *.mp3 which do not
  map well into views.

This commit was sponsored by Aurélien Pinceaux.
2014-02-22 16:27:53 -04:00

144 lines
4.4 KiB
Markdown

git-annex now has support for storing
[[arbitrary metadata|design/metadata]] about annexed files. For example, this can be
used to tag files, to record the author of a file, etc. The metadata is
synced around between repositories with the other information git-annex
keeps track of.
One nice way to use the metadata is through **views**. You can ask
git-annex to create a view of files in the currently checked out branch
that have certian metadata. Once you're in a view, you can move and copy
files to adjust their metadata further. Rather than the traditional
hierarchical directory structure, views are dynamic; you can easily
refine or reorder a view.
Let's get started by setting some tags on files. No views yet, just some
metadata:
# git annex metadata --tag todo work/2014/*
# git annex metadata --untag todo work/2014/done/*
# git annex metadata --tag urgent work/2014/presentation_for_tomorrow.odt
# git annex metadata --tag done work/2013/* work/2014/done/*
# git annex metadata --tag work work
# git annex metadata --tag video videos
# git annex metadata --tag work videos/operating_heavy_machinery.mov
# git annex metadata --tag done videos/old
# git annex metadata --tag new videos/lotsofcats.ogv
# git annex metadata --tag sound podcasts
# git annex metadata --tag done podcasts/*/old
# git annex metadata --tag new podcasts/*/recent
So, you had a bunch of different kinds of files sorted into a directory
structure. But that didn't really reflect how you approach the files.
Adding some tags lets you categorize the files in different ways.
Ok, metadata is in place, but how to use it? Time to change views!
# git annex view tag=*
view (searching...)
Switched to branch 'views/_'
ok
This searched for all files with any tag, and created a new git branch
that sorts the files according to their tags.
# tree -d
work
todo
urgent
done
new
video
sound
Notice that a single file may appear in multiple directories
depending on its tags. For example, `lotsofcats.ogv` is in
both `new/` and `video/`.
Ah, but you're at work now, and don't want to be distracted by cat videos.
Time to filter the view:
# git annex vfilter tag=work
vfilter
Switched to branch 'views/(work)/_'
ok
Now only the work files are in the view, and they're otherwise categorized
according to their other tags. So you can check the `urgent/` directory
to see what's next, and look in `todo/` for other work related files.
Now that you're in a tag based view, you can move files around between the
directories, and when you commit your changes to git, their tags will be
updated.
# git mv urgent/presentation_for_tomorrow_{work;2014}.odt ../done
# git commit -m "a good day's work"
metadata tag-=urgent
metadata tag+=done
You can return to a previous view by running `git annex vpop`. If you pop
all the way out of all views, you'll be back on the regular git branch you
originally started from. You can also use `git checkout` to switch between
views and other branches.
## fields
Beyond simple tags and directories, you can add whatever kinds of metadata
you like, and use that metadata in more elaborate views. For example, let's
add a year field.
# git checkout master
# git annex metadata --set year=2014 work/2014
# git annex metadata --set year=2013 work/2013
# git annex view year=* tag=*
Now you're in a view with two levels of directories, first by year and then
by tag.
# tree -d
2014
|-- work
|-- todo
|-- urgent
`-- done
2013
|-- work
`-- done
Oh, did you want it the other way around? Easy!
# git annex vcycle
# tree -d
work
|-- 2014
`-- 2013
todo
`-- 2014
urgent
`-- 2014
done
|-- 2014
`-- 2013
## location fields
Let's switch to a view containing only new podcasts. And since the
podcasts are organized into one subdirectory per show, let's
include those subdirectories in the view.
# git checkout master
# git annex view tag=new podcasts/=*
# tree -d
This_Developers_Life
Escape_Pod
GitMinutes
The_Haskell_Cast
StarShipSofa
That's an example of using part of the directory layout of the original
branch to inform the view. Every file gets fields automatically set up
corresponding to the directory it's in. So a file"foo/bar/baz/file" has
fields "/=foo", "foo/=bar", and "foo/bar/=baz". These location fields
can be used the same as other metadata to construct the view.
This has probably only scratched the surface of what you can do with views.