Commit graph

385 commits

Author SHA1 Message Date
Joey Hess
8bcd67b9d8 metadata: Add --get (from bremner) 2014-03-15 17:29:40 -04:00
Joey Hess
417aea25be vicfg: Allows editing preferred content expressions for groups.
This is stored in the git-annex branch, but not yet actually hooked up and
used.
2014-03-15 16:17:01 -04:00
Joey Hess
7ea0b82cf9
note that webapp starts the assistant if it's not already running 2014-03-12 15:40:20 -04:00
Joey Hess
a3fe8270ca annex.startupscan can be set to false to disable the assistant's startup scan. 2014-03-05 17:44:14 -04:00
Joey Hess
d0fce426c4 pre-commit-annex hook script to automatically extract metadata from lots of types of files
Using the extract(1) program to do the heavy lifting.

Decided to make git-annex run pre-commit-annex when committing. Since
git-annex pre-commit also runs it, it'll be run when git commit is run too,
via the pre-commit hook. This basically gives back the pre-commit hook
that git-annex took away. The implementation avoids repeatedly looking
for the hook script when the assistant is running and committing
repeatedly; only checks if the hook is available once.

To make the script simpler, made git-annex metadata -s field?=value
only set a field when it's not already got a value.

This commit was sponsored by bak.
2014-03-02 20:11:58 -04:00
Joey Hess
4643a0120c doc improvements 2014-03-02 15:46:58 -04:00
Joey Hess
7d9486a709 vadd: Allow listing multiple desired values for a field. 2014-03-02 15:36:45 -04:00
Joey Hess
c2e8c21ca6 view, vfilter: Add support for filtering tags and values out of a view, using !tag and field!=value.
Note that negated globs are not supported. Would have complicated the code
to add them, without changing the data type serialization in a
non-backwards-compatable way.

This commit was sponsored by Denver Gingerich.
2014-03-02 14:53:19 -04:00
Joey Hess
6a355686ff annex.listen can be configured, instead of using --listen 2014-03-01 00:31:17 -04:00
Joey Hess
0bc8dabb54 docs for remote webapp, securely 2014-02-28 22:39:06 -04:00
Joey Hess
aa39457f5f update 2014-02-25 17:26:04 -04:00
Joey Hess
fb4e1ebfbe metadata: Support --json 2014-02-23 13:58:16 -04:00
Joey Hess
7498c5dd96 annex.genmetadata can be set to make git-annex automatically set metadata (year and month) when adding files 2014-02-23 00:08:29 -04:00
Joey Hess
079b35a1a8 views: add automatically constructed file location metadata
When constructing views, metadata is available about the location of the
file in the view's reference branch. Allows incorporating parts of the
directory hierarchy in a view.

For example `git annex view tag=* podcasts/=*` makes a view in the form
tag/showname.

Performance impact: I benchmarked git annex view tag=* in the conference
proceedings repo to take 6.459s before this change, and 6.544s after.

FWIW, I considered making the syntax for this be podcasts/*, which might
be easier for the user to learn. However, I think it's not as good:

* The user has to then juggle two different syntaxes, and podcasts/* will
  be expanded by the shell so they also need to quote it, while podcasts/=*
  is unlikely to be expanded by the shell.
* It would allow for things like podcasts/*/* and *.mp3 which do not
  map well into views.

This commit was sponsored by Aurélien Pinceaux.
2014-02-22 16:27:53 -04:00
Joey Hess
2a65f07621 note case insensative matching 2014-02-21 18:36:36 -04:00
Joey Hess
24f8136504 --metadata field=value can now use globs to match, and matches case insensatively, the same as git annex view field=value does.
Also refactored glob code into its own module.
2014-02-21 18:34:34 -04:00
Joey Hess
1428390300 tweak wording 2014-02-20 16:00:41 -04:00
Joey Hess
d209566dfa Revert "Fix command to match fsck description"
This reverts commit 9e8370d1b9.

No, --incremental and --more are not needed when using
--incremental-schedule. The --incremental-schedule option
implies the other ones.
2014-02-20 15:36:59 -04:00
Joey Hess
134fdefb8c fsck: When run with --all or --unused, while .gitattributes annex.numcopies cannot be honored since it's operating on keys instead of files, make it honor the global numcopies setting, and the annex.numcopies git config setting. 2014-02-20 14:45:17 -04:00
Joey Hess
dd7b99c860 add tip about metadata driven views (and more flexible view filtering)
While writing this documentation, I realized that there needed to be a way
to stay in a view like tag=* while adding a filter like tag=work that
applies to the same field.

So, there are really two ways a view can be refined. It can have a new
"field=explicitvalue" filter added to it, which does not change the
"shape" of the view, but narrows the files it shows.
Or, it can have a new view added, which adds another level of
subdirectories.

So, added a vfilter command, which takes explicit values to add to the
filter, and rejects changes that would change the shape of the view.

And, made vadd only accept changes that change the shape of the view.

And, changed the View data type slightly; now components that can match
multiple metadata values can be visible, or not visible.

This commit was sponsored by Stelian Iancu.
2014-02-19 16:29:56 -04:00
Joey Hess
d8ce6cac36 metadata: add --tag and --untag shorthand options 2014-02-19 15:04:12 -04:00
Joey Hess
e7672f197e new section for metadata 2014-02-19 14:55:34 -04:00
Joey Hess
39ebfa1a2e pre-commit: Update metadata when committing changes to annexed files within a view.
So the user can now switch to a view and then move files around within it
to manage metadata. For example, moving a file into a new directory
when in the tags=* view adds a tag to it.

Implementation is fairly efficient. One diff-index, which is no more
expensive than the first stage of a git commit, followed by possibly
some cat-file --batch traffic to find the key (when deleting a file).
Very similar to what's done in direct mode when committing. And like
direct mode when updating the WC after a merge, it has to buffer the
diff-tree values in order to make 2 passes over them.

When not in a view, pre-commit now does one extra git symbolic-ref,
which is tiny overhead.

This commit was sponsored by Andrew Eskridge.
2014-02-19 14:17:58 -04:00
Joey Hess
1a53c87057 vpop N 2014-02-18 21:57:21 -04:00
Joey Hess
67a5f02a0b add vcycle command 2014-02-18 20:16:28 -04:00
Joey Hess
f603692a72 add vadd command 2014-02-18 20:02:09 -04:00
Joey Hess
67fd06af76 add git annex view command
(And a vpop command, which is still a bit buggy.)

Still need to do vadd and vrm, though this also adds their documentation.

Currently not very happy with the view log data serialization. I had to
lose the TDFA regexps temporarily, so I can have Read/Show instances of
View. I expect the view log format will change in some incompatable way
later, probably adding last known refs for the parent branch to View
or something like that.

Anyway, it basically works, although it's a bit slow looking up the
metadata. The actual git branch construction is about as fast as it can be
using the current git plumbing.

This commit was sponsored by Peter Hogg.
2014-02-18 18:22:20 -04:00
stp
9e8370d1b9 Fix command to match fsck description 2014-02-17 15:53:46 +00:00
Joey Hess
2075cdeb59
limiting files based on metadata
Note that there is currently no caching, so
	--metadata foo=bar --metadata tag=blah
will currently read the log 2x per file.
2014-02-13 02:24:30 -04:00
Joey Hess
0e9a72b356
metacata command can now operate on many files at once 2014-02-13 01:49:38 -04:00
Joey Hess
9f7e76130e add metadata command to get/set metadata
Adds metadata log, and command.

Note that unsetting field values seems to currently be broken.
And in general this has had all of 2 minutes worth of testing.

This commit was sponsored by Julien Lefrique.
2014-02-12 21:30:33 -04:00
Joey Hess
b9e6cb07ad remove dropkey example 2014-02-08 15:25:58 -04:00
Joey Hess
a44e01c29c --in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}" 2014-02-06 12:43:56 -04:00
Joey Hess
1858c1f44a Document in man page that sshcaching uses ssh ControlMaster. Closes: #737476 2014-02-02 19:27:47 -04:00
Joey Hess
089c0109a2 Added ways to configure rsync options to be used only when uploading or downloading from a remote. Useful to eg limit upload bandwidth. 2014-02-02 16:06:34 -04:00
Joey Hess
ec7443eb06 All commands that support --all also support a --key option, which limits them to acting on a single key. 2014-01-26 14:59:47 -04:00
Joey Hess
5fc2d760ea Optimise non-bare http remotes; no longer does a 404 to the wrong url every time before trying the right url. Needs annex-bare to be set to false, which is done when initially probing the uuid of a http remote. 2014-01-26 13:03:25 -04:00
Joey Hess
b93e485ef1 added annex.secure-erase-command config option. 2014-01-24 12:58:52 -04:00
Joey Hess
3da0064657 assistant unused file handling
Make sanity checker run git annex unused daily, and queue up transfers
of unused files to any remotes that will have them. The transfer retrying
code works for us here, so eg when a backup disk remote is plugged in,
any transfers to it are done. Once the unused files reach a remote,
they'll be removed locally as unwanted.

If the setup does not cause unused files to go to a remote, they'll pile
up, and the sanity checker detects this using some heuristics that are
pretty good -- 1000 unused files, or 10% of disk used by unused files,
or more disk wasted by unused files than is left free. Once it detects
this, it pops up an alert in the webapp, with a button to take action.

TODO: Webapp UI to configure this, and also the ability to launch an
immediate cleanup of all unused files.

This commit was sponsored by Simon Michael.
2014-01-22 22:53:18 -04:00
Joey Hess
f2713a3bb9 benchmarked numcopies .gitattributes in preferred content
Checking .gitattributes adds a full minute to a git annex find looking for
files that don't have enough copies. 2:25 increasts to 3:27. I feel this is
too much of a slowdown to justify making it the default. So, exposed two
versions of the preferred content expression, a slow one and a fast but
approximate one.

I'm using the approximate one in the default preferred content expressions
to avoid slowing down the assistant.
2014-01-21 18:49:25 -04:00
Joey Hess
d1bf61464f expose tasty test suite's option parser 2014-01-21 00:08:43 -04:00
Joey Hess
3159da2693 Add and use numcopiesneeded preferred content expression.
* Add numcopiesneeded preferred content expression.
* Client, transfer, incremental backup, and archive repositories
  now want to get content that does not yet have enough copies.

This means the asssistant will make copies of files that don't yet
meet the configured numcopies, even to places that would not normally want
the file.

For example, if numcopies is 4, and there are 2 client repos and
2 transfer repos, and 2 removable backup drives, the file will be sent
to both transfer repos in order to make 4 copies. Once a removable drive
get a copy of the file, it will be dropped from one transfer repo or the
other (but not both).

Another example, numcopies is 3 and there is a client that has a backup
removable drive and two small archive repos. Normally once one of the small
archives has a file, it will not be put into the other one. But, to satisfy
numcopies, the assistant will duplicate it into the other small archive
too, if the backup repo is not available to receive the file.

I notice that these examples are fairly unlikely setups .. the old behavior
was not too bad, but it's nice to finally have it really correct.

.. Almost. I have skipped checking the annex.numcopies .gitattributes
out of fear it will be too slow.

This commit was sponsored by Florian Schlegel.
2014-01-20 17:35:29 -04:00
Joey Hess
d66535f065 global numcopies setting
* numcopies: New command, sets global numcopies value that is seen by all
  clones of a repository.
* The annex.numcopies git config setting is deprecated. Once the numcopies
  command is used to set the global number of copies, any annex.numcopies
  git configs will be ignored.
* assistant: Make the prefs page set the global numcopies.

This global numcopies setting is needed to let preferred content
expressions operate on numcopies.

It's also convenient, because typically if you want git-annex to preserve N
copies of files in a repo, you want it to do that no matter which repo it's
running in. Making it global avoids needing to warn the user about gotchas
involving inconsistent annex.numcopies settings.
(See changes to doc/numcopies.mdwn.)

Added a new variety of git-annex branch log file, that holds only 1 value.
Will probably be useful for other stuff later.

This commit was sponsored by Nicolas Pouillard.
2014-01-20 16:47:56 -04:00
Joey Hess
b6ba0bd556 sync --content: New option that makes the content of annexed files be transferred.
Similar to the assistant, this honors any configured preferred content
expressions.

I am not entirely happpy with the implementation. It would be nicer if
the seek function returned a list of actions which included the individual
file gets and copies and drops, rather than the current list of calls to
syncContent. This would allow getting rid of the somewhat reundant display
of "sync file [ok|failed]" after the get/put display.

But, do that, withFilesInGit would need to somehow be able to construct
such a mixed action list. And it would be less efficient than the current
implementation, which is able to reuse several values between eg get and
drop.

Note that currently this does not try to satisfy numcopies when
getting/putting files (numcopies are of course checked when dropping
files!) This makes it like the assistant, and unlike get --auto
and copy --auto, which do duplicate files when numcopies is not yet
satisfied. I don't know if this is the right decision; it only seemed to
make sense to have this parallel the assistant as far as possible to start
with, since I know the assistant works.

This commit was sponsored by Øyvind Andersen Holm.
2014-01-19 17:49:54 -04:00
Joey Hess
85185b8f50 Allow --all to be mixed with matching options like --copies and --in (but not --include and --exclude). 2014-01-18 14:58:56 -04:00
Joey Hess
a135bbd5a2 note that --all can't be mixed with eg --copies 2014-01-18 13:52:35 -04:00
Joey Hess
939eb666fe clarify sync 2014-01-18 13:26:47 -04:00
Yaroslav Halchenko
0bf41b335b Minor git-annex.mdwn tune ups (trailing spaces, typos, more consistency in tense)
Conflicts:
	doc/git-annex.mdwn -- I have managed to work on an old copy, so overlapped a bit
2014-01-18 13:06:15 -04:00
Joey Hess
c20f31a1ad add GETAVAILABILITY to external special remote protocol
And some reworking of types, and added an annex-availability git config
setting.
2014-01-13 14:41:10 -04:00
Joey Hess
85272d8a98 Added tahoe special remote.
Known problems:

1. Tries to tahoe start when daemon is already running.

2. If multiple tahoe remotes are set up on the same computer,
   they will have the same node.url configured by default,
   and this confuses tahoe commands.

This commit was sponsored by LeastAuthority.com
2014-01-08 16:14:41 -04:00