Commit graph

415 commits

Author SHA1 Message Date
Joey Hess
d8ce6cac36 metadata: add --tag and --untag shorthand options 2014-02-19 15:04:12 -04:00
Joey Hess
e7672f197e new section for metadata 2014-02-19 14:55:34 -04:00
Joey Hess
39ebfa1a2e pre-commit: Update metadata when committing changes to annexed files within a view.
So the user can now switch to a view and then move files around within it
to manage metadata. For example, moving a file into a new directory
when in the tags=* view adds a tag to it.

Implementation is fairly efficient. One diff-index, which is no more
expensive than the first stage of a git commit, followed by possibly
some cat-file --batch traffic to find the key (when deleting a file).
Very similar to what's done in direct mode when committing. And like
direct mode when updating the WC after a merge, it has to buffer the
diff-tree values in order to make 2 passes over them.

When not in a view, pre-commit now does one extra git symbolic-ref,
which is tiny overhead.

This commit was sponsored by Andrew Eskridge.
2014-02-19 14:17:58 -04:00
Joey Hess
1a53c87057 vpop N 2014-02-18 21:57:21 -04:00
Joey Hess
67a5f02a0b add vcycle command 2014-02-18 20:16:28 -04:00
Joey Hess
f603692a72 add vadd command 2014-02-18 20:02:09 -04:00
Joey Hess
67fd06af76 add git annex view command
(And a vpop command, which is still a bit buggy.)

Still need to do vadd and vrm, though this also adds their documentation.

Currently not very happy with the view log data serialization. I had to
lose the TDFA regexps temporarily, so I can have Read/Show instances of
View. I expect the view log format will change in some incompatable way
later, probably adding last known refs for the parent branch to View
or something like that.

Anyway, it basically works, although it's a bit slow looking up the
metadata. The actual git branch construction is about as fast as it can be
using the current git plumbing.

This commit was sponsored by Peter Hogg.
2014-02-18 18:22:20 -04:00
stp
9e8370d1b9 Fix command to match fsck description 2014-02-17 15:53:46 +00:00
Joey Hess
2075cdeb59
limiting files based on metadata
Note that there is currently no caching, so
	--metadata foo=bar --metadata tag=blah
will currently read the log 2x per file.
2014-02-13 02:24:30 -04:00
Joey Hess
0e9a72b356
metacata command can now operate on many files at once 2014-02-13 01:49:38 -04:00
Joey Hess
9f7e76130e add metadata command to get/set metadata
Adds metadata log, and command.

Note that unsetting field values seems to currently be broken.
And in general this has had all of 2 minutes worth of testing.

This commit was sponsored by Julien Lefrique.
2014-02-12 21:30:33 -04:00
Joey Hess
b9e6cb07ad remove dropkey example 2014-02-08 15:25:58 -04:00
Joey Hess
a44e01c29c --in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}" 2014-02-06 12:43:56 -04:00
Joey Hess
1858c1f44a Document in man page that sshcaching uses ssh ControlMaster. Closes: #737476 2014-02-02 19:27:47 -04:00
Joey Hess
089c0109a2 Added ways to configure rsync options to be used only when uploading or downloading from a remote. Useful to eg limit upload bandwidth. 2014-02-02 16:06:34 -04:00
Joey Hess
ec7443eb06 All commands that support --all also support a --key option, which limits them to acting on a single key. 2014-01-26 14:59:47 -04:00
Joey Hess
5fc2d760ea Optimise non-bare http remotes; no longer does a 404 to the wrong url every time before trying the right url. Needs annex-bare to be set to false, which is done when initially probing the uuid of a http remote. 2014-01-26 13:03:25 -04:00
Joey Hess
b93e485ef1 added annex.secure-erase-command config option. 2014-01-24 12:58:52 -04:00
Joey Hess
3da0064657 assistant unused file handling
Make sanity checker run git annex unused daily, and queue up transfers
of unused files to any remotes that will have them. The transfer retrying
code works for us here, so eg when a backup disk remote is plugged in,
any transfers to it are done. Once the unused files reach a remote,
they'll be removed locally as unwanted.

If the setup does not cause unused files to go to a remote, they'll pile
up, and the sanity checker detects this using some heuristics that are
pretty good -- 1000 unused files, or 10% of disk used by unused files,
or more disk wasted by unused files than is left free. Once it detects
this, it pops up an alert in the webapp, with a button to take action.

TODO: Webapp UI to configure this, and also the ability to launch an
immediate cleanup of all unused files.

This commit was sponsored by Simon Michael.
2014-01-22 22:53:18 -04:00
Joey Hess
f2713a3bb9 benchmarked numcopies .gitattributes in preferred content
Checking .gitattributes adds a full minute to a git annex find looking for
files that don't have enough copies. 2:25 increasts to 3:27. I feel this is
too much of a slowdown to justify making it the default. So, exposed two
versions of the preferred content expression, a slow one and a fast but
approximate one.

I'm using the approximate one in the default preferred content expressions
to avoid slowing down the assistant.
2014-01-21 18:49:25 -04:00
Joey Hess
d1bf61464f expose tasty test suite's option parser 2014-01-21 00:08:43 -04:00
Joey Hess
3159da2693 Add and use numcopiesneeded preferred content expression.
* Add numcopiesneeded preferred content expression.
* Client, transfer, incremental backup, and archive repositories
  now want to get content that does not yet have enough copies.

This means the asssistant will make copies of files that don't yet
meet the configured numcopies, even to places that would not normally want
the file.

For example, if numcopies is 4, and there are 2 client repos and
2 transfer repos, and 2 removable backup drives, the file will be sent
to both transfer repos in order to make 4 copies. Once a removable drive
get a copy of the file, it will be dropped from one transfer repo or the
other (but not both).

Another example, numcopies is 3 and there is a client that has a backup
removable drive and two small archive repos. Normally once one of the small
archives has a file, it will not be put into the other one. But, to satisfy
numcopies, the assistant will duplicate it into the other small archive
too, if the backup repo is not available to receive the file.

I notice that these examples are fairly unlikely setups .. the old behavior
was not too bad, but it's nice to finally have it really correct.

.. Almost. I have skipped checking the annex.numcopies .gitattributes
out of fear it will be too slow.

This commit was sponsored by Florian Schlegel.
2014-01-20 17:35:29 -04:00
Joey Hess
d66535f065 global numcopies setting
* numcopies: New command, sets global numcopies value that is seen by all
  clones of a repository.
* The annex.numcopies git config setting is deprecated. Once the numcopies
  command is used to set the global number of copies, any annex.numcopies
  git configs will be ignored.
* assistant: Make the prefs page set the global numcopies.

This global numcopies setting is needed to let preferred content
expressions operate on numcopies.

It's also convenient, because typically if you want git-annex to preserve N
copies of files in a repo, you want it to do that no matter which repo it's
running in. Making it global avoids needing to warn the user about gotchas
involving inconsistent annex.numcopies settings.
(See changes to doc/numcopies.mdwn.)

Added a new variety of git-annex branch log file, that holds only 1 value.
Will probably be useful for other stuff later.

This commit was sponsored by Nicolas Pouillard.
2014-01-20 16:47:56 -04:00
Joey Hess
b6ba0bd556 sync --content: New option that makes the content of annexed files be transferred.
Similar to the assistant, this honors any configured preferred content
expressions.

I am not entirely happpy with the implementation. It would be nicer if
the seek function returned a list of actions which included the individual
file gets and copies and drops, rather than the current list of calls to
syncContent. This would allow getting rid of the somewhat reundant display
of "sync file [ok|failed]" after the get/put display.

But, do that, withFilesInGit would need to somehow be able to construct
such a mixed action list. And it would be less efficient than the current
implementation, which is able to reuse several values between eg get and
drop.

Note that currently this does not try to satisfy numcopies when
getting/putting files (numcopies are of course checked when dropping
files!) This makes it like the assistant, and unlike get --auto
and copy --auto, which do duplicate files when numcopies is not yet
satisfied. I don't know if this is the right decision; it only seemed to
make sense to have this parallel the assistant as far as possible to start
with, since I know the assistant works.

This commit was sponsored by Øyvind Andersen Holm.
2014-01-19 17:49:54 -04:00
Joey Hess
85185b8f50 Allow --all to be mixed with matching options like --copies and --in (but not --include and --exclude). 2014-01-18 14:58:56 -04:00
Joey Hess
a135bbd5a2 note that --all can't be mixed with eg --copies 2014-01-18 13:52:35 -04:00
Joey Hess
939eb666fe clarify sync 2014-01-18 13:26:47 -04:00
Yaroslav Halchenko
0bf41b335b Minor git-annex.mdwn tune ups (trailing spaces, typos, more consistency in tense)
Conflicts:
	doc/git-annex.mdwn -- I have managed to work on an old copy, so overlapped a bit
2014-01-18 13:06:15 -04:00
Joey Hess
c20f31a1ad add GETAVAILABILITY to external special remote protocol
And some reworking of types, and added an annex-availability git config
setting.
2014-01-13 14:41:10 -04:00
Joey Hess
85272d8a98 Added tahoe special remote.
Known problems:

1. Tries to tahoe start when daemon is already running.

2. If multiple tahoe remotes are set up on the same computer,
   they will have the same node.url configured by default,
   and this confuses tahoe commands.

This commit was sponsored by LeastAuthority.com
2014-01-08 16:14:41 -04:00
Joey Hess
f7727d2df1 Remotes can now be made read-only, by setting remote.<name>.annex-readonly 2014-01-02 13:12:32 -04:00
Joey Hess
079f463d51 mirror: Support --all (and --unused). 2014-01-01 17:39:33 -04:00
Joey Hess
81f498559a importfeed: Support youtube playlists. 2013-12-29 15:52:20 -04:00
Joey Hess
01f11c6432 document annex-externaltype and annex-hooktype 2013-12-29 13:41:53 -04:00
https://openid.stackexchange.com/user/e65e6d0e-58ba-41de-84cc-1f2ba54cf574
9824019222 corrected typo in status command 2013-12-26 21:08:43 +00:00
Richard Hartmann
598f3ee0b9 doc/git-annex.mdwn: Forgot Oxford comma 2013-12-23 18:50:51 +01:00
Richard Hartmann
25a37b0e73 doc/git-annex.mdwn: Improve docs for annex.diskreserve 2013-12-23 18:49:54 +01:00
Øyvind A. Holm
4de79befe6 doc/git-annex.mdwn: Typo fix
ab33f2e2-6aa1-11e3-8f66-001f3b596ec9
2013-12-22 01:39:49 +01:00
Joey Hess
557de5f07e update 2013-12-19 16:51:57 -04:00
Joey Hess
769ad80486 improve docs 2013-12-15 14:54:52 -04:00
Joey Hess
2b5b4dcd78 Add plumbing-level lookupkey examinekey command.
find --format: Added hashdirlower, hashdirmixed, keyname, and mtime format
variables.
2013-12-15 14:52:09 -04:00
Joey Hess
7d5b25515c Add plumbing-level lookupkey command. 2013-12-15 14:02:23 -04:00
Joey Hess
f0cf4d1861 update plumbing command docs 2013-12-15 13:48:26 -04:00
Joey Hess
64160a9679 import: Add --skip-duplicates option.
Note that the hash backends were made to stop printing a (checksum..)
message as part of this, since it showed up without a file when deciding
whether to act on a file. Should have probably removed that message a while
ago anyway, I suppose.
2013-12-04 13:13:30 -04:00
Yaroslav Halchenko
9639142df9 minor typos/trailing spaces fixes in git-annex.mdwn 2013-12-04 11:11:22 -04:00
Joey Hess
66285ca3d1 copy --from, get --from: When --force is used, ignore the location log and always try to get the file from the remote. 2013-12-02 15:41:20 -04:00
Joey Hess
31d43c63a4 annex.autoupgrade setting 2013-11-22 16:04:20 -04:00
Joey Hess
0d0e21ea57 dropunused, addunused: Allow "all" instead of a range to act on all unused data. 2013-11-18 17:24:18 -04:00
https://id.koumbit.net/anarcat
8329f6ec37 i believe the file is copied out of the git/annex directory now, so it's not a hardlink anymore 2013-11-12 01:51:19 +00:00
Joey Hess
09abd29469 improve docs 2013-11-07 15:22:45 -04:00
Joey Hess
59ecc804cd add new status command
This works for both direct and indirect mode.

It may need some performance tuning.

Note that unlike git status, it only shows the status of the work tree, not
the status of the index. So only one status letter, not two .. and since
files that have been added and not yet committed do not differ between the
work tree and the index, they are not shown. Might want to add display of
the index vs the last commit eventually.

This commit was sponsored by an unknown bitcoin contributor, whose
contribution as been going up lately! ;)
2013-11-07 14:07:25 -04:00
Joey Hess
eed2ed4fdb rename status to info, and update docs 2013-11-07 12:45:59 -04:00
Joey Hess
4510819215 v5 for direct mode, with automatic upgrade
This includes storing the current state of the HEAD ref, which git annex
sync is going to need, but does not make sync use it.
2013-11-05 17:05:03 -04:00
Joey Hess
8820091b4c webapp: remind user when using repositories that lack consistency checks
When starting up the assistant, it'll remind about the current
repository, if it doesn't have checks. And when a removable drive
is plugged in, it will remind if a repository on it lacks checks.

Since that might be annoying, the reminders can be turned off.

This commit was sponsored by Nedialko Andreev.
2013-10-29 16:50:38 -04:00
Joey Hess
0eff0dd910 unannex: New, much slower, but more safe behavior
Copies files out of the annex. This avoids an unannex of one file breaking
other files that link to the same content. Also, it means that the content
remains in the annex using up space until cleaned up with  "git annex
unused".

(The behavior of unannex --fast has not changed; it still hard
links to content in the annex. --fast was not made the default because it
is potentially unsafe; editing such a hard linked file can unexpectedly
change content stored in the annex.)
2013-10-28 16:56:01 -04:00
Joey Hess
230bfa9688 add --want-get and --want-drop options
New --want-get and --want-drop options which can be used to test preferred
content settings. For example, "git annex find --in . --want-drop"
2013-10-28 14:50:17 -04:00
Joey Hess
afddbfd7e9 The "git annex content" command is renamed to "git annex wanted". 2013-10-28 14:08:38 -04:00
Joey Hess
d5eb85acf4 add repair command 2013-10-23 12:21:59 -04:00
Joey Hess
8cf0e5b105 update schedule docs 2013-10-15 13:42:33 -04:00
Joey Hess
296e21b381 add schedule command
Mostly because it gives me an excuse and a hook to document the schedule
expression format.
2013-10-13 15:40:38 -04:00
Joey Hess
336f4b5e2e mention preferred content standard 2013-10-06 13:02:17 -04:00
Joey Hess
12f6b9693a Send a git-annex user-agent when downloading urls.
Overridable with --user-agent option.

Not yet done for S3 or WebDAV due to limitations of libraries used --
nether allows a user-agent header to be specified.

This commit sponsored by Michael Zehrer.
2013-09-28 14:35:21 -04:00
Joey Hess
c923c981b9 import: Preserve top-level directory structure. 2013-09-25 13:16:55 -04:00
Joey Hess
4c954661a1 git-annex-shell: Added support for operating inside gcrypt repositories.
* Note that the layout of gcrypt repositories has changed, and
  if you created one you must manually upgrade it.
  See http://git-annex.branchable.com/upgrades/gcrypt/
2013-09-24 17:25:47 -04:00
Joey Hess
55636bf92f list --allrepos 2013-09-19 21:42:03 -04:00
Joey Hess
51d5c1d032 remove possibly confusing mention of git commit -a in sync documentation
http://git-annex.branchable.com/forum/git-annex_pre-commit_eats_all_my_4GB_of_ram/#comment-f7523e3779794a03680dbf48a488abc0
2013-09-19 20:13:58 -04:00
Joey Hess
3f8151d469 Merge remote-tracking branch 'anarcat/master' 2013-09-19 14:23:19 -04:00
Joey Hess
03729dc2a5 Merge remote-tracking branch 'anarcat/bold' 2013-09-19 14:22:46 -04:00
Antoine Beaupré
f4e8b70bba rename remotes to list 2013-09-19 14:16:55 -04:00
Joey Hess
a3bbda5bed status: In local mode, displays information about variance from configured numcopies levels. 2013-09-15 19:10:38 -04:00
https://www.google.com/accounts/o8/id?id=AItOawmxUEoLxEHC0qavQnoGStxpjbbszn87POQ
c34b0bf2af Fix typos. 2013-09-14 02:27:25 +00:00
Joey Hess
e3c7b505cd format 2013-09-12 12:33:22 -04:00
Joey Hess
82759b6a5d remotes: New command, displays a compact table of remotes that contain files. (Thanks, anarcat for display code and mastensg for inspiration.)
Note that it would be possible to extend the display to show all
repositories. But there can be a lot of repositories that are not set up as
remotes, and it would significantly clutter the display to show them all.

Since we're not showing all repositories, it's not worth trying to show
numcopies count either.

I decided to embrace these limitations and call the command remotes.
2013-09-12 12:21:21 -04:00
Antoine Beaupré
375a942dc9 add backticks to all options 2013-09-10 21:12:30 -04:00
Antoine Beaupré
1e45986430 be bold: make backticks bold in the manpage 2013-09-10 21:02:55 -04:00
Antoine Beaupré
8fbbf11233 missing backticks 2013-09-10 21:01:16 -04:00
Antoine Beaupré
6a526c7335 nitpicking: make this more readable on the web
this most options are now formatted as code.

this has no effect on the manpage whatsoever (unfortunately)
2013-09-10 20:47:27 -04:00
Joey Hess
b17defd43a reworded encryption stuff on man page, hopefully clearer and less jargon 2013-09-04 18:39:41 -04:00
Joey Hess
2fcae0348f Merge branch 'master' into encryption 2013-09-04 18:08:47 -04:00
guilhem
8293ed619f Allow public-key encryption of file content.
With the initremote parameters "encryption=pubkey keyid=788A3F4C".

/!\ Adding or removing a key has NO effect on files that have already
been copied to the remote. Hence using keyid+= and keyid-= with such
remotes should be used with care, and make little sense unless the point
is to replace a (sub-)key by another. /!\

Also, a test case has been added to ensure that the cipher and file
contents are encrypted as specified by the chosen encryption scheme.
2013-09-03 14:34:16 -04:00
Joey Hess
0831e18372 forget --drop-dead: Completely removes mentions of repositories that have been marked as dead from the git-annex branch.
Wrote nice pure transition calculator, and ugly code to stage its results
into the git-annex branch. Also had to split up several Log modules
that Annex.Branch needed to use, but that themselves used Annex.Branch.

The transition calculator is limited to looking at and changing one file at
a time. While this made the implementation relatively easy, it precludes
transitions that do stuff like deleting old url log files for keys that are
being removed because they are no longer present anywhere.
2013-08-31 17:51:13 -04:00
guilhem
53ce59021a Allow revocation of OpenPGP keys.
/!\ It is to be noted that revoking a key does NOT necessarily prevent
the owner of its private part from accessing data on the remote /!\

The only sound use of `keyid-=` is probably to replace a (sub-)key by
another, where the private part of both is owned by the same
person/entity:

    git annex enableremote myremote keyid-=2512E3C7 keyid+=788A3F4C

Reference: http://git-annex.branchable.com/bugs/Using_a_revoked_GPG_key/

* Other change introduced by this patch:

New keys now need to be added with option `keyid+=`, and the scheme
specified (upon initremote only) with `encryption=`. The motivation for
this change is to open for new schemes, e.g., strict asymmetric
encryption.

    git annex initremote myremote encryption=hybrid keyid=2512E3C7
    git annex enableremote myremote keyid+=788A3F4C
2013-08-29 14:31:33 -04:00
Joey Hess
4a915cd3cd add forget command
Works, more or less. --dead is not implemented, and so far a new branch
is made, but keys no longer present anywhere are not scrubbed.

git annex sync fails to push the synced/git-annex branch after a forget,
because it's not a fast-forward of the existing synced branch. Could be
fixed by making git-annex sync use assistant-style sync branches.
2013-08-28 16:41:13 -04:00
Joey Hess
fcd5c167ef untested transition detection on merging, and transition running code 2013-08-28 15:57:42 -04:00
Joey Hess
46b6d75274 Youtube support! (And 53 other video hosts)
When quvi is installed, git-annex addurl automatically uses it to detect
when an page is a video, and downloads the video file.

web special remote: Also support using quvi, for getting files,
or checking if files exist in the web.

This commit was sponsored by Mark Hepburn. Thanks!
2013-08-22 18:50:43 -04:00
Joey Hess
0f921307e7 mirror: New command, makes two repositories contain the same set of files.
This is a simple approach for setting up a mirroring repository.

It will work with any type of remotes.

Mirror --from is more expensive than mirror --to in general.
OTOH, mirror --from will get the file from any remote that has it, not only
the named mirror remote. And if the named mirror remote is not the fastest
available remote with a file, that can speed things up.

It would be possible to make the assistant or watch command do a more
dynamic mirroring, that didn't need to scan every time.
2013-08-20 15:46:35 -04:00
Joey Hess
e240cb99f7 Merge branch 'duplicate'
Conflicts:
	debian/changelog
2013-08-20 10:27:24 -04:00
Joey Hess
6112a5b927 wording 2013-08-15 01:29:50 +02:00
Joey Hess
1cb622d01e docs for 3 import duplicate file handling options 2013-08-11 18:56:26 +02:00
Joey Hess
03c76b5a30 improve importfeed --force; try to match existing files to avoid unncessary duplication 2013-08-01 11:57:05 -04:00
Joey Hess
42ca8aaa61 importfeed --force: re-download urls that have been seen before 2013-07-31 12:19:00 -04:00
Joey Hess
7e66d260ea importfeed: git-annex becomes a podcatcher in 150 LOC 2013-07-28 16:55:42 -04:00
Joey Hess
e788ba8e38 fix example in man page 2013-07-16 15:35:15 -04:00
Richard Hartmann
3cc47da8ec doc/git-annex.mdwn: Reference --numcopies for fsck
While --numcopies is explained in the manpage, referencing it from
the `git annex fsck` section directly does not hurt and arguably helps.
2013-07-11 10:45:11 -04:00
Richard Hartmann
78c8e0b2a9 The dreaded whitespace commit
This fixes trailing whitespace in the manpage; nothing else.
2013-07-11 10:45:02 -04:00
Joey Hess
980e9a15e0 merge: Now also merges synced/master or similar branches, which makes it useful to put in a post-receive hook to make a repository automatically update its working copy when git annex sync or the assistant sync with it. 2013-07-03 15:42:56 -04:00
Joey Hess
1453c46349 typography 2013-07-03 15:27:14 -04:00
Joey Hess
04d07f2c1f --unused: New switch that makes git-annex operate on all data found by the last run of git annex unused. Supported by fsck, get, move, copy. 2013-07-03 15:26:59 -04:00
Joey Hess
ebfd6fc2fe drop --all cannot check numcopies from .gitattributes, so don't implement it!
I spent a long time worrying about this problem with --all, that it cannot
check .gitattributes files for numcopies settings, and so would not be
entirely safe to use. The solution turns out to be simple, just don't
implement `git annex drop --all`. drop is the only command that needs to
check numcopies (move can also reduce the number of copies, but explicitly
bypasses numcopies settings).

Use cases that might need a drop --all are probably better served by using
unused and dropunused, which already work in a bare repository.
2013-07-03 14:01:31 -04:00
Joey Hess
def7cb706f Add --all option, and support it for fsck 2013-07-03 13:12:53 -04:00