Merge branch 'directguard'

This commit is contained in:
Joey Hess 2013-11-07 14:12:13 -04:00
commit d99bdbbb84
34 changed files with 963 additions and 531 deletions

View file

@ -39,7 +39,7 @@ Now configure the remote and do the initial push:
git remote add origin example.com:bare-annex.git
git push origin master git-annex
Now `git annex status` should show the configured bare remote. If it does not, you may have to pull from the remote first (older versions of `git-annex`)
Now `git annex info` should show the configured bare remote. If it does not, you may have to pull from the remote first (older versions of `git-annex`)
If you wish to configure git such that you can push/pull without arguments, set the upstream branch:

View file

@ -4,8 +4,7 @@ git, and in turn point at the content of large files that is stored in
The advantage of direct mode is that you can access files directly,
including modifying them. The disadvantage is that most regular git
commands cannot safely be used, and only a subset of git-annex commands
can be used.
commands cannot be used in a direct mode repository.
Normally, git-annex repositories start off in indirect mode. With some
exceptions:
@ -21,7 +20,7 @@ exceptions:
Any repository can be converted to use direct mode at any time, and if you
decide not to use it, you can convert back to indirect mode just as easily.
Also, you can have one clone of a repository using direct mode, and another
using indirect mode; direct mode interoperates.
using indirect mode.
To start using direct mode:
@ -52,7 +51,6 @@ computers, and manage your files, this should not be a concern for you.
## use a direct mode repository
You can use most git-annex commands as usual in a direct mode repository.
A very few commands don't work in direct mode, and will refuse to do anything.
Direct mode also works well with the git-annex assistant.
@ -63,23 +61,32 @@ the changes to other repositories for `git annex sync` there to pick up,
and will pull and merge any changes made on other repositories into the
local repository.
While you generally will just use `git annex sync`, if you want to,
you can use `git commit --staged`, or plain `git commit`.
But not `git commit -a`, or `git commit <file>` ..
that'd commit whole large files into git!
## what doesn't work in direct mode
`git annex status` shows incomplete information. A few other commands,
like `git annex unlock` don't make sense in direct mode and will refuse to
run.
A very few git-annex commands don't work in direct mode, and will refuse
to do anything. For example, `git annex unlock` doesn't make sense in
direct mode.
As for git commands, you can probably use some git working tree
manipulation commands, like `git checkout` and `git revert` in useful
ways... But beware, these commands can replace files that are present in
your repository with broken symlinks. If that file was the only copy you
had of something, it'll be lost.
As for git commands, direct mode prevents using any git command that would
modify or access the work tree. So you cannot `git commit` or `git pull`
(use `git annex sync` for both instead), or run `git status`.
These git commands will complain "fatal: This operation must be run in a work tree".
This is one more reason it's wise to make git-annex untrust your direct mode
repositories. Still, you can lose data using these sort of git commands, so
use extreme caution.
The reason for this is that git doesn't understand how git-annex uses the
work tree in direct mode. Where git expects the symlinks that get checked
into git to be checked out in the work tree, direct mode instead replaces
them with the actual content of files, as managed by git-annex.
There are still lots of git commands you can use in direct mode. For
example, you can run `git log` on files, run `git push`, `git config`,
`git remote add` etc.
## forcing git to use the work tree in direct mode
This is for experts only. You can lose data doing this, or check enormous
files directly into your git repository, and it's your fault if you do!
Also, there should be no good reason to need to do this, ever.
Ok, with the warnings out of the way, all you need to do to make any
git command access the work tree in direct mode is pass it
`-c core.bare=false`

View file

@ -103,6 +103,13 @@ subdirectories).
To avoid contacting the remote to check if it has every file, specify `--fast`
* `status` [path ...]`
Similar to `git status --short`, displays the status of the files in the
working tree. Shows files that are not checked into git, files that
have been deleted, and files that have been modified.
Particulary useful in direct mode.
* `unlock [path ...]`
Normally, the content of annexed files is protected from being changed.
@ -563,10 +570,6 @@ subdirectories).
# QUERY COMMANDS
* `version`
Shows the version of git-annex, as well as repository version information.
* `find [path ...]`
Outputs a list of annexed files in the specified path. With no path,
@ -607,23 +610,26 @@ subdirectories).
To generate output suitable for the gource visualisation program,
specify `--gource`.
* `status [directory ...]`
* `info [directory ...]`
Displays some statistics and other information, including how much data
is in the annex and a list of all known repositories.
To only show the data that can be gathered quickly, use `--fast`.
When a directory is specified, shows a differently formatted status
When a directory is specified, shows a differently formatted info
display for that directory. In this mode, all of the file matching
options can be used to filter the files that will be included in
the status.
the information.
For example, suppose you want to run "git annex get .", but
would first like to see how much disk space that will use.
Then run:
git annex status --fast . --not --in here
git annex info --fast . --not --in here
* `version`
Shows the version of git-annex, as well as repository version information.
* `map`
@ -698,12 +704,21 @@ subdirectories).
* `pre-commit [path ...]`
This is meant to be called from git's pre-commit hook. `git annex init`
automatically creates a pre-commit hook using this.
Fixes up symlinks that are staged as part of a commit, to ensure they
point to annexed content. Also handles injecting changes to unlocked
files into the annex.
This is meant to be called from git's pre-commit hook. `git annex init`
automatically creates a pre-commit hook using this.
* `update-hook refname olvrev newrev`
This is meant to be called from git's update hook. `git annex init`
automatically creates an update hook using this.
This denies updates being pushed for the currently checked out branch.
While receive.denyCurrentBranch normally prevents that, it does
not for fake bare repositories, as used by direct mode.
* `fromkey key file`
@ -788,7 +803,7 @@ subdirectories).
Rather than the normal output, generate JSON. This is intended to be
parsed by programs that use git-annex. Each line of output is a JSON
object. Note that json output is only usable with some git-annex commands,
like status and find.
like info and find.
* `--debug`
@ -1088,7 +1103,7 @@ Here are all the supported configuration settings.
up to 500000 keys. If your repository is larger than that,
you can adjust this to avoid `git annex unused` not noticing some unused
data files. Increasing this will make `git-annex unused` consume more memory;
run `git annex status` for memory usage numbers.
run `git annex info` for memory usage numbers.
* `annex.bloomaccuracy`
@ -1096,7 +1111,7 @@ Here are all the supported configuration settings.
`git annex unused`. The default accuracy is 1000 --
1 unused file out of 1000 will be missed by `git annex unused`. Increasing
the accuracy will make `git annex unused` consume more memory;
run `git annex status` for memory usage numbers.
run `git annex info` for memory usage numbers.
* `annex.sshcaching`

View file

@ -25,7 +25,7 @@ Now you can run normal annex operations, as long as the port forwarding shell is
git annex sync
git annex get on-the-go some/big/file
git annex status
git annex info
You can add more computers by repeating with a different port, e.g. 2202 or 2203 (or any other).

View file

@ -31,7 +31,7 @@ On `angela`, we want to synchronise the git annex metadata with `marcos`. We nee
git init
git remote add marcos marcos.example.com:/srv/mp3
git fetch marcos
git annex status # this should display the two repos
git annex info # this should display the two repos
git annex add .
This will, again, checksum all files and add them to git annex. Once that is done, you can verify that the files are really the same as marcos with `whereis`:

View file

@ -4,6 +4,6 @@
subject="comment 1"
date="2013-07-12T19:36:28Z"
content="""
Ah, I just found that git annex status can do the same :)
Ah, I just found that git annex info can do the same :)
Disregard this.
"""]]

View file

@ -77,6 +77,29 @@ This seems really promising. But of course, git-annex has its own set of
behaviors in a bare repo, so will need to recognise that this repo is not
really bare, and avoid them.
> [[done]]!! --[[Joey]]
(Git may also have some bare repo behaviors that are unwanted. One example
is that git allows pushes to the current branch in a bare repo,
even when `receive.denyCurrentBranch` is set.)
> This is indeed a problem. Indeed, `git annex sync` successfully
> pushes changes to the master branch of a fake bare direct mode repo.
>
> And then, syncing in the repo that was pushed to causes the changes
> that were pushed to the master branch to get reverted! This happens
> because sync commits; commit sees that files are staged in index
> differing from the (pushed) master, and commits the "changes"
> which revert it.
>
> Could fix this using an update hook, to reject the updated of the master
> branch. However, won't work on crippled filesystems! (No +x bit)
>
> Could make git annex sync detect this. It could reset the master
> branch to the last one committed, before committing. Seems very racy
> and hard to get right!
>
> Could make direct mode operate on a different branch, like
> `annex/direct/master` rather than `master`. Avoid pushing to that
> branch (`git annex sync` can map back from it to `master` and push there
> instead). A bit clumsy, but works.

View file

@ -18,10 +18,18 @@ conflicts first before upgrading git-annex.
## Upgrade events, so far
### v4 -> v5 (git-annex version 5.x)
v5 is only used for [[direct_mode]]. The upgrade from v4 to v5 is handled
automatically.
This upgrade involves changing direct mode repositories to operate with
core.bare=true.
### v3 -> v4 (git-annex version 4.x)
v4 is only used for [[direct_mode]], and no upgrade needs to be done from
existing v3 repositories, they will continue to work.
v4 was only used for [[direct_mode]], to ensure that a version of git-annex
that understands direct mode was used with a direct mode repository.
### v2 -> v3 (git-annex version 3.x)