Merge branch 'smudge'
This commit is contained in:
commit
72e717e14c
76 changed files with 2392 additions and 894 deletions
|
@ -9,6 +9,13 @@ understand how to update its working tree.
|
|||
|
||||
[[!toc]]
|
||||
|
||||
## deprecated
|
||||
|
||||
Direct mode is deprecated! Intead, git-annex v6 repositories can simply
|
||||
have files that are unlocked and thus can be directly accessed and
|
||||
modified. See [[upgrades]] for details about the transition to v6
|
||||
repositories.
|
||||
|
||||
## enabling (and disabling) direct mode
|
||||
|
||||
Normally, git-annex repositories start off in indirect mode. With some
|
||||
|
|
|
@ -11,12 +11,18 @@ git annex add `[path ...]`
|
|||
Adds files in the path to the annex. If no path is specified, adds
|
||||
files from the current directory and below.
|
||||
|
||||
Normally, files that are already checked into git, or that git has been
|
||||
configured to ignore will be silently skipped.
|
||||
Files that are already checked into git and are unmodified, or that
|
||||
git has been configured to ignore will be silently skipped.
|
||||
|
||||
If annex.largefiles is configured, and does not match a file that is being
|
||||
added, `git annex add` will behave the same as `git add` and add the
|
||||
non-large file directly to the git repository, instead of to the annex.
|
||||
If annex.largefiles is configured, and does not match a file, `git annex
|
||||
add` will behave the same as `git add` and add the non-large file directly
|
||||
to the git repository, instead of to the annex.
|
||||
|
||||
Large files are added to the annex in locked form, which prevents further
|
||||
modification of their content unless unlocked by [[git-annex-unlock]](1).
|
||||
(This is not the case however when a repository is in direct mode.)
|
||||
To add a file to the annex in unlocked form, `git add` can be used instead
|
||||
(that only works when the repository has annex.version 6 or higher).
|
||||
|
||||
This command can also be used to add symbolic links, both symlinks to
|
||||
annexed content, and other symlinks.
|
||||
|
|
|
@ -17,12 +17,18 @@ Note that git commands that operate on the work tree will refuse to
|
|||
run in direct mode repositories. Use `git annex proxy` to safely run such
|
||||
commands.
|
||||
|
||||
Note that the direct mode/indirect mode distinction is removed in v6
|
||||
git-annex repositories. In such a repository, you can
|
||||
use [[git-annex-unlock]](1) to make a file's content be directly present.
|
||||
|
||||
# SEE ALSO
|
||||
|
||||
[[git-annex]](1)
|
||||
|
||||
[[git-annex-indirect]](1)
|
||||
|
||||
[[git-annex-unlock]](1)
|
||||
|
||||
# AUTHOR
|
||||
|
||||
Joey Hess <id@joeyh.name>
|
||||
|
|
|
@ -11,9 +11,8 @@ git annex indirect
|
|||
Switches a repository back from direct mode to the default, indirect
|
||||
mode.
|
||||
|
||||
Some systems cannot support git-annex in indirect mode, because they
|
||||
do not support symbolic links. Repositories on such systems instead
|
||||
default to using direct mode.
|
||||
Note that the direct mode/indirect mode distinction is removed in v6
|
||||
git-annex repositories.
|
||||
|
||||
# SEE ALSO
|
||||
|
||||
|
|
|
@ -24,6 +24,13 @@ mark it as dead (see [[git-annex-dead]](1)).
|
|||
This command is entirely safe, although usually pointless, to run inside an
|
||||
already initialized git-annex repository.
|
||||
|
||||
# OPTIONS
|
||||
|
||||
* `--version=N`
|
||||
|
||||
Force the repository to be initialized using a different annex.version
|
||||
than the current default.
|
||||
|
||||
# SEE ALSO
|
||||
|
||||
[[git-annex]](1)
|
||||
|
|
|
@ -9,7 +9,7 @@ git annex lock `[path ...]`
|
|||
# DESCRIPTION
|
||||
|
||||
Use this to undo an unlock command if you don't want to modify
|
||||
the files, or have made modifications you want to discard.
|
||||
the files any longer, or have made modifications you want to discard.
|
||||
|
||||
# OPTIONS
|
||||
|
||||
|
|
|
@ -12,10 +12,14 @@ This is meant to be called from git's pre-commit hook. `git annex init`
|
|||
automatically creates a pre-commit hook using this.
|
||||
|
||||
Fixes up symlinks that are staged as part of a commit, to ensure they
|
||||
point to annexed content. Also handles injecting changes to unlocked
|
||||
files into the annex. When in a view, updates metadata to reflect changes
|
||||
point to annexed content.
|
||||
|
||||
When in a view, updates metadata to reflect changes
|
||||
made to files in the view.
|
||||
|
||||
When in a repository that has not been upgraded to annex.version 6,
|
||||
also handles injecting changes to unlocked files into the annex.
|
||||
|
||||
# SEE ALSO
|
||||
|
||||
[[git-annex]](1)
|
||||
|
|
43
doc/git-annex-smudge.mdwn
Normal file
43
doc/git-annex-smudge.mdwn
Normal file
|
@ -0,0 +1,43 @@
|
|||
# NAME
|
||||
|
||||
git-annex smudge - git filter driver for git-annex
|
||||
|
||||
# SYNOPSIS
|
||||
|
||||
git annex smudge [--clean] file
|
||||
|
||||
# DESCRIPTION
|
||||
|
||||
This command lets git-annex be used as a git filter driver which lets
|
||||
annexed files in the git repository to be unlocked at all times, instead
|
||||
of being symlinks.
|
||||
|
||||
When adding a file with `git add`, the annex.largefiles config is
|
||||
consulted to decide if a given file should be added to git as-is,
|
||||
or if its content are large enough to need to use git-annex.
|
||||
|
||||
The git configuration to use this command as a filter driver is as follows.
|
||||
This is normally set up for you by git-annex init, so you should
|
||||
not need to configure it manually.
|
||||
|
||||
[filter "annex"]
|
||||
smudge = git-annex smudge %f
|
||||
clean = git-annex smudge --clean %f
|
||||
|
||||
To make git use that filter driver, it needs to be configured in
|
||||
the .gitattributes file or in `.git/config/attributes`. The latter
|
||||
is normally configured when a repository is initialized, with the following
|
||||
contents:
|
||||
|
||||
* filter=annex
|
||||
.* !filter
|
||||
|
||||
# SEE ALSO
|
||||
|
||||
[[git-annex]](1)
|
||||
|
||||
# AUTHOR
|
||||
|
||||
Joey Hess <id@joeyh.name>
|
||||
|
||||
Warning: Automatically converted into a man page by mdwn2man. Edit with care.
|
|
@ -11,8 +11,16 @@ git annex unlock `[path ...]`
|
|||
Normally, the content of annexed files is protected from being changed.
|
||||
Unlocking an annexed file allows it to be modified. This replaces the
|
||||
symlink for each specified file with a copy of the file's content.
|
||||
You can then modify it and `git annex add` (or `git commit`) to inject
|
||||
it back into the annex.
|
||||
You can then modify it and `git annex add` (or `git commit`) to save your
|
||||
changes.
|
||||
|
||||
In repositories with annex.version 5 or earlier, unlocking a file is local
|
||||
to the repository, and is temporary. With version 6, unlocking a file
|
||||
changes how it is stored in the git repository (from a symlink to a pointer
|
||||
file), so you can commit it like any other change. Also in version 6, you
|
||||
can use `git add` to add a fie to the annex in unlocked form. This allows
|
||||
workflows where a file starts out unlocked, is modified as necessary, and
|
||||
is locked once it reaches its final version.
|
||||
|
||||
# OPTIONS
|
||||
|
||||
|
|
|
@ -626,6 +626,14 @@ subdirectories).
|
|||
|
||||
See [[git-annex-diffdriver]](1) for details.
|
||||
|
||||
* `smudge`
|
||||
|
||||
This command lets git-annex be used as a git filter driver, allowing
|
||||
annexed files in the git repository to be unlocked at all times, instead
|
||||
of being symlinks.
|
||||
|
||||
See [[git-annex-smudge]](1) for details.
|
||||
|
||||
* `remotedaemon`
|
||||
|
||||
Detects when network remotes have received git pushes and fetches from them.
|
||||
|
|
|
@ -158,7 +158,8 @@ Using git-annex on a crippled filesystem that does not support symlinks.
|
|||
Data:
|
||||
|
||||
* An annex pointer file has as its first line the git-annex key
|
||||
that it's standing in for. Subsequent lines of the file might
|
||||
that it's standing in for (prefixed with "annex/objects/", similar to
|
||||
an annex symlink target). Subsequent lines of the file might
|
||||
be a message saying that the file's content is not currently available.
|
||||
An annex pointer file is checked into the git repository the same way
|
||||
that an annex symlink is checked in.
|
||||
|
@ -177,8 +178,8 @@ Configuration:
|
|||
the annex. Other files are passed through the smudge/clean as-is and
|
||||
have their contents stored in git.
|
||||
|
||||
* annex.direct is repurposed to configure how the assistant adds files.
|
||||
When set to true, they're added unlocked.
|
||||
* annex.direct is repurposed to configure how git-annex adds files.
|
||||
When set to false, it adds symlinks and when true it adds pointer files.
|
||||
|
||||
git-annex clean:
|
||||
|
||||
|
@ -232,15 +233,11 @@ git annex lock/unlock:
|
|||
transition repositories to using pointers, and a cleaner unlock/lock
|
||||
for repos using symlinks.
|
||||
|
||||
unlock will stage a pointer file, and will copy the content of the object
|
||||
out of .git/annex/objects to the work tree file. (Might want a --hardlink
|
||||
switch.)
|
||||
unlock will stage a pointer file, and will link the content of the object
|
||||
from .git/annex/objects to the work tree file.
|
||||
|
||||
lock will replace the current work tree file with the symlink, and stage it.
|
||||
Note that multiple work tree files could point to the same object.
|
||||
So, if the link count is > 1, replace the annex object with a copy of
|
||||
itself to break such a hard link. Always finish by locking down the
|
||||
permissions of the annex object.
|
||||
lock will replace the current work tree file with the symlink, and stage it,
|
||||
and lock down the permissions of the annex object.
|
||||
|
||||
#### file map
|
||||
|
||||
|
@ -248,7 +245,8 @@ The file map needs to map from `Key -> [File]`. `File -> Key`
|
|||
seems useful to have, but in practice is not worthwhile.
|
||||
|
||||
Drop and get operations need to know what files in the work tree use a
|
||||
given key in order to update the work tree.
|
||||
given key in order to update the work tree. And, we don't want to
|
||||
overwrite a work tree file if it's been modified when dropping or getting.
|
||||
|
||||
git-annex commands that look at annex symlinks to get keys to act on will
|
||||
need fall back to either consulting the file map, or looking at the staged
|
||||
|
@ -275,13 +273,14 @@ In particular:
|
|||
* Is the smudge filter called at any other time? Seems unlikely but then
|
||||
there could be situations with a detached work tree or such.
|
||||
* Does git call any useful hooks when removing a file from the work tree,
|
||||
or converting it to not be annexed?
|
||||
or converting it to not be annexed, or for `git mv` of an annexed file?
|
||||
No!
|
||||
|
||||
From this analysis, any file map generated by the smudge/clean filters
|
||||
is necessary potentially innaccurate. It may list deleted files.
|
||||
It may or may not reflect current unstaged changes from the work tree.
|
||||
|
||||
|
||||
Follows that any use of the file map needs to verify the info from it,
|
||||
and throw out bad cached info (updating the map to match reality).
|
||||
|
||||
|
@ -306,17 +305,71 @@ just look at the repo content in the first place..
|
|||
|
||||
annex.version changes to 6
|
||||
|
||||
Upgrade should be handled automatically.
|
||||
git config for filter.annex.smudge and filter.annex.clean is set up.
|
||||
|
||||
On upgrade, update .gitattributes with a stock configuration, unless
|
||||
it already mentions "filter=annex".
|
||||
.gitattributes is updated with a stock configuration,
|
||||
unless it already mentions "filter=annex".
|
||||
|
||||
Upgrading a direct mode repo needs to switch it out of bare mode, and
|
||||
needs to run `git annex unlock` on all files (or reach the same result).
|
||||
So will need to stage changes to all annexed files.
|
||||
|
||||
When a repo has some clones indirect and some direct, the upgraded repo
|
||||
will have all files unlocked, necessarily in all clones.
|
||||
will have all files unlocked, necessarily in all clones. This happens
|
||||
automatically, because when the direct repos are upgraded that causes the
|
||||
files to be unlocked, while the indirect upgrades don't touch the files.
|
||||
|
||||
#### implementation todo list
|
||||
|
||||
* Still a few test suite failues for v6 with locked files.
|
||||
* Test suite should make pass for v6 with unlocked files.
|
||||
* Reconcile staged changes into the associated files database, whenever
|
||||
the database is queried. This is needed to handle eg:
|
||||
git add largefile
|
||||
git mv largefile othername
|
||||
git annex move othername --to foo
|
||||
# fails to drop content from associated file othername,
|
||||
# because it doesn't know it has that name
|
||||
# git commit clears up this mess
|
||||
* Interaction with shared clones. Should avoid hard linking from/to a
|
||||
object in a shared clone if either repository has the object unlocked.
|
||||
(And should avoid unlocking an object if it's hard linked to a shared clone,
|
||||
but that's already accomplished because it avoids unlocking an object if
|
||||
it's hard linked at all)
|
||||
* Make automatic merge conflict resolution work for pointer files.
|
||||
- Should probably automatically handle merge conflicts between annex
|
||||
symlinks and pointer files too. Maybe by always resulting in a pointer
|
||||
file, since the symlinks don't work everwhere.
|
||||
* Crippled filesystem should cause all files to be transparently unlocked.
|
||||
Note that this presents problems when dealing with merge conflicts and
|
||||
when pushing changes committed in such a repo. Ideally, should avoid
|
||||
committing implicit unlocks, or should prevent such commits leaking out
|
||||
in pushes.
|
||||
* Dropping a smudged file causes git status (and git annex status)
|
||||
to show it as modified, because the timestamp has changed.
|
||||
Getting a smudged file can also cause this.
|
||||
Upgrading a direct mode repo also leaves files in this state.
|
||||
User can use `git add` to clear it up, but better to avoid this,
|
||||
by updating stat info in the index.
|
||||
(May need to use libgit2 to do this, cannot find
|
||||
any plumbing except git-update-index, which is very inneficient for
|
||||
smudged files.)
|
||||
* Audit code for all uses of isDirect. These places almost always need
|
||||
adjusting to support v6, if they haven't already.
|
||||
* Optimisation: See if the database schema can be improved to speed things
|
||||
up. Are there enough indexes? getAssociatedKey in particular does a
|
||||
reverse lookup and might benefit from an index.
|
||||
* Optimisation: Reads from the Keys database avoid doing anything if the
|
||||
database doesn't exist. This makes v5 repos, or v6 with all locked files
|
||||
faster. However, if a v6 repo unlocks and then re-locks a file, its
|
||||
database will exist, and so this optimisation will no longer apply.
|
||||
Could try to detect when the database is empty, and remove it or avoid reads.
|
||||
|
||||
* Eventually (but not yet), make v6 the default for new repositories.
|
||||
Note that the assistant forces repos into direct mode; that will need to
|
||||
be changed then.
|
||||
* Later still, remove support for direct mode, and enable automatic
|
||||
v5 to v6 upgrades.
|
||||
|
||||
----
|
||||
|
||||
|
|
|
@ -43,6 +43,46 @@ conflicts first before upgrading git-annex.
|
|||
|
||||
The upgrade events, so far:
|
||||
|
||||
## v5 -> v6 (git-annex version 6.x)
|
||||
|
||||
The upgrade from v5 to v6 is handled manually. Run `git-annex upgrade`
|
||||
perform the upgrade.
|
||||
|
||||
Warning: All places that a direct mode repository is cloned to should be
|
||||
running git-annex version 6.x before you upgrade the repository.
|
||||
This is necessary because the contents of the repository are changed
|
||||
in the upgrade, and the old version of git-annex won't be able to
|
||||
access files after the repo is upgraded.
|
||||
|
||||
This upgrade does away with the direct mode/indirect mode distinction.
|
||||
A v6 git-annex repository can have some files locked and other files
|
||||
unlocked, and all git and git-annex commands can be used on both locked and
|
||||
unlocked files. (Although for locked files to work, the filesystem
|
||||
must support symbolic links..)
|
||||
|
||||
The behavior of some commands changes in an upgraded repository:
|
||||
|
||||
* `git add` will add files to the annex, in unlocked mode, rather than
|
||||
adding them directly to the git repository. To cause some files to be
|
||||
added directly to git, you can configure `annex.largefiles`. For
|
||||
example:
|
||||
|
||||
git config annex.largefiles "largerthan=100kb and not (include=*.c or include=*.h)"
|
||||
|
||||
* `git annex unlock` and `git annex lock` change how the pointer to
|
||||
the annexed content is stored in git.
|
||||
|
||||
If a repository is only used in indirect mode, you can use git-annex
|
||||
v5 and v6 in different clones of the same indirect mode repository without
|
||||
problems.
|
||||
|
||||
On upgrade, all files in a direct mode repository will be converted to
|
||||
unlocked files. The upgrade will stage changes to all annexed files in
|
||||
the git repository, which you can then commit.
|
||||
|
||||
If a repository has some clones using direct mode and some using indirect
|
||||
mode, all the files will end up unlocked in all clones after the upgrade.
|
||||
|
||||
## v4 -> v5 (git-annex version 5.x)
|
||||
|
||||
The upgrade from v4 to v5 is handled
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue