Merge branch 'v7'
This commit is contained in:
commit
6fd37fb016
37 changed files with 528 additions and 280 deletions
|
@ -77,3 +77,5 @@ file_%subdir%
|
|||
|
||||
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||
Yep! I already use it to move files between my laptop's HDD and SSD, and to copy files between my many SD cards. I was trying this to see if I could not have to scroll as far on my 3D printer's menu.
|
||||
|
||||
> [[done]] see comments --[[Joey]]
|
||||
|
|
|
@ -0,0 +1,17 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 4"""
|
||||
date="2018-10-26T16:53:47Z"
|
||||
content="""
|
||||
`git annex adjust --hide-missing` is now available to do what you want
|
||||
re hiding missing files.
|
||||
|
||||
`git annex view` doesn't currently unlock files in a v6 repo, so it's not
|
||||
usable on a crippled filesystem. That's why the cat in the transcript above
|
||||
shows the symlink content which git writes to a regular file when in a
|
||||
crippled filesystem.
|
||||
|
||||
I would like to eventually unify adjust with view, so `git annex adjust
|
||||
--unlock` can be used with a view, which would support that.
|
||||
See [[todo/unify_adjust_with_view]].
|
||||
"""]]
|
|
@ -0,0 +1,11 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2018-10-26T17:04:09Z"
|
||||
content="""
|
||||
Have you ever seen this again or have any more information about how to
|
||||
reproduce it?
|
||||
|
||||
This seems similar to the problem fixed by [[!commit a13c0ce66c6dd5d8cf5b09ee2fc5a58f43db4b14]]
|
||||
but the version you were using already had that commit in it.
|
||||
"""]]
|
|
@ -377,3 +377,5 @@ total 12
|
|||
lil' positive end note mode on:
|
||||
|
||||
git-annex is the only thing to which I trust my archive of most valuable documents and memories!
|
||||
|
||||
> [[done]]; see comments --[[Joey]]
|
||||
|
|
|
@ -5,6 +5,7 @@
|
|||
subject="comment 4"
|
||||
date="2018-10-18T23:34:26Z"
|
||||
content="""
|
||||
|
||||
I am stupid talking about executable files hardlinking. I think I just chmod-ed already hardlinking files, that's how I got it. No surprise.
|
||||
|
||||
I am ok with this quirk (executable files are not thinned), but just curious: what exactly influenced such design decision?
|
||||
|
|
|
@ -0,0 +1,24 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 5"""
|
||||
date="2018-10-26T17:17:35Z"
|
||||
content="""
|
||||
[[!commit b7c8bf5274a64389ac87d6ce0388b8708c261971]] is where that was
|
||||
implemented. Interestingly, its commit message does say that the annex
|
||||
object file is made executable when using annex.thin.
|
||||
And indeed, git add of an executable file with annex.thin set does
|
||||
make the object executable and hard link to it.
|
||||
|
||||
But that commit contains this line that avoids hard linking:
|
||||
|
||||
| maybe False isExecutable destmode = copy =<< getstat
|
||||
|
||||
Which is what I based my earlier comment on. But without that line,
|
||||
AFAIK it will behave the way you want, with the annex object and
|
||||
executable worktree file being hard linked. The code also removes the
|
||||
execute bit if the annex object file later ends up getting hard linked
|
||||
instead to a non-executable file.
|
||||
|
||||
So, based on this analysis, I'm going to remove that line. And improve the
|
||||
annex.thin docs slightly, and I think that's sufficient to close this bug.
|
||||
"""]]
|
|
@ -11,9 +11,9 @@ understand how to update its working tree.
|
|||
|
||||
## deprecated
|
||||
|
||||
Direct mode is deprecated! Intead, git-annex v6 repositories can simply
|
||||
Direct mode is deprecated! Intead, git-annex v7 repositories can simply
|
||||
have files that are unlocked and thus can be directly accessed and
|
||||
modified. See [[upgrades]] for details about the transition to v6
|
||||
modified. See [[upgrades]] for details about the transition to v7
|
||||
repositories.
|
||||
|
||||
## enabling (and disabling) direct mode
|
||||
|
|
|
@ -6,11 +6,13 @@ git-annex smudge - git filter driver for git-annex
|
|||
|
||||
git annex smudge [--clean] file
|
||||
|
||||
git annex smudge --update
|
||||
|
||||
# DESCRIPTION
|
||||
|
||||
This command lets git-annex be used as a git filter driver which lets
|
||||
annexed files in the git repository to be unlocked at all times, instead
|
||||
of being symlinks.
|
||||
annexed files in the git repository to be unlocked, instead
|
||||
of being symlinks, and lets `git add` store files in the annex.
|
||||
|
||||
When adding a file with `git add`, the annex.largefiles config is
|
||||
consulted to decide if a given file should be added to git as-is,
|
||||
|
@ -32,6 +34,16 @@ contents:
|
|||
* filter=annex
|
||||
.* !filter
|
||||
|
||||
The smudge filter does not provide git with the content of annexed files,
|
||||
because that would be slow and triggers memory leaks in git. Instead,
|
||||
it records which worktree files need to be updated, and
|
||||
`git annex smudge --update` later updates the work tree to contain
|
||||
the content. That is run by several git hooks, including post-checkout
|
||||
and post-merge. However, a few git commands, notably `git stash` and
|
||||
`git cherry-pick`, do not run any hooks, so after using those commands
|
||||
you can manually run `git annex smudge --update` to update the working
|
||||
tree.
|
||||
|
||||
# SEE ALSO
|
||||
|
||||
[[git-annex]](1)
|
||||
|
|
|
@ -1024,18 +1024,16 @@ Here are all the supported configuration settings.
|
|||
* `annex.thin`
|
||||
|
||||
Set this to `true` to make unlocked files be a hard link to their content
|
||||
in the annex, rather than a second copy. (Only when supported by the file
|
||||
system, and only in repository version 6.) This can save considerable
|
||||
in the annex, rather than a second copy. This can save considerable
|
||||
disk space, but when a modification is made to a file, you will lose the
|
||||
local (and possibly only) copy of the old version. So, enable with care.
|
||||
|
||||
After setting (or unsetting) this, you should run `git annex fix` to
|
||||
fix up the annexed files in the work tree to be hard links (or copies).
|
||||
|
||||
Note that `annex.thin` is not honored when git updates an annexed file
|
||||
in the working tree. So when `git checkout` or `git merge` updates the
|
||||
working tree, a second copy of annexed files will result. You can run
|
||||
`git-annex fix` to fix up the hard links after running such git commands.
|
||||
|
||||
Note that this has no effect when the filesystem does not support hard links.
|
||||
And when multiple files in the work tree have the same content, only
|
||||
one of them gets hard linked to the annex.
|
||||
|
||||
* `annex.delayadd`
|
||||
|
||||
|
|
|
@ -8,10 +8,10 @@ but it needs some different workflows of using git-annex.
|
|||
|
||||
## getting started
|
||||
|
||||
To get started, your repository needs to be upgraded to v6, since the
|
||||
To get started, your repository needs to be upgraded to v7, since the
|
||||
feature does not work in v5 repositories.
|
||||
|
||||
git annex upgrade --version=6
|
||||
git annex upgrade --version=7
|
||||
|
||||
The [[git-annex adjust|git-annex-adjust]] command sets up an adjusted form
|
||||
of a git branch, in this case we'll ask it to hide missing files.
|
||||
|
@ -124,7 +124,7 @@ I set up the repository like this:
|
|||
|
||||
git clone server:/path/to/podcasts
|
||||
cd podcasts
|
||||
git annex upgrade --version=6
|
||||
git annex upgrade --version=7
|
||||
git annex adjust --hide-missing
|
||||
git annex group here client
|
||||
git annex wanted here standard
|
||||
|
|
|
@ -15,7 +15,7 @@ by running `git annex unlock`.
|
|||
# git annex unlock some_file
|
||||
# echo "new content" > some_file
|
||||
|
||||
Back before git-annex version 6, and its v6 repository mode, unlocking a file
|
||||
Back before git-annex version 7, and its v7 repository mode, unlocking a file
|
||||
like this was a transient thing. You'd modify it and then `git annex add` the
|
||||
modified version to the annex, and finally `git commit`. The new version of
|
||||
the file was then back to being locked.
|
||||
|
@ -29,31 +29,28 @@ to edit files repeatedly, without manually having to unlock them every time.
|
|||
The [[direct_mode]] made all files be unlocked all the time, but it
|
||||
had many problems of its own.
|
||||
|
||||
## enter v6 mode
|
||||
## enter v7 mode
|
||||
|
||||
/!\ This is a new feature; see its [[todo_list|todo/smudge]]
|
||||
for known issues.
|
||||
|
||||
This led to the v6 repository mode, which makes unlocked files remain
|
||||
This led to the v7 repository mode, which makes unlocked files remain
|
||||
unlocked after they're committed, so you can keep changing them and
|
||||
committing the changes whenever you'd like. It also lets you use more
|
||||
normal git commands (or even interfaces on top of git) for handling
|
||||
annexed files.
|
||||
|
||||
To get a repository into v6 mode, you can [[upgrade|upgrades]] it.
|
||||
To get a repository into v7 mode, you can [[upgrade|upgrades]] it.
|
||||
This will eventually happen automatically, but for now it's a manual process
|
||||
(be sure to read [[upgrades]] before doing this):
|
||||
|
||||
# git annex upgrade
|
||||
|
||||
Or, you can init a new repository in v6 mode.
|
||||
Or, you can init a new repository in v7 mode.
|
||||
|
||||
# git init
|
||||
# git annex init --version=6
|
||||
# git annex init --version=7
|
||||
|
||||
## using it
|
||||
|
||||
Using a v6 repository is easy! Simply use regular git commands to add
|
||||
Using a v7 repository is easy! Simply use regular git commands to add
|
||||
and commit files. In a git-annex repository, git will use git-annex
|
||||
to store the file contents, and the files will be left unlocked.
|
||||
|
||||
|
@ -97,7 +94,7 @@ mode is used. To make them always use unlocked mode, run:
|
|||
|
||||
## mixing locked and unlocked files
|
||||
|
||||
A v6 repository can contain both locked and unlocked files. You can switch
|
||||
A v7 repository can contain both locked and unlocked files. You can switch
|
||||
a file back and forth using the `git annex lock` and `git annex unlock`
|
||||
commands. This changes what's stored in git between a git-annex symlink
|
||||
(locked) and a git-annex pointer file (unlocked). To add a file to
|
||||
|
@ -108,28 +105,34 @@ If you want to mostly keep files locked, but be able to locally switch
|
|||
to having them all unlocked, you can do so using `git annex adjust
|
||||
--unlock`. See [[git-annex-adjust]] for details. This is particularly
|
||||
useful when using filesystems like FAT, and OS's like Windows that don't
|
||||
support symlinks.
|
||||
support symlinks. Indeed, `git-annex init` detects such filesystems and
|
||||
automatically sets up a repository to use all unlocked files.
|
||||
|
||||
## index gotchas
|
||||
## imperfections
|
||||
|
||||
When git-annex gets or drops the content of an unlocked file, it updates
|
||||
the file in git's worktree accordingly. That makes `git status` show
|
||||
the file as modified, even though there are no changes to commit.
|
||||
So git-annex then updates the index file to reflect the change to the
|
||||
worktree, and prevent the file from appearing to be modified.
|
||||
Unlocked files in v7 repositories mostly work very well, but there are a
|
||||
few imperfections which you should be aware of when using them.
|
||||
|
||||
This means that when git-annex is running a command that gets or drops the
|
||||
content of an unlocked file, the index will sometimes be locked. This might
|
||||
prevent you from `git commit` at the same time. Or, if you have a git
|
||||
commit in progress, or are running multiple git-annex processes, git-annex
|
||||
may complain that the index is locked.
|
||||
1. `git stash`, `git cherry-pick` and `git reset --hard` don't update
|
||||
the working tree with the content of unlocked files. The files
|
||||
will contain pointers, the same as if the content was not in the
|
||||
repository. So after running these commands, you will need to manually
|
||||
run `git annex smudge --update`.
|
||||
|
||||
Also, interrupting git-annex (eg with ctrl-c) before it can update the
|
||||
index will leave `git status` showing modifications.
|
||||
2. When git-annex is running a command that gets or drops the content
|
||||
of an unlocked file, git's index will briefly be locked, which might
|
||||
prevent you from running a `git commit` at the same time.
|
||||
|
||||
To manually update the index when git-annex was not able to, you can run:
|
||||
3. Conversely, if you have a git commit in progress, running git-annex may
|
||||
complain that the index is locked, though this will not prevent it from
|
||||
working.
|
||||
|
||||
git update-index -q --refresh $file
|
||||
4. When an operation such as a checkout or merge needs to update a large
|
||||
number of unlocked files, it can become slow. So can be `git add` of
|
||||
a large number of files (`git annex add` is faster).
|
||||
|
||||
(The technical reasons behind these imperfections are explained in
|
||||
detail in [[todo/git_smudge_clean_interface_suboptiomal]].)
|
||||
|
||||
## using less disk space
|
||||
|
||||
|
@ -168,15 +171,6 @@ match the new setting:
|
|||
|
||||
git annex fix
|
||||
|
||||
Unfortunately, git's smudge interface does not let git-annex honor
|
||||
the annex.thin configuration when git is checking out a file.
|
||||
So, using `git checkout` to check out a different branch, or even
|
||||
`git merge` can result in some non-thin files making their way into the
|
||||
working tree, and using more disk space. A warning will be printed out in
|
||||
this situation. You can always run `git annex fix` to re-thin such files.
|
||||
|
||||
## annex.thin tradeoffs
|
||||
|
||||
[[!template id=note text="""
|
||||
When a [[direct_mode]] repository is upgraded, annex.thin is automatically
|
||||
set, because direct mode made the same single-copy tradeoff.
|
||||
|
|
|
@ -0,0 +1,21 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2018-10-26T16:21:28Z"
|
||||
content="""
|
||||
While `git add` would be a lot slower when using this interface to add
|
||||
large files, it would make `git checkout` and other commands that update
|
||||
the work tree a lot faster.
|
||||
|
||||
Since the smudge filter is not providing git with the file content any more,
|
||||
using filterdriver would avoid git running many git-annex smudge processes,
|
||||
greatly speeding up large checkouts.
|
||||
|
||||
Unfortunately, `git annex smudge --update` ends up running the smudge filter
|
||||
on all files that the clean filter earlier acted on, so even if filterdriver were
|
||||
used to speed up the clean filter, there would still be one process spawned per
|
||||
file for the smudge filter.
|
||||
|
||||
So some interface improvement is needed before git-annex can usefully use
|
||||
this.
|
||||
"""]]
|
|
@ -1,82 +1,13 @@
|
|||
git-annex should use smudge/clean filters. v6 mode
|
||||
git-annex should use smudge/clean filters. v7 mode
|
||||
|
||||
### problems keeping v6 experimental
|
||||
## warts
|
||||
|
||||
* Checking out a different branch causes git to smudge all changed files,
|
||||
and write their content. This does not honor annex.thin. A warning
|
||||
message is printed in this case.
|
||||
|
||||
This is particularly wasteful when checking out an adjusted unlocked
|
||||
branch, which causes 2x the space to be used.
|
||||
|
||||
"git annex proxy" could be used to handle this.
|
||||
Make it run the git command with smudge filter set to not output content
|
||||
but only pointers, and then at the end populate the pointer files, hard
|
||||
when appropriate. (As an optimization, the smudge filter could also be
|
||||
made to use the long-running filter interface when run this way.)
|
||||
|
||||
git-annex adjust and git-annex sync could both use that internally
|
||||
when checking out the adjusted branch, and merging a branch into HEAD.
|
||||
|
||||
Or: Make the smudge filter never provide the actual file content, but the
|
||||
pointer. Install post-checkout and post-merge hooks that populate
|
||||
the worktree files that were checked out. Of course, they will also
|
||||
need to update the index.
|
||||
|
||||
Problem: post-merge hook is not run when there's a merge conflict.
|
||||
Git does not actually run the smudge filter in this case;
|
||||
the conflicting file becomes a text file containing a merge conflict
|
||||
between the two annex pointers. When the user resolves the conflict
|
||||
and git add's the result, git runs the smudge filter.
|
||||
So, if the smudge filter then provides the pointer, the file would not be
|
||||
populated. The post-commit hook would then need to populate the file,
|
||||
once the merge got committed.
|
||||
|
||||
Problem: No hook seems to be run for git stash / git stash apply
|
||||
or for git reset --hard or git cherry-pick.
|
||||
Fatal or can we live with needing to run a
|
||||
git-annex command to populate the files after those commands?
|
||||
|
||||
> implemented on the `delaysmudge` branch now
|
||||
|
||||
(My enhanced smudge/clean patch set also fixed this problem, in a much
|
||||
nicer way...)
|
||||
|
||||
* Optionally: Use the filterdriver interface during checkout. Unfortunately that
|
||||
interface is slower for cleaning during git add (see
|
||||
[[todo/Long_Running_Filter_Process]]), but since the smudge filter is not
|
||||
providing git with the file content any more, using filterdriver would
|
||||
avoid git running many git-annex smudge processes, greatly speeding up large
|
||||
checkouts. git add could be left slow, with git-annex add being the fast path,
|
||||
until the filterdriver interface is improved. Or, make "git annex proxy"
|
||||
use the filterdriver interface for checkout.
|
||||
|
||||
* When git runs the smudge filter, it buffers all its output in ram before
|
||||
writing it to a file. So, checking out a branch with a large v6 unlocked files
|
||||
can cause git to use a lot of memory.
|
||||
|
||||
This needs to be fixed in git, but my proposed interface in
|
||||
<http://thread.gmane.org/gmane.comp.version-control.git/294425> would
|
||||
avoid the problem for git checkout, since it would use the new interface
|
||||
and not the smudge filter.
|
||||
|
||||
Last verified with git 2.18 in 2018.
|
||||
|
||||
Note that the long-running filter process interface has the same problem.
|
||||
|
||||
The annex.thin idea above could work around this problem.
|
||||
|
||||
> implemented on the `delaysmudge` branch now
|
||||
|
||||
## other warts
|
||||
|
||||
* There are several v6 bugs that are edge cases and
|
||||
* There are several bugs that are edge cases and
|
||||
need more info or analysis. None of these seem like blockers
|
||||
to keep v6 experimental or to replacing direct mode with v6.
|
||||
to keep v7 experimental or to replacing direct mode with v7.
|
||||
|
||||
- <http://git-annex.branchable.com/bugs/assistant_crashes_in_TransferScanner/>
|
||||
- <http://git-annex.branchable.com/bugs/v6_appears_to_not_thin/>
|
||||
- <http://git-annex.branchable.com/bugs/Metadata_views_in_v6_repo_upgraded_from_direct_mode_act_strangely/>
|
||||
- <http://git-annex.branchable.com/bugs/git-annex-sync_sometimes_fails_in_submodule_in_V6_adjusted_branch/>
|
||||
|
||||
### long term todos
|
||||
|
@ -86,14 +17,14 @@ git-annex should use smudge/clean filters. v6 mode
|
|||
multiple files, and so should be faster.
|
||||
|
||||
See [[todo/Long_Running_Filter_Process]] .. it's not currently actually a
|
||||
win but might be a good way to improve git to work better with v6.
|
||||
win but might be a good way to improve git to work better with v7.
|
||||
|
||||
* Eventually (but not yet), make v6 the default for new repositories.
|
||||
* Eventually (but not yet), make v7 the default for new repositories.
|
||||
Note that the assistant forces repos into direct mode; that will need to
|
||||
be changed then, and it should enable annex.thin instead.
|
||||
|
||||
* Later still, remove support for direct mode, and enable automatic
|
||||
v5 to v6 upgrades.
|
||||
v5 to v7 upgrades.
|
||||
|
||||
### historical notes
|
||||
|
||||
|
@ -395,7 +326,7 @@ just look at the repo content in the first place..
|
|||
|
||||
#### Upgrading
|
||||
|
||||
annex.version changes to 6
|
||||
annex.version changes to 7
|
||||
|
||||
git config for filter.annex.smudge and filter.annex.clean is set up.
|
||||
|
||||
|
|
7
doc/todo/unify_adjust_with_view.mdwn
Normal file
7
doc/todo/unify_adjust_with_view.mdwn
Normal file
|
@ -0,0 +1,7 @@
|
|||
`git annex adjust` and `git annex view` (et all) both derive a branch from
|
||||
the main branch and enter it. They have different capabilies. It would be
|
||||
useful to be able to compose them. For example, to enter a view based on
|
||||
metadata that also has all files unlocked.
|
||||
|
||||
There's also probably a fair amount of overlap in their implementations.
|
||||
--[[Joey]]
|
|
@ -46,11 +46,18 @@ the upgrade would need to be run in a copy of the repository.
|
|||
|
||||
The upgrade events, so far:
|
||||
|
||||
## v5 -> v6 (git-annex version 6.x)
|
||||
## v6 -> v7 (git-annex version 7.x)
|
||||
|
||||
The upgrade from v5 to v6 is handled manually for now.
|
||||
The upgrade from v5 to v7 is handled manually for now.
|
||||
Run `git-annex upgrade` to perform the upgrade.
|
||||
|
||||
v6 repositories are automatically upgraded to v7.
|
||||
|
||||
The only difference between v6 and v7 is that some additional git hooks
|
||||
were added in v7.
|
||||
|
||||
## v5 -> v6 (git-annex version 6.x)
|
||||
|
||||
A v6 git-annex repository can have some files locked while other files are
|
||||
unlocked, and all git and git-annex commands can be used on both locked and
|
||||
unlocked files. (Although for locked files to be accessible, the filesystem
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue