git-annex/doc/todo/avoid_unnecessary_union_merges.mdwn
Joey Hess e9bfa8eaed avoid unnecessary auto-merge when only changing a file in the branch.
Avoids doing auto-merging in commands that don't need fully current
information from the git-annex branch. In particular, git annex add no
longer needs to auto-merge. Affected commands: Anything that doesn't
look up data from the branch, but does write a change to it.

It might seem counterintuitive that we can change a value without first
making sure we have the current value. This optimisation works because
these two sequences are equivilant:

1. pull from remote
2. union merge
3. read file from branch
4. modify file and write to branch

vs.

1. read file from branch
2. modify file and write to branch
3. pull from remote
4. union merge

After either sequence, the git-annex branch contains the same logical content
for the modified file. (Possibly with lines in a different order or
additional old lines of course).
2011-11-12 15:15:57 -04:00

20 lines
1,000 B
Markdown

Some commands cause a union merge unnecessarily. For example, `git annex add`
modifies the location log, which first requires reading the current log (if
any), which triggers a merge.
Would be good to avoid these unnecessary union merges. First because it's
faster and second because it avoids a possible delay when a user might
ctrl-c and leave the repo in an inconsistent state. In the case of an add,
the file will be in the annex, but no location log will exist for it (fsck
fixes that).
It may be that all that's needed is to modify Annex.Branch.change
to read the current value, without merging. Then commands like `get`, that
query the branch, will still cause merges, and commands like `add` that
only modify it, will not. Note that for a command like `get`, the merge
occurs before it has done anything, so ctrl-c should not be a problem
there.
This is a delicate change, I need to take care.. --[[Joey]]
> [[done]] (assuming I didn't miss any cases where this is not safe!) --[[Joey]]