Fix bug that made changes to a special remote sometimes be missed when
importing a tree from it. The diff import would miss when a change was
exported, then manually undone on the special remote (eg deleting a newly
exported file). A full import is needed to catch such changes.
After upgrading, any such missed changes will be included in the next
tree imported from a special remote. This happens because the previously
recorded content identifier tree does not have export information included,
so it is treated as invalid, and a full import is done.
Fixes reversion introduced in version 10.20230626, commit
40017089f2
Unfortunately, this does mean that after each export, the next import will
be a full import. Which can take significantly longer than the diff import
does, when there are a lot of files in the tree.
It would be better if exporting also update the content identifier tree.
However, I don't know if that can be done inexpensively. It would be future
optimisation work, in any case.
(That could only be done for an export that is run in the same
repository as the import. When an export is run in a different repository,
the export.log gets updated, and that propagates to the repository where
import is later run. At that point, a full import is done.)
Sponsored-by: Luke T. Shumaker
Fix hang that could occur when using git-annex adjust on a branch with a
number of files greater than annex.queuesize. Or potentially other
commands.
When reconcileStaged is running, the database is being opened. But
restagePointerFiles closes the database, and later writes to it. So it will
deadlock if called by reconcileStaged.
The deadlock occurred when the git queue happened to be full, causing
adding a call to restagePointerFiles to it to flush the queue and
restagePointerFiles to run at the wrong time.
Fixed by making reconcileStaged, when it populates or depopulates a pointer
file, arrange for restagePointerFiles to be run as a cleanup action, rather
than from the git queue.
But, what if restagePointerFiles is already in the git queue before
reconcileStaged is run? If it adds anything else to the git queue, causing
the queue to flush, it would still deadlock. To avoid this hypothetical
situation, added a Annex.inreconcilestaged, and made restagePointerFiles
check it and not do anything.
Note that, I did consider the simpler approach of only running
restagePointerFiles as a cleanup action, rather than from the git queue.
But see commit 6a3bd283b8 for why it was made
to use the queue in the first place. I wanted to avoid tying this bug fix
to a behavior change.
Sponsored-by: mycroft
Changing the type out from under an existing special remote exposes the
existing config to something that may interpret it wildly differently. As
seen in the bug report, this can even result in behavior that makes
git-annex say it's buggy. So prevent the user from doing this. --sameas is
the better way.
Sponsored-by: Kevin Mueller
Works around this bug https://github.com/haskell/file-io/issues/45
The fix is in Utility.FileIO.CloseOnExec because all use of file-io is
already wrapped through that module. Although perhaps that ought to be
refactored at this point.
I'd hope that file-io will eventually fix this bug, and also provide
CloseOnExec variants of its functions. That would allow depending on the
fixed version, and removing this ugly code.
Note that, functions like readFile that don't care about the encoding
due to reading/writing a ByteString were kept optimally fast by not
setting the encoding. This avoids an IORef read and write per open.
Sponsored-by: Graham Spencer
git write-tree was being run once per file git-annex acts on when eg,
getting files, which is slow when the remote repository has a large
tree.
onLocal calls quiesce after each action, and quiesce closes the keys db
since [[!commit ba7ecbc6a9c]]. Which has a relevant comment about
performance. I have not addressed that, the keys db still gets closed and
reopened after each file.
Turns out that, since git write-tree was run by each call to
reconcileStaged, the .git/annex/keysdb.cache value was never the
same as the git index's inode. Because git write-tree updates the index's
mtime even when no changes have been made.
And so, when the database got closed and reopened, reconcileStaged would
see a changed index, and run git write-tree again. Over and over.
I considered writing the index's new inodecache after write-tree to the
keysdb.cache, but that would be vulnerable to a race, if the index was
changed just after write-tree.
The fix was to stop using keysb.cache at all. When the database is closed
and later reopened by the same process, avoid re-doing reconcileStaged.
Now that .git/annex/keysdb.cache is no longer used. It could be removed,
but the time overhead of removing it would be more than the space overhead
of keeping it. Defferred removal to the v11 upgrade.
Sponsored-by: unqueued