Implemented by making Git.Queue have a FlushAction, which can accumulate
along with another action on files, and runs only once the other action has
run.
This lets git-annex unlock queue up git update-index actions, without
conflicting with the restagePointerFiles FlushActions.
In a repository with filter-process enabled, git-annex unlock will
often not take any more time than before, though it may when the files are
large. Either way, it should always slow down less than git-annex status
speeds up.
When filter-process is not enabled, git-annex unlock will slow down as much
as git status speeds up.
Sponsored-by: Jochen Bartl on Patreon
autoUpgradeableVersions had latestVersion (10), but it did not make
sense for asking for old version 6 to get version 10, while asking for
version 8 got version 8. So use defaultVersion (8) instead.
Sponsored-by: Dartmouth College's Datalad project
The problem is that withContentLockFile, in a v8 repo, has to take a shared
lock of `.git/annex/content.lck`. But, in a readonly repository, if that
file does not yet exist, it cannot lock it. And while it will sometimes
work to `chmod +r .git/annex`, the repository might be readonly due to
being owned by another user, or due to being mounted readonly.
So, it seems that the only solution is to use some other file than
`.git/annex/content.lck` as the lock file. The inode sential file
was almost the only option that should always exist. (And if it somehow
does not exist, creating an empty one for locking will be ok.)
Wow, what a hack!
Sponsored-by: Dartmouth College's Datalad project
This has tradeoffs, but is generally a win, and users who it causes git add to
slow down unacceptably for can just disable it again.
It needed to happen in an upgrade, since there are git-annex versions
that do not support it, and using such an old version with a v8
repository with filter.annex.process set will cause bad behavior.
By enabling it in v9, it's guaranteed that any git-annex version that
can use the repository does support it. Although, this is not a perfect
protection against problems, since an old git-annex version, if it's
used with a v9 repository, will cause git add to try to run
git-annex filter-process, which will fail. But at least, the user is
unlikely to have an old git-annex in path if they are using a v9
repository, since it won't work in that repository.
Sponsored-by: Dartmouth College's Datalad project
Capstone of the v10 upgrade process.
Tested with a git-annex drop in a v8 repo that had a local v8 remote.
Upgrading the repo to v10 (with --force) immedaitely caused it to notice
and switch over to v10 locking. Upgrading the remote also caused it to
switch over when operating on the remote.
The InodeCache makes this fairly efficient, just an added stat call per
lock of an object file. After the v10 upgrade, there is no more
overhead.
Sponsored-by: Dartmouth College's Datalad project
Since it's easy to keep supporting v8, using it for a while (eg a few
months) will give users time to upgrade git-annex installations, before
it upgrades their repository to v9.
This commit should be reverted once ready to start upgrading
repositories by default.
Sponsored-by: Dartmouth College's Datalad project
The v10 upgrade should almost be safe now. What remains to be done is
notice when the v10 upgrade has occurred, while holding the shared lock,
and switch to using v10 lock files.
Sponsored-by: Dartmouth College's Datalad project
The upgrade from V9 uses this to avoid an automatic upgrade until 1 year
after the V9 update. It can also be used in future such situations.
Sponsored-by: Dartmouth College's Datalad project
v10 will run 1 year after the upgrade to v9, to give time for any v8
processes to die. Until that point, the v10 upgrade will be tried by
every process but deferred, so added support for deferring upgrades.
The upgrade prevention lock file that will be used by v10 is not yet
implemented, so it does not yet defer.
Sponsored-by: Dartmouth College's Datalad project
Upgrade the shared lock to an exclusive lock, and then delete the
lock file. If there is another process still holding the shared lock,
the first process will fail taking the exclusive lock, and not delete
the lock file; then the other process will later delete it.
Note that, in the time period where the exclusive lock is held, other
attempts to lock the content in place would fail. This is unlikely to be
a problem since it's a short period.
Other attempts to lock the content for removal would also fail in that
time period, but that's no different than a removal failing because
content is locked to prevent removal.
Sponsored-by: Dartmouth College's Datalad project
When dropping content, this was already done after deleting the content
file, but the lock file prevents deleting the directories. So, try the
deletion again.
This does mean there's a small added overhead of a failed rmdir().
Sponsored-by: Dartmouth College's Datalad project
This seems to be the best that can be done to avoid forever accumulating
the new content lock files, while being fully safe.
This is fixing code paths that have lingered unused since direct mode!
And direct mode seems to have been buggy in this area, since the content
lock file was deleted on unlock. But with a shared lock, there could be
another process that also had the lock file locked, and deleting it
invalidates that lock.
So, the lock file cannot be deleted after a shared lock. At least, not
wihout taking an exclusive lock first.. which I have not pursued yet but may.
After an exclusive lock, the lock file can be deleted. But there is
still a potential race, where the exclusive lock is held, and another
process gets the file open, just as the exclusive lock is dropped and
the lock file is deleted. That other process would be left with a file
handle it can take a shared lock of, but with no effect since the file
is deleted. Annex.Transfer also deletes lock files, and deals with this
same problem by using checkSaneLock, which is how I've dealt with it
here.
Sponsored-by: Dartmouth College's Datalad project
Now the content lock files are used in v9. However, I am not yet certian
they are correct. In particular, lockContentUsing deletes
the content lock file on unlock. But what if there's a shared lock
by another process? That seems like it would discard that lock too!
(Windows seems like it would not have the same problem, because as the
comment in there says, "Can't delete a locked file on Windows".
So if another process has a shared lock, removing it presumably fails.)
Sponsored-by: Dartmouth College's Datalad project
Seems to work ok. Unsure yet about the actual locking changes being
correct.
This is not the end of the story with upgrades, because it is unsafe for
this upgrade as implemented to run in a repository where an old
git-annex process is already running. The old process would use the old
locking method, and not notice files locked by the new, and this could
result in data loss. This problem will need to be dealt with before this
branch is suitable for merging.
Sponsored-by: Dartmouth College's Datalad project
Windows has always used a separate lock file, but on unix, the content
file itself was locked, and in v9 that changes to also use a separate
lock file.
This needs to be tested more. Eg, what happens after dropping a file;
does the the content lock file get deleted too, or linger around?
Sponsored-by: Dartmouth College's Datalad project
v9 will not need to write to annex content files in order to lock them,
so freezeContent removes the write bit in a shared repository, the same
as in any other repository.
checkContentWritePerm makes sure that the write perm is not set, which
will let git-annex fsck fix up the permissions. Upgrading to v9
will need to fix the permissions as well, but it seems likely there will
be situations where the user git-annex is running an upgrade as cannot,
so it will have to leave the write bit set. In such a case, git-annex
fsck can fix it later.
Sponsored-by: Dartmouth College's Datalad project
This is the start of v9, but it's currently identical to v8, and v8 is
not upgraded to it. git-annex upgrade will upgrade to v9 with this
change.
Sponsored-by: Dartmouth College's Datalad project
Recover from corrupted content being received from a git remote due eg to a
wire error, by deleting the temporary file when it fails to verify. This
prevents a retry from failing again.
Reversion introduced in version 8.20210903, when incremental verification
was added.
Only the git remote seems to be affected, although it is certianly
possible that other remotes could later have the same issue. This only
affects things passed to getViaTmp that return (False, UnVerified) due to
verification failing. As far as getViaTmp can tell, that could just as well
mean that the transfer failed in a way that would resume, so it cannot
delete the temp file itself. Remote.Git and P2P.Annex use getViaTmp internally,
while other remotes do not, which is why only it seems affected.
A better fix perhaps would be to improve the types of the callback
passed to getViaTmp, so that some other value could be used to indicate
the state where the transfer succeeded but verification failed.
Sponsored-by: Boyd Stephen Smith Jr.
Before it would pick one at random, though preferring ones that were not
dead over dead ones.
Now, if one is dead and the other not, it will use the non-dead one. But if
both are not dead, or both dead, it will error out, suggesting the user
clarify what they want to enable.
Sponsored-by: Luke Shumaker on Patreon
Capstone to this feature. Any transitions that have been performed on an
unmerged remote ref but not on the local git-annex branch, or vice-versa
have to be applied on the fly when reading files.
Sponsored-by: Dartmouth College's Datalad project
It would be difficult to make Annex.Branch.files query the unmerged
git-annex branches. Might be possible, similar to what was discussed in
7f6b2ca49c but again I decided to make it
not do anything in that situation to start with before adding such a
complicated thing.
git-annex info uses it when getting info about a repostory. The choices
were to make that fail with an error, or display the info it can, and
change the output slightly for the bits of info it cannot access. While
that is a behavior change, and I want to avoid any behavior changes due
to unmerged git-annex branches in a read-only repo, displaying a message
that is not a number seems unlikely to break anything that was consuming
a number, any worse than throwing an exception would. Probably.
Also git-annex unused --from origin is made to throw an error, but
it would fail later anyway when trying to write to the unused log files.
Sponsored-by: Dartmouth College's Datalad project
This makes --all error out in that situation. Which is better than
ignoring information from the branches.
To really handle the branches right, overBranchFileContents would need
to both query all the branches and union merge file contents
(or perhaps not provide any file content), as well as diffing between
branches to find files that are only present in the unmerged branches.
And also, it would need to handle transitions..
Sponsored-by: Dartmouth College's Datalad project
The way precaching works, it can't merge in information from those
branches efficiently, so just disable it and fall back to
Annex.Branch.get in order to get the correct information.
Sponsored-by: Dartmouth College's Datalad project
Improved support for using git-annex in a read-only repository, git-annex
branch information from remotes that cannot be merged into the git-annex
branch will now not crash it, but will be merged in memory.
To avoid this making git-annex behave one way in a read-only repository,
and another way when it can write, it's important that Annex.Branch.get
return the same thing (modulo log file compaction) in both cases.
This manages that mostly. There are some exceptions:
- When there is a transition in one of the remote git-annex branches
that has not yet been applied to the local or other git-annex branches.
Transitions are not handled.
- `git-annex log` runs git log on the git-annex branch, and so
it will not be able to show information coming from the other, not yet
merged branches.
- Annex.Branch.files only looks at files in the git-annex branch and not
unmerged branches. This affects git-annex info output.
- Annex.Branch.hs.overBranchFileContents ditto. Affects --all and
also importfeed (but importfeed cannot work in a read-only repo
anyway).
- CmdLine.Seek.seekFilteredKeys when precaching location logs.
Note use of Annex.Branch.fullname
- Database.ContentIdentifier.needsUpdateFromLog and updateFromLog
These warts make this not suitable to be merged yet.
This readonly code path is more expensive, since it has to query several
branches. The value does get cached, but still large queries will be
slower in a read-only repository when there are unmerged git-annex
branches.
When annex.merge-annex-branches=false, updateTo skips doing anything,
and so the read-only repository code does not get triggered. So a user who
is bothered by the extra work can set that.
Other writes to the repository can still result in permissions errors.
This includes the initial creation of the git-annex branch, and of course
any writes to the git-annex branch.
Sponsored-by: Dartmouth College's Datalad project
So that eg, addurl of several large files that take time to download will
update the index for each file, rather than deferring the index updates to
the end.
In cases like an add of many smallish files, where a new file is being
added every few seconds. In that case, the queue will still build up a
lot of changes which are flushed at once, for best performance. Since
the default queue size is 10240, often it only gets flushed once at the
end, same as before. (Notice that updateQueue updated _lastchanged
when adding a new item to the queue without flushing it; that is
necessary to avoid it flushing the queue every 5 minutes in this case.)
But, when it takes more than a 5 minutes to add a file, the overhead of
updating the index immediately is probably small, so do it after each
file. This avoids git-annex potentially taking a very very long time
indeed to stage newly added files, which can be annoying to the user who
would like to get on with doing something with the files it's already
added, eg using git mv to rename them to a better name.
This is only likely to cause a problem if it takes say, 30 seconds to
update the index; doing an extra 30 seconds of work after every 5
minute file add would be less optimal. Normally, updating the index takes
significantly less time than that. On a SSD with 100k files it takes
less than 1 second, and the index write time is bound by disk read and
write so is not too much worse on a hard drive. So I hope this will not
impact users, although if it does turn out to, the time limit could be
made configurable.
A perhaps better way to do it would be to have a background worker
thread that wakes up every 60 seconds or so and flushes the queue.
That is made somewhat difficult because the queue can contain Annex
actions and so this would add a new source of concurrency issues.
So I'm trying to avoid that approach if possible.
Sponsored-by: Erik Bjäreholt on Patreon
Was "failed to generate a key" when key generation did not fail
(it never does anymore) but the actual problem was it failed to stat
the source file, perhaps due to it being deleted while the key was being
generated.
A user reported this, in a comment I followed up on in
262400fe04, although I don't know
what they did to trigger the error message.
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.
The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.
That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.
In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.
registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.
Sponsored-by: Dartmouth College's Datalad project
This locking has been missing from the beginning of annex.pidlock.
It used to be possble, when two threads are doing conflicting things,
for both to run at the same time despite using locking. Seems likely
that nothing actually had a problem, but it was possible, and this
eliminates that possible source of failure.
Sponsored-by: Dartmouth College's Datalad project
This version of git -- or its new default "ort" resolver -- handles such
a conflict by staging two files, one with the original name and the other
named file~ref. Use unmergedSiblingFile when the latter is detected.
(It doesn't do that when the conflict is between a directory and a file
or symlink though, so see previous commit for how that case is handled.)
The sibling file has to be deleted separately, because cleanConflictCruft
may not delete it -- that only handles files that are annex links,
but the sibling file may be the non-annexed file side of the conflict.
The graftin code had assumed that, when the other side of a conclict
is a symlink, the file in the work tree will contain the non-annexed
content that we want it to contain. But that is not the case with the new
git; the file may be the annex link and needs to be replaced with the
content, while the annex link will be written as a -variant file.
(The weird doesDirectoryExist check in graftin turns out to still be
needed, test suite failed when I tried to remove it.)
Test suite passes with new git with ort resolver default. Have not tried it
with old git or other defaults.
Sponsored-by: Noam Kremen on Patreon
Bugfix: When -J was enabled, getting files leaked a ever-growing number of
git cat-file processes.
(Since commit dd39e9e255)
The leak happened when mergeState called stopNonConcurrentSafeCoProcesses.
While stopNonConcurrentSafeCoProcesses usually manages to stop everything,
there was a race condition where cat-file processes were leaked. Because
catFileStop modifies Annex.catfilehandles in a non-concurrency safe way,
and could clobber modifications made in between. Which should have been ok,
since originally catFileStop was only used at shutdown.
Note the comment on catFileStop saying it should only be used when nothing
else is using the handles. It would be possible to make catFileStop
race-safe, but it should just not be used in a situation where a race is
possible. So I didn't bother.
Instead, the fix is just not to stop any processes in mergeState. Because
in order for mergeState to be called, dupState must have been run, and it
enables concurrency mode, stops any non-concurrent processes, and so all
processes that are running are concurrency safea. So there is no need to
stop them when merging state. Indeed, stopping them would be extra work,
even if there was not this bug.
Sponsored-by: Dartmouth College's Datalad project
When non-concurrent git coprocesses have been started, setConcurrency
used to not stop them, and so could leak processes when enabling
concurrency, eg when forkState is called.
I do not think that ever actually happened, given where setConcurrency
is called. And it probably would only leak one of each process, since it
never downgrades from concurrent to non-concurrent.
Based on my earlier benchmark, I have a rough cost model for how
expensive it is for git-annex smudge to be run on a file, vs
how expensive it is for a gigabyte of a file's content to be read and
piped through to filter-process.
So, using that cost model, it can decide if using filter-process will
be more or less expensive than running the smudge filter on the files to
be restaged.
It turned out to be *really* annoying to temporarily disable
filter-process. I did find a way, but urk, this is horrible. Notice
that, if it's interrupted with it disabled, it will remain disabled
until the next time restagePointerFile runs. Which could be some time
later. If the user runs `git add` or `git checkout` on a lot of small
files before that, they will see slower than expected performance.
(This commit also deletes where I wrote down the benchmark results
earlier.)
Sponsored-by: Noam Kremen on Patreon
This reverts commit afe327ac49.
Unfortunately, disabling it by setting it to "" does not work, git
then ignores filter.annex.smudge/clean, and does not pass files through
git-annex at all.
I don't think there is a way to temporarily disable this git config
from the git command line. Which seems like a bug in git.
So, it may be more expensive than anticipated to enable
filter.annex.process, since git checkout etc will pipe all annexed files
being checked out through it.
This means git will run git-annex smudge --clean once per file that is
restaged, which can be slow. But probably *not* as slow as git feeding
all the content of annexed files you've gotten through a pipe to
git-annex filter-process.
The only time this is probably not ideal is after a drop of a bunch of
files, when filter-process would be faster.