complication

This commit is contained in:
Joey Hess 2022-01-12 13:19:06 -04:00
parent d427afb347
commit 63851bfec4
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38

View file

@ -0,0 +1,41 @@
[[!comment format=mdwn
username="joey"
subject="""comment 9"""
date="2022-01-12T16:29:40Z"
content="""
While this is mostly implemented in the branch, the upgrade to it has a
serious danger point.
If a long-running git-annex process, eg a large drop, is running during the
upgrade, then it will keep on using the current locking method. Meanwhile,
other processes run after the upgrade will use the new locking method. So
this could cause data loss: Old git-annex locks a file to drop it, and at
the same time new git-annex-shell is used to lock the same file, to prevent
it from being dropped.
Avoiding that seems to require a way to make sure there are no running
git-annex processes when performing a repository upgrade. (Which would be
nice in general but has somehow not been necessary until now.)
One way to do it would be to have a shared lock file that all git-annex
processes hold while they are running. And upgrade takes an exclusive lock.
That locking would need to be implemented first and somehow be known
that any git-annex process that is running is using it, before performing
the repository upgrade. Ideally without taking years in between to wait
for all git-annex binaries to be upgraded.
I suppose that git-annex v9 could ship with that added lock file, and not
upgrade the repository to v9 immediately. Instead, stat the lock file, and
only when its ctime is sufficiently old that it seems safe to assume any
running git-annex process would be using it, do the repository upgrade. Eg
after 3 months or so or perhaps when the ctime is older than the last
reboot. But this would not avoid problems if an older git-annex
version was also used in the same repository as the new version.
Or git-annex upgrade could hunt for other running git-annex processes that
are using the repository and refuse to perform the v9 upgrade. But that is
hard because a processes's cwd is not necessarily inside the repository
it's using (eg a remote). It would have to look for git child processes
of git-annex processes that are using the repository, such as git cat-file.
Also network filesystems would be a problem.
"""]]