f04d9574d6
While redundant concurrent transfers were already prevented in most cases, it failed to prevent the case where two different repositories were sending the same content to the same repository. By removing the uuid from the transfer lock file for Download transfers, one repository sending content will block the other one from also sending the same content. In order to interoperate with old git-annex, the old lock file is still locked, as well as locking the new one. That added a lot of extra code and work, and the plan is to eventually stop locking the old lock file, at some point in time when an old git-annex process is unlikely to be running at the same time. Note that in the case of 2 repositories both doing eg `git-annex copy foo --to origin` the output is not that great: copy b (to origin...) transfer already in progress, or unable to take transfer lock git-annex: transfer already in progress, or unable to take transfer lock 97% 966.81 MiB 534 GiB/s 0sp2pstdio: 1 failed Lost connection (fd:14: hPutBuf: resource vanished (Broken pipe)) Transfer failed Perhaps that output could be cleaned up? Anyway, it's a lot better than letting the redundant transfer happen and then failing with an obscure error about a temp file, which is what it did before. And it seems users don't often try to do this, since nobody ever reported this bug to me before. (The "97%" there is actually how far along the *other* transfer is.) Sponsored-by: Joshua Antonishen on Patreon
56 lines
2.3 KiB
Markdown
56 lines
2.3 KiB
Markdown
Copying the same file from 2 repositories into another repository at the
|
|
same time, the redundant transfer is not prevented.
|
|
|
|
This was surprising to me! I would have expected it to prevent starting an
|
|
upload when another transfer of that key is already running. The same as
|
|
it does when starting 2 copies of the same file to the same remote.
|
|
|
|
I'm easily able to reproduce this in a bench test with 2 clones of a repo.
|
|
|
|
git init a
|
|
cd a
|
|
git-annex init
|
|
dd if=/dev/urandom of=big bs=1M count=1000
|
|
git-annex add
|
|
git commit -m add
|
|
cd ..
|
|
git clone a b
|
|
git clone a c
|
|
cd b; git-annex get
|
|
cd ../c; git-annex move --from orgin
|
|
(cd b; git-annex copy --to origin) &
|
|
(cd c; git-annex copy --to origin) &
|
|
|
|
Also same happens when running `git-annex get --from` two different remotes
|
|
concurrently in the same repo.
|
|
|
|
Aha... Looking at the code, this seems like a fundamental oversight.
|
|
The `transferFile` depends on the uuid of the remote being transfered
|
|
to/from, so there are two different ones in this case. And the transfer
|
|
lock file that is checked derives from those files, so there are two
|
|
seperate lock files.
|
|
|
|
That is actually what we want when eg, uploading content from the same
|
|
repo into two different repos. It would not do for an upload to one repo
|
|
to block uploading to another repo. So a per-uuid lock file makes sense for
|
|
uploads. But not for downloads.
|
|
|
|
It seems that it does make sense to have different transfer information
|
|
files for downloads from different uuids. Because the filename is parsed
|
|
to determine the uuid.
|
|
|
|
Perhaps the fix is just for `transferLockFile` to take a Transfer parameter,
|
|
and return the same lock file for all Downloads of a key, no matter the
|
|
uuid.
|
|
|
|
There is the issue that renaming the lock file would break interoperability
|
|
with old git-annex. In this case, since this bug prevents noticing multiple
|
|
downloads from different uuids, interoperability would only prevent
|
|
noticing multiple downloads from the same uuid. Which is not a great
|
|
behavior to break either, even if it would usually only break transiently.
|
|
|
|
Of course that could be avoided by keeping the current lock file, and
|
|
adding a second lock file. [[done]] this, with plans in
|
|
[[todo/v11_changes]] to transition to use only the new lock file in
|
|
the future.
|
|
--[[Joey]]
|