Commit graph

74 commits

Author SHA1 Message Date
Joey Hess
51c696679f
avoid using temp file size when deciding whether to retry failed transfer
When stall detection is enabled, and a transfer is in progress,
it would display a doubled message:

(transfer already in progress, or unable to take transfer lock) (transfer already in progress, or unable to take transfer lock)

That happened because the forward retry decider had a start size of 0,
and an end size of whatever amount of the object the other process had
downloaded. So it incorrectly thought that the transferrer process had
made progress, when it had in fact immediately given up with that
message.

Instead, use the reported value from the progress meter. If a remote
does not report progress, this will mean it doesn't forward retry, in a
situation where it used to. But most remotes do report progress, and any
remote that does not can be fixed to, by using watchFileSize when
downloading. Also, some remotes might preallocate the temp file (eg
bittorrent), so relying on statting its size at this level to get
progress is dubious.

The same change was made to Annex/Transfer.hs, although only
Annex/TransferrerPool.hs needed to be changed to avoid the duplicate
message.

(An alternate fix would have been to start the retry decider with the
size of the object file before downloading begins, rather than 0.)

Sponsored-by: Brett Eisenberg on Patreon
2021-06-25 12:04:23 -04:00
Joey Hess
c2f612292a
start splitting out readonly values from AnnexState
Values in AnnexRead can be read more efficiently, without MVar overhead.
Only a few things have been moved into there, and the performance
increase so far is not likely to be noticable.

This is groundwork for putting more stuff in there, particularly a value
that indicates if debugging is enabled.

The obvious next step is to change option parsing to not run in the
Annex monad to set values in AnnexState, and instead return a pure value
that gets stored in AnnexRead.
2021-04-02 15:51:44 -04:00
Joey Hess
dd39e9e255
suggest when user may want annex.stalldetection
When annex.stalldetection is not enabled, and a likely stall is detected,
display a suggestion to enable it.

Note that the progress meter display is not taken down when displaying
the message, so it will display like this:

	0%    8 B                 0 B/s
	  Transfer seems to have stalled. To handle stalling transfers, configure annex.stalldetection
	0%    10 B                0 B/s

Although of course if it's really stalled, it will never update
again after the message. Taking down the progress meter and starting
a new one doesn't seem too necessary given how unusual this is,
also this does help show the state it was at when it stalled.

Use of uninterruptibleCancel here is ok, the thread it's canceling
only does STM transactions and sleeps. The annex thread that gets
forked off is separate to avoid it being canceled, so that it
can be joined back at the end.

A module cycle required moving from dupState the precaching of the
remote list. Doing it at startConcurrency should cover all the cases
where the remote list is used in concurrent actions.

This commit was sponsored by Kevin Mueller on Patreon.
2021-02-03 15:57:19 -04:00
Joey Hess
a422a056f2
make getViaTmpFrom no longer update location log
All callers adjusted to update it themselves.

In Command.ReKey, and Command.SetKey, the cleanup action already did,
so it was updating the log twice before.

This fixes a bug when annex.stalldetection is set, as now
Command.Transferrer can skip updating the location log, and let it be
updated by the calling process.
2020-12-11 11:50:13 -04:00
Joey Hess
04c12aa6df
custom protocol for transferrer
Rather than using Read/Show, which would force me to preserve data types
into the future.

I considered just deriving json and sending that, but I don't much like
deriving json with data types that have named constructors (like Key
does) because again it locks in data type details.

So instead, used SimpleProtocol, with a fairly complex and unreadable
protocol. But it is as efficient as the p2p protocol at least, and as
future proof.

(Writing my own custom json instances would have worked but I thought
of it too late and don't want to do all the work twice. The only real
benefit might be that aeson could be faster.)

Note that, when a new protocol request type is added later, git-annex
trying to use it will cause the git-annex transferrer to display a
protocol error message. That seems ok; it would only happen if a new
git-annex found an old version of itself in PATH or the program
file. So it's unlikely, and all it can do anyway is display an error.
(The error message could perhaps be improved..)

This commit was sponsored by Jack Hill on Patreon.
2020-12-09 16:13:59 -04:00
Joey Hess
004a4f5fb1
factor out Types.Transferrer 2020-12-09 13:28:49 -04:00
Joey Hess
41f2c308ff
stall detection is working
New config annex.stalldetection, remote.name.annex-stalldetection, which
can be used to deal with remotes that stall during transfers, or are
sometimes too slow to want to use.

This commit was sponsored by Luke Shumaker on Patreon.
2020-12-08 15:22:18 -04:00
Joey Hess
fcc9e01556
finally using transferkeys
Seems to work! Even progress bars. Have not tested prompting or various
error message displays yet.

transferkeys had to be made to operate in different modes for the
Assistant and Annex monads. A bit ugly, but it did relegate that
really ugly Database.Keys.closeDb in transferkeys to only the assistant
code path.

This commit was sponsored by Noam Kremen.
2020-12-07 16:18:26 -04:00
Joey Hess
4c47568876
refactoring
This is groundwork for using git-annex transferkeys to run transfers,
in order to allow stalled transfers to be interrupted and retried.

The new upload and download are closer to what git-annex transferkeys
does, so the plan is to make them use it.

Then things that were left using upload' and download' won't recover
from stalls. Notably, that includes import and export. But
at least get/move/copy will be able to. (Also the assistant hopefully,
but not yet.)

This commit was sponsored by Jake Vosloo on Patreon.
2020-12-07 14:49:17 -04:00
Joey Hess
9b0dde834e
convert getFileSize to RawFilePath
Lots of nice wins from this in avoiding unncessary work, and I think
nothing got slower.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2020-11-05 11:32:57 -04:00
Joey Hess
681b44236a
more RawFilePath conversion
at 377/645

This commit was sponsored by Svenne Krap on Patreon.
2020-10-29 14:20:57 -04:00
Joey Hess
4c32499e82
Parse youtube-dl progress output
Which lets progress be displayed when doing concurrent downloads.
Amoung other things, like --json-progress etc.

The youtube-dl output is no longer displayed, except for any errors.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-09-29 17:53:48 -04:00
Joey Hess
77c42782d0
differentiate between concurrency enabled at command line and by git config
The latter should not affect --batch mode.
2020-09-16 11:47:12 -04:00
Joey Hess
e36bae74da
Exposed annex.forward-retry git config
One reason is, 5 is an arbitrary number so ought to be configurable.

The real reason though, is I wanted to make the man page explain when
forward retry can override annex.retry, and having a config made the
man page easier to write.
2020-09-04 15:16:40 -04:00
Joey Hess
1a42b2c5a3
combine retry deciders in better way
This fixes the problem that, if forwardRetry was checked for the first 5
and decided to retry, the 6th would go to configuredRetry which would
see the counter was 6 and so wait retry-delay*2^5 seconds (default 32).

Now, it waits for retry-delay before each retry, even when forwardRetry
initiated the retry.
2020-09-04 12:48:30 -04:00
Joey Hess
1d244bafbd
Limit retrying of failed transfers when forward progress is being made to 5
To avoid some unusual edge cases where too much retrying could result in
far more data transfer than makes sense.
2020-09-04 12:46:37 -04:00
Joey Hess
f75be32166
external backends wip
It's able to start them up, the only thing not implemented is generating
and verifying keys. And, the key translation for HasExt.
2020-07-29 15:23:18 -04:00
Joey Hess
172743728e
move cryptographicallySecure into Backend type
This is groundwork for external backends, but also makes sense to keep
this information with the rest of a Backend's implementation.

Also, removed isVerifiable. I noticed that the same information is
encoded by whether a Backend implements verifyKeyContent or not.
2020-07-20 12:17:42 -04:00
Joey Hess
fe9cf1256e
move remoteList into dupState
This does mean that RemoteDaemon.Transport.Tor's call runs it, otherwise
no change, but this is groundwork for doing more such expensive actions
in dupState.
2020-04-17 14:36:45 -04:00
Joey Hess
81d402216d cache the serialization of a Key
This will speed up the common case where a Key is deserialized from
disk, but is then serialized to build eg, the path to the annex object.

Previously attempted in 4536c93bb2
and reverted in 96aba8eff7.
The problems mentioned in the latter commit are addressed now:

Read/Show of KeyData is backwards-compatible with Read/Show of Key from before
this change, so Types.Distribution will keep working.

The Eq instance is fixed.

Also, Key has smart constructors, avoiding needing to remember to update
the cached serialization.

Used git-annex benchmark:
  find is 7% faster
  whereis is 3% faster
  get when all files are already present is 5% faster
Generally, the benchmarks are running 0.1 seconds faster per 2000 files,
on a ram disk in my laptop.
2019-11-22 17:49:16 -04:00
Joey Hess
9d36c826c0
use fine-grained WorkerStages when transferring and verifying
This means that Command.Move and Command.Get don't need to
manually set the stage, and is a lot cleaner conceptually.

Also, this makes Command.Sync.syncFile use the worker pool better.
In the scenario where it first downloads content and then uploads it to
some other remotes, it will start in TransferStage, then enter VerifyStage
and then go back to TransferStage for each transfer to the remotes.
Before, it entered CleanupStage after the download, and stayed in it for
the upload, so too many transfer jobs could run at the same time.

Note that, in Remote.Git, it uses runTransfer and also verifyKeyContent
inside onLocal. That has a Annex state for the remote, with no worker pool.
So the resulting calls to enteringStage won't block in there.

While Remote.Git.copyToRemote does do checksum verification, I
realized that should not use a verification slot in the WorkerPool
to do it. Because, it's reading back from eg, a removable disk to checksum.
That will contend with other writes to that disk. It's best to treat
that checksum verification as just part of the transer. So, removed the todo
item about that, as there's nothing needing to be done.
2019-06-19 13:24:20 -04:00
Joey Hess
82186ca58f
annex.jobs=cpus etc
Added the ability to run one job per CPU (core), by setting annex.jobs=cpus,
or using option --jobs=cpus or -Jcpus.

Built with future expansion in mind, including not defaulting matching on
Concurrency so more constructors can later be added, and using "cpu"
instead of "0".
2019-05-10 13:27:08 -04:00
Joey Hess
40ecf58d4b
update licenses from GPL to AGPL
This does not change the overall license of the git-annex program, which
was already AGPL due to a number of sources files being AGPL already.

Legally speaking, I'm adding a new license under which these files are
now available; I already released their current contents under the GPL
license. Now they're dual licensed GPL and AGPL. However, I intend
for all my future changes to these files to only be released under the
AGPL license, and I won't be tracking the dual licensing status, so I'm
simply changing the license statement to say it's AGPL.

(In some cases, others wrote parts of the code of a file and released it
under the GPL; but in all cases I have contributed a significant portion
of the code in each file and it's that code that is getting the AGPL
license; the GPL license of other contributors allows combining with
AGPL code.)
2019-03-13 15:48:14 -04:00
Joey Hess
727767e1e2
make everything build again after ByteString Key changes 2019-01-11 16:39:46 -04:00
Joey Hess
9127fe4821
add DebugLocks build flag
Using the method described in
https://www.fpcomplete.com/blog/2018/05/pinpointing-deadlocks-in-haskell
but my own code to implement it, and with callstacks added.

This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
2018-11-19 15:02:43 -04:00
Joey Hess
983c9d5a53
git-annex-shell: fix transfer hang
Fix hang when transferring the same objects to two different clients at the
same time. (Or when annex.pidlock is used, two different objects to the
same or different clients.)

Could also potentially occur if a client was downloading an object and
somehow lost connection but that git-annex-shell was still running and
holding the transfer lock.

This does not guarantee that, if `transfer` fails for some other reason,
a DATA response will be made.

This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
2018-11-06 13:00:37 -04:00
Joey Hess
1a02fc1159
Fix wrong sorting of remotes when using -J
It was sorting by uuid, rather than cost!

Avoid future bugs of this kind by changing the Ord to primarily compare
by cost, with uuid only used when the cost is the same.

This commit was supported by the NSF-funded DataLad project.
2018-08-03 13:10:50 -04:00
Joey Hess
db720f6a9c
Display error message when http download fails.
* Display error message when http download fails.

  There's nothing in the http-client library to nicely format a http
  exception, so in some cases it has to fall back to using show on it.
  Seems better than just saying "it failed" or only showing the http
  status code.

* Avoid forward retry when 0 bytes were received.

  forwardRetry was comparing Nothing to Just 0, and so thought there had
  been progress made when 0 bytes were received.

This commit was supported by the NSF-funded DataLad project.
2018-05-08 16:11:45 -04:00
Joey Hess
9ec1d6b077
add units 2018-03-29 13:31:53 -04:00
Joey Hess
961fa377d9
Also do forward retrying in cases where no exception is thrown, but the transfer failed.
I think this used to be the case, but it was accidentially lost way back in
commit 3887432c54. Normally, transfers do not
throw exceptions, so probably forward retrying was rarely done due to that
oversight.

This also affects the new annex.retry etc configuration. If a transfer
fails, without making any progress, eg because the file is not present on
the remote or the remote is not accessible, it will now retry when
configuration calls for it. In some cases such a retry is not desirable,
for example the remote could be accessible and not have a copy of the file
that the local repo thinks it has. I see no way to distinguish such cases
from cases where a retry should really be done. So, it'll be up to the user
to configure it to work for them.
2018-03-29 13:22:49 -04:00
Joey Hess
46d4316954
implement annex.retry et al
Added annex.retry, annex.retry-delay, and per-remote versions to configure
transfer retries.

This commit was supported by the NSF-funded DataLad project.
2018-03-29 13:04:07 -04:00
Joey Hess
ed81762c86
avoid compiler warning
add type sig so it's clear createtfile returns unit
2018-03-15 13:21:32 -04:00
Joey Hess
10d3b7fc62
Fix reversion introduced in 6.20171214 that caused concurrent transfers to incorrectly fail with "transfer already in progress".
Avoid creating transfer info file before transfer lock is created and
locked.

The wrong order for one thing caused transfer info to be overwritten
when a transfer was already in progress.

But worse, it caused checkTransfer to see the transfer info,
and so lock the transfer lock in order to verify the transfer was not in
progress. Which in a concurrent situation, prevented the transferrer
from locking the transfer lock, so it failed with "transfer already in
progress".

Note that the transferinfo command does not lock the transfer lock
before creating the transfer info. But, that's only run after
recvkey is running, and recvkey does lock the transfer lock, so that
seems more or less ok. (Other than being a super complicated legacy mess
that the P2P code has mostly obsoleted now.)

This commit was supported by the NSF-funded DataLad project.
2018-03-14 18:55:34 -04:00
Joey Hess
24df95f0f6
Fix several places where files in .git/annex/ were written with modes that did not take the core.sharedRepository config into account.
git grep writeFile finds some more that might also be problems, but
for now I've concentrated on .git/annex/ log files. There are certianly
cases where writeFile is not a problem too.

This commit was sponsored by mo on Patreon.
2018-01-02 17:25:25 -04:00
Joey Hess
fc845e6530
more lambda-case conversion 2017-12-05 15:00:50 -04:00
Joey Hess
99bebdface
youtube-dl working
Including resuming and cleanup of incomplete downloads.

Still todo: --fast, --relaxed, importfeed, disk reserve checking,
quvi code cleanup.

This commit was sponsored by Anthony DeRobertis on Patreon.
2017-11-29 16:40:32 -04:00
Joey Hess
e1ac299ad0
better dup key with -J fix
This avoids all the complication about redundant work discussed in
the previous try at fixing this. At the expense of needing each command
that could have the problem to be patched to simply wrap the action in
onlyActionOn once the key is known. But there do not seem to be many
such commands.

onlyActionOn' should not be used with a CommandStart (or CommandPerform),
although the types do allow it. onlyActionOn handles running the whole
CommandStart chain. I couldn't immediately see a way to avoid mistken
use of onlyActionOn'.

This commit was supported by the NSF-funded DataLad project.
2017-10-17 18:48:53 -04:00
Joey Hess
68a49adcda
Improve behavior when -J transfers multiple files that point to the same key
After a false start, I found a fairly non-intrusive way to deal with it.
Although it only handles transfers -- there may be issues with eg
concurrent dropping of the same key, or other operations.

There is no added overhead when -J is not used, other than an added
inAnnex check. When -J is used, it has to maintain and check a small
Set, which should be negligible overhead.

It could output some message saying that the transfer is being done by
another thread. Or it could even display the same progress info for both
files that are being downloaded since they have the same content. But I
opted to keep it simple, since this is rather an edge case, so it just
doesn't say anything about the transfer of the file until the other
thread finishes.

Since the deferred transfer action still runs, actions that do more than
transfer content will still get a chance to do their other work. (An
example of something that needs to do such other work is P2P.Annex,
where the download always needs to receive the content from the peer.)
And, if the first thread fails to complete a transfer, the second thread
can resume it.

But, this unfortunately means that there's a risk of redundant work
being done to transfer a key that just got transferred.
That's not ideal, but should never cause breakage; the same
thing can occur when running two separate git-annex processes.

The get/move/copy/mirror --from commands had extra inAnnex checks added,
inside the download actions. Without those checks, the first thread
downloaded the content, and then the second thread woke up and
downloaded the same content redundantly.

move/copy/mirror --to is left doing redundant uploads for now. It
would need a second checkPresent of the remote inside the upload
to avoid them, which would be expensive. A better way to avoid
redundant work needs to be found..

This commit was supported by the NSF-funded DataLad project.
2017-10-17 17:10:50 -04:00
Joey Hess
7db37ddde0
Fix transfer log file locking problem when running concurrent transfers.
orElse is great, but was not the right thing to use here because
waitTakeLock could retry for other reasons than the lock being held,
which made tryTakeLock fail when it shouldn't.

Instead, move the code to tryTakeLock and implement waitTakeLock using
tryTakeLock and retry.

(Also, in runTransfer, when checkSaneLock fails, dropLock to avoid leaking a
lock handle.)

This commit was supported by the NSF-funded DataLad project.
2017-05-25 17:40:23 -04:00
Joey Hess
c8e1e3dada
AssociatedFile newtype
To prevent any further mistakes like 301aff34c4

This commit was sponsored by Francois Marier on Patreon.
2017-03-10 13:35:31 -04:00
Joey Hess
0534152685
get -J: Improve distribution of jobs amoung remotes when there are more jobs than remotes.
It was distributing jobs to remotes that were not being used by any other
job. But, suppose that there are only 2 remotes, and -J10. In such a case,
the first 2 downloads would be distributed amoung the 2 remotes, but
the other 8 would all go to remote #1. Improved by keeping a counter
of how many jobs are assigned to a remote, and prefer remotes with fewer
jobs.

Note use of Data.Map.Strict to avoid blowing up space. I kept the
bang-patterns as-is, although probably not needed with Data.Map.Strict.

This commit was sponsored by Jack Hill on Patreon.
2017-03-08 14:49:30 -04:00
Joey Hess
c33363dfa7
early cancelation of transfer that annex.securehashesonly prohibits
This avoids sending all the data to a remote, only to have it reject it
because it has annex.securehashesonly set. It assumes that local and
remote will have the same annex.securehashesonly setting in most cases.
If a remote does not have that set, and local does, the remote won't get
some content it would otherwise accept.

Also avoids downloading data that will not be added to the local object
store due to annex.securehashesonly.

Note that, while encrypted special remotes use a GPGHMAC key variety,
which is not collisiton resistent, Transfers are not used for such
keys, so this check is avoided. Which is what we want, so encrypted
special remotes still work.

This commit was sponsored by Ewen McNeill.
2017-02-27 15:21:24 -04:00
Joey Hess
38516b2fca
update progress logs in remotedaemon send/receive 2016-12-08 19:56:02 -04:00
Joey Hess
8ef494a833
disentangle concurrency and message type
This makes -Jn work with --json and --quiet, where before
setting -Jn disabled those options.

Concurrent json output is currently a mess though since threads output
chunks over top of one-another.
2016-09-09 12:57:42 -04:00
Joey Hess
31289da691
get -J: Download different files from different remotes when the remotes have the same costs.
Only done in -J mode because only if there's concurrency can downloading
from two remotes be faster. Without concurrency, it's likely the case that
sequential downloads from the same remote are faster than switching back
and forth between two remotes.

There is some hairy MVar code here, but basically it just keeps
the activeremotes MVar full except when deciding which remote to assign
to a thread.

Also affects gets by sync --content -J

This commit was sponsored by Jochen Bartl.
2016-09-06 12:45:21 -04:00
Joey Hess
10ddf2c3bd
remove TransferObserver
unused after last commit
2016-08-03 13:46:20 -04:00
Joey Hess
f461bcae4b
Re-enable accumulating transfer failure log files for command-line actions
This was disabled in commit 61ccf95004,
because only the assistant used them, and they were clutter. But, now
--failed also uses them.

Remove the failure log files after successful transfers. Should avoid
most of the clutter problems.

Commit 61ccf95004 mentions a subtle behavior
change, which has now been reverted:

    There is one behavior change from this. If glacier is being used, and a
    manual git annex get --from glacier fails because the file isn't available
    yet, the assistant will no longer later see that failed transfer file and
    retry the get.
2016-08-03 13:41:07 -04:00
Joey Hess
1a0e2c9901
get, move, copy, mirror: Added --failed switch which retries failed copies/moves
Note that get --from foo --failed will get things that a previous get --from bar
tried and failed to get, etc. I considered making --failed only retry
transfers from the same remote, but it was easier, and seems more useful,
to not have the same remote requirement.

Noisy due to some refactoring into Types/
2016-08-03 12:37:12 -04:00
Joey Hess
fbf5045d4f
sync --content: Fix bug that caused transfers of files to be made to a git remote that does not have a UUID. This particularly impacted clones from gcrypt repositories.
Added guard in Annex.Transfer to prevent this problem at a deeper level.

I'm unhappy ith NoUUID, but having Maybe UUID instead wouldn't help either
if nothing checked that there was a UUID. Since there legitimately need to
be Remotes that do not have a UUID, I can't see a way to fix it at the type
level, short making there be two separate types of Remotes.
2016-06-02 13:50:43 -04:00
Joey Hess
540a0343ba
more windows build fix 2016-02-15 15:03:44 -04:00