Commit graph

107 commits

Author SHA1 Message Date
Joey Hess
c31e1be781
convert KeySource to RawFilePath 2020-02-21 10:04:44 -04:00
Joey Hess
40ecf58d4b
update licenses from GPL to AGPL
This does not change the overall license of the git-annex program, which
was already AGPL due to a number of sources files being AGPL already.

Legally speaking, I'm adding a new license under which these files are
now available; I already released their current contents under the GPL
license. Now they're dual licensed GPL and AGPL. However, I intend
for all my future changes to these files to only be released under the
AGPL license, and I won't be tracking the dual licensing status, so I'm
simply changing the license statement to say it's AGPL.

(In some cases, others wrote parts of the code of a file and released it
under the GPL; but in all cases I have contributed a significant portion
of the code in each file and it's that code that is getting the AGPL
license; the GPL license of other contributors allows combining with
AGPL code.)
2019-03-13 15:48:14 -04:00
Joey Hess
e1ac299ad0
better dup key with -J fix
This avoids all the complication about redundant work discussed in
the previous try at fixing this. At the expense of needing each command
that could have the problem to be patched to simply wrap the action in
onlyActionOn once the key is known. But there do not seem to be many
such commands.

onlyActionOn' should not be used with a CommandStart (or CommandPerform),
although the types do allow it. onlyActionOn handles running the whole
CommandStart chain. I couldn't immediately see a way to avoid mistken
use of onlyActionOn'.

This commit was supported by the NSF-funded DataLad project.
2017-10-17 18:48:53 -04:00
Joey Hess
68a49adcda
Improve behavior when -J transfers multiple files that point to the same key
After a false start, I found a fairly non-intrusive way to deal with it.
Although it only handles transfers -- there may be issues with eg
concurrent dropping of the same key, or other operations.

There is no added overhead when -J is not used, other than an added
inAnnex check. When -J is used, it has to maintain and check a small
Set, which should be negligible overhead.

It could output some message saying that the transfer is being done by
another thread. Or it could even display the same progress info for both
files that are being downloaded since they have the same content. But I
opted to keep it simple, since this is rather an edge case, so it just
doesn't say anything about the transfer of the file until the other
thread finishes.

Since the deferred transfer action still runs, actions that do more than
transfer content will still get a chance to do their other work. (An
example of something that needs to do such other work is P2P.Annex,
where the download always needs to receive the content from the peer.)
And, if the first thread fails to complete a transfer, the second thread
can resume it.

But, this unfortunately means that there's a risk of redundant work
being done to transfer a key that just got transferred.
That's not ideal, but should never cause breakage; the same
thing can occur when running two separate git-annex processes.

The get/move/copy/mirror --from commands had extra inAnnex checks added,
inside the download actions. Without those checks, the first thread
downloaded the content, and then the second thread woke up and
downloaded the same content redundantly.

move/copy/mirror --to is left doing redundant uploads for now. It
would need a second checkPresent of the remote inside the upload
to avoid them, which would be expensive. A better way to avoid
redundant work needs to be found..

This commit was supported by the NSF-funded DataLad project.
2017-10-17 17:10:50 -04:00
Joey Hess
46d19648ee
first pass at assistant knowing about export remotes
Split exportRemotes out from syncDataRemotes; the parts of the assistant
that upload keys and drop keys from remotes don't apply to exports,
because those operations are not supported.

Some parts of the assistant and webapp do operate on both
syncDataRemotes and exportRemotes. Particularly when downloading from
either of them. Added a downloadRemotes that combines both.

With this, the assistant should download from exports, but it won't yet
upload changes to them.

This commit was sponsored by Fernando Jimenez on Patreon.
2017-09-20 13:58:27 -04:00
Joey Hess
d58148031b
remove xmpp support
I've long considered the XMPP support in git-annex a wart.
It's nice to remove it.

(This also removes the NetMessager, which was only used for XMPP, and the
daemonstatus's desynced list (likewise).)

Existing XMPP remotes should be ignored by git-annex.

This commit was sponsored by Brock Spratlen on Patreon.
2016-11-14 14:53:08 -04:00
Joey Hess
166d70db77
convert TMVars that are never left empty into TVars
This is probably more efficient, and it avoids mistakenly leaving them
empty.
2016-09-30 19:51:16 -04:00
Joey Hess
1a0e2c9901
get, move, copy, mirror: Added --failed switch which retries failed copies/moves
Note that get --from foo --failed will get things that a previous get --from bar
tried and failed to get, etc. I considered making --failed only retry
transfers from the same remote, but it was easier, and seems more useful,
to not have the same remote requirement.

Noisy due to some refactoring into Types/
2016-08-03 12:37:12 -04:00
Joey Hess
737e45156e
remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Joey Hess
c4152654d2
combine PendingAddChanges for the same file into one
In v6 unlocked mode, this fixes a problem that was making eg,
echo > file cause the assistant to copy the file to the annex object,
instead of hard linking it. That because 2 change events were seen
(one for opening the file and one for closing) and processed together
the file was then locked down twice. Which meant it had mutiple hard links,
and so prevented linkAnnex from hard linking it.

There might be scenarios where multiple events come in, but staggered such
that a file gets locked down repeatedly, and it would still be copied to
the annex object in that case.
2015-12-22 17:52:39 -04:00
Joey Hess
4f60234690
finish v6 support for assistant
Seems to basically work now!
2015-12-22 15:23:27 -04:00
Joey Hess
03667a162a couple of AMP warnings I missed before 2015-05-10 16:51:03 -04:00
Joey Hess
afc5153157 update my email address and homepage url 2015-01-21 12:50:09 -04:00
Joey Hess
7b50b3c057 fix some mixed space+tab indentation
This fixes all instances of " \t" in the code base. Most common case
seems to be after a "where" line; probably vim copied the two space layout
of that line.

Done as a background task while listening to episode 2 of the Type Theory
podcast.
2014-10-09 15:09:11 -04:00
Joey Hess
b197ec8917 get rid of (completely safe) uses of Char8
Char8 often indicates an encoding bug. It didn't here, but I can avoid it
and not worry about it.
2014-05-27 20:26:10 -04:00
Joey Hess
ac98853f05 add CredPair cache
Note that this does not yet use SecureMem. It would probably make sense for
the Password part of a CredPair to use SecureMem, and making that change
is better than passing in a String and having it converted to SecureMem in
this code.
2014-04-29 18:08:02 -04:00
Joey Hess
db38678595 webapp: Rework xmpp nudge to prompt for either xmpp or a ssh remote be set up.
This commit was sponsored by Nathan Howell.
2014-04-09 16:27:24 -04:00
Joey Hess
33b8cff433 webapp: Show a network signal icon next to ssh remotes that it's currently connected with. 2014-04-09 15:26:41 -04:00
Joey Hess
fac7bca05b assistant: Now detects immediately when other repositories push changes to
a ssh remote, and pulls.

XMPP is no longer needed in this configuration!

Requires the remote server have git-annex-shell with notifychanges support.

(untested)

This commit was sponsored by Geog Wechslberger.
2014-04-08 15:23:50 -04:00
Joey Hess
4e0be2792b remove Read instance for Ref
Removed instance, got it all to build using fromRef. (With a few things
that really need to show something using a ref for debugging stubbed out.)

Then added back Read instance, and made Logs.View use it for serialization.
This changes the view log format.
2014-02-19 01:19:57 -04:00
Joey Hess
3da0064657 assistant unused file handling
Make sanity checker run git annex unused daily, and queue up transfers
of unused files to any remotes that will have them. The transfer retrying
code works for us here, so eg when a backup disk remote is plugged in,
any transfers to it are done. Once the unused files reach a remote,
they'll be removed locally as unwanted.

If the setup does not cause unused files to go to a remote, they'll pile
up, and the sanity checker detects this using some heuristics that are
pretty good -- 1000 unused files, or 10% of disk used by unused files,
or more disk wasted by unused files than is left free. Once it detects
this, it pops up an alert in the webapp, with a button to take action.

TODO: Webapp UI to configure this, and also the ability to launch an
immediate cleanup of all unused files.

This commit was sponsored by Simon Michael.
2014-01-22 22:53:18 -04:00
Joey Hess
5279c4d1df tested transferkeys restarting; fix some bugs 2014-01-06 17:07:08 -04:00
Joey Hess
e5b4d447b6 assistant: Start a new git-annex transferkeys process after a network connection change
So that remotes that use a persistent network connection are restarted.

A remote might keep open a long duration network connection, and could
fail to deal well with losing the connection. This is particularly a
concern now that we have external special reotes. An external
special remote that is implemented naively might open the connection only
when PREPARE is sent, and if it loses connection, throw errors on each
request that is made.

(Note that the ssh connection caching should not have this problem; if the
long-duration ssh process loses connection, the named pipe is disconnected
and the next ssh attempt will reconnect. Also, XMPP already deals with
disconnection robustly in its own way.)

There's no way for git-annex to know if a lost network connection actually
affects a given remote, which might have a transfer in process. It does not
make sense to force kill the transferkeys process every time the NetWatcher
detects a change. (Especially because the NetWatcher sometimes polls 1
change per hour.)

In any case, the NetWatcher only detects connection to a network, not
disconnection. So if a transfer is in progress over the network, and the
network goes down, that will need to time out on its own.

An alternate approch that was considered is to use a separate transferkeys
process for each remote, and detect when a request fails, and assume that
means that process is in a failing state and restart it. The problem with
that approach is that if a resource is not available and a remote fails
every time, it degrades to starting a new transferkeys process for every
file transfer, which is too expensive.

Instead, this commit only handles the network reconnection case, and restarts
transferkeys only once the network has reconnected and another transfer needs
to be made. So, a transferkeys process will be reused for 1 hour, or until the
next network connection.

----

The NotificationBroadcaster was rewritten to use TMVars rather than MSampleVars,
to allow checking without blocking if a notification has been received.

----

This commit was sponsored by Tobias Brunner.
2014-01-06 16:03:39 -04:00
Joey Hess
32acf908bb queue and start download of git-annex from web, using git-annex, when upgrade is started 2013-11-23 17:21:04 -04:00
Joey Hess
183f7355cd global webapp redirects, to finish upgrades
When an automatic upgrade completes, or when the user clicks on the upgrade
button in one webapp, but also has it open in another browser window/tab,
we have a problem: The current web server is going to stop running in
minutes, but there is no way to send a redirect to the web browser to the
new url.

To solve this, used long polling, so the webapp is always listening for
urls it should redirect to. This allows globally redirecting every open
webapp. Works great! Tested with 2 web browsers with 2 tabs each.
May be useful for other purposes later too, dunno.

The overhead is 2 http requests per page load in the webapp. Due to yesod's
speed, this does not seem to noticibly delay it. Only 1 of the requests
could possibly block the page load, the other is async.
2013-11-23 14:47:38 -04:00
Joey Hess
d24f7f94fe better UI flow through upgrade process
Move button to enable automatic upgrades to an alert displayed after
successful upgrade. Unclutters the UI and makes psychological sense.
2013-11-23 13:27:52 -04:00
Joey Hess
6abaf19c41 restart on upgrade is working, including automatic restart
Made alerts be able to have multiple buttons, so the alerts about upgrading
can have a button that enables automatic upgrades.

Implemented automatic upgrading when the program file has changed.

Note that when an automatic upgrade happens, the webapp displays an alert
about it for a few minutes, and then closes. This still needs work.
2013-11-23 00:54:08 -04:00
Joey Hess
b9cdb55e0c assistant restart on upgrade 2013-11-22 23:12:06 -04:00
Joey Hess
e2f17e9da3 upgrade alerts
The webapp will check twice a day, when the network is connected, to see if
it can download a distributon upgrade file. If a newer version is found,
display an upgrade alert.

This will need the autobuilders to set UPGRADE_LOCATION to the url
it can be downloaded from when building git-annex. Only builds with that
set need automatic upgrade alerts.

Currently, the upgrade page just requests the user manually download
and upgrade it. But, all the info is provided to do automated upgrades
in the future.

Note that urls used will need to all be https.

This commit was sponsored by Dirk Kraft.
2013-11-21 17:49:56 -04:00
Joey Hess
13108b7196 assistant: Notice on startup when the index file is corrupt, and auto-repair. 2013-11-13 14:27:17 -04:00
Joey Hess
8820091b4c webapp: remind user when using repositories that lack consistency checks
When starting up the assistant, it'll remind about the current
repository, if it doesn't have checks. And when a removable drive
is plugged in, it will remind if a repository on it lacks checks.

Since that might be annoying, the reminders can be turned off.

This commit was sponsored by Nedialko Andreev.
2013-10-29 16:50:38 -04:00
Joey Hess
496c8b7abb add post-repair actions 2013-10-29 14:25:20 -04:00
Joey Hess
fabb0c50b7 move code around and rename thread; no functional changes 2013-10-29 13:41:44 -04:00
Joey Hess
a7821c0581 automatically launch git repository repair
Added a RemoteChecker thread, that waits for problems to be reported with
remotes, and checks if their git repository is in need of repair.

Currently, only failures to sync with the remote cause a problem to be
reported. This seems enough, but we'll see.

Plugging in a removable drive with a repository on it that is corrupted
does automatically repair the repository, as long as the corruption causes
git push or git pull to fail. Some types of corruption do not, eg
missing/corrupt objects for blobs that git push doesn't need to look at.

So, this is not really a replacement for scheduled git repository fscking.
But it does make the assistant more robust.

This commit is sponsored by Fernando Jimenez.
2013-10-27 16:42:13 -04:00
Joey Hess
4722dcc92b cleanup 2013-10-18 11:24:41 -04:00
Joey Hess
25462f125d cronner: run jobs triggered by remotes becoming connected (untested) 2013-10-13 17:14:56 -04:00
Joey Hess
af5e1d0494 half way complete cronner thread to run scheduled activities 2013-10-08 11:48:28 -04:00
Joey Hess
635c9a1549 assistant: Detect stale git lock files at startup time, and remove them.
Extends the index.lock handling to other git lock files. I surveyed
all lock files used by git, and found more than I expected. All are
handled the same in git; it leaves them open while doing the operation,
possibly writing the new file content to the lock file, and then closes
them when done.

The gc.pid file is excluded because it won't affect the normal operation
of the assistant, and waiting for a gc to finish on startup wouldn't be
good.

All threads except the webapp thread wait on the new startup sanity checker
thread to complete, so they won't try to do things with git that fail
due to stale lock files. The webapp thread mostly avoids doing that kind of
thing itself. A few configurators might fail on lock files, but only if the
user is explicitly trying to run them. The webapp needs to start
immediately when the user has opened it, even if there are stale lock
files.

Arranging for the threads to wait on the startup sanity checker was a bit
of a bear. Have to get all the NotificationHandles set up before the
startup sanity checker runs, or they won't see its signal. Perhaps
the NotificationBroadcaster is not the best interface to have used for
this. Oh well, it works.

This commit was sponsored by Michael Jakl
2013-10-05 17:04:21 -04:00
Joey Hess
dba1e29949 webapp: Better display of added files. 2013-07-10 15:37:40 -04:00
Joey Hess
7a7e426352 moved AssociatedFile definition 2013-07-04 02:36:02 -04:00
Joey Hess
4effef3176 fix minor memory leak caused by recent CanPush change
Putting the UUID in meant that equivilant CanPush messages no longer are ==
2013-05-22 15:47:06 -04:00
Joey Hess
e2b67e0bc4 add two long-running XMPP push threads, no more inversion of control
I hope this will be easier to reason about, and less buggy. It was
certianly easier to write!

An immediate benefit is that with a traversable queue of push requests to
select from, the threads can be a lot fairer about choosing which client to
service next.
2013-05-22 15:13:31 -04:00
Joey Hess
cc68b340ff avoid building up a lot of threads all waiting for their chance to run a push
Only 2 threads are needed, one running, and one waiting to push something
new. Any more is redundant and wasteful.
2013-05-22 00:27:12 -04:00
Joey Hess
08c03b2af3 XMPP: Avoid redundant and unncessary pushes. Note that this breaks compatibility with previous versions of git-annex, which will refuse to accept any XMPP pushes from this version. 2013-05-21 18:24:29 -04:00
Joey Hess
9efde46cdd per-client inboxes for push messages
This will avoid losing any messages received from 1 client when a push
involving another client is running.

Additionally, the handling of push initiation is improved,
it's no longer allowed to run multiples of the same type of push to
the same client.

Still stalls sometimes :(
2013-05-21 11:08:08 -04:00
Joey Hess
14d96b8e06 XMPP: Be better at responding to CanPush messages when busy with something else.
Observed: With 2 xmpp clients, one would sometimes stop responding
to CanPush messages. Often it was in the middle of a receive-pack
of its own (or was waiting for a failed one to time out).

Now these are always immediately responded to, which is fine; the point
of CanPush is to find out if there's another client out there that's
interested in our push.

Also, in queueNetPushMessage, queue push initiation messages when
we're already running the side of the push they would initiate.
Before, these messages were sent into the netMessagesPush channel,
which was wrong. The xmpp send-pack and receive-pack code discarded
such messages.

This still doesn't make XMPP push 100% robust. In testing, I am seeing
it sometimes try to run two send-packs, or two receive-packs at once
to the same client (probably because the client sent two requests).

Also, I'm seeing rather a lot of cases where it stalls out until it
runs into the 120 second timeout and cancels a push.

And finally, there seems to be a bug in runPush. I have logs that
show it running its setup action, but never its cleanup action.
How is this possible given its use of E.bracket? Either some exception
is finding its way through, or the action somehow stalls forever.
When this happens, one of the 2 clients stops syncing.
2013-05-21 00:59:38 -04:00
Joey Hess
df88c51334 add uuid to all xmpp messages
(Except for the actual streaming of receive-pack through XMPP, which
can only run once we've gotten an appropriate uuid in a push initiation
message.)

Pushes are now only initiated when the initiation message comes from a
known uuid. This allows multiple distinct repositories to use the same xmpp
address.

Note: This probably breaks initial push after xmpp pairing, because at that
point we may not know about the paired uuid, and so reject the push from
it. It won't break in simple cases, because the annex-uuid of the remote
is checked. However, when there are multiple clients behind a single xmpp
address, only uuid of the first is recorded in annex-uuid, and so any
pushes from the others will be rejected (unless the first remote pushes their
uuids to us beforehand.
2013-04-30 13:22:55 -04:00
Joey Hess
362ed9f0e3 use DList for the transfer queue
Some nice efficiency gains here for list appending, although mostly
the small size of the transfer queue makes them irrelivant.
2013-04-25 01:33:44 -04:00
Joey Hess
c6da464051 use a DList for the deferred downloads queue 2013-04-25 01:26:23 -04:00
Joey Hess
46529c0129 assistant: Sanitize XMPP presence information logged for debugging. 2013-04-24 21:13:10 -04:00