This will be used in youtube-dl integration, to tell when a html page has
been downloaded by addurl, in which case it is worth running youtube-dl
to see if it can extract media from it.
tagsoup is an almost free dependency, because yesod depends on it.
So, this only really adds a dep when git-annex is built without the
webapp.
I'd like this to as closely as possible match how browsers decide if a
page is html or not. Unfortunately, that is fairly heuristic, in order
to support malformed html. And, we don't want to falsely detect
something as html just because it has something that looks like a html
tag embedded somewhere in it. Probably any major video hosting site is
going to be serving html documents that at least start with a <html>
tag, so requiring that or a DOCTYPE should be good enough.
This commit was sponsored by Jeff Goeke-Smith on Patreon.
As long as the class of remotes supports exporting, it's tested whether
or not the remote is configured with exporttree=yes.
Also, made testremote of a remote configured with exporttree=yes
disable that configuration for testing non-export storage.
This commit was supported by the NSF-funded DataLad project.
Remove closed bugs and todos that were last edited or commented before 2017.
Command line used:
for f in $(grep -l '|done\]\]' -- *.mdwn); do d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=01-01-2017 --pretty=oneline -- "$f")" -a -z "$(git log --since=01-01-2017 --pretty=oneline -- "$d")" ]; then git rm -- "./$f" ; git rm -rf "./$d"; fi; done
for f in $(grep -l '\[\[done\]\]' -- *.mdwn); do d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=01-01-2017 --pretty=oneline -- "$f")" -a -z "$(git log --since=01-01-2017 --pretty=oneline -- "$d")" ]; then git rm -- "./$f" ; git rm -rf "./$d"; fi; done
This is similar to the pusher thread, but a separate thread because git
pushes can be done in parallel with exports, and updating a big export
should not prevent other git pushes going out in the meantime.
The exportThread only runs at most every 30 seconds, since updating an
export is more expensive than pushing. This may need to be tuned.
Added a separate channel for export commits; the committer records a
commit in that channel.
Also, reconnectRemotes records a dummy commit, to make the exporter
thread wake up and make sure all exports are up-to-date. So,
connecting a drive with a directory special remote export will
immediately update it, and getting online will automatically
update S3 and WebDAV exports.
The transfer queue is not involved in exports. Instead, failed
exports are retried much like failed pushes.
This commit was sponsored by Ewen McNeill.
The bug occurred when closeDb was not called, and garbage collection of
the DbHandle didn't give the workerThread time to shut down. Fixed by
exiting the runSqlite action when a commit is made.
(MultiWriter mode already forked off a runSqlite action, so avoided the
problem.)
This commit was sponsored by Brock Spratlen on Patreon.
Now when one repository has exported a tree, another repository can get
files from the export, after syncing.
There's a bug: While the database update works, somehow the database on
disk does not get updated, and so the database update is run the next
time, etc. Wasn't able to figure out why yet.
This commit was sponsored by Ole-Morten Duesund on Patreon.
New table needed to look up what filenames are used in the currently
exported tree, for reasons explained in export.mdwn.
Also, added smart constructors for ExportLocation and ExportDirectory to
make sure they contain filepaths with the right direction slashes.
And some code refactoring.
This commit was sponsored by Francois Marier on Patreon.
There does not seem to be a use case for supporting that, and it would
need a lot of complication to support it in a way that allows eventual
consistency when two repositories are updating the same export.
This commit was sponsored by Henrik Riomar on Patreon.
Apparently box.com renaming is just buggy. I tried a couple of fixes:
* In case the http Manager was opening multiple connections and reaching
different backend servers, I tried limiting the number of connections
to 1. Didn't help.
* To make sure it was not a http connection reuse problem, I tried
rewriting how exportAction works, so that the same http connection
is clearly open. Didn't help.
So, disable renaming of exports for box.com. It would be good to test it
with some other webdav server.
This commit was sponsored by John Peloquin on Patreon.
It was not getting old lines removed, because the tree graft confused
the updater, so it union merged from the previous git-annex branch,
which still contained the old lines. Fixed by carefully using setIndexSha.
This commit was supported by the NSF-funded DataLad project.
This breaks backwards compatibility, but only with unreleased versions of
git-annex, which I think is acceptable.
This commit was supported by the NSF-funded DataLad project.