Commit graph

2414 commits

Author SHA1 Message Date
Joey Hess
9d78a4387f
update 2018-08-31 12:23:04 -04:00
Joey Hess
3a5d0402ba
update 2018-08-30 15:49:21 -04:00
Joey Hess
19dcff2b71
use S3 version ID for retrieval
Have to store the S3 object along with the version ID, so retrieval can
use the same object.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 15:37:08 -04:00
Joey Hess
794e9a7a44
store S3 version IDs
Only done when versioning=yes is configured. It could always do it when
S3 sends back a version id, but there may be buckets that have
versioning enabled by accident, so it seemed better to honor the
configuration.

S3's docs say version IDs are "randomly generated", so presumably
storing the same content twice gets two different ones not the same one.
So I considered storing a list of version IDs for a key. That would
allow removing the key completely. But.. The way Logs.RemoteState works,
when there are multiple writers, the last writer wins. So storing a list
would need a different log format that merges, which seemed overkill to support
removing a key from an append-only remote.

Note that Logs.RemoteState for S3 is now dedicated to version IDs.
If something else needs to be stored, a new log will be needed to do it.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 14:30:56 -04:00
Joey Hess
0ff5a41311
S3 versioning=yes config
Not yet used.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 13:45:28 -04:00
Joey Hess
358178fbfb
don't untrust appendonly exports
Make exporttree=yes remotes that are appendonly not be untrusted, and not force
verification of content, since the usual concerns about losing data when an
export is updated by someone else don't apply.

Note that all the remote operations on keys are left as usual for
appendonly export remotes, except for storing content.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 11:48:04 -04:00
Joey Hess
8b39db20b5
export appendonly support
Make `git annex export` check appendonly when removing a file from an
export, and not update the location log, since the remote still contains
the content.

This commit was supported by the NSF-funded DataLad project.
2018-08-30 11:18:20 -04:00
Joey Hess
dad627fa9e
remove false starts, simplify 2018-08-29 14:12:18 -04:00
Joey Hess
5b78952f78
misunderstood some code; simplify 2018-08-29 14:09:18 -04:00
Joey Hess
e216c18318
new much improved plan 2018-08-29 13:59:52 -04:00
Joey Hess
3874c5c88d
further thoughts 2018-08-29 10:56:02 -04:00
Joey Hess
b1280eb252
new todo (requested by yoh) 2018-08-28 12:14:06 -04:00
Joey Hess
6adc0d2b3f
bug triage 2018-08-27 15:10:05 -04:00
Joey Hess
2c9f21e987
todo 2018-08-26 20:59:20 -04:00
anarcat
af727108b0 update status to mention tor 2018-08-24 21:35:11 +00:00
anarcat
bfab1da5a7 mention that dat thing 2018-08-24 21:30:20 +00:00
Joey Hess
98fd7ec6c9
recover from race between git mv+commit and git-annex get
Last of the known v6 races.

This also makes git add of a pointer file populate it when its content
is present in the annex. Which makes sense to do, I think.

This commit was supported by the NSF-funded DataLad project.
2018-08-22 16:01:50 -04:00
Joey Hess
50fa17aee6
v6: recover from race between git mv and git-annex get/drop
Update pointer file next time reconcileStaged is run to recover from the
race.

Note that restagePointerFile causes git to run the clean filter,
and that will run reconcileStaged. So, normally by the time the git
annex get/drop command finishes, the race has already been dealt with.
It may be that, in some case, that won't happen and the race will be
dealt with at a later point. git-annex could run reconcileStaged at
shutdown if that becomes a problem.

This does not handle the situation where the git mv is committed before
git-annex gets a chance to run again. git commit does run the clean
filter, and that happens to re-inject the content if it was supposed to
be dropped but is still populated. But, the case where the file was
supposed to be gotten but is not populated is not handled yet.

This commit was supported by the NSF-funded DataLad project.
2018-08-22 15:56:43 -04:00
Joey Hess
e9b2674281
plan 2018-08-22 13:58:32 -04:00
Joey Hess
38a934cf07
correction 2018-08-22 13:34:15 -04:00
Joey Hess
18ecf41917
avoid running reconcileStaged when the index has not changed
This commit was supported by the NSF-funded DataLad project.
2018-08-22 13:04:12 -04:00
Joey Hess
5e56d9b620
v6: Update associated files database when git has staged changes to pointer files
This commit was supported by the NSF-funded DataLad project.
2018-08-21 17:02:20 -04:00
Joey Hess
b8cd5fde17
idea 2018-08-20 16:13:46 -04:00
Joey Hess
54d49eeac8
avoid update-index race
This commit was supported by the NSF-funded DataLad project.
2018-08-17 16:03:40 -04:00
Joey Hess
ec91b6e4b2
plan to fix race 2018-08-17 11:18:53 -04:00
Joey Hess
5799d325f0
update todo categories 2018-08-16 16:36:47 -04:00
Joey Hess
82a239675f
narrow the race where a file gets modified before update-index
Check just before running update-index if the worktree file's content is
still the same, don't update it when it's been modified. This narrows
the race window a lot, from possibly minutes or hours, to seconds or
less.

(Use replaceFile so that the worktree update happens atomically,
allowing the InodeCache of the new worktree file to itself be gathered
w/o any other race.)

This doesn't eliminate the race; it can still occur in the window before
update-index runs. When annex.queue is large, a lot of files will be
statted by the checks, and so the window may still be large enough to be a
problem.

When only a few files are being processed, the window is as small as it
is in the race where a modification gets overwritten by git-annex when
it updates the worktree. Or maybe as small as whatever race git
checkout/pull/merge may have when the worktree gets modified during it.
Still, I've kept a todo about this race.

This commit was supported by the NSF-funded DataLad project.
2018-08-16 15:56:43 -04:00
Joey Hess
82cfcfc838
better index file refresh method
Use git update-index --refresh, since it's a little bit more
efficient and the user can be told to run it if a locked index prevents
git-annex from running it.

This also fixes the problem where an annexed file was deleted in the index
and a get of another file that uses the same key caused the index update to
add back the deleted file. update-index will not add back the deleted file.

Documented in tips/unlocked_files.mdwn the gotcha that the index update
may conflict with other operations. I can't see any way to possibly avoid
that conflict.

One new todo about a race that causes a modification to be accidentially
staged.

Note that the assistant only flushes the git command queue when it
commits a modification. I have not tested the assistant with v6 unlocked
files, but assume most users of the assistant won't care if the index
shows a file as modified for a while.

This commit was supported by the NSF-funded DataLad project.
2018-08-16 14:16:24 -04:00
Joey Hess
4c5a9965c1
remove invalid todo item
I tested it, and it's ok. I think I was adding it under a filename that
produced a different key.
2018-08-15 13:34:48 -04:00
Joey Hess
48e9e12961
finally fixed v6 get/drop git status
After updating the worktree for an add/drop, update git's index, so git
status will not show the files as modified.

What actually happens is that the index update removes the inode
information from the index. The next git status (or similar) run
then has to do some work. It runs the clean filter.

So, this depends on the clean filter being reasonably fast and on git
not leaking memory when running it. Both problems were fixed in
a96972015d, but only for git 2.5. Anyone
using an older git will see very expensive git status after an add/drop.

This uses the same git update-index queue as other parts of git-annex, so
the actual index update is fairly efficient. Of course, updating the index
does still have some overhead. The annex.queuesize config will control how
often the index gets updated when working on a lot of files.

This is an imperfect workaround... Added several todos about new
problems this workaround causes. Still, this seems a lot better than the
old behavior.

This commit was supported by the NSF-funded DataLad project.
2018-08-14 16:23:58 -04:00
Joey Hess
66a4483dfa
response 2018-08-14 11:02:55 -04:00
Joey Hess
d8a8f2df70
full plan 2018-08-13 17:51:02 -04:00
Joey Hess
86df0d6e1b
even better idea 2018-08-13 17:43:16 -04:00
Joey Hess
df5823cea0
update 2018-08-13 17:29:33 -04:00
Joey Hess
bc7d431a6a
status 2018-08-13 16:37:23 -04:00
Joey Hess
147a793f4b
one way to use this 2018-08-09 18:22:21 -04:00
Joey Hess
a96972015d
massive v6 add speed/memory improvement
v6 add: Take advantage of improved SIGPIPE handler in git 2.5 to speed up
the clean filter by not reading the file content from the pipe. This also
avoids git buffering the whole file content in memory.

When built with an older git, still consumes stdin. If built with a newer
git and used with an older one, it breaks, but that's acceptable --
checking the git version every time would make repeated smudge runs slow.

This commit was supported by the NSF-funded DataLad project.
2018-08-09 18:17:46 -04:00
Joey Hess
38ddd6072d
addurl: Include filename in --json-progress output when known. 2018-08-06 12:53:44 -04:00
Joey Hess
5c5259db7c
followup 2018-08-06 11:56:55 -04:00
Joey Hess
634aefebd4
comment 2018-08-06 11:54:03 -04:00
Joey Hess
df72b2584a
already implmeneted 2018-08-06 11:29:22 -04:00
yarikoptic
c3f366448a initial expression of the desire 2018-08-04 03:20:48 +00:00
Joey Hess
ae11394efa
added annex.commitmessage
Added annex.commitmessage config that can specify a commit message for the
git-annex branch instead of the usual "update".

This commit was supported by the NSF-funded DataLad project.
2018-08-02 14:06:06 -04:00
Joey Hess
50620efe85
thought 2018-08-02 13:47:50 -04:00
Joey Hess
18aa931a44
followup 2018-08-02 13:43:44 -04:00
Joey Hess
35dbf231d8
response 2018-08-02 13:31:22 -04:00
Joey Hess
2884637cab
S3: Support credential-less download from remotes configured with public=yes exporttree=yes.
This commit was supported by the NSF-funded DataLad project.
2018-07-31 16:32:43 -04:00
Joey Hess
903b10e2b2
add todo 2018-07-31 13:05:04 -04:00
yarikoptic
a206f933fe Added a comment 2018-07-31 14:27:17 +00:00
yarikoptic
c70e757f2b Added a comment: size 2018-07-31 14:19:16 +00:00
RonnyPfannschmidt
5b711ac4f1 Added a comment 2018-07-29 21:41:56 +00:00
RonnyPfannschmidt
ef64e71f76 2018-07-29 20:29:23 +00:00
CandyAngel
6bf0c3ee14 2018-07-18 17:06:19 +00:00
Joey Hess
cc2cb46857
unused --from: Allow specifiying a repository by uuid or description.
This commit was sponsored by Jake Vosloo on Patreon.
2018-07-11 16:01:35 -04:00
uli@8484a70fbfd489faef5f72c230d340b01e2676ca
32df7fca23 2018-07-11 14:07:24 +00:00
Joey Hess
66cb41b0b3
thought 2018-07-09 14:38:34 -04:00
Joey Hess
13c853bda1
dealing with race conditions in import tree design
I seem to be down to a race no worse than one in git, which seems good
enough.

This commit was sponsored by Trenton Cronholm on Patreon.
2018-07-09 14:05:34 -04:00
anarcat
a93a8f254e Added a comment 2018-07-06 16:48:29 +00:00
Joey Hess
87507722cb
comment 2018-07-06 12:38:41 -04:00
anarcat
801154149a Added a comment: some docs 2018-07-06 01:44:08 +00:00
anarcat
445cc79fc8 Added a comment: apologies 2018-07-05 15:56:27 +00:00
Joey Hess
49cc94f61f
add docs about p2p --pair being broken in old versions 2018-07-05 11:52:52 -04:00
Joey Hess
749d5115fe
response 2018-07-04 12:24:09 -04:00
anarcat
5b2bbaaa18 Added a comment: some further considerations 2018-07-04 02:17:50 +00:00
Joey Hess
8a201c5cc4
close 2018-07-03 12:29:57 -04:00
Joey Hess
a63bbd868b
make addurl of media url fail when youtube-dl is disabled
addurl: When security configuration prevents downloads with youtube-dl,
still check if the url is one that it supports, and fail downloading it,
instead of downloading the raw web page.
2018-06-28 13:01:18 -04:00
Joey Hess
b091dac130
note for later 2018-06-26 12:10:09 -04:00
Joey Hess
3160cadba3 git-annex version 6.20180626
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKKUAw1IH6rcvbA8l2xLbD/BfjzgFAlstCaQACgkQ2xLbD/Bf
 jzh5nxAAn7D9soTI0ex6AVDDo2CjOyTTDVrIcl2h5XizfuUD3ev5P0TR3BZmzpAb
 MI6uaZ8kxqZ/eGAsBTyH9PsV7QVYIdht9t89ytP4xWyTQiOgjyJeA6PnJl4zVK9z
 Y8Of3mlylaz+97+sndljpsvy/KHENrHI7HHd+qxAu7wKysJxG6fJB7CjremkjaCI
 zAwg3mIy72ZKyuR/8hL9puJN9fdfw1ulkzQR+he007e/HkurPCwgRAOYW/Aa2tpY
 Oigdb9a6/0nl/VnOS8ZyHrSPRrhLH9c4IBmsdC1Xt5NDVmID/sWgD9uPF9dsHSMF
 OM25QdSlJ5cSNg+/XCpmmhC9MjgKkuVNpZ/fWBaHFs6KYgGhtZcAayQdz5AmMS2N
 HTPWB1IxZiV5TQHQpLbdH/q3RfNtRq1G1tc24zpd/zdhzijeTM6D8n4No6LXNq8X
 7U0qcrp9TdLOpBCTf6Jrg/7qFaXddHoEW1e3KrsOmB0hlYHuNxfY4bs0+ROeXGOT
 00koezcbF8kEI0ekoDvJjtVqaUq+608YjJZ5v7dE0vbtTj0KGbl5EHwC9atUluCX
 MHyTDY89uq68g4HIDytL001ZLvE3EUGJc4jh3+OMDzuZSKB5uwJIIky+qIaQu34K
 QJrZuyAIY0sVFV6LUX9nwqTW6Nnx/bB+kZ6k0+gx+Lpf7pUpE+o=
 =kex4
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKKUAw1IH6rcvbA8l2xLbD/BfjzgFAlsxnX4ACgkQ2xLbD/Bf
 jzjK1xAAnJ58ZxLyTYlCZRcKiR81UHS/Mk6+SDAjRIRbT0SsY+6gSP55XKjrcuOb
 Jatp+6cNNSgk2lBpn37mq+rYIqboFh9moDRK7JSh1mDHCVtIwdARGblFRfuwaWPi
 xHnu+Pj43+SP7OF+8qP8/kDM+js3iMS+0gvBBz8pQN/yJDROXii6u0eONOd7vbER
 iRY9QpJdj5lp3hjaWfXt5iJC0re0eOAY4eUSHPsFIASysShnn33dFPOZ2hbhRKjR
 unQHUVIUE+ehmW3w9qIqn+9v2kca7laGK11cvzYRpmu/9rrvpf+RF1h42S8822dP
 CKHvxDkBGbyqTA+F9/6zpU1i9/ARgHFDpScRcdq7ZJi9FbWabKDklHCsgxwrkdXb
 +FXgb7N5Sa4+eVDNUf4rxldtLPX53nrtZ3IqrGiCWApCvbysNyP5kE0nix02l9z2
 xzY2vlpicx7TOMoO9mZesSFNgRzuFAbbya/zDJrz+xfgSRYXRYg58yTpmhpTFvSI
 h3Fw6+MYvehvRdAweLtoQt2p/UV2MAWrTpNzFoqgf2OCQOiH97ACDHn8Yki9rnQi
 NuMsqv9WOYQs4SaygDZMKemgAxftf3uaXiBW0RzHHwwWnDjHhqsEioOvOhNNyZbz
 U3OjKrH1JZlkNHlIBQD4BsWGLlIct66ZTU3k2OxPEp+mpEG/Xi4=
 =p+cW
 -----END PGP SIGNATURE-----

Merge tag '6.20180626' - previously embargoed security release
2018-06-25 21:56:43 -04:00
Joey Hess
28720c795f
limit url downloads to whitelisted schemes
Security fix! Allowing any schemes, particularly file: and
possibly others like scp: allowed file exfiltration by anyone who had
write access to the git repository, since they could add an annexed file
using such an url, or using an url that redirected to such an url,
and wait for the victim to get it into their repository and send them a copy.

* Added annex.security.allowed-url-schemes setting, which defaults
  to only allowing http and https URLs. Note especially that file:/
  is no longer enabled by default.

* Removed annex.web-download-command, since its interface does not allow
  supporting annex.security.allowed-url-schemes across redirects.
  If you used this setting, you may want to instead use annex.web-options
  to pass options to curl.

With annex.web-download-command removed, nearly all url accesses in
git-annex are made via Utility.Url via http-client or curl. http-client
only supports http and https, so no problem there.
(Disabling one and not the other is not implemented.)

Used curl --proto to limit the allowed url schemes.

Note that this will cause git annex fsck --from web to mark files using
a disallowed url scheme as not being present in the web. That seems
acceptable; fsck --from web also does that when a web server is not available.

youtube-dl already disabled file: itself (probably for similar
reasons). The scheme check was also added to youtube-dl urls for
completeness, although that check won't catch any redirects it might
follow. But youtube-dl goes off and does its own thing with other
protocols anyway, so that's fine.

Special remotes that support other domain-specific url schemes are not
affected by this change. In the bittorrent remote, aria2c can still
download magnet: links. The download of the .torrent file is
otherwise now limited by annex.security.allowed-url-schemes.

This does not address any external special remotes that might download
an url themselves. Current thinking is all external special remotes will
need to be audited for this problem, although many of them will use
http libraries that only support http and not curl's menagarie.

The related problem of accessing private localhost and LAN urls is not
addressed by this commit.

This commit was sponsored by Brett Eisenberg on Patreon.
2018-06-16 11:57:50 -04:00
andrew@2e5aa03dfdc624af77a5957dd345d28430342a9c
785cb276f0 posted issue 2018-06-15 22:23:58 +00:00
Joey Hess
e592635fe6
improve wording 2018-06-14 17:14:13 -04:00
Joey Hess
690bb303f9
more thoughts 2018-06-14 14:00:49 -04:00
Joey Hess
3f80aaea3d
some open questions 2018-06-14 13:42:25 -04:00
Joey Hess
466d3fbaab
more thoughts 2018-06-14 13:30:34 -04:00
Joey Hess
8b734da876
thoughts 2018-06-14 12:32:18 -04:00
Joey Hess
0f566ed242
removal of the rest of remoteGitConfig
In keyUrls, the GitConfig is used only by annexLocations
to support configured Differences. Since such configurations affect all
clones of a repository, the local repo's GitConfig must have the same
information as the remote's GitConfig would have. So, used getGitConfig
to get the local GitConfig, which is cached and so available cheaply.

That actually fixed a bug noone had ever noticed: keyUrls is
used for remotes accessed over http. The full git config of such a
remote is normally not available, so the remoteGitConfig that keyUrls
used would not have the necessary information in it.

In copyFromRemoteCheap', it uses gitAnnexLocation,
which does need the GitConfig of the remote repo itself in order to
check if it's crippled, supports symlinks, etc. So, made the
State include that GitConfig, cached. The use of gitAnnexLocation is
within a (not $ Git.repoIsUrl repo) guard, so it's local, and so
its git config will always be read and available.

(Note that gitAnnexLocation in turn calls annexLocations, so the
Differences config it uses in this case comes from the remote repo's
GitConfig and not from the local repo's GitConfig. As explained above
this is ok since they must have the same value.)

Not very happy with this mess of different GitConfigs not type-safe and
some read only sometimes etc. Very hairy. Think I got it this change
right. Test suite passes..

This commit was sponsored by Ethan Aubin.
2018-06-05 14:48:37 -04:00
Joey Hess
a5f598a6aa
remove use of remoteGitConfig
Unfortunately one more use remains..

This should be just as fast as the other method. The remote's Git.Repo
has already had its config read, so Annex.new's call to Git.Config.read
is a noop.

Thid commit was sponsored by andrea rota.
2018-06-05 13:15:04 -04:00
Joey Hess
fc5888300f
fix annex-checkuuid
Fixed annex-checkuuid implementation, so that remotes configured that way
can be used. This was 100% broken from the first commit of it, oops.

This commit was sponsored by Øyvind Andersen Holm.
2018-06-04 16:52:22 -04:00
RonnyPfannschmidt
c197077e89 Added a comment: the remote im working on 2018-06-04 07:51:57 +00:00
Joey Hess
0c803eee71
list all (non-archived) done bugs, not only most recent 10 2018-05-31 11:48:53 -04:00
Joey Hess
2c8da1432f
comment 2018-05-29 13:01:24 -04:00
unqueued
5300386c2b Added a comment 2018-05-28 14:55:34 +00:00
https://christian.amsuess.com/chrysn
6620c1704a Added a comment: append-only and gitolite 2018-05-28 11:47:14 +00:00
Joey Hess
940444994e
idea 2018-05-25 16:13:13 -04:00
Joey Hess
85f9360d9b
GIT_ANNEX_SHELL_APPENDONLY
Makes it allow writes, but not deletion of annexed content. Note that
securing pushes to the git repository is left up to the user.

This commit was sponsored by Jack Hill on Patreon.
2018-05-25 13:17:56 -04:00
Joey Hess
15129bac9b
2018 update 2018-05-23 15:44:29 -04:00
Joey Hess
41cf6f3d17
followup 2018-05-22 15:57:59 -04:00
yarikoptic
834d3dfff0 just rewording the desire of the master to have a discussion 2018-05-22 17:26:12 +00:00
sorsasampo@35b3d76c4c73ffc3f2c89e965c47a3f6a2721228
38caaee8fc 2018-05-20 03:45:06 +00:00
CandyAngel
4156c13221 Added a comment 2018-05-17 20:15:14 +00:00
anarcat
8f226fb7bd cross-ref with append-only 2018-05-17 18:15:14 +00:00
anarcat
990bb3085e another untrusted client idea 2018-05-17 18:14:17 +00:00
anarcat
e753c7de4f update: git repo now available. the previous paste expired, sorry about that. 2018-05-17 18:06:40 +00:00
anarcat
fce32e6cd4 /dev/random is not necessary in git-annex 2018-05-17 17:38:41 +00:00
Joey Hess
d135705b32
close 2018-05-15 12:03:43 -04:00
Joey Hess
60780a8605
close since anarcat thinks inprogress is good enough 2018-05-15 12:01:30 -04:00
Joey Hess
fbfb2b85ec
close 2018-05-15 12:00:50 -04:00
Joey Hess
c0ffd02ac5
close almost all old Android app bug reports
The old git-annex Android app is now deprecated in favor of running
git-annex in termux. I suspect all or nearly all of these no longer apply.

This commit was sponsored by Jochen Bartl on Patreon.
2018-05-08 15:00:46 -04:00
Joey Hess
d1961e4498
back out incorrect IO interleaving change
Fix regression in last release that crashes when using --all or running
git-annex in a bare repository. May have also affected git-annex unused and
git-annex info.

Reversed the order of the (++) in Annex.Branch.files so --all will stream
lazily still when there are not a bunch of uncommitted journal files.
Added a todo to maybe improve this later.

This commit was sponsored by Trenton Cronholm on Patreon.
2018-05-08 13:54:42 -04:00
Joey Hess
393fc79d58
comment 2018-04-30 16:12:33 -04:00
anarcat
488e127e97 this is basically what inprogress does, except it doesn't depend on xdg-open. at least basic functionality to do what i want is there, thanks! 2018-04-29 12:32:43 +00:00
yarikoptic
5c977d0cf6 initial whining about no support for ~/.netrc 2018-04-28 01:20:42 +00:00
Joey Hess
f5df6244f3
deal with getMounts crashing on android 2018-04-25 17:42:27 -04:00
Joey Hess
f22a8c3485
update 2018-04-25 16:58:12 -04:00
Joey Hess
9807e5bead
fix webapp opening in termux
Open real url not html shim since android and file:// urls is a nasty
kettle of fish.

This commit was sponsored by John Pellman on Patreon.
2018-04-25 14:38:42 -04:00
Joey Hess
b0df331b4a
adjust webapp paths when run in termux on Android
And display the special case Android UI
2018-04-25 14:17:52 -04:00
Joey Hess
118ed8f92b
runshell: hacks for termux; add tip
Added some tweaks to make git-annex work in termux on Android. The regular
arm standalone tarball now works in termux.

I guess the test for "$base/bin/git" is not really necessary, since it
tests for git-annex. Since that gets deleted on android, removed that test.

These are pretty hackish hacks, especially adding it to PATH. The goal is
to make it work well enough out of the box on Android.

This commit was sponsored by Eric Drechsel on Patreon.
2018-04-25 13:48:37 -04:00
Joey Hess
3753c07204
update 2018-04-25 11:19:59 -04:00
Joey Hess
5a01ebe036
update 2018-04-24 21:28:08 -04:00
Joey Hess
43b4e80bf5
update 2018-04-24 21:22:20 -04:00
Joey Hess
ec7262bb87
notes 2018-04-24 19:53:24 -04:00
Joey Hess
6b68813988
OMG 2018-04-24 19:04:07 -04:00
Joey Hess
c34152777b
Use http-conduit for url downloads by default, annex.web-options enables curl
* For url downloads, git-annex now defaults to using a http library,
  rather than wget or curl. But, if annex.web-options is set, it will
  use curl. To use the .netrc file, run:
    git config annex.web-options --netrc
* git-annex no longer uses wget (and wget is no longer shipped with
  git-annex builds).

Note that curl is always run in silent mode, since the new API for
download has a MeterUpdate and doesn't make way for curl progress
output. It might be worth writing a parser for curl's progress output
to update the meter when using it, but I didn't bother with this edge
case for now.

This commit was supported by the NSF-funded DataLad project.
2018-04-06 17:36:20 -04:00
Joey Hess
0f6775f1ff
refactor sinkResponseFile and add downloadC
Remote.S3 and Remote.Helper.Http both had similar code to sink a
http-conduit Response to a file; refactor out sinkResponseFile.

downloadC downloads an url to a file using http-conduit, and supports
resuming. Falls back to curl to handle urls that http-conduit does not
support. This is not used yet, but the goal is to replace download with
it.

git-annex.cabal: conduit-extra was not actually used for a long time,
remove the dep. conduit moves into the main dependency list, but since
http-conduit was already in there, and it depends on conduit, that's not
really adding a new build dep.

This commit was supported by the NSF-funded DataLad project.
2018-04-06 16:07:08 -04:00
Joey Hess
9b98d3f630
better HTTP connection reuse
Enable HTTP connection reuse across multiple files, when git-annex
uses http-conduit. Before, a new Manager was created each time
Utility.Url used it. Now, a single Manager gets created the first time,
so connections are reused.

Doesn't help when external programs are used for url download,
but does speed up addurl --fast, fsck --from web, etc.

Testing fsck --fast --from web with 3 files, over high-latency
satellite internet, it sped up from 19.37s to 14.96s.

This commit was supported by the NSF-funded DataLad project.
2018-04-04 15:39:40 -04:00
Joey Hess
0783352fae
todo 2018-04-04 14:32:32 -04:00
Joey Hess
ae75eb06bc
exporttree support for adb special remote
This commit was sponsored by Michael Magin.
2018-03-27 16:28:41 -04:00
Joey Hess
2927618d35
Added adb special remote which allows exporting files to Android devices.
git annex testremote passes.

exportree not implemented yet, although the documentation talks about it,
since it will be the main way this remote will be used.

The adb push/pull progress is displayed for now; it would be better
to consume it and use it to update the git-annex progress bar.

This commit was sponsored by andrea rota.
2018-03-27 14:54:41 -04:00
Joey Hess
abffea5fcb
update 2018-03-21 09:19:06 -04:00
Joey Hess
1cf705d416
idea 2018-03-21 03:19:47 -04:00
Joey Hess
0ae2662f71
ideas 2018-03-21 02:25:53 -04:00
vrs+annex@ea5fa24dbb279be61a8e50adb638bf8366300717
ae462950a9 Added a comment: related bugs 2018-03-17 19:06:46 +00:00
Joey Hess
050ada746f
Added backends for the BLAKE2 family of hashes.
There are a lot of different variants and sizes, I suppose we might as well
export all the common ones.

Bump dep to cryptonite to 0.16, earlier versions lacked BLAKE2 support.
Even android has 0.16 or newer.

On Debian, Blake2bp_512 is buggy, so I have omitted it for now.
http://bugs.debian.org/892855

This commit was sponsored by andrea rota.
2018-03-13 16:23:42 -04:00
Joey Hess
0ceb6da0fa
Merge branch 'master' of ssh://git-annex.branchable.com 2018-03-13 15:08:28 -04:00
Joey Hess
4015c5679a
force verification when resuming download
When resuming a download and not using a rolling checksummer like rsync,
the partial file we start with might contain garbage, in the case where a
file changed as it was being downloaded. So, disabling verification on
resumes risked a bad object being put into the annex.

Even downloads with rsync are currently affected. It didn't seem worth the
added complexity to special case those to prevent verification, especially
since git-annex is using rsync less often now.

This commit was sponsored by Brock Spratlen on Patreon.
2018-03-13 14:50:49 -04:00
Joey Hess
31e1adc005
deal with unlocked files
P2P protocol version 1 adds VALID|INVALID after DATA; INVALID means the
file was detected to change content while it was being sent and so we
may not have received the valid content of the file.

Added new MustVerify constructor for Verification, which forces
verification even when annex.verify=false etc. This is used when INVALID
and in protocol version 0.

As well as changing git-annex-shell p2psdio, this makes git-annex tor
remotes always force verification, since they don't yet use protocol
version 1. Previously, annex.verify=false could skip verification when
using tor remotes, and let bad data into the repository.

This commit was sponsored by Jack Hill on Patreon.
2018-03-13 14:27:14 -04:00
https://openid.stackexchange.com/user/3ee5cf54-f022-4a71-8666-3c2b5ee231dd
7ea75e1132 Added a comment: How expensive is verification anyway? 2018-03-13 17:58:19 +00:00
Joey Hess
9930b1f140
a plan 2018-03-13 12:17:24 -04:00
Joey Hess
abe8346dca
response 2018-03-12 17:49:06 -04:00
Joey Hess
ad13b56c86
Merge branch 'master' of ssh://git-annex.branchable.com 2018-03-12 17:33:37 -04:00
Joey Hess
59e7f3cbb2
done for the day 2018-03-12 17:32:57 -04:00
Joey Hess
24f35f6acc
remove todo about assistant and network change
When the assistant detects a network change, it
stops using old git-annex transferkeys processes.
So, no problem that old git-annex-shell p2pstdio
connections are cached; they won't be reused after
network change.
2018-03-12 17:24:40 -04:00
Joey Hess
b654597c2f
urk, this is hard 2018-03-12 17:24:18 -04:00
Joey Hess
c3df5d1f10
avoid double-connect to unreachable ssh remote
When git-annex-shell p2pstdio fails with 255, it's because the ssh
server is not reachable. Avoid running the fallback action in this case,
since it would just try a second time to connect, and presumably fail.

Note that the closed P2PSshConnection will not be stored in the pool,
so the next request tries again to connect. This is just the right
behavior; when the remote becomes reachable again, the same git-annex
process will start using it.

This commit was sponsored by Ole-Morten Duesund on Patreon.
2018-03-12 16:50:21 -04:00
Joey Hess
e36ceb7448
open todo for p2p protocol flag days 2018-03-12 16:28:54 -04:00
davicastro
56bab023b9 Added a comment: What are the drawbacks of repo v6? 2018-03-12 19:00:36 +00:00
Joey Hess
c81768d425
version the P2P protocol
Unfortunately ReceiveMessage didn't handle unknown messages the way it
was documented to; client sending VERSION would cause the server to
return an ERROR and hang up. Fixed that, but old releases of git-annex
use the P2P protocol for tor and will still have that behavior.

So, version is not negotiated for Remote.P2P connections, only for
Remote.Git connections, which will support VERSION from their first
release. There will need to be a later flag day to change Remote.P2P;
left a commented out line that is the only thing that will need to be
changed then.

Version 1 of the P2P protocol is not implemented yet, but updated
the docs for the DATA change that will be allowed by that version.

This commit was sponsored by Jeff Goeke-Smith on Patreon.
2018-03-12 14:36:35 -04:00
Joey Hess
f74c970c7e
updates 2018-03-09 13:55:25 -04:00
Joey Hess
08814327ff
use P2P protocol for checkpresent, retrieve, and store
Note that, due to not using rsync to transfer files to ssh remotes
any longer, permissions and other file metadata of annexed files
will no longer be preserved when copying them to ssh remotes.
Other remotes never supported preserving that information, so
this is not considered a regression. Added NEWS item about this.

Another significant side effect of this is that, even when rsync is run to
retrieve a file, its progress display will no longer be shown, and
instead the native git-annex progress display will appear. It would be
possible to use the rsync process display when rsync is used (old
git-annex-shell and also retrieval from a local repository), but it
would have complicated the code unncessarily, and been inconsistent
behavior.

(I'd been thinking for a while about eliminating the rsync progress
display, since it's got some annoying verbosities, including display of
the key and the "(xfr#1, to-chk=0/1)" bit and was already somewhat
inconsistent.)

retrieveKeyFileCheap still uses rsync, since that ensures that it gets
the actual file content from the remote. Using the P2P protocol would
use the local content, as long as the local and remote size are the
same.

This commit was sponsored by John Pellman on Patreon.
2018-03-09 13:25:16 -04:00
Joey Hess
26febb4e58
benchmarks 2018-03-09 12:57:23 -04:00
Joey Hess
6a59bc4845
use P2P protocol for drop
Not yet used for everything else, but this is enough to
verify that it works, and do some benchmarking.

Some bugfixes included, which got it working. Also fallback to old
actions has been verified to work correctly.

Benchmarked dropping one thousand files from a ssh remote on localhost.
Using the old git-annex	40.867 seconds.
With the P2P protocol	9.905 seconds!

This commit was sponsored by Jochen Bartl on Patreon.
2018-03-08 16:56:17 -04:00
Joey Hess
0a294900b5
Merge branch 'master' of ssh://git-annex.branchable.com 2018-03-07 15:41:33 -04:00
Joey Hess
6ddfa9807b
implemented git-annex-shell p2pstdio
Not yet used by git-annex, but this will allow faster transfers etc than
using individual ssh connections and rsync.

Not called git-annex-shell p2p, because git-annex p2p does something
else and I don't want two subcommands with the same name between the two
for sanity reasons.

This commit was sponsored by Øyvind Andersen Holm.
2018-03-07 15:38:01 -04:00
yarikoptic
6023d82a43 Added a comment 2018-03-06 20:33:44 +00:00
Joey Hess
9b8f6c9cb9
followup 2018-03-06 14:51:27 -04:00
Joey Hess
f42baedd4c
designing new git-annex-shell multi
This commit was supported by the NSF-funded DataLad project.
2018-03-06 14:48:44 -04:00
Joey Hess
bf98afb91a
comment 2018-03-06 13:38:55 -04:00
Joey Hess
a4bd508643
todo 2018-02-28 16:52:13 -04:00
Joey Hess
bed6773346
Support exporttree=yes for rsync special remotes.
Renaming is not supported; it might be possible to use --fuzzy to get rsync
to notice the file is being renamed, but that is a bit ..fuzzy.

On the other hand, interrupted transfers of an exported file are resumed,
since rsync is great at that. Had to adjust the exporttree docs, which
said interrupted transfers would restart.

Note that remove no longer makes the empty directory dummy, instead
sending the top-level empty directory. This works just as well and I
noticed the dummy was unncessary when refactoring it into removeGeneric.
Verified that behavior of remove is not changed, and git annex
testremote does pass.

This commit was sponsored by Brock Spratlen on Patreon.
2018-02-28 13:36:20 -04:00
roger.herikstad@ca3b99b0263344ccfd8ec134a12261be25ef3504
7a4fe7e80c Added a comment: Not working with rsync? 2018-02-28 02:41:38 +00:00
Joey Hess
8104f2f758
done 2018-02-20 13:00:51 -04:00
Joey Hess
572754c925
update 2018-02-19 15:54:56 -04:00
yarikoptic
fb2c7d180a Added a comment 2018-02-16 18:42:28 +00:00
Joey Hess
b0633931d6
comment 2018-02-16 14:05:26 -04:00
Joey Hess
56711c310c
update 2018-02-16 13:56:05 -04:00
Joey Hess
846208b265
comment 2018-02-16 13:47:10 -04:00
Joey Hess
1758982760
comment 2018-02-16 13:28:13 -04:00
yarikoptic
36375ca060 Added a comment 2018-02-08 18:06:55 +00:00
Joey Hess
5109d406c0
comment 2018-02-08 13:31:36 -04:00
Joey Hess
68321be178
comment 2018-02-08 13:12:39 -04:00
yarikoptic
8cd0f8b28e 2018-02-06 22:35:41 +00:00
yarikoptic
638032f3aa Added a comment 2018-02-06 20:15:35 +00:00
Joey Hess
7d9f0e0fbe
Added INFO to external special remote protocol.
It's left up to the special remote to detect when git-annex is new enough
to support the message; an old git-annex will blow up.

This commit was supported by the NSF-funded DataLad project.
2018-02-06 13:03:55 -04:00
Joey Hess
5c8150c6eb
Merge branch 'master' of ssh://git-annex.branchable.com 2018-02-05 13:25:11 -04:00
Joey Hess
4cd300547d
followup 2018-02-05 13:24:44 -04:00
yarikoptic
0578ed2aaa initial report 2018-02-05 15:53:06 +00:00
lhunath@3b4ff15f4600f3276d1776a490b734fca0f5c245
60fc538790 Added a comment: Simultaneous transfers 2018-02-02 17:37:27 +00:00
yarikoptic
f3d5d80efb Added a comment: any chance to avoid necessity to config the hook(s)? 2018-02-01 18:54:18 +00:00
Joey Hess
e84d9219c4
add todo 2018-02-01 13:52:15 -04:00
yarikoptic
1ca00e61e6 initial whishlist 2018-01-25 03:44:53 +00:00
mbekkema97@66b135681014f005a3a14c4011d148fcb6655f81
4c6edbc8e6 Added a comment 2018-01-19 03:46:29 +00:00
Joey Hess
4eb70965a7
response 2018-01-15 13:31:21 -04:00
Joey Hess
97d84da875
response 2018-01-15 13:26:25 -04:00
mbekkema97@66b135681014f005a3a14c4011d148fcb6655f81
dcc4b7e954 2018-01-13 03:36:34 +00:00
yarikoptic
369b6f416f initial question about ssh urls support 2018-01-12 14:56:43 +00:00
Joey Hess
d180ae039c
fix now-dead gmane links
gmane's disk crashed, I found one thread in another archive, but could
not find my whole patch set in any archive (perhaps some of the messages
were too long), so pulled it out of my personal mail archives.

This commit was supported by the NSF-funded DataLad project.
2017-12-26 12:21:44 -04:00
Joey Hess
72469af8a9
Merge branch 'master' of ssh://git-annex.branchable.com 2017-12-13 14:36:49 -04:00
Joey Hess
3cc94c1667
.noannex file
A top-level .noannex file will prevent git-annex init from being used in a
repository. This is useful for repositories that have a policy reason not
to use git-annex. The content of the file will be displayed to the user who
tries to run git-annex init.

This also affects git annex reinit and initialization via the webapp.
It does not affect automatic inits, when there's a sibling git-annex branch
already.

This commit was supported by the NSF-funded DataLad project.
2017-12-13 14:34:32 -04:00
yarikoptic
74dedb3e43 removed 2017-12-13 18:15:10 +00:00
yarikoptic
3643fd3fec Added a comment 2017-12-13 18:12:14 +00:00
yarikoptic
55bd55fa55 Added a comment 2017-12-13 18:12:05 +00:00
Joey Hess
95ec530324
thought 2017-12-13 13:46:36 -04:00
Joey Hess
d25a9a1c64
comment 2017-12-13 12:54:54 -04:00
Joey Hess
52ec7c0204
comment 2017-12-13 12:48:28 -04:00
yarikoptic
ba511f4de9 Added a comment 2017-12-11 19:41:35 +00:00
Joey Hess
6c8c4cb7d5
todo 2017-12-11 13:49:40 -04:00
yarikoptic
cf05ebbeba Added a comment 2017-12-05 18:51:49 +00:00
Joey Hess
4e38c4f57f
Allow exporttree remotes to be marked as dead.
Union with max so that DeadTrusted wins over UnTrusted.

This commit was sponsored by Trenton Cronholm on Patreon.
2017-12-05 13:46:55 -04:00
Joey Hess
fd24034e56
fix format of comment 2017-12-05 13:01:26 -04:00
Joey Hess
15d5d0fafb
devblog 2017-11-30 17:08:18 -04:00
Joey Hess
3febb79c8f
wip 2017-11-28 17:17:40 -04:00
Joey Hess
d1d04a3c65
size checking 2017-11-28 13:37:50 -04:00
Joey Hess
8f41a1b7ce
update youtube playlist docs 2017-11-28 13:30:05 -04:00
Joey Hess
57b4c5bdff
add Utility.HtmlDetect
This will be used in youtube-dl integration, to tell when a html page has
been downloaded by addurl, in which case it is worth running youtube-dl
to see if it can extract media from it.

tagsoup is an almost free dependency, because yesod depends on it.
So, this only really adds a dep when git-annex is built without the
webapp.

I'd like this to as closely as possible match how browsers decide if a
page is html or not. Unfortunately, that is fairly heuristic, in order
to support malformed html. And, we don't want to falsely detect
something as html just because it has something that looks like a html
tag embedded somewhere in it. Probably any major video hosting site is
going to be serving html documents that at least start with a <html>
tag, so requiring that or a DOCTYPE should be good enough.

This commit was sponsored by Jeff Goeke-Smith on Patreon.
2017-11-28 13:03:11 -04:00
Joey Hess
c37838d1b7
update; filed youtube-dl bug 2017-11-27 16:29:29 -04:00
Joey Hess
005d9a897d
idea 2017-11-23 10:49:09 -04:00
Joey Hess
af2f3e8f9c
investigate using youtube-dl. not pretty.. 2017-11-22 13:15:13 -04:00
Lykos153
5a4733c5a6 2017-11-16 00:57:29 +00:00
Joey Hess
1d0bf44173
testremote: Test exporttree.
As long as the class of remotes supports exporting, it's tested whether
or not the remote is configured with exporttree=yes.

Also, made testremote of a remote configured with exporttree=yes
disable that configuration for testing non-export storage.

This commit was supported by the NSF-funded DataLad project.
2017-11-08 14:22:11 -04:00
Joey Hess
c394b3ccdc
confirm 2017-11-07 17:08:04 -04:00
Lykos153
9782cb9c14 rename todo/Test_cases_for_exportree_special_remotes.mdwn to todo/Test_cases_for_exporttree_special_remotes.mdwn 2017-11-01 14:15:50 +00:00
Lykos153
96e41466ba 2017-11-01 13:52:49 +00:00
Joey Hess
4ecc852ca0
Merge branch 'master' of ssh://git-annex.branchable.com 2017-10-31 13:14:18 -04:00
Joey Hess
fc8f1a7a1a
close bug; copy benchmarking info to new page 2017-10-31 13:13:40 -04:00
yarikoptic
d780958ee0 Added a comment: thanks 2017-10-30 20:04:48 +00:00
Joey Hess
d4429bcf00
close; export 2017-10-30 15:16:09 -04:00
Joey Hess
194c413727
followup 2017-10-30 15:12:23 -04:00
Joey Hess
1e20b0e782
followup 2017-10-30 14:53:15 -04:00
Joey Hess
1adcad345a
close 2017-10-30 14:47:19 -04:00
Joey Hess
75ec0227f8
unlock, lock: Support --json. 2017-10-30 14:44:11 -04:00
Joey Hess
c778b37e77
you requested his old closed bugs not be deleted yet 2017-10-02 11:58:31 -04:00
Joey Hess
65a76bf381
remove old closed bugs and todo items to speed up wiki updates and reduce size
Remove closed bugs and todos that were last edited or commented before 2017.

Command line used:

for f in $(grep -l '|done\]\]' -- *.mdwn); do d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=01-01-2017 --pretty=oneline -- "$f")" -a -z "$(git log --since=01-01-2017 --pretty=oneline -- "$d")" ]; then git rm -- "./$f" ; git rm -rf "./$d"; fi; done
for f in $(grep -l '\[\[done\]\]' -- *.mdwn); do d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=01-01-2017 --pretty=oneline -- "$f")" -a -z "$(git log --since=01-01-2017 --pretty=oneline -- "$d")" ]; then git rm -- "./$f" ; git rm -rf "./$d"; fi; done
2017-09-29 12:55:42 -04:00
Joey Hess
c363a774b8
limit rss/atom feeds to 10 pages to avoid enormous files
Since bug reports are often long, the rss and atom feeds were 5+ mb in
size.
2017-09-29 12:42:09 -04:00
Joey Hess
ac897a94bb
comment 2017-09-29 12:27:19 -04:00
Joey Hess
abdff127f2
split out todo for webapp export config UI; close export todo! 2017-09-20 15:32:05 -04:00
Joey Hess
d71c65ca0a
add exporter thread to assistant
This is similar to the pusher thread, but a separate thread because git
pushes can be done in parallel with exports, and updating a big export
should not prevent other git pushes going out in the meantime.

The exportThread only runs at most every 30 seconds, since updating an
export is more expensive than pushing. This may need to be tuned.

Added a separate channel for export commits; the committer records a
commit in that channel.

Also, reconnectRemotes records a dummy commit, to make the exporter
thread wake up and make sure all exports are up-to-date. So,
connecting a drive with a directory special remote export will
immediately update it, and getting online will automatically
update S3 and WebDAV exports.

The transfer queue is not involved in exports. Instead, failed
exports are retried much like failed pushes.

This commit was sponsored by Ewen McNeill.
2017-09-20 15:29:13 -04:00
Joey Hess
2e69efea8d
git annex sync --content to exports
Assistant still todo.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon
2017-09-19 14:20:47 -04:00
Joey Hess
a6268b79b2
break out separate todo for later 2017-09-19 12:38:07 -04:00
Joey Hess
5f9eff3f32
fix bug that prevented db being written to disk in SingleWriter mode
The bug occurred when closeDb was not called, and garbage collection of
the DbHandle didn't give the workerThread time to shut down. Fixed by
exiting the runSqlite action when a commit is made.

(MultiWriter mode already forked off a runSqlite action, so avoided the
problem.)

This commit was sponsored by Brock Spratlen on Patreon.
2017-09-18 19:42:20 -04:00
Joey Hess
f4be3c3f89
merge changes made on other repos into ExportTree
Now when one repository has exported a tree, another repository can get
files from the export, after syncing.

There's a bug: While the database update works, somehow the database on
disk does not get updated, and so the database update is run the next
time, etc. Wasn't able to figure out why yet.

This commit was sponsored by Ole-Morten Duesund on Patreon.
2017-09-18 19:21:41 -04:00
Joey Hess
b03d77c211
add ExportTree table to export db
New table needed to look up what filenames are used in the currently
exported tree, for reasons explained in export.mdwn.

Also, added smart constructors for ExportLocation and ExportDirectory to
make sure they contain filepaths with the right direction slashes.

And some code refactoring.

This commit was sponsored by Francois Marier on Patreon.
2017-09-18 13:59:59 -04:00
Joey Hess
486902389d
lock to avoid more than one export to a remote at a time
This commit was sponsored by Jack Hill on Patreon.
2017-09-18 12:38:07 -04:00
Joey Hess
af0958dd70
move tracking exports to design 2017-09-18 12:06:01 -04:00
Joey Hess
4a45f34fe1
don't support removing content from export with removeKey
There does not seem to be a use case for supporting that, and it would
need a lot of complication to support it in a way that allows eventual
consistency when two repositories are updating the same export.

This commit was sponsored by Henrik Riomar on Patreon.
2017-09-17 17:56:33 -04:00
Joey Hess
494b4066db
clarification 2017-09-16 16:44:27 -04:00
Joey Hess
18ba1be26f
design for next steps on exports 2017-09-16 16:41:04 -04:00
Joey Hess
1223960294
empty directory removal working 2017-09-15 15:24:45 -04:00
Joey Hess
5fe803e14e
update 2017-09-15 12:22:11 -04:00
Joey Hess
268a0cc664
update 2017-09-13 15:52:19 -04:00
Joey Hess
bf48ba4ef7
work around box.com webdav rename bug
Apparently box.com renaming is just buggy. I tried a couple of fixes:

* In case the http Manager was opening multiple connections and reaching
  different backend servers, I tried limiting the number of connections
  to 1. Didn't help.
* To make sure it was not a http connection reuse problem, I tried
  rewriting how exportAction works, so that the same http connection
  is clearly open. Didn't help.

So, disable renaming of exports for box.com. It would be good to test it
with some other webdav server.

This commit was sponsored by John Peloquin on Patreon.
2017-09-13 15:26:56 -04:00
Joey Hess
f8fd66d3f8
fix compaction of export.log
It was not getting old lines removed, because the tree graft confused
the updater, so it union merged from the previous git-annex branch,
which still contained the old lines. Fixed by carefully using setIndexSha.

This commit was supported by the NSF-funded DataLad project.
2017-09-12 18:30:36 -04:00
Joey Hess
c8ed941a26
change export.log format to support multiple export remotes
This breaks backwards compatibility, but only with unreleased versions of
git-annex, which I think is acceptable.

This commit was supported by the NSF-funded DataLad project.
2017-09-12 17:45:52 -04:00
Joey Hess
63ba764923
bug 2017-09-12 17:00:15 -04:00
Joey Hess
9c3622882b
export: cache connections for S3 and webdav 2017-09-12 16:59:04 -04:00
Joey Hess
7ad8e8b889
more box.com strangeness 2017-09-12 15:45:43 -04:00
Joey Hess
7f8892f2d2
document box.com rename problem 2017-09-12 15:16:17 -04:00
Joey Hess
267f47c473
S3: Allow removing files from IA, but warn about derived versions potentially still existing there.
Removal works, only derives are a potential issue, so allow removing
with a warning. This way, unexporting a file works, and behavior is
consistent with IA remotes whether or not exporttree=yes.

Also tested exporting filenames containing unicode, spaces, underscores.
All worked, despite the IA's faq saying it doesn't.

This commit was sponsored by Trenton Cronholm on Patreon.
2017-09-12 12:35:58 -04:00
Joey Hess
650d0955a0
S3 export finalization
Fixed ACL issue, and updated some documentation.
2017-09-08 16:28:28 -04:00
Joey Hess
44cd5ae313
S3 export (untested)
It opens a http connection per file exported, but then so does git
annex copy --to s3.

Decided not to munge exported filenames for IA. Too large a chance of
the munging having confusing results. Instead, export of files not
supported by IA, eg with spaces in their name, will fail.

This commit was supported by the NSF-funded DataLad project.
2017-09-08 15:46:24 -04:00
Joey Hess
34ad1c15e8
mention git-annex export 2017-09-07 16:17:46 -04:00
Joey Hess
165725b9df
update 2017-09-07 16:07:28 -04:00
Joey Hess
a48b52c056
avoid renaming to temp files before deleting
Only rename when actually ncessary.

The diff gets buffered in memory. Probably git has to buffer a diff in
memory when generating it as well, so this memory usage should not be a
problem, even when the diff is very large. I hope.

This commit was supported by the NSF-funded DataLad project.
2017-09-07 14:32:47 -04:00
Joey Hess
16eb2f976c
prevent exporttree=yes on remotes that don't support exports
Don't allow "exporttree=yes" to be set when the special remote
does not support exports. That would be confusing since the user would
set up a special remote for exports, but `git annex export` to it would
later fail.

This commit was supported by the NSF-funded DataLad project.
2017-09-07 13:48:44 -04:00
Joey Hess
6ab14710fc
fix consistency bug reading from export database
The export database has writes made to it and then expects to read back
the same data immediately. But, the way that Database.Handle does
writes, in order to support multiple writers, makes that not work, due
to caching issues. This resulted in export re-uploading files it had
already successfully renamed into place.

Fixed by allowing databases to be opened in MultiWriter or SingleWriter
mode. The export database only needs to support a single writer; it does
not make sense for multiple exports to run at the same time to the same
special remote.

All other databases still use MultiWriter mode. And by inspection,
nothing else in git-annex seems to be relying on being able to
immediately query for changes that were just written to the database.

This commit was supported by the NSF-funded DataLad project.
2017-09-06 17:19:07 -04:00
Joey Hess
35cd329bd8
Merge branch 'master' into export 2017-09-06 15:49:30 -04:00
Joey Hess
3ccf661d7c
todo 2017-09-06 15:46:35 -04:00
Joey Hess
1ec3a9eb05
thoughts on handling renames efficiently
This gets complicated, but I think this design will work!

This commit was supported by the NSF-funded DataLad project.
2017-09-06 13:04:09 -04:00
Edward Betts
c1b9f718bc
move line break to fix broken link 2017-09-06 11:25:06 -04:00
Joey Hess
662f2a5ee7
git annex get from exports
Straightforward enough, except for the needed belt-and-suspenders sanity
checks to avoid foot shooting due to exports not being key/value stores.

* Even when annex.verify=false, always verify from exports.
* Only get files from exports that use a backend that supports
  checksum verification.
* Never trust exports, even if the user says to, because then
  `git annex drop` would drop content if the export seemed to contain
  a copy.

This commit was supported by the NSF-funded DataLad project.
2017-09-04 16:39:56 -04:00
Joey Hess
4da763439b
use export db to correctly handle duplicate files
Removed uncorrect UniqueKey key in db schema; a key can appear multiple
times with different files.

The database has to be flushed after each removal. But when adding files
to the export, lots of changes are able to be queued up w/o flushing.
So it's still fairly efficient.

If large removals of files from exports are too slow, an alternative
would be to make two passes over the diff, one pass queueing deletions
from the database, then a flush and the a second pass updating the
location log. But that would use more memory, and need to look up
exportKey twice per removed file, so I've avoided such optimisation yet.

This commit was supported by the NSF-funded DataLad project.
2017-09-04 14:39:32 -04:00
Joey Hess
28e2cad849
implement exporttree=yes configuration
* Only export to remotes that were initialized to support it.
* Prevent storing key/value on export remotes.
* Prevent enabling exporttree=yes and encryption in the same remote.

SetupStage Enable was changed to take the old RemoteConfig.
This allowed only setting exporttree when initially setting up a
remote, and not configuring it later after stuff might already be stored
in the remote.

Went with =yes rather than =true for consistency with other parts of
git-annex. Changed docs accordingly.

This commit was supported by the NSF-funded DataLad project.
2017-09-04 13:09:38 -04:00
Joey Hess
978885247e
implement export.log and resolve export conflicts
Incremental export updates work now too.

This commit was sponsored by Anthony DeRobertis on Patreon.
2017-08-31 15:47:23 -04:00
Joey Hess
7c7af82578
resuming exports
Make a pass over the whole exported tree, and upload anything that has
not yet reached the export. Update location log when exporting.

Note that the synthesized keys for non-annexed files are stored in the
location log too.

Some cases involving files in the tree with the same content are not
handled correctly yet.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2017-08-31 13:33:50 -04:00
Joey Hess
943de657b8
Merge branch 'master' into export 2017-08-31 12:16:22 -04:00
Joey Hess
9f3630f4e0
initial export command
Very basic operation works, but of course this is only the beginning.

This commit was sponsored by Nick Daly on Patreon.
2017-08-29 15:10:01 -04:00
supernaught
15601f2b66 Added a comment 2017-08-28 22:01:23 +00:00
Joey Hess
5c99131b7b
comment 2017-08-28 13:49:16 -04:00
Joey Hess
35fc273218
comment 2017-08-15 13:26:23 -04:00
Joey Hess
2eb6309d3e
move, copy: Support --batch. 2017-08-15 12:39:10 -04:00
https://openid.stackexchange.com/user/26f3c692-0460-4cbc-8712-e9bfb5889fb2
4df21b511c Added a comment: "WSL adds inotify & filesystem change notification support" 2017-08-07 06:22:07 +00:00
yarikoptic
e7e665b1fa 2017-08-02 14:58:28 +00:00
https://launchpad.net/~stephane-gourichon-lpad
48c98fbe93 Added a comment: Experiment to run git-annex-repair as fast as possible. 2017-07-28 04:27:06 +00:00
https://launchpad.net/~stephane-gourichon-lpad
4eab06bf75 2017-07-28 03:39:56 +00:00
yarikoptic
6adbb5a727 initial whining 2017-07-21 16:37:03 +00:00
https://launchpad.net/~stephane-gourichon-lpad
830106de2a 2017-07-21 13:59:10 +00:00
supernaught
13e18b37a5 Feature request: invert remote selection. 2017-07-18 17:00:14 +00:00
Joey Hess
adbd0ff068
add design 2017-07-11 11:32:35 -04:00
Joey Hess
b3a53f7f04
note 2017-07-10 14:20:37 -04:00
Joey Hess
6dd3fbdb41
thoughts 2017-07-10 14:06:49 -04:00
glasserc
5b2dc78aa6 Added a comment 2017-07-02 02:56:20 +00:00
Joey Hess
e8464f106b
followup 2017-06-26 13:13:49 -04:00
glasserc
5991d5c1ab A wishlist item I would like 2017-06-14 21:26:39 +00:00
Joey Hess
99a1e6efe2
close as dup 2017-06-09 13:43:53 -04:00
Joey Hess
aff0ce86da
merge and followup window path length bugs 2017-06-06 15:57:35 -04:00
Joey Hess
3193d8258b
link to msdn article on enabling long paths 2017-06-06 15:24:37 -04:00
Joey Hess
fa50906a80
comment 2017-06-06 13:49:13 -04:00
Joey Hess
3cc30dd128
thoughts 2017-05-24 15:14:49 -04:00
Joey Hess
6dd806f1ad
stop using MissingH for MD5
Cryptonite is faster and allocates less, and I want to get rid of
MissingH use.

Note that the new dependency on memory is free; it's a dependency of
cryptonite.

This commit was supported by the NSF-funded DataLad project.
2017-05-15 21:36:03 -04:00
Joey Hess
9ddd75b99a
comment 2017-05-11 13:29:14 -04:00
Joey Hess
f5a3b14e49
comment 2017-05-09 13:47:09 -04:00
Joey Hess
6ad0b00849
comment 2017-05-09 13:38:45 -04:00
CandyAngel
aa60e9924d Added a comment 2017-05-09 14:25:50 +00:00
CandyAngel
6a6fe43da3 2017-05-09 14:20:21 +00:00
CandyAngel
2befd8acc9 Added a comment 2017-05-09 10:05:14 +00:00
Tafnzart
74e28f9b99 Added a comment: Maybe provide an option to force without name change 2017-05-07 14:05:04 +00:00
anarcat
bf47e518dd try to structure a guide 2017-04-24 14:18:21 +00:00
anarcat
804f06baa4 Added a comment: sounds like the dumb backend, except not dumb 2017-04-08 20:21:41 +00:00
Joey Hess
e3184e54c9
version: Added "dependency versions" line.
This commit was sponsored by Anthony DeRobertis on Patreon.
2017-04-07 18:16:11 -04:00
lee@7614f42c1a6cc84dbc813df25d2f75ed54948e17
2ad7a3e1ff add todo for lib versions 2017-04-07 21:35:48 +00:00
Joey Hess
6896ac06e8
git annex add -u now supported, analagous to git add -u
Unlike git add -u, git annex add -u does not update the index for files
removed from the working tree. But then, "git add ." stages removals,
and "git annex add ." does not, so that's an existing divergence.

Seems that --update --batch would need to run git ls-files once per line of
batch input, which would surely be too slow, so just throw an error for
that.

This commit was supported by the NSF-funded DataLad project.
2017-04-07 15:55:45 -04:00
Joey Hess
c3970f6c1a
multicast: New command, uses uftp to multicast annexed files, for eg a classroom setting.
This commit was supported by the NSF-funded DataLad project.
2017-03-30 19:35:30 -04:00
Joey Hess
39e8433d46
fix format 2017-03-30 14:16:13 -04:00
Joey Hess
43d7862b44
design 2017-03-30 14:15:12 -04:00
yarikoptic
3a03665b4f 2017-03-30 16:08:07 +00:00
yarikoptic
f6d34d228e initial idea 2017-03-30 16:07:43 +00:00
Joey Hess
d6afd70e20 WSL can now run git-annex 2017-03-27 21:40:45 -04:00
Joey Hess
a2017e944f expand 2017-03-27 18:12:46 -04:00
Joey Hess
9fcd3987f2 idea 2017-03-27 18:10:36 -04:00
Cyberthal
9713258e9b 2017-03-24 15:10:10 +00:00
yarikoptic
a6430481c4 initial report 2017-03-23 04:09:44 +00:00