Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2020-03-03 20:03:42 -04:00
commit 8e0fc4998d
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
14 changed files with 278 additions and 0 deletions

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="http://id.pvgoran.name/"
nickname="pvgoran"
avatar="http://cdn.libravatar.org/avatar/e32a61d9c49989ae31d7a30d6af27f5c73d8d46ba03c840a612791fe5c820b87"
subject="comment 2"
date="2020-03-03T02:11:30Z"
content="""
I run the tests on the new `master`, and I can see that all `blake2` tests (`Tests.QuickCheck.blake2s_160` and so on) fail, whereas all other hash-related QuickCheck tests succeed.
"""]]

View file

@ -0,0 +1,12 @@
[[!comment format=mdwn
username="http://id.pvgoran.name/"
nickname="pvgoran"
avatar="http://cdn.libravatar.org/avatar/e32a61d9c49989ae31d7a30d6af27f5c73d8d46ba03c840a612791fe5c820b87"
subject="comment 3"
date="2020-03-03T03:16:25Z"
content="""
After rebuilding `cryptonite` without AES-NI support as suggested in https://github.com/haskell-crypto/cryptonite/issues/260#issuecomment-484185981 (by passing `--constraint=\"cryptonite -support_aesni\"` to `cabal install --dependencies-only` and to `cabal build`), all of QuickCheck tests pass on the machine in question. (Many other tests fail, but it's probably because I run tests from a not-yet-installed `git-annex` binary.)
So `cryptonite` apparently uses these AES-NI instructions (if configured to support them) without runtime checks (or at least without working runtime checks). Which constitutes a problem for any binary distributions that include `cryptonite` code: either they won't work on older CPUs, or they will not work as fast as they could on newer CPUs.
"""]]

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="http://id.pvgoran.name/"
nickname="pvgoran"
avatar="http://cdn.libravatar.org/avatar/e32a61d9c49989ae31d7a30d6af27f5c73d8d46ba03c840a612791fe5c820b87"
subject="comment 4"
date="2020-03-03T03:19:01Z"
content="""
joey, can you tell what problems I will face if I run a git-annex binary with broken blake2 implementation? What functionality of git-annex will fail?
"""]]

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="http://id.pvgoran.name/"
nickname="pvgoran"
avatar="http://cdn.libravatar.org/avatar/e32a61d9c49989ae31d7a30d6af27f5c73d8d46ba03c840a612791fe5c820b87"
subject="comment 5"
date="2020-03-03T04:16:35Z"
content="""
I also posted an issue for `cryptonite`: https://github.com/haskell-crypto/cryptonite/issues/314
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 10"
date="2020-03-02T20:54:12Z"
content="""
I am sorry! I rushed pasting I guess and didn't spot that pasted the wrong one. Instead of `8.20200226+git16-ge156a2b74-1~ndall+1` I meant `7.20190819+git2-g908476a9b-1~ndall+1` which [we have in neurodebian ATM](http://neuro.debian.net/pkgs/git-annex-standalone.html?highlight=standalone). `8.20200226+git16-ge156a2b74-1~ndall+1` is a bad one!
"""]]

View file

@ -0,0 +1,71 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 11"
date="2020-03-03T17:48:16Z"
content="""
> It would be worth checking on the server if ssh has run the process.
Added a bunch of `ps auxw -H` calls in the code and monitoring of the process for when it calls enableremote on target2.
<details>
<summary>
Here is the diff for `ps auxw -H` right before running `enableremote` and then right after the outside process notes that it was ran (and probably hanged):
</summary>
```
@@ -91,22 +91,28 @@
root 1745 0.0 0.2 82604 22000 ? Ss 16:00 0:00 /usr/bin/python3 /usr/bin/google_accounts_daemon
root 1764 0.0 0.0 65512 6312 ? Ss 16:00 0:00 /usr/sbin/sshd -D
root 1998 0.0 0.0 96936 7004 ? Ss 16:00 0:00 sshd: travis [priv]
-travis 2014 0.0 0.0 96936 3488 ? S 16:00 0:00 sshd: travis@pts/0
+travis 2014 0.1 0.0 96936 3488 ? S 16:00 0:00 sshd: travis@pts/0
travis 2015 0.1 0.1 31864 12732 pts/0 Ss+ 16:00 0:00 bash /home/travis/build.sh
travis 2637 0.0 0.1 429472 16088 pts/0 Sl+ 16:00 0:00 ruby /home/travis/.travis/agent
travis 9632 0.0 0.0 22448 2996 pts/0 S+ 16:05 0:00 /bin/bash ../tools/ci/stalling-enable-remote.sh
-travis 9634 20.7 0.5 165156 47104 pts/0 S+ 16:05 0:03 python -m nose -s -v ../datalad/distribution/tests/test_publish.py:test_publish_depends
-travis 17483 0.0 0.0 4508 788 pts/0 S+ 16:05 0:00 sh -c echo '==== right before enableremote target2'; date; ps auxw -H
-travis 17485 0.0 0.0 47388 3472 pts/0 R+ 16:05 0:00 ps auxw -H
-travis 17344 0.0 0.0 17168 644 pts/0 S+ 16:05 0:00 sleep 1
+travis 9634 20.8 0.5 165156 47108 pts/0 S+ 16:05 0:03 python -m nose -s -v ../datalad/distribution/tests/test
+travis 17486 0.0 0.5 1074090736 42496 pts/0 Sl+ 16:05 0:00 /usr/lib/git-annex.linux/exe/git-annex --library-path
+travis 17510 0.0 0.0 21156 4264 pts/0 S+ 16:05 0:00 /usr/lib/git-annex.linux/exe/git --library-path /us
+travis 17511 0.0 0.0 21156 2228 pts/0 S+ 16:05 0:00 /usr/lib/git-annex.linux/exe/git --library-path /us
+travis 17512 0.0 0.0 21204 4160 pts/0 S+ 16:05 0:00 /usr/lib/git-annex.linux/exe/git --library-path /us
+travis 17520 0.0 0.0 0 0 pts/0 Z+ 16:05 0:00 [ssh] <defunct>
+travis 17552 0.0 0.0 47388 3440 pts/0 R+ 16:05 0:00 ps auxw -H
root 16839 0.5 0.0 96936 6872 ? Ss 16:05 0:00 sshd: travis [priv]
travis 16845 0.5 0.0 96936 4168 ? S 16:05 0:00 sshd: travis@notty
root 17310 0.0 0.0 96936 6964 ? Ss 16:05 0:00 sshd: travis [priv]
travis 17316 0.0 0.0 96936 4104 ? S 16:05 0:00 sshd: travis@notty
+root 17521 0.0 0.0 96936 6924 ? Ss 16:05 0:00 sshd: travis [priv]
+travis 17528 0.0 0.0 96936 3668 ? S 16:05 0:00 sshd: travis@notty
travis 2008 0.0 0.0 45276 4664 ? Ss 16:00 0:00 /lib/systemd/systemd --user
travis 2009 0.0 0.0 63256 1908 ? S 16:00 0:00 (sd-pam)
root 3439 0.1 0.8 335396 68348 ? Ssl 16:01 0:00 /usr/bin/dockerd -H fd://
-root 3462 0.2 0.4 268004 34968 ? Ssl 16:01 0:00 docker-containerd --config /var/run/docker/containerd/containerd.toml
+root 3462 0.2 0.4 268004 34968 ? Ssl 16:01 0:00 docker-containerd --config /var/run/docker/containerd/container
travis 5658 0.0 0.0 11148 316 ? Ss 16:03 0:00 ssh-agent
-travis 16841 0.0 0.0 44924 704 ? Ss 16:05 0:00 ssh -fN -o ControlMaster=auto -o ControlPersist=15m -o ControlPath=/home/travis/.cache/datalad/sockets/a1cd7d63 localhost
-travis 17312 0.0 0.0 44924 708 ? Ss 16:05 0:00 ssh -fN -o ControlMaster=auto -o ControlPersist=15m -o ControlPath=/home/travis/.cache/datalad/sockets/31d0eb5a datalad-test
+travis 16841 0.0 0.0 44924 704 ? Ss 16:05 0:00 ssh -fN -o ControlMaster=auto -o ControlPersist=15m -o ControlPat
+travis 17312 0.0 0.0 44924 708 ? Ss 16:05 0:00 ssh -fN -o ControlMaster=auto -o ControlPersist=15m -o ControlPat
+travis 17525 0.0 0.0 44928 712 ? Ss 16:05 0:00 ssh: .git/annex/ssh/datalad-test [mux]
```
</details>
[full travis log](https://api.travis-ci.org/v3/job/657811641/log.txt) . Unfortunately the \"after\" call has truncated command lines :-/ But you can see new processes:
```
+travis 17486 0.0 0.5 1074090736 42496 pts/0 Sl+ 16:05 0:00 /usr/lib/git-annex.linux/exe/git-annex --library-path
+travis 17510 0.0 0.0 21156 4264 pts/0 S+ 16:05 0:00 /usr/lib/git-annex.linux/exe/git --library-path /us
+travis 17511 0.0 0.0 21156 2228 pts/0 S+ 16:05 0:00 /usr/lib/git-annex.linux/exe/git --library-path /us
+travis 17512 0.0 0.0 21204 4160 pts/0 S+ 16:05 0:00 /usr/lib/git-annex.linux/exe/git --library-path /us
+travis 17520 0.0 0.0 0 0 pts/0 Z+ 16:05 0:00 [ssh] <defunct>
...
+travis 17525 0.0 0.0 44928 712 ? Ss 16:05 0:00 ssh: .git/annex/ssh/datalad-test [mux]
```
and no `configlist`. so it does seems to be ssh stalling for some reason
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Ilya_Shlyakhter"
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
subject="git-annex standalone and long filenames in locale cache"
date="2020-03-03T22:36:42Z"
content="""
I'm finding it would be useful to have a conda package variant based on the standalone git-annex version (the normal variant has dependencies that sometimes conflict with other packages). So it would still be useful to fix this issue (another example of it is [here](https://dev.azure.com/conda-forge/84710dde-1620-425b-80d0-4cf5baca359d/_apis/build/builds/127349/logs/10)). If `md5sum` is not available, maybe can use `cksum`?
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Ilya_Shlyakhter"
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
subject="id for locale cache"
date="2020-03-03T23:31:55Z"
content="""
Or inode and ctime of the installed script file could be used instead of full pathname.
"""]]

View file

@ -0,0 +1,11 @@
[[!comment format=mdwn
username="lhunath@3b4ff15f4600f3276d1776a490b734fca0f5c245"
nickname="lhunath"
avatar="http://cdn.libravatar.org/avatar/6388e539b56b3875cc9aceb9f404b3ad"
subject="comment 5"
date="2020-03-03T20:33:47Z"
content="""
Seems like git-annex should report on that. Simply saying \"git-annex: upgrade: 1 failed\" because git is 2.21 is extremely unhelpful. If git-annex requires a certain version of git, it should test for that and be explicit about that requirement in its error output. I needed this comment in order to resolve the issue.
For me, 2.21 is just what macOS ships with.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Ilya_Shlyakhter"
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
subject="minimum git version"
date="2020-03-03T21:15:07Z"
content="""
\"The minimum git version is 2.22\" (6 months ago) -- is that still the correct minimum git version?
"""]]

View file

@ -0,0 +1,20 @@
I still don't like repeated name inside each symlink.
I enabled every tunable I found to make it sane for me: ./git/annex/objects/xxx/SHA256-.../SHA256-...
I expect to have no more than 1.000.000 files which I care about even if I live till 100 years, which nicely fit into performance window of filesystem of 4096 folders with 5000 files each.
I don't care about readonly files -- I have BTRFS snapshots and weekly backups for the case of unintentional disaster and corruption of some files till SHA don't match anymore.
I don't care about nice distributed lock-free mechanics of git annex -- I always commit each new annex file as separate commit and don't mess with scripting git write operations in my repo.
What I care is symlinks length -- to fit into my screen width in terminal file manager, repo performance and underlying git repo size -- all of which are dependent on the length of path till the real file.
What I found reading git-annex forum/todo/bugs and datalad issues for two month -- all are old discussions, ideas and maybe obsolete plans.
I even found somewhere proposal for single-lock model of working for git-annex.
* https://git-annex.branchable.com/design/new_repo_versions/
* https://github.com/datalad/datalad/issues/32
Yet what I did not found is the status of the idea to add such tunable to git-annex.
Deadly unsafe tunable to drop nested SHA-SHA folder would totally satisfy me -- and I would immediately go writing scripts to replay whole my git history onto the freshly initialized git-annex with new symlinks.
Especially if waiting for "more safe" single-lock git-annex w/o SHA-SHA may take years.
So, what is the status of plans?
Do you have any idea how many years (or month?) left to wait until the idea will brew enough to become vital and git-annex codebase transforms into more suitable state to implement such tunable?

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="Ilya_Shlyakhter"
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
subject="re: shorter symlinks"
date="2020-03-03T19:04:03Z"
content="""
Related: [[todo/shorter_keys_through_better_encoding]]
"""]]

View file

@ -0,0 +1,13 @@
Hello! <br>
&nbsp;&nbsp;&nbsp;&nbsp; I am running `Termux` on a `Huawei P20` running `Android 9.x`; when syncing content, I seem to keep getting this on most / all annexed objects:
```bash
/data/data/com.termux/files/home/git-annex.linux/shimmed/cp/cp: preserving times for '.git/annex/othertmp/*': Operation not permitted
```
where `*` is replaced by whatever object is in the folder. I don't know if this is a permissions issue or not, but it is kind of annoying, and takes time from the sync process itself. <br>
&nbsp;&nbsp;&nbsp;&nbsp; Is there any way to change the shimmed `cp` to ignore times? <br>
&nbsp;&nbsp;&nbsp;&nbsp; Thank you kindly for the help!

View file

@ -0,0 +1,82 @@
I set up synchronization between two new git-annex repositories via a webdav export remote for the files content and tor p2p for the git commits.
The following notes apply to a Debian testing system with around 8.20200227. (I compile from source.)
I wanted to understand what the individual setup steps are doing in detail. I hope I'll have time to contribute this into the documentation (man pages) or maybe motivate Joey to do some changes in the code.
## git-annex enable-tor
This is what the **enable-tor** command does:
Be
hiddenServiceSocketFile=/var/lib/tor-annex/$(id -u)_$(git config --get annex.uuid)/s
- prepHiddenServiceSocketDir effectively does
mkdir -p $(dirname $hiddenServiceSocketFile)
- adds two lines to /etc/tor/torrc
HiddenServiceDir /var/lib/tor/tor-annex_$(id -u)_$(git config --get annex.uuid)
HiddenServicePort $newport unix:$hiddenServiceSocketFile
- restarts the tor service and waits for it to come back
- parses the OnionAddress from the $HiddenServiceDir/hostname that tor should have written after restart
- stores the OnionAddress and $newport into .git/annex/creds/p2paddrs
### Comments to enable-tor
- Why can't $newport be a fixed port? There will always only be one
HiddenservicePort per annex HiddenServiceDir.
Confirmed in comment in Auth.hs:
-- We can omit the port and just use the onion address for the creds file,
-- because any given tor hidden service runs on a single port and has a
-- unique onion address.
- Wouldn't it be easier if git-annex-remotedaemon would just run a child tor
process? This way git-annex would fully control the config file and there were
no permission issues with the socket.
- The path to the tor socket file is hard coded and git-remote-daemon can not be
instructed to use a different file. Thus it is not possible to explore
alternative setups, e.g. systemd user services.
## git-annex-p2p --pair
Man page: https://git-annex.branchable.com/git-annex-p2p
I did not use the --pair option since it was unclear to me what exact Wormhole version was needed. Also it was to magic for me.
So far I did the pairing only in one direction and still the synchronization seems to work at least in one direction. I don't remember ATM whether I also tested the other direction.
### --gen-addresses
- generates an auth token
- stores the auth token in .git/annex/creds/p2pauth
- prints some string to be passed to --link in another annex repo
### --link
- runs git remote add $remotename (formatP2PAddress addr)
- storeUUIDIn (remoteAnnexConfig remotename "uuid") theiruuid
does effectively: git config --set remote.$remotename.annex-uuid theiruuid
- storeP2PRemoteAuthToken addr authtoken
stores the auth token in .git/annex/creds/$onionaddr
## git-annex remotedaemon, git-annex assistant
Now I can start git-annex remotedaemon and the synchronization works.
Also git-annex assistant works. However after killing the assistant, it seems that sometimes I needed to restart the remotedaemon, otherwise there was an error about some socket problem.
## webdav export remote
I needed some time to find out that I need to configure "annex-tracking-branch" for an export remote in order for the assistant to automatically sync file content.
## Links
https://git-annex.branchable.com/special_remotes/tor/
https://git-annex.branchable.com/tips/peer_to_peer_network_with_tor/
https://2019.www.torproject.org/docs/onion-services
https://riseup.net/en/security/network-security/tor/onionservices-best-practices