Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2020-06-11 15:44:28 -04:00
commit 0017d9a347
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
7 changed files with 191 additions and 0 deletions

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="branchable@bafd175a4b99afd6ed72501042e364ebd3e0c45e"
nickname="branchable"
avatar="http://cdn.libravatar.org/avatar/ae41dba34ee6000056f00793c695be75"
subject="I wonder if this is related to the use of tor"
date="2020-06-11T09:56:08Z"
content="""
I'll try to test with a simple ssh remote.
"""]]

View file

@ -0,0 +1,68 @@
### Please describe the problem.
git-annex-remote-googledrive uses SETCREDS and GETCREDS to let git-annex handle the credentials. According to the [documentation](https://git-annex.branchable.com/design/external_special_remote_protocol/) it should be stored inside the git-annex branch when using hyprid or pubkey encryption. However, this does not happen. Even setting embedcreds to yes does not change anything.
### What steps will reproduce the problem?
git annex initremote testremote type=external externaltype=googledrive prefix=test keyid=<key> encryption=pubkey
or
git annex initremote testremote type=external externaltype=googledrive embedcreds=yes prefix=test keyid=<key> encryption=pubkey
### What version of git-annex are you using? On what operating system?
git-annex version: 8.20200309-g14a4a9f4c
build flags: Assistant Webapp Pairing S3 WebDAV Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.26 DAV-1.3.4 feed-1.3.0.0 ghc-8.8.3 http-client-0.6.4.1 persistent-sqlite-2.10.6.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.0.1
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs hook external
operating system: linux x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
local repository version: 8
### Please provide any additional information below.
[[!format sh """
% git annex initremote g4 type=external externaltype=googledrive embedcreds=yes prefix=test keyid=***redacted*** encryption=pubkey --debug
[...]
[2020-06-11 12:14:11.622651546] git-annex-remote-googledrive[1] --> SETCREDS credentials ***redacted***
[2020-06-11 12:14:11.623213097] chat: gpg ["--quiet","--trust-model","always","--decrypt"]
[2020-06-11 12:14:11.761559468] process done ExitSuccess
[2020-06-11 12:14:11.761663537] chat: gpg ["--quiet","--trust-model","always","--batch","--recipient","***redacted***","--encrypt","--no-encrypt-to","--no-default-recipient","--force-mdc","--no-textmode"]
[2020-06-11 12:14:11.765603345] process done ExitSuccess
[2020-06-11 12:14:11.765697179] git-annex-remote-googledrive[1] --> INITREMOTE-SUCCESS
[...]
% git show git-annex
commit abb4cf685439115dffc393bb73cd3bb499f6aaec (git-annex)
Author: Silvio Ankermann <silvio@localhost>
Date: Thu Jun 11 12:14:11 2020 +0200
update
diff --git a/remote.log b/remote.log
index 5c72883..2d29c9c 100644
--- a/remote.log
+++ b/remote.log
@@ -1,3 +1,4 @@
+1da660c2-fe07-4dcb-aca6-12f2cfdfff52 cipher=***redacted*** cipherkeys==***redacted*** embedcreds=yes encryption=pubkey externaltype=googledrive name=g4 prefix=test root_id==***redacted*** token= type=external timestamp=1591870451.772978233s
***redacted***
diff --git a/uuid.log b/uuid.log
index b9196be..4a9fe4d 100644
--- a/uuid.log
+++ b/uuid.log
@@ -1,3 +1,4 @@
+1da660c2-fe07-4dcb-aca6-12f2cfdfff52 g4 timestamp=1591870451.772033937s
***redacted***
"""]]
I can see that gpg is called after SETCREDS and I would have expected there to be an additional config "credentials":
[[!format sh """
+1da660c2-fe07-4dcb-aca6-12f2cfdfff52 cipher=***redacted*** cipherkeys==***redacted*** credentials=*encrypted_creds* embedcreds=yes encryption=pubkey externaltype=googledrive name=g4 prefix=test root_id==***redacted*** token= type=external timestamp=1591870451.772978233s
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Of course ;) In fact. I've actually never used embedcreds. I just had an [issue](https://github.com/Lykos153/git-annex-remote-googledrive/issues/48) raised about it on github.

View file

@ -2,6 +2,8 @@
I am trying to copy files from my computer to an external drive, but git annex get in one annex repo on the drive, keeps failing with "Unable to access these remotes: origin" message. It seems to pull the data, compute a checksum and then give me the error message. The same file passes fsck on the computer, and I was able to "git annex get" it from a different computer without issues. I am trying to copy files from my computer to an external drive, but git annex get in one annex repo on the drive, keeps failing with "Unable to access these remotes: origin" message. It seems to pull the data, compute a checksum and then give me the error message. The same file passes fsck on the computer, and I was able to "git annex get" it from a different computer without issues.
Comparing the debug log with that of a different annex (on the same external drive), the part that's confusing is that git-annex spins up two rsync processes to copy the one file for some reason, and logs "failed" right before the checksum step?
### What steps will reproduce the problem? ### What steps will reproduce the problem?
- cd into externaldrive/Annex - cd into externaldrive/Annex

View file

@ -0,0 +1,18 @@
[[!comment format=mdwn
username="kanakkshetri@9ea0e7639162bddc7bf9f3bb94cc32e93c793b89"
nickname="kanakkshetri"
avatar="http://cdn.libravatar.org/avatar/3cf305b2854041f8441c8fd8f02e8a2a"
subject="git-annex-repair and git annex fsck: no errors found"
date="2020-06-10T14:39:44Z"
content="""
I ran git annex repair on both the external drive and the desktop:
git annex repair
repair Running git fsck ...
No problems found.
ok
I also fsck'd the files in the repository (on the desktop) and they all passed checksum checks.
I'm kind of at a loss for what else to try. I also tried copy from desktop to external drive and it fails in the same way (including the two calls to rsync, both of which have ProcessExit Success, but then git annex just writes \"failed\").
"""]]

View file

@ -0,0 +1,58 @@
### Please describe the problem.
See "test-annex (crippled-home)" run on [datalad-extensions github actions](https://github.com/datalad/datalad-extensions/pull/15/checks?check_run_id=758524896). yes -- I really hate all those scrollbars etc, but you should be able also to click on "..." -> "View raw logs" which would lead to one [large log](https://pipelines.actions.githubusercontent.com/2UPlDxaVvvbkeFX4btxWorCjpJvj40zvWY5ogH2yZibhOMcU7O/_apis/pipelines/1/runs/745/signedlogcontent/14?urlExpires=2020-06-11T14%3A42%3A25.6501110Z&urlSigningMethod=HMACV1&urlSignature=Kmm%2BTBYZt5jzojQgrDTOgSrVjYq8VgUHLd3sUtFJd0c%3D)
in which you can find
[[!format sh """
2020-06-10T16:02:40.4507392Z Detected a filesystem without fifo support.
2020-06-10T16:02:40.4508127Z Disabling ssh connection caching.
2020-06-10T16:02:40.4579131Z Detected a crippled filesystem.
2020-06-10T16:02:40.6831961Z export_import: [adjusted/master(unlocked) 4c3ac42] empty
2020-06-10T16:02:40.7515700Z adjust ok
2020-06-10T16:02:40.8085540Z initremote foo ok
2020-06-10T16:02:40.8152161Z (recording state in git...)
2020-06-10T16:02:40.8811878Z get foo (from origin...)
2020-06-10T16:02:40.9178237Z
2020-06-10T16:02:40.9190085Z 100% 20 B 548 B/s 0s
2020-06-10T16:02:40.9201375Z
2020-06-10T16:02:40.9240907Z (checksum...) ok
2020-06-10T16:02:40.9261455Z get sha1foo (from origin...)
2020-06-10T16:02:40.9325841Z
2020-06-10T16:02:40.9334418Z 100% 25 B 4 KiB/s 0s
2020-06-10T16:02:40.9336072Z
2020-06-10T16:02:40.9405494Z (checksum...) ok
2020-06-10T16:02:40.9406396Z (recording state in git...)
2020-06-10T16:02:41.1056415Z commit
2020-06-10T16:02:41.2164981Z On branch adjusted/master(unlocked)
2020-06-10T16:02:41.2165985Z Your branch is ahead of 'origin/adjusted/master(unlocked)' by 1 commit.
2020-06-10T16:02:41.2166158Z (use "git push" to publish your local commits)
2020-06-10T16:02:41.2166244Z
2020-06-10T16:02:41.2166353Z nothing to commit, working tree clean
2020-06-10T16:02:41.2166474Z ok
2020-06-10T16:02:41.3970703Z export foo bar.c ok
2020-06-10T16:02:41.4080765Z export foo foo ok
2020-06-10T16:02:41.4123344Z export foo sha1foo ok
2020-06-10T16:02:41.4301472Z (recording state in git...)
2020-06-10T16:02:41.4804732Z list foo ok
2020-06-10T16:02:41.4913408Z import foo import
2020-06-10T16:02:41.4915106Z
2020-06-10T16:02:41.4916599Z .git/annex/tmp/CID-s16--24892 16 1591804960 : openBinaryFile: invalid argument (Invalid argument)
2020-06-10T16:02:41.4916699Z
2020-06-10T16:02:41.4917067Z
2020-06-10T16:02:41.4917254Z ok
2020-06-10T16:02:41.4924772Z
2020-06-10T16:02:41.4925784Z Failed to import some files from foo. Re-run command to resume import.
2020-06-10T16:02:41.5156287Z merge foo/master ok
2020-06-10T16:02:41.5276712Z FAIL
"""]]
overall it seems to be only this test failing: 1 out of 702 tests failed (198.43s)
git annex is 8.20200522+git142-g9102d3172-1~ndall and .deb available within the .zip within "Artifacts" drop down on the top.
details of the setup are [in that PR](https://github.com/datalad/datalad-extensions/pull/15/files#diff-8364c688b76bfaf5df947cfd4d74eef7R76)
PS determining the boundaries and names of the tests git annex had ran is a tricky business on its own -- I wondered if tests output formatting and annotation could have been improved as well. E.g. unlikely there is a point to print all output if test passes. With `nose` in Python / datalad we get a summary of all failed tests (and what was output when they were ran) at the end of the full sweep. That helps to avoid needing to search the entire long list

View file

@ -0,0 +1,21 @@
[[!comment format=mdwn
username="branchable@bafd175a4b99afd6ed72501042e364ebd3e0c45e"
nickname="branchable"
avatar="http://cdn.libravatar.org/avatar/ae41dba34ee6000056f00793c695be75"
subject="I've hacked an ugly daemon together for this"
date="2020-06-11T09:56:57Z"
content="""
I've now written [a custom solution for myself which uses inotifywait to trigger `annex sync` when `master` or `synced/master` is updated](https://github.com/aspiers/pim/blob/master/bin/auto-sync-daemon).
I'm running this on a mesh of remotes which all have the daemon running, so that a manual commit on any remote can be spotted and distributed throughout the network.
However from a synchronisation PoV, this is pretty ugly:
- The remote where the manual commit is made spots the change to `master` and initiates an `annex sync` which updates at least `synced/master` on the other remotes, even if not `master` due to `receive.denyCurrentBranch` being set to `refuse`.
- The daemons on the other remotes spot the changes to their `synced/master` and each initiate `annex sync`.
- At this point a whole bunch of remotes are running `annex sync` at roughly the same time, and the window for weird race conditions is large.
However even though the design is ugly, so far I haven't spotted any issues, which is presumably testament to the quality of the locking / synchronisation code within both `git` and `git-annex`. Kudos!
Of course I would prefer to ditch my custom hack and use the assistant, but that would require this feature to be added (as well as a solution to [[bugs/assistant_sometimes_removes_and_re-adds_whole_file]]). I'd happily sponsor development of that but IIUC you aren't accepting sponsorship of individual features, so I'll live with my ugly hack for now until I get round to learning Haskell ;-)
"""]]

View file

@ -0,0 +1,15 @@
[[!comment format=mdwn
username="branchable@bafd175a4b99afd6ed72501042e364ebd3e0c45e"
nickname="branchable"
avatar="http://cdn.libravatar.org/avatar/ae41dba34ee6000056f00793c695be75"
subject="I've hacked up a Python script for policy-based automatic commits"
date="2020-06-11T10:10:52Z"
content="""
Since I haven't learnt Haskell yet (and even if I was able to hack a patch to git-annex I'm not sure whether it would be accepted), I've hacked up [a simple `git-auto-commit` Python script for automatically committing based on a simple policy](https://github.com/aspiers/pim/blob/master/bin/git-auto-commit), together with [a shell script which invokes it every `$n` seconds](https://github.com/aspiers/pim/blob/master/bin/auto-commit-daemon).
Currently the policy is hardcoded to only stage and commit files ending in `.org` which have unstaged changes and an `mtime` over a minimum threshold, in order to throttle the rate at which automatic commits are made. Maybe at some point I'll change it to honour `.gitattributes` such as
*.org annex.autocommit=(mtimebefore=5mins)
but of course it would be far nicer if the assistant could do this natively, since that would also solve [forum/Can the assistant sync files if committed manually (autocommit=false)?](https://git-annex.branchable.com/forum/Can_the_assistant_sync_files_if_committed_manually___40__autocommit__61__false__41____63__/).
"""]]