Merge branch 'master' of ssh://git-annex.branchable.com
This commit is contained in:
commit
6eb52cf5f1
13 changed files with 311 additions and 0 deletions
|
@ -0,0 +1,66 @@
|
|||
I've been thinking a bit more about `annex.dotfiles` in the context of
|
||||
[this forum post][0]. It seems to me that annexed dotfiles can jump
|
||||
to git in a way that's surprising and worth raising as a possible bug.
|
||||
|
||||
Say that I have repo with `annex.dotfiles=true` in the .git/config,
|
||||
and I add some dotfiles to the annex. Then, someone clones that repo
|
||||
and goes into an adjusted state (either by running `git annex adjust
|
||||
--unlock` or by being on a crippled file system). In that clone,
|
||||
calling `annex get` on any of the annexed dotfiles will lead to those
|
||||
files being added to the index as regular files. (Demo included
|
||||
below.)
|
||||
|
||||
The above issue could be resolved by the user storing
|
||||
`annex.dotfiles=true` in `git-annex:config.log`, but perhaps it'd be
|
||||
feasible for git-annex to guard against already annexed dotfiles
|
||||
migrating to git?
|
||||
|
||||
Thanks in advance.
|
||||
|
||||
[[!format sh """
|
||||
git annex version | head -1
|
||||
|
||||
cd "$(mktemp -d --tmpdir gx-XXXXXXX)"
|
||||
git init a
|
||||
(
|
||||
cd a
|
||||
git annex init a
|
||||
git config annex.dotfiles true
|
||||
mkdir .reallybig
|
||||
echo "a" >.reallybig/foo
|
||||
git annex add .reallybig/foo
|
||||
git commit -m"add foo"
|
||||
)
|
||||
|
||||
git clone a b
|
||||
(
|
||||
cd b
|
||||
git annex init b
|
||||
git annex adjust --unlock
|
||||
git annex get .reallybig
|
||||
git status
|
||||
git diff --cached
|
||||
)
|
||||
"""]]
|
||||
|
||||
```
|
||||
git-annex version: 8.20200226
|
||||
[...]
|
||||
On branch adjusted/master(unlocked)
|
||||
Changes to be committed:
|
||||
modified: .reallybig/foo
|
||||
|
||||
diff --git a/.reallybig/foo b/.reallybig/foo
|
||||
index 3de500c..7898192 100644
|
||||
--- a/.reallybig/foo
|
||||
+++ b/.reallybig/foo
|
||||
@@ -1 +1 @@
|
||||
-/annex/objects/SHA256E-s2--87428fc522803d31065e7bce3cf03fe475096631e5e07bbd7a0fde60c4cf25c7
|
||||
+a
|
||||
```
|
||||
|
||||
[0]: https://git-annex.branchable.com/forum/Get_annex.dotfiles__61__true_behavior_without_persistent_configuration__63__/
|
||||
|
||||
|
||||
[[!meta author=kyle]]
|
||||
[[!tag projects/datalad]]
|
|
@ -0,0 +1,20 @@
|
|||
### Please describe the problem.
|
||||
|
||||
A special remote implementation that needs to look up further config based on the remote name no longer works, because a recent change prevents `GETCONFIG name` to return the remote name while `git annex initremote` is driving the special remote implementation.
|
||||
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
It used to work with 7.20191230 and no longer does with 8.20200226, test on Debian and Ubuntu.
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
Originally reported at https://github.com/datalad/datalad/issues/4259
|
||||
|
||||
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||
|
||||
There is no day that ends without me being grateful for git-annex ;-)
|
||||
|
||||
[[!meta author=mih]]
|
||||
[[!tag projects/datalad]]
|
||||
|
|
@ -0,0 +1,67 @@
|
|||
### Please describe the problem.
|
||||
If `git annex init` has not been run on a repo, running git-annex commands on the linked worktrees should not change them, but seems to.
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
See log below
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
8.20200226 on Amazon Linux 2
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
[[!format sh """
|
||||
# If you can, paste a complete transcript of the problem occurring here.
|
||||
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
|
||||
(master_env_v178_py36) 17:07 [a] $ ls -alt
|
||||
total 4
|
||||
drwxrwxr-x 8 ilya ilya 166 Mar 6 17:07 .git
|
||||
drwxrwxr-x 3 ilya ilya 32 Mar 6 17:07 .
|
||||
-rw-rw-r-- 1 ilya ilya 0 Mar 6 17:07 myfile
|
||||
drwxrwxrwt 15 root root 4096 Mar 6 17:07 ..
|
||||
(master_env_v178_py36) 17:07 [a] $ git worktree add ../b
|
||||
Preparing worktree (new branch 'b')
|
||||
HEAD is now at 9a5d353 first commit
|
||||
(master_env_v178_py36) 17:08 [a] $ pushd ../b
|
||||
/tmp/b /tmp/a /data/branches/is-add-asm-improvability-metrics
|
||||
(master_env_v178_py36) 17:08 [b] $ ls -alt
|
||||
total 8
|
||||
drwxrwxr-x 2 ilya ilya 32 Mar 6 17:08 .
|
||||
-rw-rw-r-- 1 ilya ilya 0 Mar 6 17:08 myfile
|
||||
drwxrwxrwt 16 root root 4096 Mar 6 17:08 ..
|
||||
-rw-rw-r-- 1 ilya ilya 32 Mar 6 17:08 .git
|
||||
(master_env_v178_py36) 17:08 [b] $ git annex get
|
||||
git-annex: First run: git-annex init
|
||||
(master_env_v178_py36) 17:08 [b] $ ls -alt
|
||||
total 4
|
||||
drwxrwxr-x 2 ilya ilya 32 Mar 6 17:08 .
|
||||
lrwxrwxrwx 1 ilya ilya 21 Mar 6 17:08 .git -> ../a/.git/worktrees/b
|
||||
-rw-rw-r-- 1 ilya ilya 0 Mar 6 17:08 myfile
|
||||
drwxrwxrwt 16 root root 4096 Mar 6 17:08 ..
|
||||
(master_env_v178_py36) 17:08 [b] $ ls -alt /tmp/b/../a/.git/worktrees/b/annex
|
||||
lrwxrwxrwx 1 ilya ilya 11 Mar 6 17:08 /tmp/b/../a/.git/worktrees/b/annex -> ../../annex
|
||||
|
||||
|
||||
(master_env_v178_py36) 17:12 [b] $ git annex version
|
||||
git-annex version: 8.20200226-g2d3ef2c07
|
||||
build flags: Assistant Webapp Pairing S3 WebDAV Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite
|
||||
dependency versions: aws-0.20 bloomfilter-2.0.1.0 cryptonite-0.25 DAV-1.3.3 feed-1.0.1.0 ghc-8.6.5 http-client-0.5.14 persistent-sqlit\
|
||||
e-2.9.3 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.0
|
||||
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_2\
|
||||
24 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE\
|
||||
2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224\
|
||||
BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL
|
||||
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs hook external
|
||||
operating system: linux x86_64
|
||||
supported repository versions: 8
|
||||
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
|
||||
|
||||
(master_env_v178_py36) 17:14 [b] $ uname -a
|
||||
Linux ip-172-31-80-211.ec2.internal 4.14.171-136.231.amzn2.x86_64 #1 SMP Thu Feb 27 20:22:48 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
|
||||
|
||||
|
||||
# End of transcript or log.
|
||||
"""]]
|
||||
|
||||
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||
|
||||
|
|
@ -0,0 +1,69 @@
|
|||
### Please describe the problem.
|
||||
|
||||
I have a special directory remote with exporttree=yes (encryption=none) on an USB hard drive. Both `git annex sync --content` and `git annex export` only write around 400 KiB/s. Thus an export of a 9GB DVD iso takes a whole night.
|
||||
|
||||
The drive is not blazing fast, but:
|
||||
|
||||
- `sync; dd if=/dev/zero of=tempfile bs=1M count=10; sync` gives something around 10MB/s (don't recall the exact number)
|
||||
- rsync (with --progress turned on) copies files with 2.35MB/s
|
||||
|
||||
`mount` for this drive shows:
|
||||
|
||||
> /dev/sdc1 on /media/thk/thk-sg1 type ext4 (rw,nosuid,nodev,relatime,sync,stripe=8191,uhelper=udisks2)
|
||||
|
||||
I tried to mount the drive without sync but failed. Even with the usdisks2 service stopped I could not manually mount the drive without sync (or with async). It always ended up being mounted with sync.
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
TODO(thk): try other drive and other laptop once the current transfer finishes...
|
||||
|
||||
Update 2020-03-07:
|
||||
|
||||
- export to a different USB drive (both seagate, same size, similar age) from the same machine with the exact same setup (but NTFS filesystem) runs with ~80 MiB/s. So this is perfect. This time there is also no problem with a lost exporttree=yes config.
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
- git-annex version: 8.20200227-gf56dfe791
|
||||
- Debian testing with Kernel 5.2.17
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
I now learned that there is no Linux kernel primitive to copy a file but that this is actually a high art:
|
||||
|
||||
<http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/copy.c>
|
||||
|
||||
I was surprised to see the implementation of `meteredWrite` in *Utility/Metered.hs*. I hoped that there would be some haskell standard library for efficient file copying? I wonder how rsync implements its progress meter? And whether the progress meter is the reason why rsync had slower write speed than dd.
|
||||
|
||||
Maybe it would make sense to call out to the *cp* command and just issue a *stat()* every few seconds for the progress meter? This is what I do to monitor cp progress manually.
|
||||
|
||||
I have no clue, but maybe these could help for fast file copying in Haskell?
|
||||
|
||||
- <https://github.com/snoyberg/conduit>
|
||||
- <https://wiki.haskell.org/Pipes>
|
||||
- reddit: [What is your take on conduits, pipes, and streams?](https://www.reddit.com/r/haskell/comments/7w79q1/what_is_your_take_on_conduits_pipes_and_streams/)
|
||||
|
||||
### Have you had any luck using git-annex before?
|
||||
|
||||
Well, I'm coming back to git-annex after several years. So far it is better than I remembered:
|
||||
|
||||
- tor support is great and solves the need for a central server
|
||||
- I hope that the sqlite integration will now make large collections of files managable
|
||||
- Finally we have exporttree, yeah!
|
||||
|
||||
|
||||
## 2020-03-07 update
|
||||
|
||||
Turns out, the problem is more complex. I wanted to be clever. When I set up the two synced annex repos I made the mistake of not specifying exporttree=yes at the beginning. But I wanted to re-use the initial name. So I tried hard to remove all evidence of the previous existence of a special remote with that name from git-annex.
|
||||
|
||||
I checked out the git-annex branch in a separate worktree (see **man git worktree**) in both repositories, deleted the lines for that remote from remote.log and pushed to the other repo (not git annex sync). I even made the changes in parallel in both repos before pushing in both directions so that the special merge does not bring the lines back. I actually was sure there was nothing left of the old remotes. Of course I also deleted them from .git/config.
|
||||
|
||||
Somehow, there is again a line in remote.log for that remote without exporttree=yes. So now, after the last git annex sync --content, I have a mixture of an exported tree and an exported annex object store in the same special remote dir.
|
||||
|
||||
I also noticed that the repo that was so slow did not have the `remote.$REMOTE.annex-tracking-branch` config. But I could still run `sync --content` somehow. After adding this config, the last sync actually ran with 2 MB/s but it still wrote in object store format, not as an exported tree.
|
||||
|
||||
Some questions:
|
||||
|
||||
- Is there any other place where git-annex stores information about remotes then remote.log?
|
||||
- The object store files in the remote were stored in format AAA/BBB/$HASH with three character directory names. While in .git/annex/objects the folders have two characters. What are those characters? I believe the 3 characters format is for remotes that potentially do not distinguish letter case?
|
||||
- Is there a command to get the full path of a file in the object store (two or three letters) from the hash?
|
||||
- Maybe there is still a bug. Is there a possibility that git-annex could forget that a remote is configured with exporttree=yes? Especially if I export to the same directory on the same usb drive from two synced repos?
|
|
@ -0,0 +1,9 @@
|
|||
[[!comment format=mdwn
|
||||
username="sam.nastase@2b4a9b3e5094dab41e0a4de0b808a2697a3e9860"
|
||||
nickname="sam.nastase"
|
||||
avatar="http://cdn.libravatar.org/avatar/55c74b521bcb7322069f35bf655f81e0"
|
||||
subject="Invalid option `--include-dotfiles'"
|
||||
date="2020-03-06T22:17:32Z"
|
||||
content="""
|
||||
I just reinstalled DataLad (v0.12.2) via conda today and tried to do a `datalad save` on a preexisting datalad dataset and got previously unseen error with \"Invalid option `--include-dotfiles'\". Is this related to ongoing development? Or is there some easy fix? Thanks! (apologies if this is a poor place to post)
|
||||
"""]]
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="kyle"
|
||||
avatar="http://cdn.libravatar.org/avatar/7d6e85cde1422ad60607c87fa87c63f3"
|
||||
subject="re: Invalid option `--include-dotfiles'"
|
||||
date="2020-03-06T22:56:35Z"
|
||||
content="""
|
||||
DataLad has been updated for the removal of `--include-dotfiles` in
|
||||
the latest git-annex release (8.20200226), but there hasn't been a
|
||||
DataLad release yet that includes that fix. So I'd say the easiest
|
||||
fix for now would be installing a developmental version of DataLad
|
||||
(both `master` and `maint` have the fix). I think downgrading your
|
||||
git-annex version would be problematic because your repo has probably
|
||||
already been auto-upgraded to v8.
|
||||
"""]]
|
|
@ -0,0 +1,9 @@
|
|||
[[!comment format=mdwn
|
||||
username="sam.nastase@2b4a9b3e5094dab41e0a4de0b808a2697a3e9860"
|
||||
nickname="sam.nastase"
|
||||
avatar="http://cdn.libravatar.org/avatar/55c74b521bcb7322069f35bf655f81e0"
|
||||
subject="comment 5"
|
||||
date="2020-03-06T23:34:12Z"
|
||||
content="""
|
||||
Thanks! This brings me to a new error due to our old version of git (v1.8.3), which apparently doesn't have the `--no-patch` flag for `git show`, but that's a separate issue.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="Ilya_Shlyakhter"
|
||||
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
|
||||
subject="comment 6"
|
||||
date="2020-03-07T01:05:30Z"
|
||||
content="""
|
||||
Strange that a newer git didn’t get installed by conda as a runtime dependency of git-annex... Can you post the output of `conda list`?
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="erewhon"
|
||||
avatar="http://cdn.libravatar.org/avatar/b9bd5ad7176ebe149d0f051dcfe0a63e"
|
||||
subject="comment 2"
|
||||
date="2020-03-07T23:14:56Z"
|
||||
content="""
|
||||
Thanks, I missed that in the man page.
|
||||
|
||||
Is there a rationale for not having the option for preferred contents?
|
||||
"""]]
|
|
@ -0,0 +1 @@
|
|||
Currently [[`git-annex-fsck`|git-annex-fsck]] gives a warning for all my files stored with MD5 keys that they can be upgraded to the more secure SHA256: `Can be upgraded to an improved key format. You can do so by running: git annex migrate`. In my case the key choice is deliberate, so it would be good if this warning could be disabled, to prevent it from drowning out more serious ones.
|
|
@ -0,0 +1,11 @@
|
|||
[[!comment format=mdwn
|
||||
username="thk"
|
||||
avatar="http://cdn.libravatar.org/avatar/bfef10a428769701aeee1db978951461"
|
||||
subject="Also no clear error for permission problems"
|
||||
date="2020-03-06T17:51:20Z"
|
||||
content="""
|
||||
I was exporting (with exporttree) to a directory remote on an external ext4 formatted USB drive.
|
||||
As is usually the case, there was permission problem. My current user did not have write permission for one directory I was exporting to.
|
||||
git-annex just printed \"failed\" after it actually completed the file transfer with 100%.
|
||||
Even with --verbose and --debug I could not figure out the problem until I discovered the permission problem.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="thk"
|
||||
avatar="http://cdn.libravatar.org/avatar/bfef10a428769701aeee1db978951461"
|
||||
subject="No clear error message for failing names on NTFS"
|
||||
date="2020-03-07T12:50:19Z"
|
||||
content="""
|
||||
I tried to export a tree to NTFS with a filename that contained spaces, single quotes and dots. Only after removing all of them the export succeeded. There error messages was just \"failed\".
|
||||
"""]]
|
19
doc/users/thk.mdwn
Normal file
19
doc/users/thk.mdwn
Normal file
|
@ -0,0 +1,19 @@
|
|||
I'm thk at debian org
|
||||
|
||||
|
||||
My TODO items
|
||||
|
||||
- write a tip on using git worktree to inspect the git-annex branch
|
||||
- Is there a way to filter out the directories?
|
||||
- write a tip on how to deal with permission issues on ext formatted USB drives
|
||||
- works of course only on Debian and derivatives
|
||||
- use a common group defined in /usr/share/base-passwd/group.master, e.g. "floppy"
|
||||
- use setgid bit: https://en.wikipedia.org/wiki/Setuid#SGID
|
||||
- make sure all users on all machines are part of the common group
|
||||
- Collect problems with NTFS tree exports, e.g.
|
||||
- Spaces at the end in filenames
|
||||
- single quotes in filenames
|
||||
- dots?
|
||||
- more experiences with ext4 encryption feature:
|
||||
|
||||
<https://www.techort.com/encryption-in-ext4-how-it-works-habrahabr/>
|
Loading…
Add table
Reference in a new issue