Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2021-05-07 11:09:21 -04:00
commit 5332cf8a80
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
23 changed files with 548 additions and 0 deletions

View file

@ -0,0 +1,69 @@
## I'm considering this a "false alarm", but leaving it around for others who may run into it
It took a long time to add the files (50 minutes). When I did, and did a `git status`, the ones that failed due to "permission denied" just appeared as having not been added. I added them, and it worked fine. I have no reason to believe that my folder has gotten corrupted.
So I don't personally think this needs fixing. But if anyone else out there runs into this issue, at least this page is here.
### Please describe the problem.
When adding 400k files to a new annex, I get an error "rename: permission denied". It doesn't seem to be about file permissions (I have `chown`ed them), and it's inconsistent from run to run. So each time I try the import, different files may show the permission denied error.
One thing I'm concerned about is how to confirm whether these files have made it into annex, or if I now have a corrupted folder structure.
I do intend to do smaller imports, or try using `-J1`.
### What steps will reproduce the problem?
1. `git config annex.jobs cpus`
2. `git annex add .`
### What version of git-annex are you using? On what operating system?
macOS 10.15.7
```
git-annex version: 8.20210310
build flags: Assistant Webapp Pairing FsEvents TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.28 DAV-1.3.4 feed-1.3.0.1 ghc-8.10.4 http-client-0.7.6 persistent-sqlite-2.11.1.0 torrent-10000.1.1 uuid-1.3.14 yesod-1.6.1.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: darwin x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
```
### Please provide any additional information below.
iMac 10-core i9 (maybe 20 threads?)
```
git-annex: .git/annex/othertmp/ingest-A23998-2216: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-Ad23998-21291: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-P23998-30359: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-Audio23998-182890: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-wasd_clap_sys100_cra23998-206554: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-wasd_clap_sys100_f23998-206560: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-wasd_clap_sys100_f23998-206561: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-Fairligh23998-248968: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-ly23998-268165: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-123998-269213: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-46223998-278087.wav: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ing23998-290478: rename: permission denied (Permission denied)
git-annex: .git/annex/othertmp/ingest-H23998-292758: rename: permission denied (Permission denied)
```
[[!format sh """
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
# End of transcript or log.
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Absolutely! :)

View file

@ -0,0 +1,117 @@
424bef6b6 (smudge: check for known annexed inodes before checking
annex.largefiles, 2021-05-03) fixed a case where an unlocked annexed
file that annex.largefiles does not match could get its unchanged
content checked into git. This was in response to
<https://git-annex.branchable.com/forum/one-off_unlocked_annex_files_that_go_against_large/>.
In a comment there, Joey said:
> I've made a change that seems to work, and will probably not break
> other cases, although this is a complex and subtle area.
I'm following up with a change in behavior flagged by a DataLad test.
As with most things in this area, I have a hard time reasoning about
what the expected behavior should be and whether it should be
considered a bug. Here's the reproducer:
[[!format sh """
set -eu
cd "$(mktemp -d "${TMPDIR:-/tmp}"/ga-XXXXXXX)"
git version
git annex version | head -1
git init -q
git annex init
echo a >foo
git annex add foo
git commit --quiet -m 'add foo'
git annex unlock foo
printf '* annex.largefiles=nothing\n' >.gitattributes
sleep 1
git annex add foo
git commit -q -m 'commit unlocked' -- foo
set -x
export PS4='> '
git diff HEAD^- -- foo
git diff --cached
"""]]
Here's the output with 8.20210428:
```
git version 2.31.1.659.g12c5fe8677
git-annex version: 8.20210428
[...]
> git diff HEAD^- -- foo
diff --git a/foo b/foo
deleted file mode 120000
index 8a2a0c9..0000000
--- a/foo
+++ /dev/null
@@ -1 +0,0 @@
-.git/annex/objects/3z/F8/SHA256E-s2--87428fc522803d31065e7bce3cf03fe475096631e5e07bbd7a0fde60c4cf25c7/SHA256E-s2--87428fc522803d31065e7bce3cf03fe475096631e5e07bbd7a0fde60c4cf25c7
\ No newline at end of file
diff --git a/foo b/foo
new file mode 100644
index 0000000..7898192
--- /dev/null
+++ b/foo
@@ -0,0 +1 @@
+a
> git diff --cached
```
And here's the output with a recent commit on master following
424bef6b6:
```
git version 2.31.1.659.g12c5fe8677
git-annex version: 8.20210429-ge811a50e2
[...]
> git diff HEAD^- -- foo
diff --git a/foo b/foo
deleted file mode 120000
index 8a2a0c9..0000000
--- a/foo
+++ /dev/null
@@ -1 +0,0 @@
-.git/annex/objects/3z/F8/SHA256E-s2--87428fc522803d31065e7bce3cf03fe475096631e5e07bbd7a0fde60c4cf25c7/SHA256E-s2--87428fc522803d31065e7bce3cf03fe475096631e5e07bbd7a0fde60c4cf25c7
\ No newline at end of file
diff --git a/foo b/foo
new file mode 100644
index 0000000..3de500c
--- /dev/null
+++ b/foo
@@ -0,0 +1 @@
+/annex/objects/SHA256E-s2--87428fc522803d31065e7bce3cf03fe475096631e5e07bbd7a0fde60c4cf25c7
> git diff --cached
diff --git a/foo b/foo
index 3de500c..7898192 100644
--- a/foo
+++ b/foo
@@ -1 +1 @@
-/annex/objects/SHA256E-s2--87428fc522803d31065e7bce3cf03fe475096631e5e07bbd7a0fde60c4cf25c7
+a
```
Before 424bef6b6, `git annex add foo + git commit ... foo` results in
a commit that has foo's content tracked in git. After 424bef6b6, the
unlocked file is still recorded, and the switch to being tracked by
git ends up staged in the index.
The new behavior isn't seen if the pathspec is dropped from `git
commit`. Also, without the sleep, it isn't triggered reliably
(presumably because the index and foo have the same mtime, bypassing
the clean filter).
Thanks for taking a look.
[[!meta author=kyle]]
[[!tag projects/datalad]]

View file

@ -0,0 +1,93 @@
Thanks for extending `fromkey` to support for unlocked files. When
updating some DataLad code to make use of this, a test flagged a
difference between how links and pointer files are handled: the
necessary leading directories will be created for links but not
pointer files.
[[!format sh """
cd "$(mktemp -d "${TMPDIR:-/tmp}"/ga-XXXXXXX)" || exit 1
git version
git annex version | head -1
git init -q
git annex init
set -x
git annex fromkey --force \
SHA256E-s4--b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c \
foo/a
git cat-file -p :foo/a
git config annex.addunlocked true
git annex fromkey --force \
SHA256E-s4--7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730 \
bar/a
"""]]
```
git version 2.31.1.705.g1ce651569c
git-annex version: 8.20210429-gdab203070
init (scanning for unlocked files...)
ok
(recording state in git...)
+ git annex fromkey --force SHA256E-s4--b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c foo/a
fromkey foo/a ok
(recording state in git...)
+ git cat-file -p :foo/a
../.git/annex/objects/91/9x/SHA256E-s4--b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c/SHA256E-s4--b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c+ git config annex.addunlocked true
+ git annex fromkey --force SHA256E-s4--7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730 bar/a
fromkey bar/a
git-annex: bar/a: openBinaryFile: does not exist (No such file or directory)
failed
(recording state in git...)
git-annex: fromkey: 1 failed
```
The caller can of course make sure that leading directories exist, but
I think it makes sense for the locked and unlocked variants to behave
the same here. What do you think about the patch below?
[[!format patch """
From f6c97b8d01c7e9b8069638e9827062aa2462d429 Mon Sep 17 00:00:00 2001
From: Kyle Meyer <kyle@kyleam.com>
Date: Thu, 6 May 2021 11:11:14 -0400
Subject: [PATCH] fromkey: create directory for pointer files too
fromkey creates leading directories for symbolic links. Do the same
for pointer files.
---
Command/FromKey.hs | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Command/FromKey.hs b/Command/FromKey.hs
index eadb89fd1..16ff1693f 100644
--- a/Command/FromKey.hs
+++ b/Command/FromKey.hs
@@ -106,6 +106,7 @@ perform matcher key file = lookupKeyNotHidden file >>= \case
, matchKey = Just key
}
else keyMatchInfoWithoutContent key file
+ createWorkTreeDirectory (parentDir file)
ifM (addUnlocked matcher mi contentpresent)
( do
stagePointerFile file Nothing =<< hashPointerFile key
@@ -115,7 +116,6 @@ perform matcher key file = lookupKeyNotHidden file >>= \case
else writepointer
, do
link <- calcRepo $ gitAnnexLink file key
- createWorkTreeDirectory (parentDir file)
addAnnexLink link file
)
next $ return True
base-commit: dab2030702200bc9abea4bff9ce83ba63aeca41c
--
2.31.1.705.g1ce651569c
"""]]
[[!meta author=kyle]]
[[!tag projects/datalad]]

View file

@ -0,0 +1,41 @@
When using `reinject <src> <dest>` and `dest` is an absolute path to a
pointer file, the operation silently fails to reinject the content.
[[!format sh """
cd "$(mktemp -d "${TMPDIR:-/tmp}"/ga-XXXXXXX)" || exit 1
git version
git annex version | head -1
git init -q
git annex init
git config annex.addunlocked true
git annex fromkey --force \
SHA256E-s3--2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae \
foo
printf foo >.git/tmp-to-copy
git annex reinject .git/tmp-to-copy "$PWD"/foo
echo $?
cat foo
"""]]
```
git version 2.31.1.705.g1ce651569c
git-annex version: 8.20210429-g06e996efa
init (scanning for unlocked files...)
ok
(recording state in git...)
fromkey foo ok
(recording state in git...)
0
/annex/objects/SHA256E-s3--2c26b46b68ffc68ff99b453c1d30413413422d706483bfa0f98a5e886266e7ae
```
If a link destination is used (i.e. drop the `addunlocked`
configuration in the script above) or a relative path is used
(i.e. drop the `"$PWD"/`), the content is injected.
[[!meta author=kyle]]
[[!tag projects/datalad]]

View file

@ -0,0 +1,15 @@
[[!comment format=mdwn
username="cecile.madjar@d95f9e618c3dff4829e7fedba1a71e1499542f3f"
nickname="cecile.madjar"
avatar="http://cdn.libravatar.org/avatar/a32ab97180285c0e5095bad4616a4d87"
subject="comment 3"
date="2021-05-04T20:24:50Z"
content="""
Hello,
Is there any updates on this? I am using an Apple M1 Silicon and I am blocked in all my projects because I cannot install git-annex on my computer. Do you have an approximate idea of when this would be available for Apple M1 Silicon users?
Thank you,
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Lukey"
avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
subject="comment 4"
date="2021-05-04T20:38:01Z"
content="""
Hmm, shouldn't it work just fine with rosetta?
"""]]

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="cecile.madjar@d95f9e618c3dff4829e7fedba1a71e1499542f3f"
nickname="cecile.madjar"
avatar="http://cdn.libravatar.org/avatar/a32ab97180285c0e5095bad4616a4d87"
subject="comment 5"
date="2021-05-04T21:12:52Z"
content="""
Thank you Lukey. Indeed, after installing Rosetta2 it worked. Thank you!
"""]]

View file

@ -0,0 +1,23 @@
[[!comment format=mdwn
username="fortran"
avatar="http://cdn.libravatar.org/avatar/ee27e12e945c0af698d58f0d8dde2457"
subject="comment 2"
date="2021-05-04T19:10:35Z"
content="""
Oh. Wow. That's a big man page...
Okay. So if I run `git config annex.sshcaching false`, then things are happier. Well:
```
git annex get file1.nc4
get file1.nc4 (from origin...)
You have enabled concurrency, but git-annex is not able to use ssh connection caching. This may result in multiple ssh processes prompting for passwords at the same time.
annex.sshcaching is not set to true
ok
(recording state in git...)
```
Now, reading the man pages, I see that the default concurrency is 1, so I think I'm safe? Or is there perhaps something I should use to tell it \"nope\" for that?
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="fortran"
avatar="http://cdn.libravatar.org/avatar/ee27e12e945c0af698d58f0d8dde2457"
subject="comment 3"
date="2021-05-04T19:25:47Z"
content="""
Essentially, we are hoping to deploy git-annex so the less messages from git-annex, the better for our end users. (Or, I guess, my *supporting* the end users :) )
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="Lukey"
avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
subject="comment 4"
date="2021-05-04T20:31:45Z"
content="""
Even with concurrency enabled, this should not be a problem in your case, as you're manually doing the controlmaster setup.
@joey I guess there needs to be a way to hide such messages, like git with the `advice.*` configuration options.
"""]]

View file

@ -0,0 +1,14 @@
[[!comment format=mdwn
username="Atemu"
avatar="http://cdn.libravatar.org/avatar/d1f0f4275931c552403f4c6707bead7a"
subject="comment 3"
date="2021-05-04T17:19:57Z"
content="""
I used the exact same settings for the second special remote as the first one: `type=directory chunk=50MiB encryption=hybrid mac=HMACSHA256`.
GA was 8.20200810 though because my server machine is built from the stable Nixpkgs channel; I will test that again with the most recent version tomorrow.
`--sameas` won't help here; the special remotes are accessible via the same FS (the second is just a btrfs snapshot of the first) and they'd still only count as one copy. That's the same situation I have right now.
Counting it as two copies would work but there is a large delay between having moved the files to the special remote and them actually being mirrored (residential internet upload) which means the numcopies of somewhat newly added files wouldn't be correct. It'd be a step up though.
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="Atemu"
avatar="http://cdn.libravatar.org/avatar/d1f0f4275931c552403f4c6707bead7a"
subject="comment 4"
date="2021-05-05T07:33:07Z"
content="""
It worked! Thank you so much you two!
The cipher was indeed different for some reason, what could cause that?
"""]]

View file

@ -0,0 +1,13 @@
I first mentioned this issue in a thread about 4 years ago (https://git-annex.branchable.com/forum/git-annex_across_two_filesystems/), and at time was encouraged to instead open a new thread. Priorities changed, and I'm only now returning to the issue.
The situation we have is as follows: We have a large collection of boundary condition data used in our weather/climate model. Individual "experiments" are run against specific versions of this data and we would like to minimize the total storage footprint as well as time spent copying data at the beginning of an experiment. The clones for the experiments would always be used in a read-only manner. New files would never be added through these repos.
At first glance, using git-annex with `git clone --shared` would seem to be a good solution. Unfortunately these experiments span a large number (~10) of separate cross-mounted filesystems and would result ~90% of the experiments still duplicating data rather than sharing across a hardlink.
A couple of partial solutions suggest themselves. (1) Put all of the clones on the same filesystem as the primary repo, and then create a symlink within each experiment back to the corresponding clone. (2) Maintain a secondary (fully populated) clone on each filesystem and ensure that the experiment setup script clones from the proper secondary.
Option (1) is viable, but would require some negotiations with the computing center to ensure that there is a single filesystem that gives appropriate privileges to all of our users. Tedious, but probably not a showstopper.
Option (2) sounds like an improvement over having 90% of the experiments duplicating data locally, except ... because the secondary clones would need to support any recent model configuration, the 10x duplication of "all" data could be much larger than the hundreds of copies of the smaller subsets needed by individual experiments.
Perhaps the ideal solution would be some sort of special "clone" that uses symlinks back to the primary repository. These special clones would be read only, and could even disable "dangerous" git actions that would allow adding/modifying files. `git-new-workdir` hints that something like this might be possible, but it does not appear to play nicely with git-annex in any event.

View file

@ -0,0 +1,14 @@
[[!comment format=mdwn
username="Lukey"
avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
subject="comment 1"
date="2021-05-05T18:08:49Z"
content="""
Have I understood you correctly, you have a \"primary\" repository (with all data/keys present), accessible by the clients via NFS/cifs/whatever? And the clients(/\"experiments\") want to check out a specific version/branch from that repo?
I think you have two alternatives to cloning it everywhere including all keys:
a) Every client clones the git repo (and remove the \"origin\" remote to ensure that nothing flows back), creates a symlink from `.git/annex/objects` to `/path/to/primary/.git/annex/objects` and checks out whatever version/branch it wants. Easy.
b) Every client uses the primary repo, but via its own worktree (See `git-worktree`). git-annex supports external worktrees, but I'm not sure what problems could arise in this particular setup.
"""]]

View file

@ -0,0 +1,14 @@
[[!comment format=mdwn
username="pat"
avatar="http://cdn.libravatar.org/avatar/6b552550673a6a6df3b33364076f8ea8"
subject="comment 8"
date="2021-05-05T19:39:11Z"
content="""
Do you have any information on actual times for working with big repos?
As an example, I created one with 400k files. After following the steps here, `git status` takes 8 seconds to complete. I have plenty of resources. So, it's just slow. I am curious what sort of times you're getting with your big repos.
I will have to see if submodules help with this at all. This material is all reference information, and isn't going to be changed very much. So it's possible I'd be better off with an \"active\" repo, and a \"reference\" repo (maybe connected by submodule, maybe not).
Joey did make the suggestion of storing those sorts of files in a separate branch. I just did a test, and it appears that the limiting factor is in fact the number of files in the working tree. Deleting a lot of the files brought git back up to speed. So from a simplicity standpoint, I may want to have a `reference` branch with those files in it. And perhaps have two local clones of the repo - one `main` and one `reference` so I can explore and copy files from `reference` to `main` as needed.
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="pat"
avatar="http://cdn.libravatar.org/avatar/6b552550673a6a6df3b33364076f8ea8"
subject="comment 9"
date="2021-05-05T22:51:28Z"
content="""
Separate branch is a no-go. `git annex info` takes 3 minutes 30 seconds to report 320k annex keys.
So for my purposes I think I will keep one slow reference repo, and one fast working repo.
"""]]

View file

@ -0,0 +1,13 @@
Currently, SHA256E creates duplicate files for different extensions, i.e.:
```
$ l * && l -Li * && sha256sum *
lrwxrwxrwx 1 atemu users 198 2021-05-04 03:47 random.1 -> .git/annex/objects/F9/Kk/SHA256E-s104857600--2fdbdc9c3b23d1986a743aede593765e57ade9f173f9fd9766057f0efd63197a.1/SHA256E-s104857600--2fdbdc9c3b23d1986a743aede593765e57ade9f173f9fd9766057f0efd63197a.1
lrwxrwxrwx 1 atemu users 198 2021-05-05 10:01 random.2 -> .git/annex/objects/Pm/J1/SHA256E-s104857600--2fdbdc9c3b23d1986a743aede593765e57ade9f173f9fd9766057f0efd63197a.2/SHA256E-s104857600--2fdbdc9c3b23d1986a743aede593765e57ade9f173f9fd9766057f0efd63197a.2
3720 -r--r--r-- 1 atemu users 100M 2021-05-04 03:47 random.1
49696 -r--r--r-- 1 atemu users 100M 2021-05-05 10:01 random.2
2fdbdc9c3b23d1986a743aede593765e57ade9f173f9fd9766057f0efd63197a random.1
2fdbdc9c3b23d1986a743aede593765e57ade9f173f9fd9766057f0efd63197a random.2
```
These have the exact same content though, they could be hardlinks of one another instead and nothing would change.

View file

@ -0,0 +1,3 @@
https://editorconfig.org/ is a cross-editor standard for setting formatting rules like indent etc.
A blurb of elisp probably isn't too useful to vim users and I had some really strange memory leak with it in my Emacs.

View file

@ -0,0 +1,20 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 2"
date="2021-05-04T16:53:18Z"
content="""
FWIW, verified that `git annex --debug initremote --sameas web datalad externaltype=datalad type=external encryption=none autoenable=true` makes `git-annex` to make `datalad` special remote to handle those urls. And since we do not have any prioritization handling in datalad we also grab the first one (the api. one) returned by git-annex and proceed with it.
So, indeed, if you do not like (or even just feel lukewarm) about an idea of adding costs within built-in `web` remote, feel welcome to close, and we will still have a way forward by providing such handling within datalad external special remote. It would be a bit sub-optimal since would require people to install datalad, but at least it would enable desired prioritization in some use cases (e.g. for QA `annex fsck --fast` run).
And indeed with the singular cost (not even a range of costs) assigned/returned by a remote and no e.g. cost provisioned to be returned by CLAIMURL, I guess there is no (easy) way to mix-in the URL based costs into overall decision making to order the remotes.
NB with `--sameas` trick above, `git-annex` doesn't even ask `datalad` with CLAIMURL and immediately passes `TRANSFER` of the key to `datalad` external remote. Without `--sameas` - `git-annex` (8.20210330-g0b03b3d) doesn't even bother asking datalad (within `whereis` at least) on either it could CLAIMURL those... even if I assign `annex-cost = 1.0` for datalad remote. Not sure yet if that is \"by design\".
> When it gets down to the web remote, it tries the urls in whatever order it happens to have them.
FWIW - I think I have tried to add them in different orders but it always went for the `api` one so I concluded that the order it has them is sorted and there is no way to \"tune it up\".
P.S. I still wonder why I have some memory of git-annex supporting some (external) way to prioritize URLs... may be it was indeed \"craft a special remote to do that\"...
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 2"
date="2021-05-04T17:50:45Z"
content="""
> It also seems to me that, if you're splitting a repo, you would also want to include things like trust.log and remote.log, or at least parts of them for some remotes?
yes. Even if not splitting but just copying a key (or multiple keys) since might need special remote configuration etc.
"""]]

View file

@ -0,0 +1,16 @@
[[!comment format=mdwn
username="Atemu"
avatar="http://cdn.libravatar.org/avatar/d1f0f4275931c552403f4c6707bead7a"
subject="comment 3"
date="2021-05-04T17:37:04Z"
content="""
Oh, I've already got all of that implemented; it's just the flag for disabling that behaviour at build time that's missing.
What I did is to conditionally set the executable to `/bin/cp` and the reflink param to `-c`.
The problem with using it without a fallback is that when you use it on a FS that doesn't support CoW, `/bin/cp` will hard-fail and make unlocking impossible. GNU coreutils actually fall back automatically by themselves, GA couldn't handle reflink cp failing before AFAICT. I refactored the copy functions a bit to make it fall back properly.
The reason I want it to be a configure flag is that some users might use GA exclusively on non-APFS FSs (trying to reflink copy here would be a waste of time) and some might prefer to use their $PATH's uutils-coreutils whose `cp` can handle `--reflink` just like the GNU ones.
~~I originally wanted to add it as a cabal configure flag but apparently you can't reference those anywhere?~~ Found this: https://stackoverflow.com/questions/48157516/conditional-compilation-in-haskell-submodule, that's probably what I'll end up doing. Will default to true on macOS.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Atemu"
avatar="http://cdn.libravatar.org/avatar/d1f0f4275931c552403f4c6707bead7a"
subject="comment 4"
date="2021-05-05T05:43:56Z"
content="""
https://github.com/Atemu/git-annex/tree/feature/macOS-reflinks
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="Atemu"
avatar="http://cdn.libravatar.org/avatar/d1f0f4275931c552403f4c6707bead7a"
subject="comment 5"
date="2021-05-05T06:04:49Z"
content="""
I've also got some small fixes for things that came up during development:
https://github.com/Atemu/git-annex/tree/misc-fixes
"""]]