Fix a reversion that made dead keys not be skipped when operating on all
keys via --all or in a bare repo. (Introduced in version 8.20200720)
Also, improved the documentation of git-annex-dead, it does not only apply
to fsck --all.
Also, made git-annex fsck, when run on a file whose key is dead, display
that. Before, it displayed that only when run with --all, but with this
fix, it skips dead keys with --all. But it can still be run on a file that
uses a dead key, and displaying "This key is dead" explains to the user
why it does not consider missing content for it to be a problem.
Sponsored-by: k0ld on Patreon
Use curl for downloads from git remotes when annex.url-options and other
git configs are set.
If the url needs a password, curl will fail, and git credential will not be
used to prompt for it. But the user can set --netrc in url-options and
put the password in the netrc file.
This also means that url-options settings like -4 will take effect.
That was the case before commit 1883f7ef8f
forced conduit to be used.
When accessing a git remote over http needs a git credential prompt for a
password, cache it for the lifetime of the git-annex process, rather than
repeatedly prompting.
The git-lfs special remote already caches the credential when discovering
the endpoint. And presumably commands like git pull do as well, since they
may download multiple urls from a remote.
The TMVar CredentialCache is read, so two concurrent calls to
getBasicAuthFromCredential will both prompt for a credential.
There would already be two concurrent password prompts in such a case,
and existing uses of `prompt` probably avoid it. Anyway, it's no worse
than before.
Fix crash importing from a directory special remote that contains a broken
symlink.
The crash was in listImportableContentsM but some other places in
Remote.Directory also seemed like they could have the same problem.
Also audited for other places that have such a problem. Not all calls
to getFileStatus are bad, in some cases it's better to crash on something
unexpected. For example, `git-annex import path` when the path is a broken
symlink should crash, the same as when it does not exist. Many of the
getFileStatus calls are like that, particularly when they involve
.git/annex/objects which should never have a broken symlink in it.
Fixed a few other possible cases of the problem.
Sponsored-by: Lawrence Brogan on Patreon
Trick the linker into not doing unncessary work searching for optimised
libraries that are not present, by symlinking the directories where
optimised libs would be to the main lib dir.
This reduces the ENOENT of git-annex init by about 1/2. The linker always
finds the files where it looks first time now. I have not looked at what
the wall clock speedup might be, it's probably rather small.
If a x86-64-v5 comes to be, the list will need to be extended. And there
may be other directories used on some machines that I have missed. Not done
for arm64 yet, or any uncommon architectures.
Sponsored-by: Dartmouth College's Datalad project
For some reason, cabal 3.4.1.0 builds w/o the assistant and webapp,
even when the flag is explicitly turned on. Moving the build-depends from
inside the if flag section to the main build-depends somehow fixes this.
Since the webapp build deps are thus always available, there is no reason
not to build the webapp when building the assistant. So, got rid of the
webapp build flag. Kept the assistant build flag for now, since building
without it does at least still speed up the build.
Sponsored-by: Brock Spratlen on Patreon
Too big a footgun.
This does not prevent attackers who can write to the directory being
imported from racing the check. But they can cause anything to be imported
anyway, so would be limited to getting the legacy import to follow into a
directory they do not write to, and move files out of it into the annex.
(The directory special remote does not have that problem since it does not
move files.)
Sponsored-by: Jack Hill on Patreon
Fix a regression in 10.20220624 that caused git-annex add to crash when
there was an unstaged deletion.
Sponsored-by: Dartmouth College's Datalad project
Use curl when annex.security.allowed-url-schemes includes an url scheme not
supported by git-annex internally, as long as
annex.security.allowed-ip-addresses is configured to allow using curl.
Sponsored-by: Luke Shumaker on Patreon
WIP: This is mostly complete, but there is a problem: createDirectoryUnder
throws an error when annex.dbdir is set to outside the git repo.
annex.dbdir is a workaround for filesystems where sqlite does not work,
due to eg, the filesystem not properly supporting locking.
It's intended to be set before initializing the repository. Changing it
in an existing repository can be done, but would be the same as making a
new repository and moving all the annexed objects into it. While the
databases get recreated from the git-annex branch in that situation, any
information that is in the databases but not stored in the branch gets
lost. It may be that no information ever gets stored in the databases
that cannot be reconstructed from the branch, but I have not verified
that.
Sponsored-by: Dartmouth College's Datalad project
Since bup split is not concurrency safe.
Used a lock file so that 2 git-annex processes only run one bup split
between them (per bup repo).
(Concurrent writes from different git-annex repository clones to the same
bup repo could still have concurrency problems.)
Sponsored-by: Noam Kremen on Patreon
It seems worth noting here that I emailed bup's author about bup split
being noisy on stderr even with -q in approximately 2011. That never got
fixed. Its current repo on github only accepts pull requests, not bug
reports. Needing to add such complexity to deal with such a longstanding
unfixed issue is not fun.
Sponsored-by: Kevin Mueller on Patreon
bup split outputs to stderr even with -q. This was discarded when using -J,
but it was still outputting when not using -J, and so was git-annex.
Sponsored-by: Nicholas Golder-Manning on Patreon
Work around bug in git 2.37 that causes a segfault when when
core.untrackedCache is set, and broke git-annex init.
Depending on when git gets fixed and how widely the buggy versions are
used, this could be reverted quite soon, or need to linger for a long time.
It only makes git-annex init a tiny bit slower in a new repo.
Sponsored-by: Max Thoursie on Patreon
This is intended for users who want to see what it would output in order to
eg, check if a file would be added to git or the annex. It is not intended
as a way for scripts to get information.
Sponsored-by: Dartmouth College's Datalad project
Last try at this broke on windows with a problem installing ghc, but I
wanted to try again.
Also this has a version of aws that allows using aeson 2.0, which has a
potential security fix.
(And v9 later on to v10.)
When v9/v10 were added, making v8 automatically upgrade was deferred
"for a few months" to prevent interoperability problems if users also
have an old version of git-annex. Of course that could still be the
case, but there has been a good amount of time and this can't be put off
forever.
Allow setting annex.autoupgraderepository to false to avoid this upgrade.
Previously, that only prevented upgrades from no longer supported git-annex
versions, but v8 is still supported, and users may want to keep on v8 to
interoperate with an old git-annex version.
Sponsored-by: Boyd Stephen Smith Jr. on Patreon
I would like for a new repo version to enable appends, but to do so
safely would need a v11 followed by a 1 year delay followed by a v12
that does it. Since a similar v9 and v10 transition is currently
happening, and is less than 6 months along in most repos, it does not
feel wise to stack up another year-long transition behind that. What if
I need to hurry up a new repo version for some other change?
Added todo so I remember to make this change at some time when a v11
and probably v12 repo version do make sense.
Sponsored-by: Dartmouth College's DANDI project
Added annex.alwayscompact setting which can be unset to speed up writes to
the git-annex branch in some cases.
Sponsored-by: Dartmouth College's DANDI project
Fix a reversion that prevented --batch commands (and the assistant)
from noticing data written to the journal by other commands.
I have not identified which commit broke this for sure,
but probably it was aeca7c2207
--batch commands that wrote to the journal avoided the problem since
journalIgnorable sets unset on write. It's a little bit surprising that
nobody noticed that query --batch commands did not see data written by
other commands.
Sponsored-by: Dartmouth College's DANDI project
It does not make sense for either; importing from an existing bucket should
not write to it. And the user may not have write access at all. And exporting to
a bucket should not write other files.
Also this prevents the uuid file being imported after being written.
Sponsored-by: Dartmouth College's DANDI project
To avoid using find -printf, which was first supported in Android around
2019-2020.
Probing seems too fragile, and execing stat once per file is too slow to do
when there's a faster way available, which brought me to an option...
Sponsored-by: Brett Eisenberg on Patreon
This caused git to complain that filter-process failed and kill it with
signal 15. Because it wrote an extra flushPkt for an empty file, which
git did not expect, and so git saw an unexpected response to the next
request.
Luckily, filter-process is only used by default in v9 and up, and v8 is
still the default. Also, git had to be updating an empty file, followed
by another file, which is a fairly unlikely situation. And git restarts
filter-process after this happens and uses it to filter the rest of the
files. So this isn't a crippling bug.
Sponsored-by: Luke Shumaker on Patreon
This reverts commit d2bc268317.
That seemed to break building on windows, before it starts building
git-annex at all, it tried to install ghc and something blew up:
Processing archive: C:\Users\runneradmin\AppData\Local\Programs\stack\x86_64-windows\ghc-9.0.2.tar.xz
Extracting ghc-9.0.2.tar
...
Extracted total of 11790 files from ghc-9.0.2.tar
C:\Users\runneradmin\AppData\Local\Programs\stack\x86_64-windows\ghc-9.0.2-tmp-6d0fbe7f3b29e56c\ghc-9.0.2\: renameDirectory:pathIsDirectory:CreateFile "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\Programs\\stack\\x86_64-windows\\ghc-9.0.2-tmp-6d0fbe7f3b29e56c\\ghc-9.0.2\\": does not exist (The system cannot find the file specified.)
Hopefully a newer ghc version or updated stackage version will fix this
at some point, in the meantime revert it.
The webapp modules cannot build with the assistant disabled, so make the
webapp be under the assistant build flag.
Sponsored-by: Jarkko Kniivilä on Patreon
--backend is no longer a global option, and is only accepted by commands
that actually need it.
Three commands that used to support backend but don't any longer are
watch, webapp, and assistant. It would be possible to make them support it,
but I doubt anyone used the option with these. And in the case of webapp
and assistant, the option was handled inconsistently, only taking affect
when the command is run with an existing git-annex repo, not when it
creates a new one.
Also, renamed GlobalOption etc to AnnexOption. Because there are many
options of this type that are not actually global (any more) and get
added to commands that need them.
Sponsored-by: Kevin Mueller on Patreon
Improve handling of parallelization with -J when copying content from/to a
git remote that is a local path.
Sponsored-by: Nicholas Golder-Manning on Patreon
When adding a small file, it does not get locked down, so can be modified
after git-annex checks that it's small. The use of queued git add made the
race window nice and wide too.
Fixed by checking if the file has changed, and by not using git add.
Instead, have to recapitulate git add's handling of things like symlinks
and executable files.
Sponsored-by: Jochen Bartl on Patreon
The remaining callers all did not rely on it checking gitignore, so were
easy to convert.
They were susceptable to the same overwrite race as add and fix,
although less likely to have it and a narrower window than add's race.
Command.Rekey in passing got an unncessary call to removeFile deleted.
addSymlink handles deleting any existing worktree file.
Similar to git-annex add, git-annex fix queued git add, so if a file
got modified before git add ran, the wrong content would be staged,
perhaps a large file content.
Sponsored-by: Brock Spratlen on Patreon
This is not a complete fix for all such races, only the one where a
large file gets changed while adding and gets added to git rather than
to the annex.
addLink needs to go away, any caller of it is probably subject to the
same kind of race. (Also, addLink itself fails to check gitignore when
symlinks are not supported.)
ingestAdd no longer checks gitignore. (It didn't check it consistently
before either, since there were cases where it did not run git add!)
When git-annex import calls it, it's already checked gitignore itself
earlier. When git-annex add calls it, it's usually on files found
by withFilesNotInGit, which handles checking ignores.
There was one other case, when git-annex add --batch calls it. In that
case, old git-annex behaved rather badly, it would seem to add the file,
but git add would later fail, leaving the file as an unstaged annex symlink.
That behavior has also been fixed.
Sponsored-by: Brett Eisenberg on Patreon
move: Improve resuming a move that succeeded in transferring the content,
but where dropping failed due to eg a network problem, in cases where
numcopies checks prevented the resumed move from dropping the object from
the source repository.
This was earlier done for moves that got interrupted during the drop stage.
Sponsored-by: Svenne Krap on Patreon
Fix retrival of an empty file that is stored in a special remote with
chunking enabled.
The speculative chunk stuff caused a reversion by adding an empty list for
the empty file. Which is just wrong; the empty file is still stored on the
remote, and should be retrieved like any other file. It uses 1 chunk, so
`max 1` is the simple fix.
Sponsored-by: Noam Kremen on Patreon
rather than matching path of an existing remote to find the uuid.
The main benefit of this is that locations not using ssh:// will work
now, including both paths and host:/path
The other benefit is that it's a simpler interface, no need to have an
existing remote with the same url and some other name. Although that
will still work of course.
This does rely on tryGitConfigRead working when given a Git.Repo that is
not a remote. Luckily, it works fine that way.
Also, tryGitConfigRead will auto-init a local repo that has a git-annex
branch. I did not enable auto-init of ssh repos though.
The uuid discovery actually happens twice; initremote discovers it,
and uses it to store the special remote config, but does not set it in the
git remote it creates. So the next run of git-annex does uuid discovery
again, and caches it that time. This could be improved for a tiny
speedup, but I didn't want to complicate things for that in this
commit.
Sponsored-by: Dartmouth College's DANDI project
Use cases include using git-annex init --no-autoenable and then going back
and enabling the special remotes that have autoenable configured. As well
as just querying to remember which ones have it enabled.
It lists all special remotes that have autoenable=yes whether currently
enabled or not. And it can be used with --json.
I pondered making this "git-annex info autoenable", but that seemed wrong
because then if the use has a directory named "autoenable", it's unclear
what they are asking for. (Although "git-annex info remote" may be
similarly unclear.) Making it an option does mean that it can't be provided
via --batch though.
Sponsored-by: Dartmouth College's Datalad project
Someone may disagree with what repositories are set to autoenable and
it's good to have local overrides.
See https://github.com/datalad/datalad/issues/6634
Sponsored-by: Dartmouth College's Datalad project