This makes annexFileMode be just an application of setAnnexPerm',
which avoids having 2 functions that do different versions of the same
thing.
Fixes some buggy behavior for some combinations of core.sharedRepository
and umask.
Sponsored-by: Jack Hill on Patreon
These two missed setting it.
It rarely matters that the journal gets the right perm. But, when using
annex.alwayscommit=false, someone else may come along later and want
to append to the journal file.
It probably never matters what the sentinal perms are, but for
completeness..
Sponsored-by: Luke Shumaker on Patreon
init: Bug fix: Create .git/annex/ and .git/annex/fsckdb/ directories with
permissions configured by core.sharedRepository.
The fsckfb being created happens to create .git/annex/ and it was not using
createAnnexDirectory. Probably a reversion partly, but maybe the database
directory was always created not honoring core.sharedRepository?
Sponsored-by: Noam Kremen on Patreon
This spams the user with a lot of messages, but it seems like busywork to
avoid that and only warn once, since this warning will go away when it gets
implemented.
Also fix parsing of the octal value.
Sponsored-by: Kevin Mueller on Patreon
That's too much quoting, the user expects the filename to be copy and
pasteable. It would be ok to slash-escape space ('\ ')
which is what gnu find does, but it doesn't seem necessary either.
${escaped_file} has always quoted spaces though, so keep on doing it
there.
Sponsored-by: Nicholas Golder-Manning on Patreon
When a nonexistant file is passed to a command and --json-error-messages
is enabled, output a JSON object indicating the problem.
(But git ls-files --error-unmatch still displays errors about such files in
some situations.)
I don't like the duplication of the name of the command introduced by this,
but I can't see a great way around it. One way would be to pass the Command
instead.
When json is not enabled, the stderr is unchanged. This is necessary
because some commands like find have custom output. So dislaying
"find foo not found" would be wrong. So had to complicate things with
toplevelFileProblem having different output with and without json.
When not using --json-error-messages but still using --json, it displays
the error to stderr, but does display a json object without the error. It
does have an errorid though. Unsure how useful that behavior is.
Sponsored-by: Dartmouth College's Datalad project
This reverts commit a325524454.
Turns out this was predicated on an incorrect belief that json output
didn't already sometimes lack the "key" field. Since json output already
can when `giveup` was used, it seems unncessary to add a whole new
option for this.
Added a --json-exceptions option, which makes some exceptions be output in json.
The distinction is that --json-error-messages is for messages relating
to a particular ActionItem, while --json-exceptions is for messages that
are not, eg ones for a file that does not exist.
It's unfortunate that we need two switches with such a fine distinction
between them, but I'm worried about maintaining backwards compatability
in the json output, to avoid breaking anything that parses it, and this was
the way to make sure I didn't.
toplevelWarning is generally used for the latter kind of message. And
the other calls to toplevelWarning could be converted to showException. The
only possible gotcha is that if toplevelWarning is ever called after
starting acting on a file, it will add to the --json-error-messages of the
json displayed for that file and converting to showException would be a
behavior change. That seems unlikely, but I didn't convery everything to
avoid needing to satisfy myself it was not a concern.
Sponsored-by: Dartmouth College's Datalad project
Propagate Annex.force into the remote's Annex state.
Fixes this problem:
joey@darkstar:~/tmp/xxxx>git-annex copy mmm --to origin --force
copy mmm (to origin...)
not enough free space, need 908.72 MB more (use --force to override this check or adjust annex.diskreserve)
failed to send content to remote
failed
Does beg the question if anything else should be propagated.
Some things like Annex.forcenumcopies certianly not; using --numcopies
overrides the number of copies the current repo wants, not all of them.
Sponsored-by: Graham Spencer on Patreon
New command, currently limited to changing autoenable= setting of a special remote.
It will probably never be used for more than that given the limitations on
it.
Sponsored-by: Brock Spratlen on Patreon
enableremote: Support enableremote of a git remote (that was previously set
up with initremote) when additional parameters such as autoenable= are
passed.
The enableremote special case for regular git repos is intended to handle
ones that don't have a UUID probed, and the user wants git-annex to
re-probe. So, that special case is still needed. But, in that special
case, the user is not passing any extra parameters. So, when there are
parameters, instead run the special remote setup code. That requires there
to be a uuid known already, and it allows changing things like autoenable=
Remote.Git.enableRemote changed to be a no-op if a git remote with the name
already exists. Which it generally will in this case.
Sponsored-by: Jack Hill on Patreon
These are quite low-level, but still there is no point in displaying
escape sequences that have been embedded in a key to the terminal.
I think these are the only remaining commands that didn't use safe
output, except for cases where git-annex is speaking a protocol to
itself.
Sponsored-by: Kevin Mueller on Patreon
I'm on the fence about this. Notice that pulling from a git remote can
pull branches that have escape sequences in their names. Git will
display those as-is. Arguably git should try harder to avoid that.
But, names of remotes are usually up to the local user, and autoenable
changes that, and so it makes sense that git chooses to display control
characters in names of remotes, and so autoenable needs to guard against
it.
Sponsored-by: Graham Spencer on Patreon
Searched for uses of putStr and hPutStr and changed appropriate ones to filter
out control characters and quote filenames.
This notably does not make find and findkeys quote filenames in their default
output. Because they should only do that when stdout is non a pipe.
A few commands like calckey and lookupkey seem too low-level to make sense to filter
output, so skipped those.
Also when relaying output from other commands that is not progress output,
have git-annex filter out control characters.
Sponsored-by: k0ld on Patreon
As well as escape sequences, control characters seem unlikely to be desired when
doing addurl, and likely to trip someone up. So disallow them as well.
I did consider going the other way and allowing filenames with control characters
and escape sequences, since git-annex is in the process of escaping display
of all filenames. Might still be a better idea?
Also display the illegal filename git quoted when it rejects it.
Sponsored-by: Nicholas Golder-Manning on Patreon
This is by no means complete, but escaping filenames in actionItemDesc does
cover most commands.
Note that for ActionItemBranchFilePath, the value is branch:file, and I
choose to only quote the file part (if necessary). I considered quoting the
whole thing. But, branch names cannot contain control characters, and while
they can contain unicode, git coes not quote unicode when displaying branch
names. So, it would be surprising for git-annex to quote unicode in a
branch name.
The find command is the most obvious command that still needs to be
dealt with. There are probably other places that filenames also get
displayed, eg embedded in error messages.
Some other commands use ActionItemOther with a filename, I think that
ActionItemOther should either be pre-sanitized, or should explicitly not
be used for filenames, so that needs more work.
When --json is used, unicode does not get escaped, but control
characters were already escaped in json.
(Key escaping may turn out to be needed, but I'm ignoring that for now.)
Sponsored-by: unqueued on Patreon
registerurl: When an url is claimed by a special remote other than the web,
update location tracking for that special remote.
registerurl's behavior was changed in commit
451171b7c1, apparently accidentially to not
update location tracking except for the web.
This makes registerurl followed by unregisterurl not be a no-op, when the
url happens to be claimed by a remote other than the web. It is a noop when
the url is unclaimed except by the web. I don't like the inconsistency,
and wish that registerurl and unregisterurl never updated location
tracking, which would be more in keeping with them being plumbing.
But there is the fact that it used to behave this way, and also it was
inconsistent that it updated location tracking for the web but not for
other remotes, unlike addurl. And there's an argument that the user might
not know what remote to expect to claim an url, so would be considerably in
the dark when using registerurl. (Although they have to know what content
gets downloaded, since they specify a key..)
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
This serves two purposes. --remote=web bypasses other special remotes that
claim the url, same as addurl --raw. And, specifying some other remote
allows making sure that an url is claimed by the remote you expect,
which makes then using setpresentkey not be fragile.
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
This reverts commit 66eb63dd82.
git-annex init is the only thing that uses ensureCommit. So overriding
there will make later commits to the git-annex branch or by git-annex sync
fail.
It's ugly that git-annex init sets user.name and user.email, but it only
does it on systems that are badly configured.
When it's set and git cannot determine user.name or user.email, this will
result in git-annex init failing when committing to create the git-annex
branch. Other git-annex commands that commit can also fail.
Sponsored-by: Jack Hill on Patreon
Avoid setting user.name and user.email in the git config when git is unable
to detect them.
git-annex has good reason to want to ensure git commit succeeds when eg
committing to the git-annex branch. But it's not playing nice to set these
values where other commands can see them.
Sponsored-by: Brett Eisenberg on Patreon
Fix laziness bug introduced in last release that breaks use of
--unlock-present and --hide-missing adjusted branches.
Since there is a writeFile of the same file immediately after readFile, it
may still have the file open for read (or may have happened to read it
already and closed it).
I was not able to reproduce the problem in brief testing, but this seems
obvious.
Sponsored-by: Luke Shumaker on Patreona
Support VERSION 2 in the external special remote protocol, which is
identical to VERSION 1, but avoids external remote programs neededing to
work around the above bug. External remote program that support
exporttree=yes are recommended to be updated to send VERSION 2.
Sponsored-by: Kevin Mueller on Patreon
Fix bug that caused broken protocol to be used with external remotes that
use exporttree=yes. In some cases this could result in the wrong content
being exported to, or retrieved from the remote.
Sponsored-by: Nicholas Golder-Manning on Patreon
Remote.Directory makes a temp file, then calls this, and since the temp
file exists, it prevented probing if CoW works.
Note that deleting the empty file does mean there's a small window for a
race. If another process is also exporting to the remote, that could let it
make the same temp file. However, the temp filename actually has the
processes's pid in it, which avoids that being a problem.
This may have been a reversion caused by commits around
63d508e885, but I haven't gone back and
tested to be sure. The directory special remote had supposedly supported
CoW for this going back to about half a year before that.
Sponsored-by: Graham Spencer on Patreon
The temporary URL key used for the download, before the real key is
generated, was blocked by annex.securehashesonly.
Fixed by passing the Backend that will be used for the final key into
runTransfer. When a Backend is provided, have preCheckSecureHashes
check that, rather than the key being transferred.
Sponsored-by: unqueued on Patreon
That is a legal url, but parseUrl parses it to "/c:/path"
which is not a valid path on Windows. So as a workaround, use
parseURIPortable everywhere, which removes the leading slash when
run on windows.
Note that if an url is parsed like this and then serialized back
to a string, it will be different from the input. Which could
potentially be a problem, but is probably not in practice.
An alternative way to do it would be to have an uriPathPortable
that fixes up the path after parsing. But it would be harder to
make sure that is used everywhere, since uriPath is also used
when constructing an URI.
It's also worth noting that System.FilePath.normalize "/c:/path"
yields "c:/path". The reason I didn't use it is that it also
may change "/" to "\" in the path and I wanted to keep the url
changes minimal. Also noticed that convertToWindowsNativeNamespace
handles "/c:/path" the same as "c:/path".
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
view: Support annex.maxextensionlength when generating filenames for the
view branch.
Note that refining an existing view will reuse the extension length that was
configured when initially constructing the view. This is necessarily the case
because it reuses the filenames.
Also view files used to have all extensions at the end, no matter how
many there were. Since annex.maxextensionlength's documentation includes
that it's limited to 2 extensions, I made it consistent with that.
Sponsored-by: k0ld on Patreon
I don't know of scenarios where that can happen (besides the bug
fixed by the parent commit), but there probably are some.
Sponsored-by: Boyd Stephen Smith Jr. on Patreon
Avoid failure to update adjusted branch --unlock-present after git-annex
drop when annex.adjustedbranchrefresh=1
At higher values, it did flush the queue, which ran restagePointerFiles.
But at 1, adjustedBranchRefreshFull gets added to the queue, and while
restagePointerFiles is also in the queue, it runs after that.
Sponsored-by: Brock Spratlen on Patreon
Such an url is not valid; parseURI will fail on it. But git-annex doesn't
actually need to parse the url, because all it needs to do to support
syncing with it is know that it's not a local path, and use git pull and
push.
(Note that there is no good reason for the user to use such an url. An
absolute url is valid and I patched git-remote-gcrypt to support them
years ago. Still, users gonna do anything that tools allow, and
git-remote-gcrypt still supports them.)
Sponsored-by: Jack Hill on Patreon
copy: When --from and --to are combined and the content is already present
on the destination remote, update location tracking as necessary.
Sponsored-by: Dartmouth College's DANDI project
A repository can have a newline in its description due to being in a
directory containing a newline, or due to git-annex describe being
passed a string with a newline in it for some reason. Putting that
newline in uuid.log breaks its format.
So, escape the newline when it enters uuid.log, to \n
This is a one-way escaping, it is not converted back to a newline
when reading the log. If it were, commands like git-annex info and
whereis would display a multi-line description, which could be confusing
to read.
And, implementing roundtripping would necessarily cause problems if an
old version of git-annex were used to set a description that contained
whatever special character is used to escape the \n. Eg, a \ or if
it used the ! prefix before base64 data that is used in some other logs,
the ! character. Then the description set by the old git-annex would not
roundtrip.
There just doesn't seem to be any benefit of roundtripping newlines through,
so why bother? And, git often displays \n for newline when a filename
contains a newline, so git-annex doing it in this case seems sorta ok
by analogy to git.
(Some other git-annex logs can also have newlines put into them if the
user really wants to break git-annex. For example:
git-annex config annex.largefiles "foo
bar"
The full list is probably config.log, remote.log, group.log,
preferred-content.log, required-content.log,
group-preferred-content.log, schedule.log. Probably there is no
good reason to use a newline in any of these, and the breakage is
probably limited to the bad data the user put in not coming back out.
And users can write any garbage to log files themselves manually in any
case. So, I am not going to address all of those at this time. If a
problem such as this one with the newline in the repository path comes
up, it can be dealt with on a case by case basis.)
Sponsored-by: Dartmouth College's Datalad project
When importing a bunch of feeds, this makes it more clear what it's working
on. Also, I sometimes want to delete a particular feed from a list of feeds
but don't know which url belongs to the feed, and this solves that.
Control characters are filtered out just to protect against some feed
putting escape character stuff in the feed, which could be a
security problem. (Control characters also get filtered out of
importfeed filenames.)
Sponsored-by: Luke Shumaker on Patreon
Added arm64 build for ancient kernels, needed to support Android phones
whose kernels are too old to support kernels used by the current arm64
build.
Updated Android/git-annex-install to use it. (Also made it use i386-ancient
because that seems like a good idea.)
Sponsored-by: Noam Kremen on Patreon
sync: Fix a reversion that prevented sending files to exporttree=yes
remotes when annex-tracking-branch was configured to branch:subdir
(Introduced in version 10.20230214)
Sponsored-by: Kevin Mueller on Patreon
Works around this bug in unix-compat:
https://github.com/jacobstanley/unix-compat/issues/56
getFileStatus and other FilePath using functions in unix-compat do not do
UNC conversion on Windows.
Made Utility.RawFilePath use convertToWindowsNativeNamespace to do the
necessary conversion on windows to support long filenames.
Audited all imports of System.PosixCompat.Files to make sure that no
functions that operate on FilePath were imported from it. Instead, use
the equvilants from Utility.RawFilePath. In particular the
re-export of that module in Common had to be removed, which led to lots
of other changes throughout the code.
The changes to Build.Configure, Build.DesktopFile, and Build.TestConfig
make Utility.Directory not be needed to build setup. And so let it use
Utility.RawFilePath, which depends on unix, which cannot be in
setup-depends.
Sponsored-by: Dartmouth College's Datalad project
As far as I can see, git-annex status was added to support direct mode, and
like other things added for that, it ought to be deprecated.
Behavior is similar to git status --short, though not identical in a few
cases eg renamed files.
I think datalad does not use this command, although it might have in the
past. Could not find any use of it in the current datalad code.
A deprecation warning at runtime would be the next step, probably will wait
and do that for all the deprecated commands together (except findref).
Well, perhaps it could be documented better, but it's a compositional
feature so users who need it will probably try it and be happy to find
that it works.
When generating the view, check if the key is present.
When syncing in a view branch with an adjustment, run adjustedBranchRefreshFull
the same as is done when syncing in other adjusted branches. This is
needed because the docs for git-annex adjust --unlock-present suggest
using git-annex sync to update the branch when annex.adjustedbranchrefresh
is not set.
Note that, with annex.adjustedbranchrefresh set, it just works! The
adjusted branch gets updated in the usual way and it doesn't matter that
there's a view branch underneath.
And of course, re-running git-annex adjut --unlock-present also works,
as suggested in the docs.
Sponsored-by: Erik Bjäreholt on Patreon
Old message was:
sqlite query crashed: thread blocked indefinitely in an MVar operation
New message is eg:
sqlite worker thread crashed: SQLite3 returned ErrorCan'tOpen while attempting to perform open ".git/annex/keysdb/db".
The worker thread used to throw an exception. But before that
exception was seen by anything waiting on the worker thread to
finish, the takeMVar in queryDb would have crashed with
BlockedIndefinitelyOnMVar.
Sponsored-by: k0ld on Patreon
git-annex.cabal: Move webapp build deps under the Assistant build flag so
git-annex can be built again without yesod etc installed.
Commit 78440ca37d got rid of the webapp build
flag to work around what was apparently a cabal bug. It moved the webapp
build deps to the main build-depends list. But that prevents building
git-annex when yesod etc are not installed.
Putting them under the Assistant build flag seems to not tickle that cabal
bug, and lets git-annex build automatically with the assistant disabled
when the webapp build deps are not installed.
I hypotehesize that the problem may have involved build-depends nested
behind two build flags. Also, cabal clean may need to be run in order
for cabal to find the right solution after this change, when building in
a directory where cabal configure had been run before.
Also moved 3 modules that are needed to build git-annex w/o the assistant
out from under the Assistant build flag.
Sponsored-by: Brock Spratlen on Patreon
This reverts commit 648e59cac2.
Failed to build on windows, because
In the dependencies for haskeline-0.8.2:
Win32-2.11.1.0 from Stack configuration does not match >=2.1 && <2.10 || >=2.12 (latest
matching version is 2.13.4.0)
jkniiv did find a solution that builds:
-- Win32-2.11.1.0
+- Win32-2.9.0.0
+- Cabal-3.6.3.0
+- directory-1.3.7.1
+- process-1.6.17.0
+- time-1.11.1.2
But that is a quite old version of Win32 and risks bugs from it, and bumping
Cabal and directory to newer than lts-19.33 has seems also likely to be risky.
So, I've given up. aws-0.24 won't be able to be in the stack build until
there's a stackage lts (or nightly) that has filepath (>=1.4.100.0),
which will not happen until sometime after the next ghc release.
info: Fix reversion in last release involving handling of unsupported input
by continuing to handle any other inputs, before exiting nonzero at the
end.
Sponsored-by: Dartmouth College's Datalad project
view: Fix a reversion in 10.20230214 that omitted a file from a view when
the file had no metadata set, but the view only used path fields.
Sponsored-by: Jack Hill on Patreon
This enables some new features that need the new aws.
Win32 downgraded from the version in lts-19.33 because git-annex does not build
with that version, and newer versions of Win32 need a newer filepath version,
which can't be upgraded while using lts-19.33.
Sponsored-By: Brett Eisenberg on Patreon
path to a bare repo when git config is not allowed to list the configs
due to the CVE-2022-24765 fix.
That resulted in a confusing error message, and prevented the nice
message that explains how to mark the repo as safe to use.
Made isBare a tristate so that the case where core.bare is not returned can
be handled.
The handling in updateLocation is to check if the directory
contains config and objects and if so assume it's bare.
Note that if that heuristic is somehow wrong, it would construct a repo
that thinks it's bare but is not. That could cause follow-on problems,
but since git-annex then checks checkRepoConfigInaccessible, and skips
using the repo anyway, a wrong guess should not be a problem.
Sponsored-by: Luke Shumaker on Patreon
Used to fail with a bad error message, indicating there was no
repository with the specified name, or something like that. Now, suggest
they use the uuid to disambiguate.
* info, enableremotemote, renameremote: Avoid a confusing message when more
than one repository matches the user provided name.
* info: Exit nonzero when the input is not supported.
Sponsored-by: Kevin Mueller on Patreon
A benchmark in my sound repository with `git-annex view feedtitle=*`
took 2:52 wall clock time before and 1:58 after. Though it still only used
130% of CPU.
This is the same kind of optimisation that is in seekFilteredKeys, though
that precaches location logs while this streams the metadata logs direct
to parsing them.
seekFilteredKeys contains more streaming, to find the annexed files, and
this could be further sped up with similar streaming.
Sponsored-by: Nicholas Golder-Manning on Patreon
sync: Fix a bug that caused files to be removed from an importtree=yes
exporttree=yes special remote when the remote's annex-tracking-branch was
not the currently checked out branch.
Sponsored-by: Max Thoursie on Patreon
* sync: When run in a view branch, refresh the view branch to reflect any
changes that have been made to the parent branch or metadata.
This is basically working, but probably needs some more work to deal with
all the edge cases of things sync does.
Sponsored-by: Lawrence Brogan on Patreon
* sync: Avoid pushing view branches to remotes.
* Changed the name of view branches to include the parent branch.
Existing view branches checked out using an old name will still work.
It does not seem useful for sync to push view branches around, because
the information in a view branch can entirely be derived from other
information in git. And sync doesn't push adjusted branches around either.
The better view branch names make it more in line with adjusted branch
names, but were also needed to make fromViewBranch be able to return the
original branch name.
Kept the old view branch names still working. But, when those branches
exist in a repo, sync will still try to push them as before. Avoiding
that would need more complicated and/or expensive changes to sync.
Sponsored-By: Boyd Stephen Smith Jr. on Patreon
* view: New field?=glob and ?tag syntax that includes a directory "_"
in the view for files that do not have the specified metadata set.
* Added annex.viewunsetdirectory git config to change the name of the
"_" directory in a view.
When in a view using the new syntax, old git-annex will fail to parse the
view log. It errors with "Not in a view.", which is not ideal. But that
only affects view commands.
annex.viewunsetdirectory is included in the View for a couple of reasons.
One is to avoid needing to warn the user that it should not be changed when
in a view, since that would confuse git-annex. Another reason is that it
helped with plumbing the value through to some pure functions.
annex.viewunsetdirectory is actually mangled the same as any other view
directory. So if it's configured to something like "N/A", there won't be
multiple levels of directories, which would also confuse git-annex.
Sponsored-By: Jack Hill on Patreon
S3: Support a region= configuration useful for some non-Amazon S3
implementations. This feature needs git-annex to be built with aws-0.24.
datacenter= sets both the AWS hostname and region in one setting, which is
easy when using AWS, but not useful for other hosts. So kept datacenter
as-is, but added this additional config.
Sponsored-By: Brett Eisenberg on Patreon
Allowing --from and --to as an alternative to --from or --to
is hard to do with optparse-applicative!
The obvious approach of (pfrom <|> pto <|> pfromandto) does not work
when pfromandto uses the same option names as pfrom and pto do.
It compiles but the generated parser does not work for all desired
combinations.
Instead, have to parse optionally from and optionally to. When neither
is provided, the parser succeeds, but it's a result that can't be
handled. So, have to giveup after option parsing. There does not seem to
be a way to make an optparse-applicative Parser give up internally
either.
Also, need seek' because I first tried making fto be a where binding,
but that resulted in a hang when git-annex move was run without --from
or --to. I think because startConcurrency was not expecting the stages
value to contain an exception and so ended up blocking.
Sponsored-by: Dartmouth College's DANDI project
I've long been asked for `git-annex find --all` or something like that,
but pushed back on it because I feel that the command is analagous to
find(1) and so it would be surprising for it to list keys rather than
files. So instead, add a new findkeys subcommand.
Note that the use of withKeyOptions is rather strange because usually
that is used to fall back to --all rather than listing files, but here
it's made to default to --all like behavior and never list files.
A performance thing that could be improved is that withKeyOptions
always reads and caches location logs. But findkeys with no options does
not need them, so it could be made faster. That caching does speed up
options like --in though. This is really just a subset of a more general
performance thing that --all reads location logs sometimes unncessarily.
Anyway, it needs to read the location log in order to checkDead,
and it seems good that findkeys does skip dead keys.
Also, cleaned up comments on git-annex-find man page asking for --all
option.
Sponsored-by: Dartmouth College's DANDI project
Note that when this is specified and an older git-annex is used to
enableremote such a special remote, it will simply ignore the cost= field
and use whatever the default cost is.
In passing, fixed adb to support the remote.name.cost and
remote.name.cost-command configs.
Sponsored-by: Dartmouth College's DANDI project
Allow initremote of additional special remotes with type=web, in addition
to the default web special remote.
When --sameas=web is used, these provide additional names for the web
special remote, and may also have their own additional configuration
(once there is any for the web special remote) and cost.
Sponsored-by: Dartmouth College's DANDI project
AFAICS all git-annex builds are using the git-lfs library not the vendored
copy.
Debian stable does have a too old haskell-git-lfs package to be able to
build git-annex from source, but there is not currently a backport of a
recent git-annex to Debian stable. And if they update the backport at some
point, they should be able to backport the library too.
Sponsored-by: Svenne Krap on Patreon
In Makefile, listed additional deps of Build/Standalone. Without that,
it does not get updated for the change to Utility/LinuxMkLibs.hs when
compiling incrementally.
Sponsored-by: Dartmouth College's DANDI project
Speed up git-annex upgrade (from v5) and init in a repository that has
submodules. Setting the config does not affect the submodules, so avoid
the work of getting status in them, which may involve using the smudge
filter etc.
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
Added --anything (and --nothing). Eg, git-annex find --anything will list
all annexed files whether or not the content is present. This is slightly
faster and clearer than --include=* or --exclude=*
While I can't imagine how --nothing will be used, preferred content
expressions already had anything and nothing, so might as well support both
as matching options as well.
Sponsored-by: Dartmouth College's Datalad project
Improve handling of some .git/annex/ subdirectories being on other
filesystems, in the bittorrent special remote, and youtube-dl integration,
and git-annex addurl.
The only one of these that I've confirmed to be a problem is in the
bittorrent special remote when .git/annex/tmp and .git/annex/othertmp are
on different filesystems.
As well as auditing for renameFile, also audited for createLink, all of
those are ok as are the other remaining renameFile calls. Also audited all
code paths that use .git/annex/othertmp, and did not find any other
cross-device problems. So, removing mention of othertmp needing to be on
the same device.
Sponsored-by: Dartmouth College's Datalad project
Change --metadata comparisons < > <= and >= to fall back to lexicographical
comparisons when one or both values being compared are not numbers.
Sponsored-by: Erik Bjäreholt on Patreon
Fix a hang that occasionally occurred during commands such as move.
(A bug introduced in 10.20220927, in
commit 6a3bd283b8)
The restage.log was kept locked while running a complex index refresh
action. In an unusual situation, that action could need to write to the
restage log, which caused a deadlock.
The solution is a two-stage process. First the restage.log is moved to a
work file, which is done with the lock held. Then the content of the work
file is read and processed, which happens without the lock being held.
This is all done in a crash-safe manner.
Note that streamRestageLog may not be fully safe to run concurrently
with itself. That's ok, because restagePointerFiles uses it with the
index lock held, so only one can be run at a time.
streamRestageLog does delete the restage.old file at the end without
locking. If a calcRestageLog is run concurrently, it will either see the
file content before it was deleted, or will see it's missing. Either is
ok, because at most this will cause calcRestageLog to report more
work remains to be done than there is.
Sponsored-by: Dartmouth College's Datalad project
Debian is going to drop youtube-dl which is not active upstream, and yt-dlp
is the replacement. This will make it be used if youtube-dl gets removed.
If an old version of youtube-dl remains installed, git-annex will still use
it. That might not be desirable, but changing git-annex to use yt-dlp in
preference to youtube-dl when both are installed risks breaking when
the user has annex.youtube-dl-options set to something that is supported
by youtube-dl, but not by yt-dlp.
Sponsored-by: Boyd Stephen Smith Jr. on Patreon
init: Avoid scanning for annexed files, which can be lengthy in a
large repository. Instead that scan is done on demand. This lets git-annex
init be run and some query commands be used in a repository without
waiting.
Note that autoinit already behaved this way, so while this will mean some
commands like git-annex get/unlock/add will do the scan the first time run,
that is not really a significant behavior change.
And, it's really better to have a consistent behavior. The reason for
the inconsistency was a strange bug discussed in
b3c4579c79. Avoiding reconcileStaged in
init will keep avoiding whatever that was.
Sponsored-by: Dartmouth College's DANDI project
Increasing the size of the queue 10x makes git-annex init 7% faster in a
repository with 86000 annexed files.
The memory use goes up, from 70876 kb to 85376 kb.
Avoids database querying overhead when the database is newly created.
In the large repository where git-annex init took 24 seconds, this sped it
up to 20.47 seconds, a speedup of around 15%.
Sponsored-by: Dartmouth College's DANDI project
This reverts commit 15dd7fe84b.
aws 0.23 is not used any longer, so read-only S3 import won't be
supported yet when building with stack.
That commit broke the build on windows, because the new version of Win32 that
was included (because the old one does not work with this lts version)
needs a version of filepath that is newer than the one bundled with the
ghc in that lts version. It is not possible to override that to a newer
filepath.
Seems that the only solution to get aws 0.23 will be to wait for a ghc that
contains filepath 1.4.100.0. No ghc yet contains it. (Backporting the Win32
fix to a point release version that does not include this bleeding edge
filepath would also resolve it, but seems unlikely to happen.)
Sponsored-by: Jarkko Kniivilä on Patreon