Increasing the size of the queue 10x makes git-annex init 7% faster in a
repository with 86000 annexed files.
The memory use goes up, from 70876 kb to 85376 kb.
Avoids database querying overhead when the database is newly created.
In the large repository where git-annex init took 24 seconds, this sped it
up to 20.47 seconds, a speedup of around 15%.
Sponsored-by: Dartmouth College's DANDI project
export: Fix a bug that left a file on a special remote when two files with
the same content were both deleted in the exported tree.
Case of the wrong data structure leading to the wrong result.
The DiffMap now contains all the old filenames, and all the new filenames.
Note that, when 2 files with the same content are both renamed,
it only renames the first, but deletes and re-exports the second.
Improving that is possible, but it would need to use a different temporary
filename. Anyway, that is an unusual case, and there are known to be other
unusual cases where export does not rename with maximum efficiency, IIRC.
(Or maybe this is the case that I remember?)
Sponsored-by: Dartmouth College's OpenNeuro project
The initTests have to be run once per part, and a point of diminishing
returns can be reached where more work is being done to set up for 1 or
2 tests than to run them.
This is better than a hard cap of -J8 or so, because it lets other
things than these particular tests still be parallelized at -J16.
Sponsored-by: Dartmouth College's Datalad project
Clean the standalone environment before running the su command
to run "sh". Otherwise, PATH leaked through, causing it to run
git-annex.linux/bin/sh, but GIT_ANNEX_DIR was not set,
which caused that script to not work:
[2022-10-26 15:07:02.145466106] (Utility.Process) process [938146] call: pkexec ["sh","-c","cd '/home/joey/tmp/git-annex.linux/r' && '/home/joey/tmp/git-annex.linux/git-annex' 'enable-tor' '1000'"]
/home/joey/tmp/git-annex.linux/bin/sh: 4: exec: /exe/sh: not found
Changed programPath to not use GIT_ANNEX_PROGRAMPATH,
but instead run the scripts at the top of GIT_ANNEX_DIR.
That works both when the standalone environment is set up, and when it's
not.
Sponsored-by: Kevin Mueller on Patreon
Make --batch mode handle unstaged annexed files consistently whether the
file is unlocked or not. Before this, a unstaged locked file
would have the symlink on disk examined and operated on in --batch mode,
while an unstaged unlocked file would be skipped.
Note that, when not in batch mode, unstaged files are skipped over too.
That is actually somewhat new behavior; as late as 7.20191114 a
command like `git-annex whereis .` would operate on unstaged locked
files and skip over unstaged unlocked files. That changed during
optimisation of CmdLine.Seek with apparently little fanfare or notice.
Turns out that rmurl still behaved that way when given an unstaged file
on the command line. It was changed to use lookupKeyStaged to
handle its --batch mode. That also affected its non-batch mode, but
since that's just catching up to the change earlier made to most
other commands, I have not mentioed that in the changelog.
It may be that other uses of lookupKey should also change to
lookupKeyStaged. But it may also be that would slow down some things,
or lead to unwanted behavior changes, so I've kept the changes minimal
for now.
An example of a place where the use of lookupKey is better than
lookupKeyStaged is in Command.AddUrl, where it looks to see if the file
already exists, and adds the url to the file when so. It does not matter
there whether the file is staged or not (when it's locked). The use of
lookupKey in Command.Unused likewise seems good (and faster).
Sponsored-by: Nicholas Golder-Manning on Patreon
While ErrorBusy and other exceptions were caught and the write retried for
up to 10 seconds, it was still possible for git-annex to eventually
give up and error out without writing to the database. Now it will retry
as long as necessary.
This does mean that, if one git-annex process is suspended just as sqlite
has locked the database for writing, another git-annex that tries to write
it it might get stuck retrying forever. But, that could already happen when
opening the sqlite database, which retries forever on ErrorBusy. This is an
area where git-annex is known to not behave well, there's a todo about the
general case of it.
Sponsored-by: Dartmouth College's Datalad project
When importing from versioned remotes, fix tracking of the content of
deleted files.
Only S3 supports versioning so far, so only it was affected.
But, the draft import/export interface for external remotes also seemed to
need a change, so that versionedExport could be set.
move: Fix openFile crash with -J
This does make them a bit slower, although usually the log file is not
very big, so even when it's being rewritten, they will not block for
long taking the lock. Still, little slowdowns may add up when moving a lot
file files.
A less expensive fix would be to use something lower level than openFile
that does not check if the file is already open for write by another
thread. But GHC does not seem to provide anything convenient; even mkFD
checks for a writing thread.
fullLines is no longer necessary since these functions no longer will
read the file while it's being written.
Sponsored-by: Dartmouth College's DANDI project
* trust, untrust, semitrust, dead: Fix behavior when provided with
multiple repositories to operate on.
* trust, untrust, semitrust, dead: When provided with no parameters,
do not operate on a repository that has an empty name.
The man page and usage already indicated that multiple repos could be
provided to these commands, but they actually used unwords to combine
everything into string, and found a repo matching that string. This was
especially bad when no parameters resulted in the empty string and some
repo happened to have an empty description.
This does change the behavior, and it's possible someone relied on the
current behavior to eg, trust a repo by name with the name not quoted into
a single parameter. But fixing the empty string bug and matching the
documentation are worth breaking that usage.
Note that git-annex init/reinit do still unwords multiple parameters when
provided to them. That is inconsistent behavior, but it certianly seems
possible that something does run git-annex init with an unquoted
description, and I don't think it's worth breaking that just to make it more
consistent with these other commands.
Sponsored-by: Boyd Stephen Smith Jr. on Patreon
Avoids displaying warning about git-annex restage needing to be run in
situations where it does not.
Closing a handle flushes it anyway, so no need for an explict flush. The
handle does get closed twice, but that's fine, the second one does nothing.
Sponsored-by: Dartmouth College's DANDI project
p2p: Pass wormhole the --appid option before the receive/send command, as
it does not accept that option after the command
I'm left wondering, did I get this wrong from the beginning, or did
wormhole change its option parser? I'm reminded of the change in 0.8.2
where it silently changed what FD the pairing code was output to.
But, looking at the wormhole source, it was at least putting --appid before
send in its test suite from the introduction of the option.
So I think probably this has always been broken. On 2021-12-31 the --appid
option was enabled, and it took until now for someone to try
git-annex p2p --pair and notice that flag day broke it..
Sponsored-by: Svenne Krap on Patreon
As was attempted earlier in the buggy commit 0d2e3058ee
Avoided the bug that had by making the upgrade log be updated after each
upgrade step. So, after upgrade from v8 to v9, the log is updated, and
so Upgrade.V9's timeOfUpgrade check will find that it was upgraded
recently and so won't let it skip ahead to v10.
Sponsored-by: k0ld on Patreon
This is much easier and less failure-prone than having the user run
git update-index --refresh themselves.
Sponsored-by: Dartmouth College's DANDI project
Fixes updating git index file after getting an unlocked file when
annex.stalldetection is set.
The transferrer may want to send additional protocol messages when it's
shut down. Closing the read handle prevented it from doing that, and caused
it to crash rather than cleanly shutting down.
Draining the handle without processing the protocol seemed ok to do,
because anything it outputs is going to be some side message displayed
at shutdown. Displaying those once per transferrer process that is running
seems unncessary.
Sponsored-by: Dartmouth College's DANDI project
Fix a reversion that prevented git-annex from working in a repository when
--git-dir or GIT_DIR is specified to relocate the git directory to
somewhere else. (Introduced in version 10.20220525)
checkRepoConfigInaccessible could still run git config --list, just passing
--git-dir. It seems not necessary, because I know that passing --git-dir
bypasses git's check for repo ownership. I suppose it might be that git
eventually changes to check something about the ownership of the working
tree, so passing --git-dir without --work-tree would still be worth doing.
But for now this is the simple fix.
Sponsored-by: Nicholas Golder-Manning on Patreon
Improve handling of directory special remotes with importtree=yes whose
ignoreinode setting has been changed. (By either enableremote or by
upgrading to commit 3e2f1f73cbc5fc10475745b3c3133267bd1850a7.)
When getting a file from such a remote, accept the content that would have
been accepted with the previous ignoreinode setting.
After a change to ignoreinode, importing a tree from the remote will
re-import and generate new content identifiers using the new config. So
when ignoreinode has changed to no, the inodes will be learned, and after
that point, a change in an inode will be detected as a change. Before
re-importing, a change in an inode will be ignored, as it was before the
ignoreinode change. This seems acceptble, because the user can re-import
immediately if they urgently need to add inodes. And if not, they'll
do it sometime, presumably, and the change will take effect then.
Sponsored-by: Erik Bjäreholt on Patreon
Use curl for downloads from git remotes when annex.url-options and other
git configs are set.
If the url needs a password, curl will fail, and git credential will not be
used to prompt for it. But the user can set --netrc in url-options and
put the password in the netrc file.
This also means that url-options settings like -4 will take effect.
That was the case before commit 1883f7ef8f
forced conduit to be used.
Fix crash importing from a directory special remote that contains a broken
symlink.
The crash was in listImportableContentsM but some other places in
Remote.Directory also seemed like they could have the same problem.
Also audited for other places that have such a problem. Not all calls
to getFileStatus are bad, in some cases it's better to crash on something
unexpected. For example, `git-annex import path` when the path is a broken
symlink should crash, the same as when it does not exist. Many of the
getFileStatus calls are like that, particularly when they involve
.git/annex/objects which should never have a broken symlink in it.
Fixed a few other possible cases of the problem.
Sponsored-by: Lawrence Brogan on Patreon
For some reason, cabal 3.4.1.0 builds w/o the assistant and webapp,
even when the flag is explicitly turned on. Moving the build-depends from
inside the if flag section to the main build-depends somehow fixes this.
Since the webapp build deps are thus always available, there is no reason
not to build the webapp when building the assistant. So, got rid of the
webapp build flag. Kept the assistant build flag for now, since building
without it does at least still speed up the build.
Sponsored-by: Brock Spratlen on Patreon
Remove closed bugs and todos that were last edited or commented before 2021.
Except for ones tagged projects/* since projects like datalad want to keep
around records of old deleted bugs longer.
Command line used:
for f in $(grep -l '|done\]\]' -- ./*.mdwn); do if ! grep -q "projects/" "$f"; then d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=01-01-2021 --pretty=oneline -- "$f")" -a -z "$(git log --since=01-01-2021 --pretty=oneline -- "$d")" ]; then git rm -- "./$f" ; git rm -rf "./$d"; fi; fi; done
for f in $(grep -l '\[\[done\]\]' -- ./*.mdwn); do if ! grep -q "projects/" "$f"; then d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=01-01-2021 --pretty=oneline -- "$f")" -a -z "$(git log --since=01-01-2021 --pretty=oneline -- "$d")" ]; then git rm -- "./$f" ; git rm -rf "./$d"; fi; fi; done
Too big a footgun.
This does not prevent attackers who can write to the directory being
imported from racing the check. But they can cause anything to be imported
anyway, so would be limited to getting the legacy import to follow into a
directory they do not write to, and move files out of it into the annex.
(The directory special remote does not have that problem since it does not
move files.)
Sponsored-by: Jack Hill on Patreon
Fix a regression in 10.20220624 that caused git-annex add to crash when
there was an unstaged deletion.
Sponsored-by: Dartmouth College's Datalad project
Help the user get annex.dbdir configured when their filesystem is not
one that sqlite works on.
The change in Database.Handle makes an error from sqlite not be ignored
besides being displayed, which it was before. I can't see any reason
git-annex would want to ignore these errors.
I chose to use the fsck database rather than the keys database because
opening the keys database populates it, and see commit
b3c4579c79.
The placement of the call to checkSqliteWorks inside checkInitializeAllowed
avoids annex.uuid getting set before it's called.
Sponsored-by: Dartmouth College's Datalad project
This allows annex.dbdir to be set globally or always set to the same
value when needed. Each repository uses a subdirectory of it.
Sponsored-by: Dartmouth College's Datalad project
3a513cfe73 caused a reversion in addurl.
The type of addSmall changed, but the void prevented the type checker
from helping notice this. Since it now returns a CommandPerform, the
cleanup action has to be run.
Sponsored-by: Dartmouth College's Datalad project
Since bup split is not concurrency safe.
Used a lock file so that 2 git-annex processes only run one bup split
between them (per bup repo).
(Concurrent writes from different git-annex repository clones to the same
bup repo could still have concurrency problems.)
Sponsored-by: Noam Kremen on Patreon
git-annex copy --to a http remote will of course fail, as that's not
supported. But git-annex copy first checks if the content is already
present in the remote, and that threw a "not found".
Looks to me like other remotes that use Url.checkBoth in their checkPresent
do just return false when it fails. And Url.checkBoth does display
errors when unusual errors occur. So I'm pretty sure removing this error
message is ok.
Sponsored-by: Jarkko Kniivilä on Patreon
It seems worth noting here that I emailed bup's author about bup split
being noisy on stderr even with -q in approximately 2011. That never got
fixed. Its current repo on github only accepts pull requests, not bug
reports. Needing to add such complexity to deal with such a longstanding
unfixed issue is not fun.
Sponsored-by: Kevin Mueller on Patreon
Work around bug in git 2.37 that causes a segfault when when
core.untrackedCache is set, and broke git-annex init.
Depending on when git gets fixed and how widely the buggy versions are
used, this could be reverted quite soon, or need to linger for a long time.
It only makes git-annex init a tiny bit slower in a new repo.
Sponsored-by: Max Thoursie on Patreon
This reverts commit b5dc04099e.
Broke windows build, because the new lts updates Win32 to a version that
lacks a function that git-annex needs. git-annex.cabal depends on an
older Win32, and so stack build fails.
Will need to wait to update stack.yaml until this is fixed
https://github.com/haskell/win32/issues/208
and is in a new LTS release.
Fix a reversion that prevented --batch commands (and the assistant)
from noticing data written to the journal by other commands.
I have not identified which commit broke this for sure,
but probably it was aeca7c2207
--batch commands that wrote to the journal avoided the problem since
journalIgnorable sets unset on write. It's a little bit surprising that
nobody noticed that query --batch commands did not see data written by
other commands.
Sponsored-by: Dartmouth College's DANDI project
To avoid using find -printf, which was first supported in Android around
2019-2020.
Probing seems too fragile, and execing stat once per file is too slow to do
when there's a faster way available, which brought me to an option...
Sponsored-by: Brett Eisenberg on Patreon
This caused git to complain that filter-process failed and kill it with
signal 15. Because it wrote an extra flushPkt for an empty file, which
git did not expect, and so git saw an unexpected response to the next
request.
Luckily, filter-process is only used by default in v9 and up, and v8 is
still the default. Also, git had to be updating an empty file, followed
by another file, which is a fairly unlikely situation. And git restarts
filter-process after this happens and uses it to filter the rest of the
files. So this isn't a crippling bug.
Sponsored-by: Luke Shumaker on Patreon
On Windows, that does not support long paths
https://github.com/jacobstanley/unix-compat/issues/56
Instead, use System.Directory.renamePath, which does support long paths.
Sponsored-by: Dartmouth College's Datalad project
The webapp modules cannot build with the assistant disabled, so make the
webapp be under the assistant build flag.
Sponsored-by: Jarkko Kniivilä on Patreon
--backend is no longer a global option, and is only accepted by commands
that actually need it.
Three commands that used to support backend but don't any longer are
watch, webapp, and assistant. It would be possible to make them support it,
but I doubt anyone used the option with these. And in the case of webapp
and assistant, the option was handled inconsistently, only taking affect
when the command is run with an existing git-annex repo, not when it
creates a new one.
Also, renamed GlobalOption etc to AnnexOption. Because there are many
options of this type that are not actually global (any more) and get
added to commands that need them.
Sponsored-by: Kevin Mueller on Patreon
Improve handling of parallelization with -J when copying content from/to a
git remote that is a local path.
Sponsored-by: Nicholas Golder-Manning on Patreon
When adding a small file, it does not get locked down, so can be modified
after git-annex checks that it's small. The use of queued git add made the
race window nice and wide too.
Fixed by checking if the file has changed, and by not using git add.
Instead, have to recapitulate git add's handling of things like symlinks
and executable files.
Sponsored-by: Jochen Bartl on Patreon
The remaining callers all did not rely on it checking gitignore, so were
easy to convert.
They were susceptable to the same overwrite race as add and fix,
although less likely to have it and a narrower window than add's race.
Command.Rekey in passing got an unncessary call to removeFile deleted.
addSymlink handles deleting any existing worktree file.
This is not a complete fix for all such races, only the one where a
large file gets changed while adding and gets added to git rather than
to the annex.
addLink needs to go away, any caller of it is probably subject to the
same kind of race. (Also, addLink itself fails to check gitignore when
symlinks are not supported.)
ingestAdd no longer checks gitignore. (It didn't check it consistently
before either, since there were cases where it did not run git add!)
When git-annex import calls it, it's already checked gitignore itself
earlier. When git-annex add calls it, it's usually on files found
by withFilesNotInGit, which handles checking ignores.
There was one other case, when git-annex add --batch calls it. In that
case, old git-annex behaved rather badly, it would seem to add the file,
but git add would later fail, leaving the file as an unstaged annex symlink.
That behavior has also been fixed.
Sponsored-by: Brett Eisenberg on Patreon
move: Improve resuming a move that succeeded in transferring the content,
but where dropping failed due to eg a network problem, in cases where
numcopies checks prevented the resumed move from dropping the object from
the source repository.
This was earlier done for moves that got interrupted during the drop stage.
Sponsored-by: Svenne Krap on Patreon
Fix retrival of an empty file that is stored in a special remote with
chunking enabled.
The speculative chunk stuff caused a reversion by adding an empty list for
the empty file. Which is just wrong; the empty file is still stored on the
remote, and should be retrieved like any other file. It uses 1 chunk, so
`max 1` is the simple fix.
Sponsored-by: Noam Kremen on Patreon
rather than matching path of an existing remote to find the uuid.
The main benefit of this is that locations not using ssh:// will work
now, including both paths and host:/path
The other benefit is that it's a simpler interface, no need to have an
existing remote with the same url and some other name. Although that
will still work of course.
This does rely on tryGitConfigRead working when given a Git.Repo that is
not a remote. Luckily, it works fine that way.
Also, tryGitConfigRead will auto-init a local repo that has a git-annex
branch. I did not enable auto-init of ssh repos though.
The uuid discovery actually happens twice; initremote discovers it,
and uses it to store the special remote config, but does not set it in the
git remote it creates. So the next run of git-annex does uuid discovery
again, and caches it that time. This could be improved for a tiny
speedup, but I didn't want to complicate things for that in this
commit.
Sponsored-by: Dartmouth College's DANDI project
to track down a blocked indefinitely on MVar that seems to occur after
sqlite throws ErrorBusy but that I have not been able to reproduce when
I made commits synthetically throw ErrorBusy.
Sponsored-by: Dartmouth College's Datalad project
test: When limiting tests to run with -p, work around tasty limitation by
automatically including dependent tests.
This fixes a reversion because it didn't used to use dependencies and
forced tasty to run the init tests first. That changed when parallelizing
the test suite.
It will sometimes do a little more work than strictly required,
because it adds init tests deps when limited to eg quickcheck tests,
which don't depend on them. But this only adds a few seconds work.
Sponsored-by: Dartmouth College's Datalad project
This reverts windows-specific parts of 5a98f2d509
There were no code paths in common between windows and unix, so this
will return Windows to the old behavior.
The problem that the commit talks about has to do with multiple different
locations where git-annex can store annex object files, but that is not
too relevant to Windows anyway, because on windows the filesystem is always
treated as criplled and/or symlinks are not supported, so it will only
use one object location. It would need to be using a repo populated
in another OS to have the other object location in use probably.
Then a drop and get could possibly lead to a dangling lock file.
And, I was not able to actually reproduce that situation happening
before making that commit, even when I forced a race. So making these
changes on windows was just begging trouble..
I suspect that the change that caused the reversion is in
Annex/Content/Presence.hs. It checks if the content file exists,
and then called modifyContentDirWhenExists, which seems like it would
not fail, but if something deleted the content file at that point,
that call would fail. Which would result in an exception being thrown,
which should not normally happen from a call to inAnnexSafe. That was a
windows-specific change; the unix side did not have an equivilant
change.
Sponsored-by: Dartmouth College's Datalad project
Avoid treating refs/annex/last-index or other refs that are not commit
objects as evidence of repository corruption.
The repair code checks to find bad refs by trying to run `git log` on
them, and assumes that no output means something is broken. But git log
on a tree object is empty.
This was worth fixing generally, not as a special case, since it's certainly
possible that other things store tree or other objects in refs.
Sponsored-by: Max Thoursie on Patreon
This avoids displaying the unexpected exit codes message when
the list is eg [ExitSuccess, ExitFailure 1].
Sponsored-by: Dartmouth College's Datalad project
If the temp directory can somehow contain a hard link, it changes the
mode, which affects all other hard linked files. So, it's too unsafe
to use everywhere in git-annex, since hard links are possible in
multiple ways and it would be very hard to prove that every place that
uses a temp directory cannot possibly put a hard link in it.
Added a call to removeDirectoryForCleanup to test_crypto, which will
fix the problem that commit 17b20a2450
was intending to fix, with a much smaller hammer.
Sponsored-by: Dartmouth College's Datalad project
Using removePathForcibly avoids concurrent removal problems.
The i386ancient build still uses an old version of ghc and directory that
do not include removePathForcibly though.
Sponsored-by: Dartmouth College's Datalad project
prop_relPathDirToFileAbs_basics (TestableFilePath ":/") failed on
windows. The colon was filtered out after trying to make
the path relative, which only removed leading path separators.
So, ":/" changed to "/" which is not relative. Filtering out the colon
before hand avoids this problem.
Sponsored-by: Luke Shumaker on Patreon
add: Avoid unncessarily converting a newly unlocked file to be stored
in git when it is not modified, even when annex.largefiles does not
match it.
This fixes a reversion in version 10.20220222, where git-annex unlock
followed by git-annex add, followed by git commit file could result in
git thinking the file was modified after the commit.
I do have half a mind to remove the withUnmodifiedUnlockedPointers part
of git-annex add. It seems weird, despite that old bug report arguing
a case of consistency that it ought to behave that way. When git-annex
add surpises me, it seems likely it's wrong.. But for now, this is the
smallest possible fix.
Sponsored-by: Dartmouth College's Datalad project
Directory special remotes with importtree=yes have changed to once more
take inodes into account. This will cause extra work when importing from a
directory on a FAT filesystem that changes inodes on every mount.
To avoid that extra work, set ignoreinodes=yes when initializing a new
directory special remote, or change the configuration of your existing
remote: git-annex enableremote foo ignoreinodes=yes
This will mean a one-time re-import of all contents from every directory
special remote due to the changed setting.
73df633a62 thought
it was too unlikely that there would be modifications that the inode number
was needed to notice. That was probably right; it's very unlikely that a
file will get modified and end up with the same size and mtime as before.
But, what was not considered is that a program like NextCloud might write
two files with different content so closely together that they share the
mtime. The inode is necessary to detect that situation.
Sponsored-by: Max Thoursie on Patreon
Propagate nonzero exit status from git ls-files when a specified file does
not exist, or a specified directory does not contain any files checked into
git.
The recent completion of the annex.skipunknown transition exposed this
bug, that has unfortunately been lurking all along.
It is also possible that git ls-files errors out for some other reason
-- perhaps a permission problem -- and this will also fix error propagation
in such situations.
Sponsored-by: Dartmouth College's Datalad project
It did nothing, since at this point the link is dangling. But when there
is a thaw hook, it would probably not be happy to be asked to run on a
symlink, or might do something unexpected.
Sponsored-by: Dartmouth College's Datalad project
When annex.freezecontent-command is set, and the filesystem does not
support removing write bits, avoid treating it as a crippled filesystem.
The hook may be enough to prevent writing on its own, and some filesystems
ignore attempts to remove write bits.
Sponsored-by: Dartmouth College's Datalad project
It will then proceed to add the file the same as if it were any other
file containing possibly annexable content. Usually the file is one that
was annexed before, so the new, probably corrupt content will also be added
to the annex. If the file was not annexed before, the content will be added
to git.
It's not possible for the smudge filter to throw an error here, because
git then just adds the file to git anyway.
Sponsored-by: Dartmouth College's Datalad project
This format is designed to detect accidental appends, while having some
room for future expansion.
Detect when an unlocked file whose content is not present has gotten some
other content appended to it, and avoid treating it as a pointer file, so
that appended content will not be checked into git, but will be annexed
like any other file.
Dropped the max size of a pointer file down to 32kb, it was around 80 kb,
but without any good reason and certianly there are no valid pointer files
anywhere that are larger than 8kb, because it's just been specified what it
means for a pointer file with additional data even looks like.
I assume 32kb will be good enough for anyone. ;-) Really though, it needs
to be some smallish number, because that much of a file in git gets read
into memory when eg, catting pointer files. And since we have no use cases
for the extra lines of a pointer file yet, except possibly to add
some human-visible explanation that it is a git-annex pointer file, 32k
seems as reasonable an arbitrary number as anything. Increasing it would be
possible, eg to 64k, as long as users of such jumbo pointer files didn't
mind upgrading all their git-annex installations to one that supports the
new larger size.
Sponsored-by: Dartmouth College's Datalad project
takeByteString can only be used at the end of a parser, not before other
input. This was a dumb enough mistake that I audited the rest of the
code base for similar mistakes. Pity that attoparsec cannot avoid it at
the type level.
Fixes git-annex forget propagation between repositories. (reversion
introduced in version 7.20190122)
Sponsored-by: Brock Spratlen on Patreon
Seems that --no-ext-diff and -c diff.external= are not enough to disable
external diff command when gitattributes textconv specifies it.
I'm pretty sure that --no-ext-diff and -c diff.external= are not both
needed, but not 100%. Something about -G may need the latter to fully
disable diffs in some cases. So kept that part as it was.
Sponsored-by: Dartmouth College's Datalad project
The "+" argument only runs the command once, so is not safe to use. Using
";" instead would have been the simplest fix, but also the slowest.
Since my phone has an xargs that supports -0, I piped find to xargs
instead. Unsure how portable this will be, perhaps some android's don't
have xargs -0 or find -printf to send null terminated output.
The business with pipefail is necessary to make a failure of find cause the
import to fail. Probably this works on all androids, but if not, it will
probably just result in a failure of find being ignored. It would be
possible to make ignorefinderror just disable setting pipefail, but then
if some android has a shell that has pipefail enabled by default, ignorefinderror
would not work, so I kept the || true approach for that.
Sponsored-by: Max Thoursie on Patreon
Reject combinations of --batch (or --batch-keys) with options like --all or
--key or with filenames.
Most commands ignored the non-batch items when batch mode was enabled.
For some reason, addurl and dropkey both processed first the specified
non-batch items, followed by entering batch mode. Changed them to also
error out, for consistency.
Sponsored-by: Dartmouth College's Datalad project
autoUpgradeableVersions had latestVersion (10), but it did not make
sense for asking for old version 6 to get version 10, while asking for
version 8 got version 8. So use defaultVersion (8) instead.
Sponsored-by: Dartmouth College's Datalad project
The v10 upgrade should almost be safe now. What remains to be done is
notice when the v10 upgrade has occurred, while holding the shared lock,
and switch to using v10 lock files.
Sponsored-by: Dartmouth College's Datalad project
Do not populate the keys database with associated files,
because a bare repo has no working tree, and so it does not make sense to
populate it.
Queries of associated files in the keys database always return empty lists
in a bare repo, even if it's somehow populated. One way it could be
populated is if a user converts a non-bare repo to a bare repo.
Note that Git.Config.isBare does a string comparison, so this is not free!
But, that string comparison is very small compared to a sqlite query.
Sponsored-by: Erik Bjäreholt on Patreon
Recover from corrupted content being received from a git remote due eg to a
wire error, by deleting the temporary file when it fails to verify. This
prevents a retry from failing again.
Reversion introduced in version 8.20210903, when incremental verification
was added.
Only the git remote seems to be affected, although it is certianly
possible that other remotes could later have the same issue. This only
affects things passed to getViaTmp that return (False, UnVerified) due to
verification failing. As far as getViaTmp can tell, that could just as well
mean that the transfer failed in a way that would resume, so it cannot
delete the temp file itself. Remote.Git and P2P.Annex use getViaTmp internally,
while other remotes do not, which is why only it seems affected.
A better fix perhaps would be to improve the types of the callback
passed to getViaTmp, so that some other value could be used to indicate
the state where the transfer succeeded but verification failed.
Sponsored-by: Boyd Stephen Smith Jr.
So that importing does not replace them with plain files.
This works similarly to how the previous handling of submodules and
matchers did, except that annexed symlinks still get exported as plain
files of course, it's only non-annexed symlinks that it does not make sense
to export.
When symlinks have previously been exported, updating the export will
unexport them after upgrading to this commit.
Sponsored-by: Kevin Mueller on Patreon