prop_relPathDirToFileAbs_basics (TestableFilePath ":/") failed on
windows. The colon was filtered out after trying to make
the path relative, which only removed leading path separators.
So, ":/" changed to "/" which is not relative. Filtering out the colon
before hand avoids this problem.
Sponsored-by: Luke Shumaker on Patreon
add: Avoid unncessarily converting a newly unlocked file to be stored
in git when it is not modified, even when annex.largefiles does not
match it.
This fixes a reversion in version 10.20220222, where git-annex unlock
followed by git-annex add, followed by git commit file could result in
git thinking the file was modified after the commit.
I do have half a mind to remove the withUnmodifiedUnlockedPointers part
of git-annex add. It seems weird, despite that old bug report arguing
a case of consistency that it ought to behave that way. When git-annex
add surpises me, it seems likely it's wrong.. But for now, this is the
smallest possible fix.
Sponsored-by: Dartmouth College's Datalad project
Directory special remotes with importtree=yes have changed to once more
take inodes into account. This will cause extra work when importing from a
directory on a FAT filesystem that changes inodes on every mount.
To avoid that extra work, set ignoreinodes=yes when initializing a new
directory special remote, or change the configuration of your existing
remote: git-annex enableremote foo ignoreinodes=yes
This will mean a one-time re-import of all contents from every directory
special remote due to the changed setting.
73df633a62 thought
it was too unlikely that there would be modifications that the inode number
was needed to notice. That was probably right; it's very unlikely that a
file will get modified and end up with the same size and mtime as before.
But, what was not considered is that a program like NextCloud might write
two files with different content so closely together that they share the
mtime. The inode is necessary to detect that situation.
Sponsored-by: Max Thoursie on Patreon
Default to the number of CPU cores, which seems about optimal
on my laptop. Using one more saves me 2 seconds actually.
Better packing of workers improves speed significantly.
In 2 tests runs, I saw segfaulting workers despite my attempt
to work around that issue. So detect when a worker does, and re-run it.
Removed installSignalHandlers again, because I was seeing an
error "lost signal due to full pipe", which I guess was somehow caused
by using it.
Sponsored-by: Dartmouth College's Datalad project
Note the very weird workaround for what appears to be some kind of tasty
bug, which causes a segfault. This is not new to this modification,
I was seeing a segfault before at least intermittently when limiting
git-annex test -p to only run a single test group.
Also, the path from one test repo to a remote test repo used to be
"../../foo", which somehow broke when moving the test repos from .t to
.t/N. I don't actually quite understand how it used to work, but
"../foo" seems correct and works in the new situation.
Test output from the concurrent processes is not yet serialized.
Should be easy to do using concurrent-output.
More test groups will probably make the speedup larger. It would
probably be best to have a larger number of test groups and divvy them
amoung subprocesses numbered based on the number of CPU cores, perhaps
times 2 or 3.
Sponsored-by: Dartmouth College's Datalad project
Avoid git-annex test being very slow when run from within the standalone
linux tarball or OSX app.
It may not really be necessary to add to PATH the directory where the
git-annex binary resides, but it can't hurt. Most places where the test
suite or git-annex run git-annex, they use programPath, so won't need
a modified PATH. But I'm not sure if that's always the case.
Sponsored-by: Dartmouth College's Datalad project
I cannot find any indication that git-annex uses these exit statuses,
and I see I didn't write this doc so I don't know where those numbers
came from. Odd.
Propagate nonzero exit status from git ls-files when a specified file does
not exist, or a specified directory does not contain any files checked into
git.
The recent completion of the annex.skipunknown transition exposed this
bug, that has unfortunately been lurking all along.
It is also possible that git ls-files errors out for some other reason
-- perhaps a permission problem -- and this will also fix error propagation
in such situations.
Sponsored-by: Dartmouth College's Datalad project
It did nothing, since at this point the link is dangling. But when there
is a thaw hook, it would probably not be happy to be asked to run on a
symlink, or might do something unexpected.
Sponsored-by: Dartmouth College's Datalad project
When annex.freezecontent-command is set, and the filesystem does not
support removing write bits, avoid treating it as a crippled filesystem.
The hook may be enough to prevent writing on its own, and some filesystems
ignore attempts to remove write bits.
Sponsored-by: Dartmouth College's Datalad project
It will then proceed to add the file the same as if it were any other
file containing possibly annexable content. Usually the file is one that
was annexed before, so the new, probably corrupt content will also be added
to the annex. If the file was not annexed before, the content will be added
to git.
It's not possible for the smudge filter to throw an error here, because
git then just adds the file to git anyway.
Sponsored-by: Dartmouth College's Datalad project
This format is designed to detect accidental appends, while having some
room for future expansion.
Detect when an unlocked file whose content is not present has gotten some
other content appended to it, and avoid treating it as a pointer file, so
that appended content will not be checked into git, but will be annexed
like any other file.
Dropped the max size of a pointer file down to 32kb, it was around 80 kb,
but without any good reason and certianly there are no valid pointer files
anywhere that are larger than 8kb, because it's just been specified what it
means for a pointer file with additional data even looks like.
I assume 32kb will be good enough for anyone. ;-) Really though, it needs
to be some smallish number, because that much of a file in git gets read
into memory when eg, catting pointer files. And since we have no use cases
for the extra lines of a pointer file yet, except possibly to add
some human-visible explanation that it is a git-annex pointer file, 32k
seems as reasonable an arbitrary number as anything. Increasing it would be
possible, eg to 64k, as long as users of such jumbo pointer files didn't
mind upgrading all their git-annex installations to one that supports the
new larger size.
Sponsored-by: Dartmouth College's Datalad project
File matching options like --include will be rejected in situations where
there is no filename to match against. (Or where there is a filename but
it's not relative to the cwd, or otherwise seemed too bothersome to match
against.)
The addition of listKeys' was necessary to avoid using more memory in the
common case of "git-annex info". Adding a filterM would have caused the
list to buffer in memory and not stream. This is an ugly hack, but listKeys
had previously run Annex operations inside unafeInterleaveIO (for direct
mode). And matching against a matcher should hopefully not change any Annex
state.
This does allow for eg `git-annex info somefile --include=*.ext`
although why someone would want to do that I don't really know. But it
seems to make sense to allow it.
But, consider: `git-annex info ./somefile --include=somefile`
This does not match, so will not display info about somefile.
If the user really wants to, they can `--include=./somefile`.
Using matching options like --copies or --in=remote seems likely to be
slower than git-annex find with those options, because unlike such
commands, info does not have optimised streaming through the matcher.
Note that `git-annex info remote` is not the same as
`git-annex info --in remote`. The former shows info about all files in
the remote. The latter shows local keys that are also in that remote.
The output should make that clear, but this still seems like a point
where users could get confused.
Sponsored-by: Jochen Bartl on Patreon
Implemented by making Git.Queue have a FlushAction, which can accumulate
along with another action on files, and runs only once the other action has
run.
This lets git-annex unlock queue up git update-index actions, without
conflicting with the restagePointerFiles FlushActions.
In a repository with filter-process enabled, git-annex unlock will
often not take any more time than before, though it may when the files are
large. Either way, it should always slow down less than git-annex status
speeds up.
When filter-process is not enabled, git-annex unlock will slow down as much
as git status speeds up.
Sponsored-by: Jochen Bartl on Patreon
This avoids a later git status or similar taking a long time to run
as it runs git-annex smudge once per file. While v9 repositories do
avoid that taking long when the files are small, large files can still
make git status take a very long time.
This does make unlock slower, because now git-annex smudge is being run
once per file unlocked. However, the next commit should speed that up in
many cases.
Sponsored-by: Boyd Stephen Smith Jr. on Patreon
annex.skipunknown now defaults to false, so commands like `git annex get foo*`
will not silently skip over files/dirs that are not checked into git.
Sponsored-by: Brock Spratlen on Patreon
* registerurl, unregisterurl: Improved output when reading from stdin
to be more like other batch commands.
* registerurl, unregisterurl: Added --json and --json-error-messages options.
Note that this did change the --batch output in a way that could possibly
break something that expected the old output to never change. I think it's
acceptable to break that because there has never been a guarantee of
unchanging output format except with --batch for most commands. The old
output was just really weird too!
One possible wart is that "git-annex registerurl" with no options now
seems to just hang, since it's waiting for stdin input. Before, it said
"registerurl (stdin)" which was clearer about what's happenening. But this
is a deprecated mode anyway, --batch makes clear what's happening. If
anything, this problem would be a reason to eventually remove the support
for reading from stdin w/o --batch.
Sponsored-by: Dartmouth College's Datalad project
takeByteString can only be used at the end of a parser, not before other
input. This was a dumb enough mistake that I audited the rest of the
code base for similar mistakes. Pity that attoparsec cannot avoid it at
the type level.
Fixes git-annex forget propagation between repositories. (reversion
introduced in version 7.20190122)
Sponsored-by: Brock Spratlen on Patreon
Seems that --no-ext-diff and -c diff.external= are not enough to disable
external diff command when gitattributes textconv specifies it.
I'm pretty sure that --no-ext-diff and -c diff.external= are not both
needed, but not 100%. Something about -G may need the latter to fully
disable diffs in some cases. So kept that part as it was.
Sponsored-by: Dartmouth College's Datalad project
The "+" argument only runs the command once, so is not safe to use. Using
";" instead would have been the simplest fix, but also the slowest.
Since my phone has an xargs that supports -0, I piped find to xargs
instead. Unsure how portable this will be, perhaps some android's don't
have xargs -0 or find -printf to send null terminated output.
The business with pipefail is necessary to make a failure of find cause the
import to fail. Probably this works on all androids, but if not, it will
probably just result in a failure of find being ignored. It would be
possible to make ignorefinderror just disable setting pipefail, but then
if some android has a shell that has pipefail enabled by default, ignorefinderror
would not work, so I kept the || true approach for that.
Sponsored-by: Max Thoursie on Patreon