This adds the overhead of a copy when serializing and deserializing keys.
I have not benchmarked much, but runtimes seem barely changed at all by that.
When a lot of keys are in memory, it improves memory use.
And, it prevents keys sometimes getting PINNED in memory and failing to GC,
which is a problem ByteString has sometimes. In particular, git-annex sync
from a borg special remote had that problem and this improved its memory
use by a large amount.
Sponsored-by: Shae Erisson on Patreon
This adds the overhead of a copy whenever converting to/from ExportLocation and
ImportLocation.
borg: Some improvements to memory use when importing a lot of archives.
(It's still pretty bad.)
Sponsored-by: Mark Reidenbach on Patreon
sync --content: Avoid a redundant checksum of a file that was
incrementally verified, when used on NTFS and perhaps other filesystems.
When sync has just gotten the content, it does not need to check inAnnex a
second time. On NTFS, for some reason the write of the inode cache after
it gets the content is not immediately able to be read, and with an
empty/non-matching inode cache due to that stale data, inAnnex falls back
to hashing the whole object to determine if it's present.
Sponsored-by: Brock Spratlen on Patreon
Disabling git-annex branch update for this command is
ok, because it does not use any information from the branch,
but only logs the location when it adds a key.
Sponsored-by: Dartmouth College's Datalad project
Probably this fixes a reversion, but I don't know what version broke it.
This does use withOtherTmp for a temp file that could be quite large.
Though albeit a reflink copy that will not actually take up any space
as long as the file it was copied from still exists. So if the copy cow
succeeds but git-annex is interrupted just before that temp file gets
renamed into the usual .git/annex/tmp/ location, there is a risk that
the other temp directory ends up cluttered with a larger temp file than
later. It will eventually be cleaned up, and the changes of this being
a problem are small, so this seems like an acceptable thing to do.
Sponsored-by: Shae Erisson on Patreon
This should complete the fix started in
6329997ac4, fixing the actual cause of the
test suite failure this time.
Sponsored-by: Dartmouth College's Datalad project
This was maybe a real bug too, although I don't know what circumstances
it would be a problem. See comment for analysis of this windows drive
letter wackyness issue.
Sponsored-by: Brock Spratlen on Patreon
This fixes a reversion caused by a99a84f342,
when git-annex init is run as root on a FAT filesystem mounted with
hdiutil on OSX. Such a mount point has file mode 777 for everything and
it cannot be changed. The existing crippled filesystem test tried to
write to a file after removing write bit, but that test does not run as
root (since root can write to unwritable files). So added a check of the
write permissions of the file, after attempting to remove them.
Sponsored-by: Dartmouth College's Datalad project
This is to track down what file in .git/annex/ is being written to via a
temp file when the repository is read-only.
Sponsored-by: Dartmouth College's Datalad project
And fail with an informative message.
I don't think ACLs can prevent removing the write bit, but I'm not sure,
so kept it mentioning them as a possibility.
Should git-annex lock also check if the write bits are able to be removed?
Maybe, but the case I know about with xattrs involves cp -a copying NFS
xattrs, and it's the copy of the file that is the problem. So when locking
a file, I guess it will not be the copy.
Sponsored-by: Dartmouth College's Datalad project
It would be better if the Arbitrary instance avoided generating impossible
filenames like "foo/c:bar", but proably this is the only place that splits
the file from the directory and then uses the file without the directory..
At least on the quickcheck properties.
Sponsored-by: Svenne Krap on Patreon
git-annex get when run as the first git-annex command in a new repo did not
populate unlocked files. (Reversion in version 8.20210621)
I am not entirely happy with this, because I don't understand how
428c91606b caused the problem in the first
place, and I don't fully understand how skipping calling scanAnnexedFiles
during autoinit avoids the problem.
Kept the explicit call to scanAnnexedFiles during git-annex init,
so that when reconcileStaged is expensive, it can be made to run then,
rather than at some later point when the information is needed.
Sponsored-by: Brock Spratlen on Patreon
I'm now reasonably sure I've identified both cases where this can
happen. v8 upgrades and certian filesystems eg NFS. Both are handled as
well as can be, though it may involve some extra checksumming work.
Eg, showImprecise 1 1.99 returned "1.1" rather than "2". The 9 rounded
upward to 10, and that was wrongly used as the decimal, rather than
carrying the 1.
Sponsored-by: Jack Hill on Patreon
An easy way to see this in action is to have an unlocked file, and touch the
object file.
While all code that compares inode caches for object files needs to be
prepared for this kind of problem and fall back to verification, having
fsck notice it and correct it is cheap (as long as fsck is being run
anyway) and ensures that if it happens for some unusual reason, there's a
way for the user to notice that it's happening.
Not that, when annex.thin is in use, the earlier call to isUnmodified
(and also potentially earlier calls to inAnnex in eg, verifyLocationLog)
will fix up the same problem silently. That might prevent the warning
being displayed, although probably it still will be, because the
Database.Keys write of the InodeCache will be queued but will not have
happened yet. I can't see a way to improve this, but it's not great.
Sponsored-by: Dartmouth College's Datalad project
Fix bug that caused some transfers to incorrectly fail with "content
changed while it was being sent", when the content was not changed.
While I don't know how to reproduce the problem that several people
reported, it is presumably due to the inode cache somehow being stale.
So check isUnmodified', and if it's not modified, include the file's
current inode cache in the set to accept, when checking for modification
after the transfer.
That seems like the right thing to do for another reason: The failure
says the file changed while it was being sent, but if the object file was
changed before the transfer started, that's wrong. So it needs to check
before allowing the transfer at all if the file is modified.
(Other calls to sameInodeCache or elemInodeCaches, when operating on inode
caches from the database, could also be problimatic if the inode cache is
somehow getting stale. This does not address such problems.)
Sponsored-by: Dartmouth College's Datalad project
It was making the borgrepo path absolute.. even when it was a ssh
repository.
Made BorgRepo a newtype, to guard against accidentially treating it like a
FilePath.
Sponsored-by: Graham Spencer on Patreon
Handles keys that are substrings of other keys, as well as pointer files
that contain a newline after the key.
Note that -S does not match regexp, while -G does by default. Docs are
not clear, determined experimentally. The only other difference in
changing to -G is that if a file used to contain the key and changed
in some way, while still containing the key, -G will match and -S would
not. So eg, annex links that git annex fix rewrites will match, and
files that change lock status will match. Which is an improvement anyway.
Sponsored-by: Jochen Bartl on Patreon
assistant: When adding non-large files to git, honor annex.delayadd
configuration.
Also, don't add non-large files to git when they are still
being written to. This came for free, since the changes to non-large
files get queued up with the ones to large files, and run through the lsof
check.
Sponsored-by: Luke Shumaker on Patreon