In particular, when two files had the same content, and one was unlocked
and modified, with annex.thin that can corrupt the content of the
annex object, and so fsck on the other file should detect that.
getKeyStatus was relying on Database.Keys.getAssociatedFiles to tell
when a file is unlocked, but that can false positive because the
database can list old associated files.
Instead, separate out the case of unlocked object which has multiple
hardlinks when annex.thin is in use.
To support filenames starting with dashes.
To update the config of existing repositories, you can re-run git-annex init.
Perhaps it should check every time for the old config and update it, but
that has several problems:
- read-only repos
- unexpected commands like `git annex find` changing git configs
might be surprising behavior
Since filenames starting with dashes are not super common and the user can
re-init easily enough if their repo needs fixed, I went for the simplest
fix.
Users may want sync to only export, or only import and this is broadly
analagous to push and pull, so it makes sense to use the same
configuration for it.
The branch is only updated once the export is 100% complete. This way,
if an export is started but interrupted and so the remote does not yet
contain some of the files, an import will make a commit on the old
branch, and so won't delete the missing files.
This log, unlike all other current top-level logs, is a new format log.
I have not checked what throwing it at the old log parser did, but it seems
likely it ignored unparsable lines, and so perhaps deleted all lines from
the log.
This fixes a reversion in the ByteString conversion. The old code used
isSpace to decide when the metadata value needs to be base64 encoded,
and that incorrectly changed to only checking if it contained ' '.
Note that only '\n' and '\r' were added and not other sorts of
whitespace that isSpace matches, like '\t' and '\v'. Only the former
would cause problems.
Installing git-annex with stack rsync won't be available.
Also, using the git-annex installer with 64 bit git installs a non-working
rsync binary because it's linked with libraries provided by 32 bit git.
Like with the network-uri split, cabal will automatically turn off the flag
when building with an old network.
I have not tested building with the new network-3.0.0.0 yet; several
other dependencies including aws are still pinned on network-2.*
xporting files with '#' or '?' in their name won't work because urls get
truncated on those. Fail in a better way in this case, and avoid failing
when removing such files from the export, so after the user has renamed the
problem files the export will succeed.
Avoid performing repository fixups for submodules and git-worktrees
when there's a .noannex file that will prevent git-annex from being
used in the repository.
This change is ok as long as the .noannex file is really going to prevent
git-annex from being used. But, init --force could override the file.
Which would result in the repo being initialized without the fixups
having run.
To avoid that situation decided to change init, to not let --force be used
to override a .noannex file. Instead the user can just delete the file.
* fromkey: Added --json.
* fromkey --batch output changed to support using it with --json.
The old output was not parseable for any useful information, so
this is not expected to break anything.
If the worktree file already exists, and is annexed and uses the same
key, avoid failing, nothing needs to be done.
Had to add lookupFileNotHidden to handle the case where an adjust --hide-missing
is in use, and the worktree file was hidden due to the object content
being missing. lookupFile would return the key of the hidden file,
but it makes sense that after fromkey succeeds, the worktree must
contain the file it was supposed to set up.
Need to create the directory after the lock is held, not before.
The other racing process would need to shut down at just the wrong time,
running cleanupOtherTmp.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
This gets back any speed lost in commit
9cebfd7002, and speeds up all uses of S3
remotes that operate on them more than once.
This commit was sponsored by Brett Eisenberg on Patreon.
When key-based retrieval from a S3 remote with exporttree=yes
appendonly=yes fails, fall back to trying to retrieve from the exported
tree. This allows downloads of files that were exported to such a remote
before versioning was enabled on it.
This is useful at least for a transition for users who got into that
situation, so they can download content from their S3 remote. May want to
remove this in the future though, since normally trying to download the
second time is only extra work.
This commit was sponsored by Brock Spratlen on Patreon.
Like the earlier fixed one in Command.Export, it occurred when the same
tree was exported by multiple clones. Previous fix was incomplete since
several other places looked at the list of exported trees to detect when
there was an export conflict. Added a single unified function to avoid
missing any places it needed to be fixed.
This commit was sponsored by mo on Patreon.
Because when git-annex lacks S3 version IDs for files stored in the bucket,
deleting them would cause data loss.
Also because git-annex is not able to download unversioned objects from a bucket
when versioning=yes.
This also prevents setting versioning=no. While that would perhaps be
possible to do safely, it would add complexity, and would mean that if
the user accidentially did enableremote versioning=no, they would not be
able to undo it.
This commit was sponsored by Trenton Cronholm on Patreon.
Needs not yet released version 0.22 of aws library; with older versions
asks the user to configure the bucket versioning themselves.
Note that S3 endpoints that don't support versioning will cause putBucketVersioning
to throw an exception, so initremote will fail.
This commit was sponsored by Jake Vosloo on Patreon.
* webapp: Remove configurator for box.com repository, since their
webdav support is going away at the end of this January.
* webapp: Remove configurator for gitlab, which stopped supporting git-annex
some time ago.
This commit was sponsored by Brock Spratlen on Patreon.
However, rsync still won't work with 64 bit git and
this is still not the documented way to install it.
So, if both 64 and 32 are installed, go with 32.
And if neither git can be found, default to 32.
* Switch to using .git/annex/othertmp for tmp files other than partial
downloads, and make stale files left in that directory when git-annex
is interrupted be cleaned up promptly by subsequent git-annex processes.
* The .git/annex/misctmp directory is no longer used and git-annex will
delete anything lingering in there after it's 1 week old.
Also, in Annex.Ingest, made the filename it uses in the tmp dir be
prefixed with "ingest-" to avoid potentially using a filename used by
some other code.
This will speed up the common case where a Key is deserialized from
disk, but is then serialized to build eg, the path to the annex object.
It means that every place a Key has any of its fields changed, the cache
has to be dropped. I've grepped and found them all. But, it would be
better to avoid that gotcha somehow..
A keyName could contain "/", though this is unlikely and certianly only
ever could happen with WORM keys.
The change to addunused to escape that is no problem at all.
The change to VariantFile to escape it means that different versions of
git-annex could resolve a merge conflict differently in this case, which
is unfortunate. There would be different .variant files used, so the two
resolutions would themselves merge together without additional
conflicts, but the user would have to clean up the extra .variant
files.
Not likely to be any speed gain here, but this completes porting every
log file over.
And, it let me get rid of code copied from ghc and modified, so
simplifying the licensing.
This preserves the workaround for the old bug that caused NoUUID items
to be stored in the log, prefixing log lines with " ". It's now handled
implicitly, by using takeWhile1 (/= ' ') to get the uuid.
There is a behavior change from the old parser, which split the value
into words and then recombined it. That meant that "foo bar" and "foo\tbar"
came out as "foo bar". That behavior was not documented, and seems
surprising; it meant that after a git-annex describe here "foo bar",
you wouldn't get that same string back out when git-annex displayed repo
descriptions.
Otoh, some other parsers relied on the old behavior, and the attoparsec
rewrites had to deal with the issue themselves...
For group.log, there are some edge cases around the user providing a
group name with a leading or trailing space. The old parser would ignore
such excess whitespace. The new parser does too, because the alternative
is to refuse to parse something like " group1 group2 " due to excess
whitespace, which would be even more confusing behavior.
The only git-annex branch log file that is not converted to attoparsec
and bytestring-builder now is transitions.log.
Mostly didn't push the ByteStrings down very deep, but all of these log
files are not written to frequently at all, so slight remaining
innefficiency doesn't matter.
In Logs.UUID, removed the fixBadUUID code that cleaned up after a bug in
git-annex versions 3.20111105-3.20111110. In the unlikely event that a repo was
last touched by that ancient git-annex version, the descriptions of remotes
would appear missing when used with this version of git-annex. That is such minor
breakage, and so unlikely to still be a problem for any repos, that it was not
worth forward-porting that code to ByteString.