A content directory can be frozen in direct mode. One way this can happen
is if the content is transferred before direct mode has a mapping for it,
so it's stored in the content directory.
So, we need to thaw the content directory before doing things with it.
This looks at the string one char at a time, which is hardly efficient..
but more than good enough for expanding variables in
relatively short command lines.
* since this is a crippled filesystem anyway, git-annex doesn't use
symlinks on it
* so there's no reason to use the mixed case hash directories that we're
stuck using to avoid breaking everyone's symlinks to the content
* so we can do what is already done for all bare repos, and make non-bare
repos on crippled filesystems use the all-lower case hash directories
* which are, happily, all 3 letters long, so they cannot conflict with
mixed case hash directories
* so I was able to 100% fix this and even resuming `git annex add` in the
test case will recover and it will all just work.
This avoids commit churn by the assistant when eg,
replacing a file with a symlink.
But, just as importantly, it prevents the working tree being left with a
deleted file if git-annex, or perhaps the whole system, crashes at the
wrong time.
(It also probably avoids confusing displays in file managers.)
Now getKeysPresent checks that the key's content, not only its directory,
exists. In direct mode, the inode cache file is used as a standin for the
content.
removeAnnex always removes the inode cache file, and drop and move --from
always call removeAnnex, even if the object does not seem to be inAnnex,
to ensure it's always deleted.
This reverts commit 57780cb3a4.
This was buggy, it caused the direct mode cache to be lost when dropping
keys, so when the file is gotten back, it's stored in indirect mode.
Note to self: Do not attempt bug fixes at 6 am!
git annex init probes for crippled filesystems, and sets direct mode, as
well as `annex.crippledfilesystem`.
Avoid manipulating permissions of files on crippled filesystems.
That would likely cause an exception to be thrown.
Very basic support in Command.Add for cripped filesystems; avoids the lock
down entirely since doing it needs both permissions and hard links.
Will make this better soon.
These files were left behind, and made getKeysPresent find keys that were
not present. It would be expensive to make getKeysPresent check that the
actual key files are present (it just lists the directories). But that's not
needed if we just clean up the stale cache and mapping files.
To handle systems that were in direct mode and got switched back with stale
direct mode files, made cleanObjectLoc remove all files in the key's directory.
git annex unused will still list keys that are gone but for which the stale
direct mode files exists. To deal with that, made dropunused remove the key's
directory even if the key does not seem to be present.
The most common way for a mapping to be stale is when a file was deleted,
or renamed. Nothing updates the mappings for deletions yet.
But they can also become stale in other ways. For example a file can
be modified.
So, the mapping is not trusted to be consistent. When we get a key,
only replace symlinks that still point to that key with its content.
When we drop a key, only put back symlinks for files that still have
the direct mode content.
Now there's a Config type, that's extracted from the git config at startup.
Note that laziness means that individual config values are only looked up
and parsed on demand, and so we get implicit memoization for all of them.
So this is not only prettier and more type safe, it optimises several
places that didn't have explicit memoization before. As well as getting rid
of the ugly explicit memoization code.
Not yet done for annex.<remote>.* configuration settings.
However, I don't yet have a reliable way to deal with files being modified
while they're being transferred. I have code that detects it on the sending
side, but the receiver is still free to move the wrong content into its
annex, and record that it has the content. So that's not acceptable, and
I'll need to work on it some more.
However, at this point I can use a direct mode repository as a remote and
transfer files from and to it.
Also for dropping objects in direct mode.
Checking presence reliably needs a cache of mtime, size, and inode.
This way, if a file is modified, keys that point to it are no longer
present.
Also, the code for restoring the symlink when removing objects is
unnecessarily messy. calcGitLink was generating links starting with
"../../remote/.git/", when running "git annex move --from remote".
I put in a workaround, but calcGitLink should probably be fixed.
There is not yet support for getting objects from repositories in direct
mode; it still looks for content in .git/annex/objects, and there's no
once place I can change to fix that.
Also, getting objects from direct mode repositories is problematic since
the can be changed while the object is being transferred. It probably needs
to quarantine it first.
Branch.get is not able to see changes that have been staged to the index
but not committed. This is a limitation of git cat-file --batch; when
reading from the index, as opposed to from a branch, it does not notice
changes made after the first time it reads the index.
So, had to revert the changes made in 1f73db3469
to make annex.alwayscommit=false stage changes.
Also, ensure that Branch.change and Branch.get always see changes
at all points during a commit, by not deleting journal files when
staging to the index. Delete them only after committing the branch.
Before, there was a race during commits where a different git-annex
could see out-of-date info from the branch while a commit was in progress.
That's also done when updating the branch to merge in remote branches.
In the case where the local git-annex branch has had changes pushed into it
that are not yet reflected in the index, and there are journalled changes
as well, a merge commit has to be done.
Baked into the code was an assumption that a repository's git directory
could be determined by adding ".git" to its work tree (or nothing for bare
repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are
used to separate the two.
This was attacked at the type level, by storing the gitdir and worktree
separately, so Nothing for the worktree means a bare repo.
A complication arose because we don't learn where a repository is bare
until its configuration is read. So another Location type handles
repositories that have not had their config read yet. I am not entirely
happy with this being a Location type, rather than representing them
entirely separate from the Git type. The new code is not worse than the
old, but better types could enforce more safety.
Added support for core.worktree. Overriding it with -c isn't supported
because it's not really clear what to do if a git repo's config is read, is
not bare, and is then overridden to bare. What is the right git directory
in this case? I will worry about this if/when someone has a use case for
overriding core.worktree with -c. (See Git.Config.updateLocation)
Also removed and renamed some functions like gitDir and workTree that
misused git's terminology.
One minor regression is known: git annex add in a bare repository does not
print a nice error message, but runs git ls-files in a way that fails
earlier with a less nice error message. This is because before --work-tree
was always passed to git commands, even in a bare repo, while now it's not.
annex.ssh-options, annex.rsync-options, annex.bup-split-options.
And adjust types to avoid the bugs that broke several config settings
recently. Now "annex." prefixing is enforced at the type level.
A bit tricky to avoid printing it twice in a row when there are queued git
commands to run and journal to stage.
Added a generic way to run an action that may output multiple side
messages, with only the first displayed.
This is incomplete, it does not honor it yet for hash directories
and other annex bookkeeping files. Some of that is not needed for a bare
repo; some of it may be.