144 lines
6.6 KiB
Markdown
144 lines
6.6 KiB
Markdown
The assistant should help the user recover their repository when things go
|
|
wrong.
|
|
|
|
[[!toc ]]
|
|
|
|
## dangling lock files
|
|
|
|
There are a few ways a git repository can get broken that are easily fixed.
|
|
One is left over index.lck files. When a commit to a repository fails,
|
|
check that nothing else is using it, fix the problem, and redo the commit.
|
|
|
|
* **done** for .git/annex/index.lock, can be handled safely and automatically.
|
|
* **done** for .git/index.lock, only when the assistant is starting up.
|
|
* What about local remotes, eg removable drives? git-annex does attempt
|
|
to commit to the git-annex branch of those. It will use the automatic
|
|
fix if any are dangling. It does not commit to the master branch; indeed
|
|
a removable drive typically has a bare repository. So I think nothing to
|
|
do here.
|
|
* What about git-annex-shell? If the ssh remote has the assistant running,
|
|
it can take care of it, and if not, it's a server, and perhaps the user
|
|
should be required to fix up if it crashes during a commit. This should
|
|
not affect the assistant anyway.
|
|
* **done** Seems that refs can also have stale lock files, for example
|
|
'/storage/emulated/legacy/DCIM/.git/refs/remotes/flick_phonecamera/synced/git-annex.lock'
|
|
All git lock files are now handled (except gc lock files).
|
|
|
|
## incremental fsck
|
|
|
|
Add webapp UI to enable incremental fsck **done**
|
|
|
|
Of course, incremental fsck will run as an niced (and ioniced) background
|
|
job. There will need to be a button in the webapp to stop it, in case it's
|
|
annoying. **done**
|
|
|
|
When fsck finds a damanged file, queue a download of the file from a
|
|
remote. **done**
|
|
|
|
Detect when a removable drive is connected in the Cronner, and check
|
|
and try to run its remote fsck jobs. **done** (Same mechanism will work for
|
|
network remotes becoming connected.)
|
|
|
|
TODO: If no accessible remote has a file that fsck reported missing,
|
|
prompt the user to eg, connect a drive containing it. Or perhaps this is a
|
|
special case of a general problem, and the webapp should prompt the user
|
|
when any desired file is available on a remote that's not mounted?
|
|
|
|
## git-annex-shell remote fsck
|
|
|
|
TODO: git-annex-shell fsck support, which would allow cheap fast fscks
|
|
of ssh remotes.
|
|
|
|
Would be nice; otherwise remote fsck is too expensive (downloads
|
|
everything) to have the assistant do.
|
|
|
|
Note that Remote.Git already tries to use this, but the assistant does not
|
|
call it for non-local remotes.
|
|
|
|
## git fsck
|
|
|
|
Have the sanity checker run git fsck periodically (it's fairly inexpensive,
|
|
but still not too often, and should be ioniced and niced).
|
|
|
|
If committing to the repository fails, after resolving any dangling lock
|
|
files (see above), it should git fsck.
|
|
|
|
If git fsck finds problems, launch git repository repair.
|
|
|
|
## git repository repair
|
|
|
|
There are several ways git repositories can get damanged.
|
|
|
|
The most common is empty files in .git/annex/objects and commits that refer
|
|
to those objects. When the objects have not yet been pushed anywhere.
|
|
I've several times recovered from this manually by
|
|
removing the bad files and resetting to before the commits that referred to
|
|
them. Then re-staging any divergence in the working tree. This could
|
|
perhaps be automated.
|
|
|
|
As long as the git repository has at least one remote, another method is to
|
|
clone the remote, sync from all other remotes, move over .git/config and
|
|
.git/annex/objects, and tar up the old broken git repo and `git annex add`
|
|
it. This should be automatable and get the user back on their feet. User
|
|
could just click a button and have this be done.
|
|
|
|
This is useful outside git-annex as well, so make it a
|
|
git-recover-repository command.
|
|
|
|
### detailed design
|
|
|
|
Run `git fsck` and parse output to find bad objects, and determine
|
|
from its output if they are a commit, a tree, or a blob.
|
|
|
|
Check if there's a remote. If so, and if the bad objects are all
|
|
present on it, can simply get all bad objects from the remote,
|
|
and inject them back into .git/objects to recover:
|
|
|
|
1. If the local repository contains packs, the packs may be corrupt.
|
|
So, start by using `git unpack-objects` to unpack all
|
|
packs it can handle (which may include parts of corrupt packs)
|
|
back to loose objects. And delete all packs.
|
|
2. Delete all loose corrupt objects.
|
|
3. Make a new (bare) clone from the remote. Use `--reference` pointing
|
|
at the broken repository, to avoid re-downloading objects that
|
|
are present in it. (git does not seem to provide an easy way to just
|
|
fetch specific missing objects from a remote; `git fetch-pack` only
|
|
operates on refs... but this clone method should be pretty efficient)
|
|
4. Unpack any packs in the clone, so we can operate on loose objects.
|
|
5. Copy each missing object from the new clone's .git/objects to the
|
|
repository.
|
|
6. If each bad object was able to be repaired this way, we're done!
|
|
(If not, can reuse the clone for getting objects from the next remote.)
|
|
|
|
If some missing objects cannot be recovered from remotes, find commits in each
|
|
local branch that are broken by all remaining missing objects. Some of this can
|
|
be parsed from git fsck output, but for eg blobs, the commits need to
|
|
be walked to walk the trees, to find trees that refer to the blobs.
|
|
|
|
For each branch that is affected, look in the reflog and/or `git log
|
|
$branch` to find the last good commit that predates all broken commits. (If
|
|
the head commit of a branch is broken, git log is not going to show
|
|
anything useful, but the reflog can be used to find past refs for the
|
|
branch -- have to first delete the .git/HEAD file if it points to the
|
|
broken ref.)
|
|
|
|
The basic idea then is to reset the branch to the last good commit
|
|
that was found for it.
|
|
|
|
* For the HEAD branch, can just reset it. (If no last good commit was found
|
|
for the HEAD branch, reset it to a dummy empty commit.) This will
|
|
leave git showing any changes made since then as staged in the index and
|
|
uncommitted. Or if the index is missing/corrupt, any files in the tree will
|
|
show as modified and uncommitted. User (or git-annex assistant) can then
|
|
commit as appropriate. Print appropriate warning message.
|
|
* Special handling for git-annex branch: Reset to last good commit
|
|
(or to dummy empty commit is there is not one), and
|
|
then commit `.git/annex/index` over top of that, and then run a
|
|
`git annex fsck --fast` to fix up any object location info.
|
|
* Remote tracking branches can just be removed, and then `git fetch`
|
|
from the remote, which will re-download missing objects from it and
|
|
reinstate the tracking branch.
|
|
* For other branches (or tags), it's best to not rewrite them, because
|
|
that could get really confusing. Instead, delete the old broken branch,
|
|
and make a "recovered/$branch" that holds the last good commit (if one
|
|
was found).
|