a list of problems i had with git-annex
This commit is contained in:
parent
5ac99e6b5c
commit
cdac57b5bd
1 changed files with 106 additions and 0 deletions
106
doc/tips/antipatterns.mdwn
Normal file
106
doc/tips/antipatterns.mdwn
Normal file
|
@ -0,0 +1,106 @@
|
|||
This page tries to regroup a set of Really Bad Ideas people had with
|
||||
git-annex in the past that can lead to catastrophic data loss, abusive
|
||||
disk usage, improper swearing and other unfortunate experiences.
|
||||
|
||||
This could also be called the "git annex worst practices", but is
|
||||
different than [[not|what git annex is not]] in that it covers normal
|
||||
use cases of git-annex, just implemented in the wrong way. Hopefully,
|
||||
git-annex should make it as hard as possible to do those things, but
|
||||
sometimes, you just can't help it, people figure out the worst
|
||||
possible ways of doing things.
|
||||
|
||||
[[!toc]]
|
||||
|
||||
.git/annex symlink
|
||||
==================
|
||||
|
||||
Antipattern
|
||||
-----------
|
||||
|
||||
Symlinking the `.git/annex` symlink directory, in the hope of saving
|
||||
disk space, is a horrible idea. The general antipattern is:
|
||||
|
||||
git clone repoA repoB
|
||||
mv repoB/.git/annex repoB/.git/annex.bak
|
||||
ln -s repoA/.git/annex repoB/.git/annex
|
||||
|
||||
This is bad because git-annex will believe it has two copy of the
|
||||
files and then would let you drop the single copy, therefore leading
|
||||
to data loss.
|
||||
|
||||
Proper pattern
|
||||
--------------
|
||||
|
||||
The proper way of doing this is through git-annex's hardlink support,
|
||||
by cloning the repository with the `--shared` option:
|
||||
|
||||
git clone --shared repoA repoB
|
||||
|
||||
This will setup repoB as an "untrusted" repository and use hardlinks
|
||||
to copy files between the two repos, using space only once. This
|
||||
works, of course, only on filesystems that support hardlinks, but
|
||||
that's usually the case for filesystems that support symlinks.
|
||||
|
||||
Real world cases
|
||||
----------------
|
||||
|
||||
* [[forum/share_.git__47__annex__47__objects_across_multiple_repositories_on_one_machine/]]
|
||||
* at least one IRC discussion
|
||||
|
||||
Fixes
|
||||
-----
|
||||
|
||||
Probably no way to fix this in git-annex - if users want to shoot
|
||||
themselves in the foot by messing with the backend, there's not much
|
||||
we can do to change that in this case.
|
||||
|
||||
using reinit with an existing uuid without fsck
|
||||
===============================================
|
||||
|
||||
To quote the manpage:
|
||||
|
||||
> Normally, initializing a repository generates a new, unique
|
||||
> identifier (UUID) for that repository. Occasionally it may be useful
|
||||
> to reuse a UUID -- for example, if a repository got deleted, and
|
||||
> you're setting it back up.
|
||||
|
||||
Anti-pattern
|
||||
------------
|
||||
|
||||
[[git-annex-reinit]] can be used to reuse UUIDs for deleted
|
||||
repositories. But what happens if you reuse the UUID of an *existing*
|
||||
repository, or a repository that hasn't been properly emptied before
|
||||
being declared dead? This can lead to data loss because, in that case,
|
||||
git-annex may think some files are still present in the revived
|
||||
repository (while they may not actually be).
|
||||
|
||||
Proper pattern
|
||||
--------------
|
||||
|
||||
The proper way of using reinit is to make sure you run
|
||||
[[git-annex-fsck]] (optionally with `--fast` to save time) on the
|
||||
revived repo right after running reinit. This will ensure that at
|
||||
least the location log will be updated, and git-annex will notice if
|
||||
files are missing.
|
||||
|
||||
Real world cases
|
||||
----------------
|
||||
|
||||
* [[bugs/remotes_disappeared]]
|
||||
|
||||
Fixes
|
||||
-----
|
||||
|
||||
An improvement to git-annex here would be to allow
|
||||
[[todo/reinit_should_work_without_arguments|reinit to work without arguments]]
|
||||
to at least not encourage UUID reuse. reinit could also recommend
|
||||
running fsck explicitely. It could even trigger an fsck directly.
|
||||
|
||||
Other cases
|
||||
===========
|
||||
|
||||
Feel free to add your lessons in catastrophe here! It's educational
|
||||
and fun, and will improve git-annex for everyone.
|
||||
|
||||
PS: should this be a toplevel page instead of being drowned in the
|
||||
[[tips]] section? Where should it be linked to? -- [[anarcat]]
|
Loading…
Add table
Reference in a new issue