git-annex/doc/tips/cloning_a_repository_privately.mdwn

64 lines
2.9 KiB
Text
Raw Normal View History

2021-04-21 21:01:03 +00:00
Normally, when you clone a git-annex repository, and use git-annex in it,
and then push or otherwise send changes back to origin, information gets
committed to the git-annex branch about your clone. Things like the annexed
files that are in it, its description, etc.
If you don't want the world to know about your clone, either for privacy
reason or only because the clone is a temporary copy of the repository,
here's how.
Recently git-annex got a new config setting, `annex.private`.
2021-04-21 21:01:03 +00:00
Set it before you start using git-annex in a repository, and git-annex
will avoid recording any information about the repository into the
git-annex branch.
git clone ssh://... myclone
cd myclone
git config annex.private true
2021-04-21 21:01:03 +00:00
git annex init
Now you can use git-annex as usual, adding files to the repository,
getting the contents of files, etc.
When you push changes back to origin, do still push the git-annex branch,
since git-annex still uses it to record anything it needs to keep track of
that does not involve your private repository.
And be sure, when adding or editing annexed files, that you `git-annex copy`
them to a publically accessible repository. Otherwise, to everyone else,
there will seem to be no copies of that file availble anywhere, since they
won't know about your private repo's copy.
## private special remotes
You can also make private special remotes, by using `git annex initremote
--private`.
Like a private repository, git-annex avoids storing any information about
a private special remote to the git-annex branch. It will only be available in
the repository where the special remote was created.
Bear in mind that, if you lose the repository where the private special
remote was created, you'll lose the information git-annex needs to access
that special remote, and that will likely mean you'll not be able to
recover any files stored in it.
## private git remotes
When the git config "remote.name.private" is set, git-annex will avoid
recording anything in the git-annex branch about the remote. This is
set by `git-annex initremote --private`, and could also be set for
2024-10-08 03:58:55 +00:00
git remotes. This may be useful if, for example, you are trying to deduplicate content,
bifurcate repositories, or reinject it using a temporary annex as a staging area.
Git annex is excellent for these tasks because it naturally hashes all file content,
therefore if a 'copy' appears in one repo that should belong in another, you can drop
its content or move it to deduplicate. However, in this case, git-annex logging a relationship
between the two repos is undesirable. Especially if the repos are otherwise unrelated or
one of them is temporary (to be deleted once emptied), `remote.private` is preferrable to declaring
the repo dead and doing a `forget --drop-dead --force` operation.
## where the data is actually stored
The private data gets stored in .git/annex/journal-private/ rather
than in the git-annex branch.