idea
This commit is contained in:
parent
3378f74fb0
commit
90eb649e73
1 changed files with 115 additions and 0 deletions
115
doc/todo/hiding_a_repository.mdwn
Normal file
115
doc/todo/hiding_a_repository.mdwn
Normal file
|
@ -0,0 +1,115 @@
|
||||||
|
In some situations it can be useful for a reposistory to not store its
|
||||||
|
state on the git-annex branch. One example is a temporary clone that's
|
||||||
|
going to be deleted. One way is just to not push the git-annex branch
|
||||||
|
from the repository to anywhere, but that limits what can be done in the
|
||||||
|
repository. For example, files might be added, and copied to public
|
||||||
|
repositories, but then the git-annex branch would need to be pushed to
|
||||||
|
publish them.
|
||||||
|
|
||||||
|
There could be a git config to hide the current repository.
|
||||||
|
|
||||||
|
(There could also be per-remote git configs, but it seems likely that,
|
||||||
|
if location tracking and other data is not stored for a remote, it will be
|
||||||
|
hard to do much useful with that remote. For example, git-annex get from
|
||||||
|
that remote would not work. (Without setting annex-speculate-present.) Also
|
||||||
|
some remotes depend on state being stored, like chunking data, encryption,
|
||||||
|
etc. So setting up networks of repos that know about one-another but are
|
||||||
|
hidden from the wider world would need some kind of public/private
|
||||||
|
git-annex branch separation. Which would be a very large complication.)
|
||||||
|
|
||||||
|
## location tracking effects
|
||||||
|
|
||||||
|
The main logs this would affect are for location tracking.
|
||||||
|
|
||||||
|
git-annex will mostly work the same without location tracking information
|
||||||
|
being recorded for the local repo. Often, git-annex uses inAnnex to
|
||||||
|
directly check if an annex object is present, rather than looking at
|
||||||
|
location tracking. For example --in=here uses inAnnex so would still work;
|
||||||
|
|
||||||
|
Of course, `git annex whereis`reports on location tracking info, so if a
|
||||||
|
file were added to such a repo, whereis on it would report no copies. And
|
||||||
|
I said "of course", but this may not be obvious to all users.
|
||||||
|
|
||||||
|
And there are parts of git-annex that do look at location tracking for
|
||||||
|
the current repo, even though it's generally slower than inAnnex. Since the
|
||||||
|
two are generally equivilant now, some general-purpose code that looks at
|
||||||
|
locations generally has no real need to use inAnnex. One example of this
|
||||||
|
is --copies.
|
||||||
|
|
||||||
|
One thing that would certainly need to be changed is git-annex
|
||||||
|
fsck, which notices when the location tracking information is wrong/missing and
|
||||||
|
corrects it. (Note that unsetting the git config followed by a fsck would
|
||||||
|
update the location logs, which could be useful to stop hiding the repo,
|
||||||
|
but if other stuff like annex.uuid is also affected, fsck would not do
|
||||||
|
anything about that stuff.)
|
||||||
|
|
||||||
|
git-annex info is a bit of a mess from this perspecitve. Its repo list
|
||||||
|
would not include the repo (if it was also hidden from uuid.log), but it
|
||||||
|
would report on the number of locally present objects, while other info
|
||||||
|
like numcopies stats and combined size of repositories are based on
|
||||||
|
location tracking, so would not include the current repo.
|
||||||
|
|
||||||
|
Looks like git-annex drop --from remote relies on the location log
|
||||||
|
to see if there's a local copy, so a hidden repo would not be treated as a
|
||||||
|
copy. This could be changed, but checking inAnnex here would actually slow
|
||||||
|
it down. It could also be argued that this is a good thing since dropping
|
||||||
|
from a remote could leave the only copy in a hidden repo. But then move
|
||||||
|
--from should also prevent that, and I think it might not.
|
||||||
|
|
||||||
|
So the question is, would adding this feature complicate git-annex too
|
||||||
|
much, in needing to pick the right choice of inAnnex or location log
|
||||||
|
querying, or in the user's mental model of how git-annex works?
|
||||||
|
|
||||||
|
## uuid.log
|
||||||
|
|
||||||
|
To really hide a repository, it needs to not be written to uuid.log.
|
||||||
|
|
||||||
|
So the config would need to be set before git-annex init.
|
||||||
|
|
||||||
|
If a repository is also hidden from uuid.log, it follows that this option
|
||||||
|
is not given a name specific to location tracking. Eg annex.hidden rather
|
||||||
|
than annex.omit-location-logs. But that does raise the
|
||||||
|
question about all the other places a repo's uuid could crop up in the
|
||||||
|
git-annex branch.
|
||||||
|
|
||||||
|
## everything else
|
||||||
|
|
||||||
|
* remote.log: A special remote is not usable without this, and this does
|
||||||
|
not seem to be a config that affects what is stored about remotes, but only
|
||||||
|
the current repo.
|
||||||
|
|
||||||
|
* trust.log: If the user sets this config, are things
|
||||||
|
like `git-annex trust here` supposed to refuse to work? Seems limiting,
|
||||||
|
and a significant source of scope creep. Maybe it would be better to
|
||||||
|
let the uuid be written to these, if the user chooses to set a trust.
|
||||||
|
After all, having some uuids in these logs, that are not described in
|
||||||
|
uuid.log, does not tell anyone else much, except that someone had a
|
||||||
|
hidden repository.
|
||||||
|
|
||||||
|
* group.log, preferred-content.log, required-content.log: Same as trust.log;
|
||||||
|
Names in group.log do hint about how a hidden repo might be used, but if
|
||||||
|
the user is concerned about that they can not add their repo to groups
|
||||||
|
that expose information.
|
||||||
|
|
||||||
|
* export.log: Same as remote.log
|
||||||
|
|
||||||
|
* `*.log.rmt`, `*.log.rmet`, `*.log.cid`, `*.log.cnk`: Same as remote.log
|
||||||
|
|
||||||
|
* schedule.log: Same as trust.log
|
||||||
|
|
||||||
|
* activity.log: A user might be surprised that fscking a hidden repo
|
||||||
|
mentions its uuid here. Also it seems unnecessary info to log for a
|
||||||
|
hidden repo. Should be special cased if uuid.log is.
|
||||||
|
|
||||||
|
* multicast.log: This includes the uuid of the current repo, when using
|
||||||
|
git-annex multicast. That could be surprising to a user, so probably
|
||||||
|
git-annex multicast would need to refuse to run in hidden repos.
|
||||||
|
|
||||||
|
* difference.log: Surprisingly to me, in a clone of a repo that was initialized
|
||||||
|
with tunings, git-annex init adds the new repo's uuid to this log file.
|
||||||
|
Should be special cased if uuid.log is. Unsure yet if it will be possible
|
||||||
|
to avoid writing it, or if tunings and hidden repos need to be
|
||||||
|
incompatible features.
|
||||||
|
|
||||||
|
That seems to be all. --[[Joey]]
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue