update
This commit is contained in:
parent
c5c7eaf009
commit
69c14d130b
7 changed files with 70 additions and 2 deletions
13
doc/distributed_version_control.mdwn
Normal file
13
doc/distributed_version_control.mdwn
Normal file
|
@ -0,0 +1,13 @@
|
||||||
|
In git, there can be multiple clones of a repository, each clone can
|
||||||
|
be independently modified, and clones can push or pull changes to
|
||||||
|
one-another to get back in sync.
|
||||||
|
|
||||||
|
git-annex preserves that fundamental distributed nature of git, while
|
||||||
|
dropping the requirement that, once in sync, each clone contains all the data
|
||||||
|
that was committed to each other clone. Instead of storing the content
|
||||||
|
of a file in the repository, git-annex stores a pointer to the content.
|
||||||
|
|
||||||
|
Each git-annex repository is responsible for storing some of the content,
|
||||||
|
and can copy it to or from other repositories. [[Location_tracking]]
|
||||||
|
information is committed to git, to let repositories inform other
|
||||||
|
repositories what file contents they have available.
|
24
doc/future_proofing.mdwn
Normal file
24
doc/future_proofing.mdwn
Normal file
|
@ -0,0 +1,24 @@
|
||||||
|
Imagine putting a git-annex drive in a time capsule. In 20, or 50, or 100
|
||||||
|
years, you'd like its contents to be as accessible as possible to whoever
|
||||||
|
digs it up.
|
||||||
|
|
||||||
|
This is a hard problem. git-annex cannot completly solve it, but it does
|
||||||
|
its best to not contribute to the problem. Here are some aspects of the
|
||||||
|
problem:
|
||||||
|
|
||||||
|
* How are files accessed? Git-annex carefully adds minimal complexity
|
||||||
|
to access files in a repository. Nothing needs to be done to extract
|
||||||
|
files from the repository; they are there on disk in the usual way,
|
||||||
|
with just some symlinks pointing at the annexed file contents.
|
||||||
|
Neither git-annex nor git is needed to get at the file contents.
|
||||||
|
|
||||||
|
* What file formats are used? Will they still be readable? To deal with
|
||||||
|
this, it's best to stick to plain text files, and the most common
|
||||||
|
image, sound, etc formats. Consider storing the same content in multiple
|
||||||
|
formats.
|
||||||
|
|
||||||
|
* What filesystem is used on the drive? Will that filesystem still be
|
||||||
|
available?
|
||||||
|
|
||||||
|
* What is the hardware interface of the drive? Will hardware still exist
|
||||||
|
to talk to it?
|
|
@ -30,3 +30,11 @@
|
||||||
situations. It lacks git-annex's support for widely distributed storage,
|
situations. It lacks git-annex's support for widely distributed storage,
|
||||||
using only a single backend data store. It also does not support
|
using only a single backend data store. It also does not support
|
||||||
partial checkouts of file contents, like git-annex does.
|
partial checkouts of file contents, like git-annex does.
|
||||||
|
|
||||||
|
* git-annex is also not [boar](http://code.google.com/p/boar/),
|
||||||
|
although it shares many of its goals and characteristics. Boar implements
|
||||||
|
its own version control system, rather than simply embarcing and
|
||||||
|
extending git. And while boar supports distributed clones of a repository,
|
||||||
|
it does not support keeping different files in different clones of the
|
||||||
|
same repository, which git-annex does, and is an important feature for
|
||||||
|
large-scale archiving.
|
||||||
|
|
BIN
doc/repomap.png
Normal file
BIN
doc/repomap.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 126 KiB |
14
doc/transferring_data.mdwn
Normal file
14
doc/transferring_data.mdwn
Normal file
|
@ -0,0 +1,14 @@
|
||||||
|
git-annex can transfer data to or from any of a repository's git remotes.
|
||||||
|
Depending on where the remote is, the data transfer is done using rsync
|
||||||
|
(over ssh, with automatic resume), or plain cp (with copy-on-write
|
||||||
|
optimisations on supported filesystems).
|
||||||
|
|
||||||
|
It's equally easy to transfer a single file to or from a repository,
|
||||||
|
or to launch a retrievel of a massive pile of files from whatever
|
||||||
|
repositories they are scattered amoung.
|
||||||
|
|
||||||
|
git-annex automatically uses whatever remotes are currently accessible,
|
||||||
|
preferring ones that are less expensive to talk to.
|
||||||
|
|
||||||
|
[[!img repomap.png caption="A real-world repository interconnection map
|
||||||
|
(generated by git-annex map)"]]
|
|
@ -10,9 +10,11 @@ When she has 1 bar on her cell, Alice queues up interesting files on her
|
||||||
server for later. At a coffee shop, she has git-annex download them to her
|
server for later. At a coffee shop, she has git-annex download them to her
|
||||||
USB drive. High in the sky or in a remote cabin, she catches up on
|
USB drive. High in the sky or in a remote cabin, she catches up on
|
||||||
podcasts, videos, and games, first letting git-annex copy them from
|
podcasts, videos, and games, first letting git-annex copy them from
|
||||||
her USB drive to the netbook (this saves battery power).
|
her USB drive to the netbook (this saves battery power).
|
||||||
|
([[more about transferring data|transferring_data]])
|
||||||
|
|
||||||
When she's done, she tells git-annex which to keep and which to remove.
|
When she's done, she tells git-annex which to keep and which to remove.
|
||||||
They're all removed from her netbook to save space, and Alice knows
|
They're all removed from her netbook to save space, and Alice knows
|
||||||
that next time she syncs up to the net, her changes will be synced back
|
that next time she syncs up to the net, her changes will be synced back
|
||||||
to her server.
|
to her server.
|
||||||
|
([more about distributed version control|distributed_version_control])
|
||||||
|
|
|
@ -11,8 +11,15 @@ without worry about accidentally deleting anything.
|
||||||
When Bob needs access to some files, git-annex can tell him which drive(s)
|
When Bob needs access to some files, git-annex can tell him which drive(s)
|
||||||
they're on, and easily make them available. Indeed, every drive knows what
|
they're on, and easily make them available. Indeed, every drive knows what
|
||||||
is on every other drive.
|
is on every other drive.
|
||||||
|
([[more about location tracking|location_tracking]])
|
||||||
|
|
||||||
|
Bob thinks long-term, and so he's glad that git-annex uses a simple
|
||||||
|
repository format. He knows his files will be accessible in the future
|
||||||
|
even if the world has forgotten about git-annex and git.
|
||||||
|
([[more about future-proofing|future_proofing]])
|
||||||
|
|
||||||
Run in a cron job, git-annex adds new files to archival drives at night. It
|
Run in a cron job, git-annex adds new files to archival drives at night. It
|
||||||
also helps Bob keep track of intentional, and unintentional copies of
|
also helps Bob keep track of intentional, and unintentional copies of
|
||||||
files, and logs information he can use to decide when it's time to duplicate
|
files, and logs information he can use to decide when it's time to duplicate
|
||||||
the content of old drives.
|
the content of old drives.
|
||||||
|
([[more about backup copies|copies]])
|
||||||
|
|
Loading…
Reference in a new issue