update
This commit is contained in:
parent
c5c7eaf009
commit
69c14d130b
7 changed files with 70 additions and 2 deletions
13
doc/distributed_version_control.mdwn
Normal file
13
doc/distributed_version_control.mdwn
Normal file
|
@ -0,0 +1,13 @@
|
|||
In git, there can be multiple clones of a repository, each clone can
|
||||
be independently modified, and clones can push or pull changes to
|
||||
one-another to get back in sync.
|
||||
|
||||
git-annex preserves that fundamental distributed nature of git, while
|
||||
dropping the requirement that, once in sync, each clone contains all the data
|
||||
that was committed to each other clone. Instead of storing the content
|
||||
of a file in the repository, git-annex stores a pointer to the content.
|
||||
|
||||
Each git-annex repository is responsible for storing some of the content,
|
||||
and can copy it to or from other repositories. [[Location_tracking]]
|
||||
information is committed to git, to let repositories inform other
|
||||
repositories what file contents they have available.
|
24
doc/future_proofing.mdwn
Normal file
24
doc/future_proofing.mdwn
Normal file
|
@ -0,0 +1,24 @@
|
|||
Imagine putting a git-annex drive in a time capsule. In 20, or 50, or 100
|
||||
years, you'd like its contents to be as accessible as possible to whoever
|
||||
digs it up.
|
||||
|
||||
This is a hard problem. git-annex cannot completly solve it, but it does
|
||||
its best to not contribute to the problem. Here are some aspects of the
|
||||
problem:
|
||||
|
||||
* How are files accessed? Git-annex carefully adds minimal complexity
|
||||
to access files in a repository. Nothing needs to be done to extract
|
||||
files from the repository; they are there on disk in the usual way,
|
||||
with just some symlinks pointing at the annexed file contents.
|
||||
Neither git-annex nor git is needed to get at the file contents.
|
||||
|
||||
* What file formats are used? Will they still be readable? To deal with
|
||||
this, it's best to stick to plain text files, and the most common
|
||||
image, sound, etc formats. Consider storing the same content in multiple
|
||||
formats.
|
||||
|
||||
* What filesystem is used on the drive? Will that filesystem still be
|
||||
available?
|
||||
|
||||
* What is the hardware interface of the drive? Will hardware still exist
|
||||
to talk to it?
|
|
@ -30,3 +30,11 @@
|
|||
situations. It lacks git-annex's support for widely distributed storage,
|
||||
using only a single backend data store. It also does not support
|
||||
partial checkouts of file contents, like git-annex does.
|
||||
|
||||
* git-annex is also not [boar](http://code.google.com/p/boar/),
|
||||
although it shares many of its goals and characteristics. Boar implements
|
||||
its own version control system, rather than simply embarcing and
|
||||
extending git. And while boar supports distributed clones of a repository,
|
||||
it does not support keeping different files in different clones of the
|
||||
same repository, which git-annex does, and is an important feature for
|
||||
large-scale archiving.
|
||||
|
|
BIN
doc/repomap.png
Normal file
BIN
doc/repomap.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 126 KiB |
14
doc/transferring_data.mdwn
Normal file
14
doc/transferring_data.mdwn
Normal file
|
@ -0,0 +1,14 @@
|
|||
git-annex can transfer data to or from any of a repository's git remotes.
|
||||
Depending on where the remote is, the data transfer is done using rsync
|
||||
(over ssh, with automatic resume), or plain cp (with copy-on-write
|
||||
optimisations on supported filesystems).
|
||||
|
||||
It's equally easy to transfer a single file to or from a repository,
|
||||
or to launch a retrievel of a massive pile of files from whatever
|
||||
repositories they are scattered amoung.
|
||||
|
||||
git-annex automatically uses whatever remotes are currently accessible,
|
||||
preferring ones that are less expensive to talk to.
|
||||
|
||||
[[!img repomap.png caption="A real-world repository interconnection map
|
||||
(generated by git-annex map)"]]
|
|
@ -10,9 +10,11 @@ When she has 1 bar on her cell, Alice queues up interesting files on her
|
|||
server for later. At a coffee shop, she has git-annex download them to her
|
||||
USB drive. High in the sky or in a remote cabin, she catches up on
|
||||
podcasts, videos, and games, first letting git-annex copy them from
|
||||
her USB drive to the netbook (this saves battery power).
|
||||
her USB drive to the netbook (this saves battery power).
|
||||
([[more about transferring data|transferring_data]])
|
||||
|
||||
When she's done, she tells git-annex which to keep and which to remove.
|
||||
They're all removed from her netbook to save space, and Alice knows
|
||||
that next time she syncs up to the net, her changes will be synced back
|
||||
to her server.
|
||||
to her server.
|
||||
([more about distributed version control|distributed_version_control])
|
||||
|
|
|
@ -11,8 +11,15 @@ without worry about accidentally deleting anything.
|
|||
When Bob needs access to some files, git-annex can tell him which drive(s)
|
||||
they're on, and easily make them available. Indeed, every drive knows what
|
||||
is on every other drive.
|
||||
([[more about location tracking|location_tracking]])
|
||||
|
||||
Bob thinks long-term, and so he's glad that git-annex uses a simple
|
||||
repository format. He knows his files will be accessible in the future
|
||||
even if the world has forgotten about git-annex and git.
|
||||
([[more about future-proofing|future_proofing]])
|
||||
|
||||
Run in a cron job, git-annex adds new files to archival drives at night. It
|
||||
also helps Bob keep track of intentional, and unintentional copies of
|
||||
files, and logs information he can use to decide when it's time to duplicate
|
||||
the content of old drives.
|
||||
([[more about backup copies|copies]])
|
||||
|
|
Loading…
Reference in a new issue