split centralized_git_repository_tutorial into 3

This commit is contained in:
Joey Hess 2015-07-22 17:16:42 -04:00
parent 7c290e8b98
commit 728540e5e5
4 changed files with 305 additions and 137 deletions

View file

@ -1,142 +1,17 @@
The [[walkthrough]] builds up a decentralized git repository setup, but
git-annex can also be used with a centralized bare repository, just like
git can. This tutorial shows how to set up a centralized repository hosted on
GitHub on GitLab or your own git server.
git-annex can also be used with a centralized git repository.
## set up the repository, and make a checkout
We have separate tutorials depending on where the centralized git
repository is hosted.
I've created a repository for technical talk videos, which you can
[fork on Github](https://github.com/joeyh/techtalks).
Or make your own repository on GitHub (or GitLab elsewhere) now.
* You can use GitHub. However, GitHub does not currently let git-annex
store the contents of large files there. So, things get a little more
complicated. See [[centralized_git_repository_tutorial/on_GitHub]]
for a tutorial for using git-annex with GitHub.
On your laptop, [[install]] git-annex, and clone the repository:
* You can use GitLab. This service is similar to GitHub, but supports
git-annex. See [[centralized_git_repository_tutorial/on_GitLab]]
# git clone git@github.com:joeyh/techtalks.git
# cd techtalks
Tell git-annex to use the repository, and describe where this clone is
located:
# git annex init 'my laptop'
init my laptop ok
Let's tell git-annex that GitHub doesn't support running git-annex-shell there.
# git config remote.origin.annex-ignore true
This means you can't store annexed file *contents* on GitHub; it would
really be better to host the bare repository on your own server, which
would not have this limitation. (If you want to do that, check out
[[using_gitolite_with_git-annex]].) Or, you could use GitLab, which
*does* [support git-annex on their servers](https://about.gitlab.com/2015/02/17/gitlab-annex-solves-the-problem-of-versioning-large-binaries-with-git/).
## add files to the repository
Add some files, obtained however.
# youtube-dl -t 'http://www.youtube.com/watch?v=b9FagOVqxmI'
# git annex add *.mp4
add Haskell_Amuse_Bouche-b9FagOVqxmI.mp4 (checksum) ok
(Recording state in git...)
# git commit -m "added a video. I have not watched it yet but it sounds interesting"
This file is available directly from the web; so git-annex can download it:
# git annex addurl http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg
addurl kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg
(downloading http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg ...)
(checksum...) ok
(Recording state in git...)
# git commit -a -m 'added a screencast I made'
Feel free to rename the files, etc, using normal git commands:
# git mv Haskell_Amuse_Bouche-b9FagOVqxmI.mp4 Haskell_Amuse_Bouche.mp4
# git mv kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg git-annex_coding_in_haskell.ogg
# git commit -m 'better filenames'
Now push your changes back to the central repository. As well as pushing
the master branch, remember to push the git-annex branch, which is used to
track the file contents.
# git push origin master git-annex
To git@github.com:joeyh/techtalks.git
* [new branch] master -> master
* [new branch] git-annex -> git-annex
That push went fast, because it didn't upload large videos to GitHub.
To check this, you can ask git-annex where the contents of the videos are:
# git annex whereis
whereis Haskell_Amuse_Bouche.mp4 (1 copy)
767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok
whereis git-annex_coding_in_haskell.ogg (2 copies)
00000000-0000-0000-0000-000000000001 -- web
767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok
## make more checkouts
So far you have a central repository, and a checkout on a laptop.
Let's make another checkout that's used as a backup. You can put it anywhere
you like, just make it be somewhere your laptop can access. A few options:
* Put it on a USB drive that you can plug into the laptop.
* Put it on a desktop.
* Put it on some server in the local network.
* Put it on a remote VPS.
I'll use the VPS option, but these instructions should work for
any of the above.
# ssh server
server# sudo apt-get install git-annex
Clone the central repository as before. (If the clone fails, you need
to add your server's ssh public key to github -- see
[this page](http://help.github.com/ssh-issues/).)
server# git clone git@github.com:joeyh/techtalks.git
server# cd techtalks
server# git config remote.origin.annex-ignore true
server# git annex init 'backup'
init backup (merging origin/git-annex into git-annex...) ok
Notice that the server does not have the contents of any of the files yet.
If you run `ls`, you'll see broken symlinks. We want to populate this
backup with the file contents, by copying them from your laptop.
Back on your laptop, you need to configure a git remote for the backup.
Adjust the ssh url as needed to point to wherever the backup is. (If it
was on a local USB drive, you'd use the path to the repository instead.)
# git remote add backup ssh://server/~/techtalks
Now git-annex on your laptop knows how to reach the backup repository,
and can do things like copy files to it:
# git annex copy --to backup git-annex_coding_in_haskell.ogg
copy git-annex_coding_in_haskell.ogg (checking backup...)
12877824 2% 255.11kB/s 00:00
ok
You can also `git annex move` files to it, to free up space on your laptop.
And then you can `git annex get` files back to your laptop later on, as
desired.
After you use git-annex to move files around, remember to push,
which will broadcast its updated location information.
# git push origin master git-annex
## take it farther
Of course you can create as many checkouts as you desire. If you have a
desktop machine too, you can make a checkout there, and use `git remote
add` to also let your desktop access the backup repository.
You can add remotes for each direct connection between machines you find you
need -- so make the laptop have the desktop as a remote, and the desktop
have the laptop as a remote, and then on either machine git-annex can
access files stored on the other.
* You can use your own git server, which can be any unix system with
ssh and git and git-annex installed. A VPS, a home server, etc.
See [[[[centralized_git_repository_tutorial/on_your_own_server]].

View file

@ -0,0 +1,129 @@
This tutorial shows how to set up a centralized repository hosted on
GitHub.
GitHub does not currently let git-annex store the contents of large files
there. This doesn't prevent using git-annex with GitHub, it just means you
have to set up some other centralized location for the large files.
## set up the repository, and make a checkout
I've created a repository for technical talk videos, which you can
[fork on Github](https://github.com/joeyh/techtalks).
Or make your own repository on GitHub now.
On your laptop, [[install]] git-annex, and clone the repository:
# git clone git@github.com:joeyh/techtalks.git
# cd techtalks
Tell git-annex to use the repository, and describe where this clone is
located:
# git annex init 'my laptop'
init my laptop ok
## add files to the repository
Add some files, obtained however.
# git annex add *.mp4
add Haskell_Amuse_Bouche-b9OVqxmI.mp4 (checksum) ok
(Recording state in git...)
# git commit -m "added a video. I have not watched it yet but it sounds interesting"
This file is available on the web; so git-annex can download it:
# git annex addurl http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg
addurl kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg
(downloading http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg ...)
(checksum...) ok
(Recording state in git...)
# git commit -a -m 'added a screencast I made'
Feel free to rename the files, etc, using normal git commands:
# git mv Haskell_Amuse_Bouche-b9OVqxmI.mp4 Haskell_Amuse_Bouche.mp4
# git mv kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg git-annex_coding_in_haskell.ogg
# git commit -m 'better filenames'
Now push your changes back to the central repository on GitHub. As well as
pushing the master branch, remember to push the git-annex branch, which is
used to track the file contents. You can do this push manually as shown
below, or you can just run `git annex sync` to do the same thing.
# git push origin master git-annex
To git@github.com:joeyh/techtalks.git
* [new branch] master -> master
* [new branch] git-annex -> git-annex
That push went fast, because it didn't upload large videos to GitHub.
To check this, you can ask git-annex where the contents of the videos are:
# git annex whereis
whereis Haskell_Amuse_Bouche.mp4 (1 copy)
767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok
whereis git-annex_coding_in_haskell.ogg (2 copies)
00000000-0000-0000-0000-000000000001 -- web
767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok
## make more checkouts
So far you have a central repository, and a checkout on a laptop.
You, or anyone you allow to can clone the central repository, and
use git-annex with it.
But, since GitHub doesn't currently support storing large files there
with git-annex, other checkouts of your repository won't be able to
access the files you added to the repository on your laptop.
# git clone git@github.com:myrepo/techtalks.git
# git annex get Haskell_Amuse_Bouche-b9OVqxmI.mp4
get Haskell_Amuse_Bouche-b9OVqxmI.mp4
Try making some of these repositories available:
767e8558-0955-11e1-be83-cbbeaab7fff8 -- my laptop
failed
## add a special remote
So, to complete your setup, you need to set up a repository where git-annex
can store the contents of large files. This is often done by setting up
a [[special_remote|special_remotes]]. One free option is explained in
[[using_box.com_as_a_special_remote]]. Another useful approach is
explained in [[public_Amazon_S3_remote]].
Once you have the special remote set up on your laptop, you can
send files to it:
# git annex copy --to myspecialremote Haskell_Amuse_Bouche-b9OVqxmI.mp4
copy Haskell_Amuse_Bouche-b9OVqxmI.mp4 (to myspecialremote...)
100% 255.11kB/s
ok
You can also `git annex move` files to it, to free up space on your laptop.
And then you can `git annex get` files back to your laptop later on, as
desired.
After you use git-annex to move files around, remember to sync,
which will broadcast its updated location information.
# git annex sync
After setting up the special remote and storing some files on it,
you can download them on other clones. You'll first need to enable the same
special remote on the clones.
# git annex sync
# git annex enableremote myspecialremote
# git annex get git-annex_coding_in_haskell.ogg
100% 255.11kB/s
ok
## take it farther
You can add remotes for each direct connection between machines you find you
need -- so make the laptop have the desktop as a remote, and the desktop
have the laptop as a remote, and then on either machine git-annex can
access files stored on the other.

View file

@ -0,0 +1,76 @@
This tutorial shows how to set up a centralized repository hosted on
GitLab.
Since GitLab has [added support for git-annex on their servers](https://about.gitlab.com/2015/02/17/gitlab-annex-solves-the-problem-of-versioning-large-binaries-with-git/),
you can store your large files on GitLab, quite easily.
Note that as I'm writing this, GitLab is providing this service for free,
and I don't know how much data they're willing to host for free.
## create the repository
Go to <https://gitlab.com/> and sign up for an account, and create the
repository there. Take note of the SSH clone url for the repository, which
will be something like `git@gitlab.com:yourlogin/annex.git`.
We want to clone this locally, on your laptop. (If the clone fails, you
need to generate a ssh key and add it to GitLab.)
# git clone git@gitlab.com:yourlogin/annex.git
# cd annex
Tell git-annex to use the repository, and describe where this clone is
located:
# git annex init 'my laptop'
init my laptop ok
Add some files, obtained however.
# git annex add *.mp4
add Haskell_Amuse_Bouche-b9OVqxmI.mp4 (checksum) ok
(Recording state in git...)
# git commit -m "added a video. I have not watched it yet but it sounds interesting"
Feel free to rename the files, etc, using normal git commands:
# git mv Haskell_Amuse_Bouche-b9OVqxmI.mp4 Haskell_Amuse_Bouche.mp4
# git commit -m 'better filenames'
## push to GitLab
Now make a first push to the GitLab repository.
As well as pushing the master branch, remember to push the git-annex
branch, which is used to track the file contents.
# git push origin master git-annex
To git@gitlab.com:yourlogin/annex.git
* [new branch] master -> master
* [new branch] git-annex -> git-annex
That push went fast, because it didn't upload the large file contents yet.
So, to finish up, tell git-annex to sync all the data in the repository
to GitLab:
# git annex sync --content
...
## make more checkouts
So far you have a central repository on GitLab, and a checkout on a laptop.
Let's make another checkout elsewhere. Clone the central repository as before.
(If the clone fails, you need to generate a ssh key and add it to GitLab.)
elsewhere# git clone git@gitlab.com:yourlogin/annex.git
elsewhere# cd annex
Notice that your clone does not have the contents of any of the files yet.
If you run `ls`, you'll see broken symlinks. It's easy to download them from
GitLab either by running `git annex sync --content`, or by asking
git-annex to download individual files:
# git annex get Haskell_Amuse_Bouche.mp4
get Haskell_Amuse_Bouche.mp4 (from origin...)
12877824 2% 255.11kB/s 00:00
ok

View file

@ -0,0 +1,88 @@
This tutorial shows how to set up a centralized git repository
hosted on your own git server, which can be any unix system with
ssh and git and git-annex installed. A VPS, a home server, etc.
This sets up a very simple git server. More complex setups are possible.
See for example [[using_gitolite_with_git-annex]].
## set up the server
On the server, you'll want to [[install]] git, and git-annex, if you haven't
already.
server# sudo apt-get install git git-annex
Decide where to put the repository on the server, and create a bare git repo
there. In your home directory is a simple choice:
server# cd
server# git init annex.git --bare --shared
That's the server setup done!
## make a checkout
Now on your laptop, clone the git repository from the server:
laptop# git clone ssh://example.com/~/annex.git
Cloning into 'annex'...
warning: You appear to have cloned an empty repository.
Checking connectivity... done.
Tell git-annex to use the repository, and describe where this clone is
located:
laptop# cd annex
laptop# git annex init 'my laptop'
init my laptop ok
## add files to the repository
Add some files, obtained however.
# git annex add *.mp4
add Haskell_Amuse_Bouche-b9OVqxmI.mp4 (checksum) ok
(Recording state in git...)
# git commit -m "added a video. I have not watched it yet but it sounds interesting"
Feel free to rename the files, etc, using normal git commands:
# git mv Haskell_Amuse_Bouche-b9OVqxmI.mp4 Haskell_Amuse_Bouche.mp4
# git commit -m 'better filenames'
Now push your changes back to the central repository on your server. As
well as pushing the master branch, remember to push the git-annex branch,
which is used to track the file contents.
# git push origin master git-annex
To git@github.com:joeyh/techtalks.git
* [new branch] master -> master
* [new branch] git-annex -> git-annex
That push went fast, because it didn't upload large videos to the server.
So, to finish up, tell git-annex to sync all the data in the repository
to your server:
# git annex sync --content
...
## make more checkouts
So far you have a central repository on your server, and a checkout on a laptop.
Let's make another checkout elsewhere. Clone the central repository as before.
elsewhere# git clone ssh://example.com/~/annex.git
elsewhere# cd annex
Notice that your clone does not have the contents of any of the files yet.
If you run `ls`, you'll see broken symlinks. It's easy to download them from
your server either by running `git annex sync --content`, or by asking
git-annex to download individual files:
# git annex get Haskell_Amuse_Bouche.mp4
get Haskell_Amuse_Bouche.mp4 (from origin...)
12877824 2% 255.11kB/s 00:00
ok