git-annex/doc/tips/centralized_git_repository_tutorial.mdwn

142 lines
5.4 KiB
Markdown

The [[walkthrough]] builds up a decentralized git repository setup, but
git-annex can also be used with a centralized bare repository, just like
git can. This tutorial shows how to set up a centralized repository hosted on
GitHub on GitLab or your own git server.
## set up the repository, and make a checkout
I've created a repository for technical talk videos, which you can
[fork on Github](https://github.com/joeyh/techtalks).
Or make your own repository on GitHub (or GitLab elsewhere) now.
On your laptop, [[install]] git-annex, and clone the repository:
# git clone git@github.com:joeyh/techtalks.git
# cd techtalks
Tell git-annex to use the repository, and describe where this clone is
located:
# git annex init 'my laptop'
init my laptop ok
Let's tell git-annex that GitHub doesn't support running git-annex-shell there.
# git config remote.origin.annex-ignore true
This means you can't store annexed file *contents* on GitHub; it would
really be better to host the bare repository on your own server, which
would not have this limitation. (If you want to do that, check out
[[using_gitolite_with_git-annex]].) Or, you could use GitLab, which
*does* [support git-annex on their servers](https://about.gitlab.com/2015/02/17/gitlab-annex-solves-the-problem-of-versioning-large-binaries-with-git/).
## add files to the repository
Add some files, obtained however.
# youtube-dl -t 'http://www.youtube.com/watch?v=b9FagOVqxmI'
# git annex add *.mp4
add Haskell_Amuse_Bouche-b9FagOVqxmI.mp4 (checksum) ok
(Recording state in git...)
# git commit -m "added a video. I have not watched it yet but it sounds interesting"
This file is available directly from the web; so git-annex can download it:
# git annex addurl http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg
addurl kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg
(downloading http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg ...)
(checksum...) ok
(Recording state in git...)
# git commit -a -m 'added a screencast I made'
Feel free to rename the files, etc, using normal git commands:
# git mv Haskell_Amuse_Bouche-b9FagOVqxmI.mp4 Haskell_Amuse_Bouche.mp4
# git mv kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg git-annex_coding_in_haskell.ogg
# git commit -m 'better filenames'
Now push your changes back to the central repository. As well as pushing
the master branch, remember to push the git-annex branch, which is used to
track the file contents.
# git push origin master git-annex
To git@github.com:joeyh/techtalks.git
* [new branch] master -> master
* [new branch] git-annex -> git-annex
That push went fast, because it didn't upload large videos to GitHub.
To check this, you can ask git-annex where the contents of the videos are:
# git annex whereis
whereis Haskell_Amuse_Bouche.mp4 (1 copy)
767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok
whereis git-annex_coding_in_haskell.ogg (2 copies)
00000000-0000-0000-0000-000000000001 -- web
767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
ok
## make more checkouts
So far you have a central repository, and a checkout on a laptop.
Let's make another checkout that's used as a backup. You can put it anywhere
you like, just make it be somewhere your laptop can access. A few options:
* Put it on a USB drive that you can plug into the laptop.
* Put it on a desktop.
* Put it on some server in the local network.
* Put it on a remote VPS.
I'll use the VPS option, but these instructions should work for
any of the above.
# ssh server
server# sudo apt-get install git-annex
Clone the central repository as before. (If the clone fails, you need
to add your server's ssh public key to github -- see
[this page](http://help.github.com/ssh-issues/).)
server# git clone git@github.com:joeyh/techtalks.git
server# cd techtalks
server# git config remote.origin.annex-ignore true
server# git annex init 'backup'
init backup (merging origin/git-annex into git-annex...) ok
Notice that the server does not have the contents of any of the files yet.
If you run `ls`, you'll see broken symlinks. We want to populate this
backup with the file contents, by copying them from your laptop.
Back on your laptop, you need to configure a git remote for the backup.
Adjust the ssh url as needed to point to wherever the backup is. (If it
was on a local USB drive, you'd use the path to the repository instead.)
# git remote add backup ssh://server/~/techtalks
Now git-annex on your laptop knows how to reach the backup repository,
and can do things like copy files to it:
# git annex copy --to backup git-annex_coding_in_haskell.ogg
copy git-annex_coding_in_haskell.ogg (checking backup...)
12877824 2% 255.11kB/s 00:00
ok
You can also `git annex move` files to it, to free up space on your laptop.
And then you can `git annex get` files back to your laptop later on, as
desired.
After you use git-annex to move files around, remember to push,
which will broadcast its updated location information.
# git push origin master git-annex
## take it farther
Of course you can create as many checkouts as you desire. If you have a
desktop machine too, you can make a checkout there, and use `git remote
add` to also let your desktop access the backup repository.
You can add remotes for each direct connection between machines you find you
need -- so make the laptop have the desktop as a remote, and the desktop
have the laptop as a remote, and then on either machine git-annex can
access files stored on the other.