git-annex (5.20131127) unstable; urgency=low
* webapp: Detect when upgrades are available, and upgrade if the user desires. (Only when git-annex is installed using the prebuilt binaries from git-annex upstream, not from eg Debian.) * assistant: Detect when the git-annex binary is modified or replaced, and either prompt the user to restart the program, or automatically restart it. * annex.autoupgrade configures both the above upgrade behaviors. * Added support for quvi 0.9. Slightly suboptimal due to limitations in its interface compared with the old version. * Bug fix: annex.version did not get set on automatic upgrade to v5 direct mode repo, so the upgrade was performed repeatedly, slowing commands down. * webapp: Fix bug that broke switching between local repositories that use the new guarded direct mode. * Android: Fix stripping of the git-annex binary. * Android: Make terminal app show git-annex version number. * Android: Re-enable XMPP support. * reinject: Allow to be used in direct mode. * Futher improvements to git repo repair. Has now been tested in tens of thousands of intentionally damaged repos, and successfully repaired them all. * Allow use of --unused in bare repository. # imported from the archive
This commit is contained in:
commit
7189dfd77d
6383 changed files with 204042 additions and 0 deletions
|
@ -0,0 +1,111 @@
|
|||
I've been wrestling with git-annex to try to make it build on Debian, or more specifically, wrestling with Haskell dependencies.
|
||||
|
||||
After a fair amount of futzing around, and pestering a bunch of people in the process (thanks for the help! :) ) I finally managed to make it build.
|
||||
|
||||
I figured I would post the steps here, since it's not completely trivial, and I expect that a few others might be interested in building newer versions as well.
|
||||
|
||||
There appears to currently be two methods:
|
||||
|
||||
* Debian packages on Wheezy plus Sid
|
||||
* Starting out on Wheezy, and then picking the rest from Sid (it seems at least libghc-safesemaphore-dev from Sid is critical for newer git-annex)
|
||||
* WebDAV suport will not be available with this method
|
||||
* Cabal packages
|
||||
|
||||
|
||||
#Debian packages on Wheezy plus Sid
|
||||
|
||||
##Start off with a clean wheezy chroot
|
||||
|
||||
sudo debootstrap wheezy debian-wheezy
|
||||
sudo chroot debian-wheezy
|
||||
|
||||
##Install some build tools
|
||||
|
||||
apt-get update
|
||||
apt-get install devscripts git
|
||||
|
||||
##Get git-annex (either by cloning or simply moving the source into the chroot)
|
||||
|
||||
mkdir /src
|
||||
cd /src
|
||||
git clone git://git-annex.branchable.com/source.git git-annex
|
||||
cd git-annex
|
||||
|
||||
##Remove WebDAV dependency which can't be satisfied anywhere
|
||||
|
||||
sed '/libghc-dav-dev/d' -i debian/control
|
||||
|
||||
##Create dummy build-depends package and install all available Wheezy dependencies using it
|
||||
|
||||
mk-build-deps
|
||||
dpkg -i git-annex-build-deps*.deb
|
||||
apt-get install -f
|
||||
|
||||
(this will remove the build-depends package)
|
||||
|
||||
##Add Sid sources and install all available Sid dependencies
|
||||
|
||||
echo "deb http://http.debian.net/debian sid main" >>/etc/apt/sources.list
|
||||
apt-get update
|
||||
dpkg -i git-annex-build-deps*.deb
|
||||
apt-get install -f
|
||||
|
||||
(the build-depends package should now be fully installed)
|
||||
|
||||
##Disable the 'make test' that fails due to missing hothasktags
|
||||
|
||||
echo >>debian/rules
|
||||
echo "override_dh_auto_test:" >>debian/rules
|
||||
|
||||
##Build!
|
||||
|
||||
debuild -us -uc -Igit
|
||||
|
||||
|
||||
#Cabal packages
|
||||
|
||||
##Start off with a clean Sid(/Wheezy) chroot
|
||||
|
||||
sudo debootstrap sid debian-sid
|
||||
sudo chroot debian-sid
|
||||
|
||||
##Install a smaller set of tools and build-depends from Debian (cabal needs these to compile the Haskell stuff)
|
||||
|
||||
apt-get update
|
||||
apt-get install ghc cabal-install devscripts libz-dev pkg-config c2hs libgsasl7-dev libxml2-dev libgnutls-dev c2hs git debhelper ikiwiki perlmagick uuid rsync openssh-client fakeroot
|
||||
|
||||
##Get git-annex (either by cloning or simply moving the source into the chroot)
|
||||
|
||||
mkdir /src
|
||||
cd /src
|
||||
git clone git://git-annex.branchable.com/source.git git-annex
|
||||
cd git-annex
|
||||
|
||||
##Install the Haskell build-dependencies from cabal
|
||||
|
||||
cabal update
|
||||
cabal install --only-dependencies
|
||||
|
||||
##Optional step which doesn't work (might in the future)
|
||||
If we want to run the 'make test' after build we need hothasktags, which is only available via cabal
|
||||
|
||||
apt-get install happy
|
||||
cabal install hothasktags
|
||||
export PATH=$PATH:~/.cabal/bin
|
||||
|
||||
But this currently fails silently inside make test->fast->tags, and if you dig a bit (manually edit the makefile to be more verbose) you see
|
||||
|
||||
hothasktags: ./Command/AddUnused.hs: hGetContents: invalid argument (invalid byte sequence)
|
||||
|
||||
##Disable the 'make test' that fails
|
||||
|
||||
echo >>debian/rules
|
||||
echo "override_dh_auto_test:" >>debian/rules
|
||||
|
||||
##Remove all Debian package haskell depends (taken care of by cabal instead)
|
||||
|
||||
sed '/\tlibghc/d' -i debian/control
|
||||
|
||||
## Build!
|
||||
|
||||
debuild -us -uc -Igit
|
|
@ -0,0 +1,27 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkCw26IdxXXPBoLcZsQFslM67OJSJynb1w"
|
||||
nickname="Alexander"
|
||||
subject="can't install git-annex on OS X Mountain Lion without disabling WebDAV support"
|
||||
date="2013-04-29T17:57:03Z"
|
||||
content="""
|
||||
possibly related to this Debian issue:
|
||||
|
||||
trying to install git-annex with cabal on OS X 10.8.3, the build fails with
|
||||
|
||||
|
||||
Loading package DAV-0.4 ... linking ... ghc:
|
||||
lookupSymbol failed in relocateSection (relocate external)
|
||||
~/.cabal/lib/DAV-0.4/ghc-7.4.2/HSDAV-0.4.o: unknown symbol `_DAVzm0zi4_PathszuDAV_version1_closure'
|
||||
ghc: unable to load package `DAV-0.4'
|
||||
Failed to install git-annex-4.20130417
|
||||
cabal: Error: some packages failed to install:
|
||||
git-annex-4.20130417 failed during the building phase. The exception was:
|
||||
ExitFailure 1
|
||||
|
||||
|
||||
This was after following all of the instructions for the Homebrew install at [http://git-annex.branchable.com/install/OSX/](http://git-annex.branchable.com/install/OSX/)
|
||||
I was able to work around this issue by installing with the WebDAV flag disabled (ie, added the option --flags=\"-WebDAV\" to last command in the OS X install instructions):
|
||||
|
||||
cabal install git-annex --bindir=$HOME/bin --flags=\"-WebDAV\"
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
nickname="joey"
|
||||
subject="comment 2"
|
||||
date="2013-04-30T21:51:50Z"
|
||||
content="""
|
||||
@Alexander that DAV-0.4 problem is a bug in DAV, not git-annex. I've informed its author and it should be fixed soon, in a new version of DAV.
|
||||
"""]]
|
35
doc/tips/Crude_Windows_Sync.mdwn
Normal file
35
doc/tips/Crude_Windows_Sync.mdwn
Normal file
|
@ -0,0 +1,35 @@
|
|||
Here's a workaround to start syncing folders on Windows right now. It's a bit command line heavy, so you might need to set this up for your users. But I would much rather do this than use some other syncing solution and then have to migrate.
|
||||
|
||||
(1) Create a remote server git annex repository with the assistant on Linux or Mac.
|
||||
|
||||
(2) [Install git](http://git-scm.com/) on the Windows machine.
|
||||
|
||||
(3) [Install git-annex for Windows](http://git-annex.branchable.com/install/Windows/) on the Windows machine. Don't forget to run the installer as administrator.
|
||||
|
||||
(4) Run _Git Bash_ from the system menu, and run these commands to clone your repository.
|
||||
|
||||
ssh-keygen
|
||||
cat .ssh/id_rsa.pub | ssh username@my-server.com "cat >> ~/.ssh/authorized_keys"
|
||||
git clone username@my-server.com:/path/to/annex
|
||||
cd annex
|
||||
git annex init
|
||||
|
||||
(5) Create a script that will trigger a full sync
|
||||
|
||||
echo '
|
||||
#!/bin/bash
|
||||
git annex sync
|
||||
git annex get *
|
||||
git annex add .
|
||||
git annex sync
|
||||
git annex copy * --to origin
|
||||
' > sync.sh
|
||||
chmod +x sync.sh
|
||||
./sync.sh
|
||||
|
||||
(6) Copy the "Git Bash" shortcut from your windows menu to your desktop, and change the link target to:
|
||||
|
||||
C:\Program Files\Git\bin\sh.exe" --login -i "annex/sync.sh"
|
||||
|
||||
Now ask your users to run this shortcut before and after they change files. You can also put it into the "autostart" folder to sync at boot.
|
||||
|
59
doc/tips/Decentralized_repository_behind_a_Firewall.mdwn
Normal file
59
doc/tips/Decentralized_repository_behind_a_Firewall.mdwn
Normal file
|
@ -0,0 +1,59 @@
|
|||
If you're anything like me¹, you have a copy of your annex on a computer running at home², set up so you can access it from anywhere like this:
|
||||
|
||||
ssh myhome.no-ip.org
|
||||
|
||||
This is totally great! Except, there is no way for your home computer to pull your changes, because there is no *on-the-go.no-ip.org*. You can get clunky and use a *bare git repository and git push*, but there is a better way.
|
||||
|
||||
First, install *openssh-server* on your *on-the-go* computer
|
||||
|
||||
sudo apt-get install openssh-server # Adjust to your flavor of unix
|
||||
|
||||
Then, log into your *home* computer, with *port forwarding*:
|
||||
|
||||
ssh me@myhome.no-ip.org -R 2201:localhost:22
|
||||
|
||||
Your *home* computer can now ssh into your *on-the-go* computer, as long as you keep the above shell running.
|
||||
|
||||
You can now add your *on-the-go* computer as a remote on your *home* computer. Use the port forwarding shell you just connected with the command above, if you like.
|
||||
|
||||
ssh-keygen -t rsa
|
||||
ssh-copy-id "me@localhost -p 2201"
|
||||
cd ~/annex
|
||||
git remote add on-the-go ssh://me@localhost:2201/home/myuser/annex
|
||||
|
||||
Now you can run normal annex operations, as long as the port forwarding shell is running³.
|
||||
|
||||
git annex sync
|
||||
git annex get on-the-go some/big/file
|
||||
git annex info
|
||||
|
||||
You can add more computers by repeating with a different port, e.g. 2202 or 2203 (or any other).
|
||||
|
||||
If you're security paranoid (like me), read on. If you're not, that's it! Thanks for reading!
|
||||
|
||||
---
|
||||
Paranoid Area
|
||||
|
||||
Note you're granting passwordless access to your on-the-go computer to your home computer. I believe that's all right, as long as:
|
||||
|
||||
* Your home computer is really in your home, and not at a friend's house or some datacenter
|
||||
* Your home computer can be accessed only by ssh, and not HTTP or Samba or NTP or (shoot me now!) FTP
|
||||
* Only you (and perhaps trustworthy family) have access to your home computer
|
||||
* You have reasonably strong passwords or key-only logins on both your home and on-the-go computers.
|
||||
* You regularly install security updates on both computers (sudo apt-get update && sudo apt-get upgrade)
|
||||
|
||||
In any case, the setup is much, much, much more secure than Dropbox. With Dropbox, you have exactly the same setup, but:
|
||||
|
||||
* Your data is stored in some datacenter. It's supposed to be encrypted. It might not be.
|
||||
* Lot's of people have routine access to your files, and plausible reason to. Bored employees might regularly be doing some 'maintenance work' involving your pictures.
|
||||
* The dropbox software can do anything it likes on your computer, and it's closed source so you don't know if it does. A disgruntled employee could put a trojan into it.
|
||||
* Dropbox might have a backdoor for employee access to any file on your computer. This might be done with the best of intentions, but a mal-intentioned or careless employee might still erase things or send sensitive files from your computer by email.
|
||||
* A truly huge amount of eyes connected to incredibly smart brains have looked at openssh and found it secure. Everybody trusts openssh. With dropbox, there is, well, dropbox. Whoever that is.
|
||||
|
||||
-----
|
||||
|
||||
¹ Me=Carlo, not Joey. I'm pretty sure doing what I wrote here is a good idea, but in case it turns out to be catastrophically dumb, it's my fault, not his.
|
||||
|
||||
² My always-on computer at home is a raspberry pi with a 32GB USB stick. Best self-hosted dropbox you could imagine.
|
||||
|
||||
³ You can just forward the port, but not open a shell, by adding the -N command. This could be useful for connecting on startup, e.g. in /etc/rc.local. I prefer to open the shell to forward the ports, maybe use it, and close it to stop it.
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.154.6.49"
|
||||
subject="comment 1"
|
||||
date="2012-11-30T16:25:58Z"
|
||||
content="""
|
||||
If you don't trust your home computer with shell access, you can lock it down in `.ssh/authorized_keys` to only be able to run git-annex-shell. See [[forum/Restricting_git-annex-shell_to_a_specific_repository]]
|
||||
"""]]
|
13
doc/tips/Delay_Assistant_Startup_on_Login.mdwn
Normal file
13
doc/tips/Delay_Assistant_Startup_on_Login.mdwn
Normal file
|
@ -0,0 +1,13 @@
|
|||
# Problem
|
||||
I noticed that after installing git-annex assistant, my start up times greatly increased because the assistant does a startup scan while everything else is loading.
|
||||
# Solution (for people using Gnome)
|
||||
The solution I came up with is to delay the assistant's startup, as well as setting its IO priority as idle. To do this in Gnome 3, run:
|
||||
|
||||
gnome-session-properties
|
||||
Find the "Git Annex Assistant" entry in the Startup Programs tab, then click edit. Change this:
|
||||
|
||||
/usr/local/bin/git-annex assistant --autostart (your location of git-annex may be different)
|
||||
to this:
|
||||
|
||||
bash -c "sleep 30; ionice -c3 /usr/local/bin/git-annex assistant --autostart" (replace /usr/local/bin to wherever git-annex is installed)
|
||||
The "sleep 30" command delays the startup of the assistant by 30 seconds, and "ionice -c3" sets git-annex's IO priority to "idle," the lowest level.
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://launchpad.net/~alphapapa"
|
||||
nickname="alphapapa"
|
||||
subject="ionice not supported by deadline scheduler"
|
||||
date="2013-06-28T17:43:47Z"
|
||||
content="""
|
||||
Linux's deadline I/O scheduler does not support ionice. It is now the default on some distros, including Ubuntu. CFQ does support ionice.
|
||||
"""]]
|
120
doc/tips/Git_annex_and_Calibre.mdwn
Normal file
120
doc/tips/Git_annex_and_Calibre.mdwn
Normal file
|
@ -0,0 +1,120 @@
|
|||
The problem
|
||||
===========
|
||||
|
||||
[Calibre](http://calibre-ebook.com/) is a ebook manager that is
|
||||
available in [debian](http://packages.debian.org/sid/calibre). I use
|
||||
it to maintain my library, but also to dowload every day an epub
|
||||
version of a French newspaper and then put it on my kobo.
|
||||
|
||||
Configuring git annex for this
|
||||
==============================
|
||||
|
||||
I wanted to use git-annex, so
|
||||
|
||||
$ git init
|
||||
$ git annex init "some useful name"
|
||||
|
||||
But I don't want every thing in annex, because Calibre use some text
|
||||
file to save some metadata, so I used:
|
||||
|
||||
$ git config annex.largefiles "include=* exclude=*.opf exclude=*.json"
|
||||
|
||||
then lets add everything
|
||||
|
||||
$ git annex add *
|
||||
$ git add *
|
||||
$ git commit -m "first commit"
|
||||
|
||||
Calibre need read and write access on the its database, so let unlock it:
|
||||
|
||||
$ git annex unlock metadata.db
|
||||
|
||||
On my other computer I only need to do
|
||||
|
||||
$ git clone $user@$host:Calibre\ library
|
||||
$ cd Calibre\ library
|
||||
$ git annex init "another useful name"
|
||||
$ git annex get .
|
||||
$ git annex unlock metadata.db
|
||||
|
||||
The problem is that every time you will `git annex sync`, git annex
|
||||
will lock again the metadata.db, so lets unlock it automatically. I
|
||||
use git hooks, in `.git/hooks/post-commit` I have
|
||||
|
||||
#!/bin/bash
|
||||
|
||||
git annex edit metadata.db
|
||||
|
||||
don't forget to make this file executable
|
||||
|
||||
$ chmod a+x .git/hooks/post-commit
|
||||
|
||||
Day to day operation
|
||||
====================
|
||||
|
||||
$ git annex add .
|
||||
|
||||
Will put new file into the annex
|
||||
|
||||
$ git add .
|
||||
|
||||
Will take care of the files that should no go into annex
|
||||
|
||||
$ git annex sync
|
||||
|
||||
Will make the repositories exchange informations about all this, and
|
||||
make remote change local
|
||||
|
||||
$ git annex get .
|
||||
|
||||
Will make remote book locally available
|
||||
|
||||
Merge conflict
|
||||
--------------
|
||||
You should not run calibre on the two computer simultaneously, or
|
||||
without syncing before it. If you do, you will have a conflict that
|
||||
git-annex will automatically *solve* by rename both of the file.
|
||||
|
||||
You can then either:
|
||||
|
||||
- Choose one. If no books have been changed or added on one of the
|
||||
computer, to use the other `metadata.db` will not make you loose
|
||||
any information
|
||||
- rebuild it. `calibredb restore_database` won't do it, but will tell
|
||||
you how to do it.
|
||||
|
||||
Checking the library
|
||||
--------------------
|
||||
You can use `calibredb check_library` to check you library is
|
||||
correct. If you use git for it, it will always tell you that it is not
|
||||
correct: there is this author ".git" it doesn't know about. Just don't
|
||||
care about it.
|
||||
|
||||
Maybe this can be solved by using `vcsh` but apparently
|
||||
`vcsh`+`git annex` it not well tested yet.
|
||||
|
||||
Automatic stuff
|
||||
---------------
|
||||
I use `mr` to automatically run all this, but some config could be
|
||||
done (I believe) to have `git annex copy --auto` do what it should.
|
||||
|
||||
There are also the git annex assistant for this kind of automatic
|
||||
synchronizations of contents, but I don't know if my automatic
|
||||
unlocking of one file will break this.
|
||||
|
||||
It might be interesting to find someway to unlock and lock the library
|
||||
only when running calibre, a simple script to launch calibre will do
|
||||
that. Note that each time you will lock and unlock, you will have a
|
||||
new commit in git.
|
||||
|
||||
Another solution
|
||||
===================
|
||||
You could also use direct mode in place of the auto unlock feature
|
||||
|
||||
git annex direct
|
||||
|
||||
The remove the `post-commit` git hook (or do not add it). Its a
|
||||
simpler solution, but remember that interaction between git annex direct
|
||||
repositories and plain git are complex and sometimes downright dangerous. See [[direct mode]] for details.
|
||||
|
||||
In particular, do *not* called `git add *` in the above steps, as that will commit all books into git.
|
|
@ -0,0 +1,19 @@
|
|||
I worked out how to retroactively annex a large file that had been checked into a git repo some time ago. I thought this might be useful for others, so I am posting it here.
|
||||
|
||||
Suppose you have a git repo where somebody had checked in a large file you would like to have annexed, but there are a bunch of commits after it and you don't want to loose history, but you also don't want everybody to have to retrieve the large file when they clone the repo. This will re-write history as if the file had been annexed when it was originally added.
|
||||
|
||||
This command works for me, it relies on the current behavior of git which is to use a directory named .git-rewrite/t/ at the top of the git tree for the extracted tree. This will not be fast and it will rewrite history, so be sure that everybody who has a copy of your repo is OK with accepting the new history. If the behavior of git changes, you can specify the directory to use with the -d option. Currently, the t/ directory is created inside the directory you specify, so "-d ./.git-rewrite/" should be roughly equivalent to the default.
|
||||
|
||||
Enough with the explanation, on to the command:
|
||||
<pre>
|
||||
git filter-branch --tree-filter 'for FILE in file1 file2 file3;do if [ -f "$FILE" ] && [ ! -L "$FILE" ];then git rm --cached "$FILE";git annex add "$FILE";ln -sf `readlink "$FILE"|sed -e "s:^../../::"` "$FILE";fi;done' --tag-name-filter cat -- --all
|
||||
</pre>
|
||||
|
||||
replace file1 file2 file3... with whatever paths you want retroactively annexed. If you wanted bigfile1.bin in the top dir and subdir1/bigfile2.bin to be retroactively annexed try:
|
||||
<pre>
|
||||
git filter-branch --tree-filter 'for FILE in bigfile1.bin subdir1/bigfile2.bin;do if [ -f "$FILE" ] && [ ! -L "$FILE" ];then git rm --cached "$FILE";git annex add "$FILE";ln -sf `readlink "$FILE"|sed -e "s:^../../::"` "$FILE";fi;done' --tag-name-filter cat -- --all
|
||||
</pre>
|
||||
|
||||
**If your repo has tags** then you should take a look at the git-filter-branch man page about the --tag-name-filter option and decide what you want to do. By default this will re-write the tags "nearly properly".
|
||||
|
||||
You'll probably also want to look at the git-filter-branch man page's section titled "CHECKLIST FOR SHRINKING A REPOSITORY" if you want to free up the space in the existing repo that you just changed history on.
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://edheil.wordpress.com/"
|
||||
ip="173.162.44.162"
|
||||
subject="comment 1"
|
||||
date="2012-12-16T00:11:38Z"
|
||||
content="""
|
||||
Man, I wish you'd written this a couple weeks ago. :) I was never able to figure that incantation out and ended up unannexing and re-annexing the whole thing to get rid of the file I inadvertently checked into git instead of the annex.
|
||||
"""]]
|
|
@ -0,0 +1,45 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://launchpad.net/~arand"
|
||||
nickname="arand"
|
||||
subject="comment 2"
|
||||
date="2013-03-13T12:05:49Z"
|
||||
content="""
|
||||
Based on the hints given here I've worked on a filter to both annex and add urls via filter-branch:
|
||||
|
||||
[https://gitorious.org/arand-scripts/arand-scripts/blobs/master/annex-filter](https://gitorious.org/arand-scripts/arand-scripts/blobs/master/annex-filter)
|
||||
|
||||
The script above is very specific but I think there are a few ideas that can be used in general, the general structure is
|
||||
|
||||
#!/bin/bash
|
||||
|
||||
# links that already exist
|
||||
links=$(mktemp)
|
||||
find . -type l >\"$links\"
|
||||
|
||||
# remove from staging area first to not block and then annex
|
||||
git rm --cached --ignore-unmatch -r bin*
|
||||
git annex add -c annex.alwayscommit=false bin*
|
||||
|
||||
# compare links before and after annexing, remove links that existed before
|
||||
newlinks=$(mktemp -u)
|
||||
mkfifo \"$newlinks\"
|
||||
comm -13 <(sort \"$links\") <(find . -type l | sort) > \"$newlinks\" &
|
||||
|
||||
# rewrite links
|
||||
while IFS= read -r file
|
||||
do
|
||||
# link is created below .git-rewrite/t/ during filter-branch, strip two parents for correct target
|
||||
ln -sf \"$(readlink \"$file\" | sed -e 's%^\.\./\.\./%%')\" \"$file\"
|
||||
done < \"$newlinks\"
|
||||
|
||||
git annex merge
|
||||
|
||||
which would be run using
|
||||
|
||||
git filter-branch --tree-filter path/annex-filter --tag-filter cat -- --all
|
||||
|
||||
or similar.
|
||||
|
||||
* I'm using `find` to make sure the only rewritten symlinks are for the newly annexed files, this way it is possible to annex an unknown set of filenames
|
||||
* If doing several git annex commands using `-c annex.alwayscommit=false` and doing a `git annex merge` at the end instead might be faster.
|
||||
"""]]
|
|
@ -0,0 +1,36 @@
|
|||
[[!comment format=mdwn
|
||||
username="arand"
|
||||
ip="130.238.245.202"
|
||||
subject="comment 3"
|
||||
date="2013-03-18T14:39:52Z"
|
||||
content="""
|
||||
One thing I noticed is that git-annex needs to checksum each file even if they were previously annexed (rather obviously since there is no general way to tell if the file is the same as the old one without checksumming), but in the specific case that we are replacing files that are already in git, we do actually have the sha1 checksum for each file in question, which could be used.
|
||||
|
||||
So, trying to work with this, I wrote a filter script that starts out annexing stuff in the first commit, and continously writes out sha1<->filename<->git-annex-object triplets to a global file, when it then starts with the next commit, it compares the sha1s of the index with those of the global file, and any matches are manually symlinked directly to the corresponding git-annex-object without checksumming.
|
||||
|
||||
I've done a few tests and this seems to be considerably faster than letting git-annex checksum everything.
|
||||
|
||||
This is from a git-svn import of the (free software) Red Eclipse game project, there are approximately 3500 files (images, maps, models, etc.) being annexed in each commit (and around 5300 commits, hence why I really, really care about speed):
|
||||
|
||||
10 commits: ~7min
|
||||
|
||||
100 commits: ~38min
|
||||
|
||||
For comparison, the old and new method (the difference should increase with the amount of commits):
|
||||
|
||||
old, 20 commits ~32min
|
||||
|
||||
new, 20 commits: ~11min
|
||||
|
||||
The script itself is a bit of a monstrosity in bash(/grep/sed/awk/git), and the files that are annexed are hardcoded (removed in forming $oldindexfiles), but should be fairly easy to adapt:
|
||||
|
||||
[https://gitorious.org/arand-scripts/arand-scripts/blobs/master/annex-ffilter](https://gitorious.org/arand-scripts/arand-scripts/blobs/master/annex-ffilter)
|
||||
|
||||
The usage would be something like:
|
||||
|
||||
rm /tmp/annex-ffilter.log; git filter-branch --tree-filter 'ANNEX_FFILTER_LOG=/tmp/annex-ffilter.log ~/utv/scripts/annex-ffilter' --tag-name-filter cat -- branchname
|
||||
|
||||
I suggest you use it with at least two orders of magnitude more caution than normal filter-branch.
|
||||
|
||||
Hope it might be useful for someone else wrestling with filter-branch and git-annex :)
|
||||
"""]]
|
|
@ -0,0 +1,9 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawknOATcOkmzX4jKuET5Z2RsaFUNnLKnQsU"
|
||||
nickname="Stephen"
|
||||
subject="comment 4"
|
||||
date="2013-06-22T07:43:09Z"
|
||||
content="""
|
||||
Thanks for the tip :) One question though: how do I push this new history out throughout my other Annexes?
|
||||
All I managed to make it do was revert the rewrite so the raw file appeared again...
|
||||
"""]]
|
58
doc/tips/Internet_Archive_via_S3.mdwn
Normal file
58
doc/tips/Internet_Archive_via_S3.mdwn
Normal file
|
@ -0,0 +1,58 @@
|
|||
[The Internet Archive](http://www.archive.org/) allows members to upload
|
||||
collections using an Amazon S3
|
||||
[compatible API](http://www.archive.org/help/abouts3.txt), and this can
|
||||
be used with git-annex's [[special_remotes/S3]] support.
|
||||
|
||||
So, you can locally archive things with git-annex, define remotes that
|
||||
correspond to "items" at the Internet Archive, and use git-annex to upload
|
||||
your files to there. Of course, your use of the Internet Archive must
|
||||
comply with their [terms of service](http://www.archive.org/about/terms.php).
|
||||
|
||||
A nice added feature is that whenever git-annex sends a file to the
|
||||
Internet Archive, it records its url, the same as if you'd run `git annex
|
||||
addurl`. So any users who can clone your repository can download the files
|
||||
from archive.org, without needing any login or password info. This makes
|
||||
the Internet Archive a nice way to publish the large files associated with
|
||||
a public git repository.
|
||||
|
||||
----
|
||||
|
||||
Sign up for an account, and get your access keys here:
|
||||
<http://www.archive.org/account/s3.php>
|
||||
|
||||
# export AWS_ACCESS_KEY_ID=blahblah
|
||||
# export AWS_SECRET_ACCESS_KEY=xxxxxxx
|
||||
|
||||
Specify `host=s3.us.archive.org` when doing `initremote` to set up
|
||||
a remote at the Archive. This will enable a special Internet Archive mode:
|
||||
Encryption is not allowed; you are required to specify a bucket name
|
||||
rather than having git-annex pick a random one; and you can optionally
|
||||
specify `x-archive-meta*` headers to add metadata as explained in their
|
||||
[documentation](http://www.archive.org/help/abouts3.txt).
|
||||
|
||||
# git annex initremote archive-panama type=S3 \
|
||||
host=s3.us.archive.org bucket=panama-canal-lock-blueprints \
|
||||
x-archive-meta-mediatype=texts x-archive-meta-language=eng \
|
||||
x-archive-meta-title="original Panama Canal lock design blueprints"
|
||||
initremote archive-panama (Internet Archive mode) ok
|
||||
# git annex describe archive-panama "a man, a plan, a canal: panama"
|
||||
describe archive-panama ok
|
||||
|
||||
Then you can annex files and copy them to the remote as usual:
|
||||
|
||||
# git annex add photo1.jpeg --backend=SHA1E
|
||||
add photo1.jpeg (checksum...) ok
|
||||
# git annex copy photo1.jpeg --fast --to archive-panama
|
||||
copy (to archive-panama...) ok
|
||||
|
||||
Once a file has been stored on archive.org, it cannot be (easily) removed
|
||||
from it. Also, git-annex whereis will tell you a public url for the file
|
||||
on archive.org. (It may take a while for archive.org to make the file
|
||||
publically visibile.)
|
||||
|
||||
Note the use of the SHA1E [[backend|backends]] when adding files. That is
|
||||
the default backend used by git-annex, but even if you don't normally use
|
||||
it, it makes most sense to use the WORM or SHA1E backend for files that
|
||||
will be stored in the Internet Archive, since the key name will be exposed
|
||||
as the filename there, and since the Archive does special processing of
|
||||
files based on their extension.
|
|
@ -0,0 +1,34 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://id.koumbit.net/anarcat"
|
||||
ip="72.0.72.144"
|
||||
subject="how to use with simply addurl?"
|
||||
date="2013-10-09T22:27:27Z"
|
||||
content="""
|
||||
It doesn't seem like git annex addurl by itself supports the archive.org urls...
|
||||
|
||||
[[!format txt \"\"\"
|
||||
anarcat@marcos:presentations$ git annex addurl --file=re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm http://archive.org/download/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
|
||||
addurl re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
|
||||
failed to verify url exists: http://archive.org/download/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
|
||||
failed
|
||||
git-annex: addurl: 1 failed
|
||||
\"\"\"]]
|
||||
|
||||
I also tried the \"details\" url (<http://archive.org/details/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia>) - but that just downloads the webpage, not the video either...
|
||||
|
||||
Even the ultimate video URL doesn't work:
|
||||
|
||||
[[!format txt \"\"\"
|
||||
anarcat@marcos:presentations$ git annex addurl --debug --file=re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
|
||||
[2013-10-09 18:26:30 EDT] call: quvi [\"-v\",\"mute\",\"--support\",\"http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm\"]
|
||||
addurl re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm [2013-10-09 18:26:30 EDT] read: curl [\"-s\",\"--head\",\"-L\",\"http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm\",\"-w\",\"%{http_code}\"]
|
||||
|
||||
failed to verify url exists: http://ia601009.us.archive.org/9/items/Republica2012-EbenMoglen-FreedomOfThoughtRequiresFreeMedia/re_publica_2012___Eben_Moglen___Freedom_of_Thought_Requires_Free_Media.webm
|
||||
failed
|
||||
git-annex: addurl: 1 failed
|
||||
\"\"\"]]
|
||||
|
||||
... even though that URL actually gives out a proper 200 OK response code.
|
||||
|
||||
Any ideas? --[[anarcat]]
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.154.4.22"
|
||||
subject="comment 2"
|
||||
date="2013-10-11T17:08:27Z"
|
||||
content="""
|
||||
This was a misleading error message. The url you are trying to add to the file does not match the size recorded for the file already in the annex. (Or possibly the file's key has no recorded size). If you really want to add the url to the file despite it being a different encoding, you can use --relaxed, although fsck may not like the result if you ever end up downloading that url..
|
||||
|
||||
(Please file bug reports for problems in the future, rather than posting comments on only vaguely related pages which as we can see here can turn out to be entirely offtopic.)
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://id.koumbit.net/anarcat"
|
||||
ip="72.0.72.144"
|
||||
subject="still a bug, filed separately!"
|
||||
date="2013-10-11T18:49:06Z"
|
||||
content="""
|
||||
Aaah, of course, sorry for the noise here. It turns out that this is *not* because the filesize (or even the checksum, for that matter) are different, so there's clearly a bug there, and i filed it in [[bugs/addurl_fails_on_the_internet_archive]]. Thanks!
|
||||
"""]]
|
|
@ -0,0 +1,32 @@
|
|||
I have an annex that syncs my personal files on all my computers. It works great. Phones are different.
|
||||
|
||||
For one, everything's a bit slower to sync, there's battery considerations, and I just don't need every last old file on my phone. Then there's some files I explicitly don't want on my phone in case it gets lost, like family pictures, passport scans, or private keys.
|
||||
|
||||
But I still want photos, videos and voice recordings I make on my phone to be synced to my server. A transfer repo would work, but I want to keep them. Then there's my PDF book collection; that would certainly be nice to always have around in case I have half on hour on a bus. And my music collection ought to be around as well.
|
||||
|
||||
So I came up with this solution, and I'm very happy with it.
|
||||
|
||||
include=Music/* or include=Books/* or present
|
||||
|
||||
This will sync my music and book collections to my phone whenever I add something new on my computers, and it will sync and keep anything I add to the annex on my phone. Best of all worlds! Impressed how flexible preferred content is. More full-sync folders can be added like this:
|
||||
|
||||
include=Music/* or include=Books/* or Notes/* or present
|
||||
|
||||
To add them, I first had to figure out the uuid of my phone repo. So I added a new tab on android, and did
|
||||
|
||||
cd /sdcard/annex
|
||||
git config annex.uuid
|
||||
|
||||
Then I went to one of my computers, and did
|
||||
|
||||
git annex vicfg
|
||||
|
||||
And changed the line
|
||||
|
||||
content [phone-uuid] = standard
|
||||
|
||||
to
|
||||
|
||||
content [phone-uuid] = include=Music/* or include=Books/* or Notes/* or present
|
||||
|
||||
And waited for it to sync.
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.246"
|
||||
subject="comment 1"
|
||||
date="2013-11-16T17:29:03Z"
|
||||
content="""
|
||||
That's great, that's how I hoped people would be able to use preferred content settings.
|
||||
|
||||
I'd suggest adding support for archive directories to this. So if you create a file on the phone and are done with it, you can move it to an archive directory, and it will then be dropped from the phone once it reaches an archive repository.
|
||||
|
||||
This should accomplish that. (Untested)
|
||||
|
||||
`((exclude=*/archive/* and exclude=archive/*) or (not (copies=archive:1 or copies=smallarchive:1))) and (include=Music/* or include=Books/* or present)`
|
||||
"""]]
|
46
doc/tips/Using_Git-annex_as_a_web_browsing_assistant.mdwn
Normal file
46
doc/tips/Using_Git-annex_as_a_web_browsing_assistant.mdwn
Normal file
|
@ -0,0 +1,46 @@
|
|||
[[todo/wishlist: an "assistant" for web-browsing -- tracking the sources of the downloads]] suggests using git-annex as a tool to store downloads tied
|
||||
to their URLs. This also enables people to have their files stored offline,
|
||||
while being able to git annex drop them at any time and redownload them
|
||||
with git annex get. Additionally, a clone of the repo can be used to
|
||||
download whatever files are desired from online.
|
||||
|
||||
This tip explains how to implement a similar system to the one described in
|
||||
the linked wishlist with existing software and features of git-annex.
|
||||
|
||||
The first step is to install the Firefox plugin
|
||||
[FlashGot](http://flashgot.net/). We will use it to provide the Firefox
|
||||
shortcuts to add things to our annex.
|
||||
|
||||
We also need a normal download manager, if we want to get status updates as
|
||||
the download is done. We'll need to configure git-annex to use it by
|
||||
setting `annex.web-download-command` as Joey describes in his comment on
|
||||
[[todo/wishlist: allow configuration of downloader for addurl]]. See the
|
||||
manpage [[git-annex]] for more information on setting configuration.
|
||||
|
||||
Once we have installed all that, we need a script that has an interface
|
||||
which FlashGot can treat as a downloader, but which calls git-annex to do
|
||||
the actual downloading. Such a script is available from
|
||||
<https://gist.github.com/andyg0808/5342434>. Download it and store it
|
||||
somewhere it can live, or cut and paste:
|
||||
|
||||
[[!format sh """
|
||||
#!/bin/bash
|
||||
# $1=folder to cd to (must be a git annex repo)
|
||||
# $2=URL to download
|
||||
|
||||
cd "$1"
|
||||
git-annex addurl "$2"
|
||||
"""]]
|
||||
|
||||
Finally, we need to configure FlashGot to use the script as a downloader.
|
||||
Go to Tools > Add-ons in Firefox. Click "Preferences" on FlashGot. Click
|
||||
the Add button next to the list of download managers. Enter a name for the
|
||||
git-annex downloader. Choose the script that was downloaded from the
|
||||
"Locate executable file" dialog that appears. Now set the command line
|
||||
arguments template to be "[FOLDER] [URL]" (you can find more substitution
|
||||
expressions in the Placeholders dropdown above the Command line arguments
|
||||
template field). You're done!
|
||||
|
||||
Go ahead and test it by trying to download a file using FlashGot. It should
|
||||
offer as one of its available download managers the new manager you created
|
||||
just above. Select it and have fun!
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
nickname="joey"
|
||||
subject="comment 1"
|
||||
date="2013-04-11T20:16:02Z"
|
||||
content="""
|
||||
As of my last commit, you don't really need a separate download manager. The webapp will now display urls that `git annex addurl` is downloading in amoung the other transfers.
|
||||
"""]]
|
31
doc/tips/assume-unstaged.mdwn
Normal file
31
doc/tips/assume-unstaged.mdwn
Normal file
|
@ -0,0 +1,31 @@
|
|||
[[!meta title="using assume-unstages to speed up git with large trees of annexed files"]]
|
||||
|
||||
Git update-index's assume-unstaged feature can be used to speed
|
||||
up `git status` and stuff by not statting the whole tree looking for changed
|
||||
files.
|
||||
|
||||
This feature works quite well with git-annex. Especially because git
|
||||
annex's files are immutable, so aren't going to change out from under it,
|
||||
this is a nice fit. If you have a very large tree and `git status` is
|
||||
annoyingly slow, you can turn it on:
|
||||
|
||||
git config core.ignoreStat true
|
||||
|
||||
When `git mv` and `git rm` are used, those changes *do* get noticed, even
|
||||
on assume-unchanged files. When new files are added, eg by `git annex add`,
|
||||
they are also noticed.
|
||||
|
||||
There are two gotchas. Both occur because `git add` does not stage
|
||||
assume-unchanged files.
|
||||
|
||||
1. When an annexed file is moved to a different directory, it updates
|
||||
the symlink, and runs `git add` on it. So the file will move,
|
||||
but the changed symlink will not be noticed by git and it will commit a
|
||||
dangling symlink.
|
||||
2. When using `git annex migrate`, it changes the symlink and `git adds`
|
||||
it. Again this won't be committed.
|
||||
|
||||
These can be worked around by running `git update-index --really-refresh`
|
||||
after performing such operations. I hope that `git add` will be changed
|
||||
to stage changes to assume-unchanged files, which would remove this
|
||||
only complication. --[[Joey]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://me.yahoo.com/a/2djv2EYwk43rfJIAQXjYt_vfuOU-#a11a6"
|
||||
nickname="Olivier R"
|
||||
subject="It doesn't work 100%"
|
||||
date="2012-05-03T21:42:54Z"
|
||||
content="""
|
||||
When you remove tracked files... it doesn't show the new status. it's like if the file was ignored.
|
||||
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,13 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnxlx1UrzVhdy6_gFjzmF42x6QXxBUxg00"
|
||||
nickname="Jakukyo"
|
||||
subject="comment 2"
|
||||
date="2013-09-05T12:14:42Z"
|
||||
content="""
|
||||
> There are two gotchas...
|
||||
|
||||
So just always run `git annex add` after editing a file
|
||||
and `git update-index --really-refresh` after migrating
|
||||
backend?
|
||||
|
||||
"""]]
|
15
doc/tips/automatically_getting_files_on_checkout.mdwn
Normal file
15
doc/tips/automatically_getting_files_on_checkout.mdwn
Normal file
|
@ -0,0 +1,15 @@
|
|||
Normally git-annex does not retrieve file contents when checking out a
|
||||
tree. In some use cases, it makes sense to always have the contents of
|
||||
files available after a `git checkout` or `git update`. This can be
|
||||
accomplished by installing the following as `.git/hooks/post-checkout`
|
||||
|
||||
#!/bin/sh
|
||||
# Uses git-annex to get all files in the specified directories
|
||||
# (relative to the top of the repository) on checkout.
|
||||
dirs=.
|
||||
top="$(git rev-parse --show-toplevel)"
|
||||
for dir in "$dirs"; do git annex get $top/$dir"; done
|
||||
|
||||
By default, all files in the whole repository will be made available. The
|
||||
`dirs` setting can be configured if you only want to get files in certian
|
||||
directories.
|
|
@ -0,0 +1,2 @@
|
|||
When git annex does fsck on (for example) a GPG-encrypted special directory remote, it first transfers the whole file into .git/annex/tmp directory.
|
||||
If your annex is on an SSD, it's a good idea to make .git/annex/tmp a symlink to say /var/tmp so SSD isn't worn down. This actually may be a better default.
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawln3ckqKx0x_xDZMYwa9Q1bn4I06oWjkog"
|
||||
nickname="Michael"
|
||||
subject="comment 1"
|
||||
date="2013-07-31T15:15:41Z"
|
||||
content="""
|
||||
Of course, this only works when /var/tmp isn't on SSD itself. Perhaps tmpfs (e.g. a /tmp on many distros) is good -- after checking that there's enough space to transfer a particular file.
|
||||
"""]]
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawln3ckqKx0x_xDZMYwa9Q1bn4I06oWjkog"
|
||||
nickname="Michael"
|
||||
subject="there's a problem"
|
||||
date="2013-08-04T17:15:05Z"
|
||||
content="""
|
||||
If .git/annex/tmp is a symlink to another fs, then adding doesn't work:
|
||||
|
||||
add file1.jpg (checksum...)
|
||||
git-annex: /path/to/.git/annex/tmp/tmp30148: rename: unsupported operation (Invalid cross-device link)
|
||||
|
||||
It looks like it would be good to have two types of tmp directories here, one for adding, another one for verifying (and that one could be redirected off SSD).
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="guilhem"
|
||||
ip="46.239.117.180"
|
||||
subject="comment 3"
|
||||
date="2013-08-19T01:05:40Z"
|
||||
content="""
|
||||
A nice feature would be to perform the `fsck` on the (encrypted) remote itself, as it would avoid to clutter either the network or the tmpdir. However, that requires some changes in git-annex's backend. Indeed it would no longer be enough to store a single digest per (plain) file: a new digest needs to be stored for each encrypted copy. It is not necessarily a big deal, but the backend would need to be reorganized carefully.
|
||||
"""]]
|
75
doc/tips/centralised_repository:_starting_from_nothing.mdwn
Normal file
75
doc/tips/centralised_repository:_starting_from_nothing.mdwn
Normal file
|
@ -0,0 +1,75 @@
|
|||
If you are starting from nothing (no existing `git` or `git-annex` repository) and want to use a server as a centralised repository, try the following steps.
|
||||
|
||||
On the server where you'll hold the "master" repository:
|
||||
|
||||
server$ cd /one/git
|
||||
server$ mkdir m
|
||||
server$ cd m
|
||||
server$ git init --bare
|
||||
Initialized empty Git repository in /one/git/m/
|
||||
server$ git annex init origin
|
||||
init origin ok
|
||||
server$
|
||||
|
||||
Clone that to the laptop:
|
||||
|
||||
laptop$ cd /other
|
||||
laptop$ git clone ssh://server//one/git/m
|
||||
Cloning into 'm'...
|
||||
Warning: No xauth data; using fake authentication data for X11 forwarding.
|
||||
remote: Counting objects: 5, done.
|
||||
remote: Compressing objects: 100% (3/3), done.
|
||||
remote: Total 5 (delta 0), reused 0 (delta 0)
|
||||
Receiving objects: 100% (5/5), done.
|
||||
warning: remote HEAD refers to nonexistent ref, unable to checkout.
|
||||
|
||||
laptop$ cd m
|
||||
laptop$ git annex init laptop
|
||||
init laptop ok
|
||||
laptop$
|
||||
|
||||
Merge the `git-annex` repository (this is the bit that is often
|
||||
overlooked!):
|
||||
|
||||
laptop$ git annex merge
|
||||
merge . (merging "origin/git-annex" into git-annex...)
|
||||
ok
|
||||
laptop$
|
||||
|
||||
Add some content:
|
||||
|
||||
laptop$ git annex addurl http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg
|
||||
"kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg"
|
||||
addurl kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg (downloading http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg ...) --2011-12-15 08:13:10-- http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg
|
||||
Resolving kitenet.net (kitenet.net)... 2001:41c8:125:49::10, 80.68.85.49
|
||||
Connecting to kitenet.net (kitenet.net)|2001:41c8:125:49::10|:80... connected.
|
||||
HTTP request sent, awaiting response... 200 OK
|
||||
Length: 39362757 (38M) [audio/ogg]
|
||||
Saving to: `/other/m/.git/annex/tmp/URL--http&c%%kitenet.net%~joey%screencasts%git-annex_coding_in_haskell.ogg'
|
||||
|
||||
100%[======================================>] 39,362,757 2.31M/s in 17s
|
||||
|
||||
2011-12-15 08:13:27 (2.21 MB/s) - `/other/m/.git/annex/tmp/URL--http&c%%kitenet.net%~joey%screencasts%git-annex_coding_in_haskell.ogg' saved [39362757/39362757]
|
||||
|
||||
(checksum...) ok
|
||||
(Recording state in git...)
|
||||
laptop$ git commit -m 'See Joey play.'
|
||||
[master (root-commit) 106e923] See Joey play.
|
||||
1 files changed, 1 insertions(+), 0 deletions(-)
|
||||
create mode 120000 kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg
|
||||
laptop$
|
||||
|
||||
All fine, now push it back to the centralised master:
|
||||
|
||||
laptop$ git push
|
||||
Counting objects: 20, done.
|
||||
Delta compression using up to 4 threads.
|
||||
Compressing objects: 100% (11/11), done.
|
||||
Writing objects: 100% (18/18), 1.50 KiB, done.
|
||||
Total 18 (delta 1), reused 1 (delta 0)
|
||||
To ssh://server//one/git/m
|
||||
3ba1386..ad3bc9e git-annex -> git-annex
|
||||
laptop$
|
||||
|
||||
You can add more "client" repositories by following the `laptop`
|
||||
sequence of operations.
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joey.kitenet.net/"
|
||||
nickname="joey"
|
||||
subject="comment 1"
|
||||
date="2011-12-23T19:19:53Z"
|
||||
content="""
|
||||
See also: [[centralized_git_repository_tutorial]]
|
||||
"""]]
|
140
doc/tips/centralized_git_repository_tutorial.mdwn
Normal file
140
doc/tips/centralized_git_repository_tutorial.mdwn
Normal file
|
@ -0,0 +1,140 @@
|
|||
The [[walkthrough]] builds up a decentralized git repository setup, but
|
||||
git-annex can also be used with a centralized bare repository, just like
|
||||
git can. This tutorial shows how to set up a centralized repository hosted on
|
||||
GitHub.
|
||||
|
||||
## set up the repository, and make a checkout
|
||||
|
||||
I've created a repository for technical talk videos, which you can
|
||||
[fork on Github](https://github.com/joeyh/techtalks).
|
||||
Or make your own repository on GitHub (or elsewhere) now.
|
||||
|
||||
On your laptop, [[install]] git-annex, and clone the repository:
|
||||
|
||||
# git clone git@github.com:joeyh/techtalks.git
|
||||
# cd techtalks
|
||||
|
||||
Tell git-annex to use the repository, and describe where this clone is
|
||||
located:
|
||||
|
||||
# git annex init 'my laptop'
|
||||
init my laptop ok
|
||||
|
||||
Let's tell git-annex that GitHub doesn't support running git-annex-shell there.
|
||||
This means you can't store annexed file *contents* on GitHub; it would
|
||||
really be better to host the bare repository on your own server, which
|
||||
would not have this limitation. (If you want to do that, check out
|
||||
[[using_gitolite_with_git-annex]].)
|
||||
|
||||
# git config remote.origin.annex-ignore true
|
||||
|
||||
## add files to the repository
|
||||
|
||||
Add some files, obtained however.
|
||||
|
||||
# youtube-dl -t 'http://www.youtube.com/watch?v=b9FagOVqxmI'
|
||||
# git annex add *.mp4
|
||||
add Haskell_Amuse_Bouche-b9FagOVqxmI.mp4 (checksum) ok
|
||||
(Recording state in git...)
|
||||
# git commit -m "added a video. I have not watched it yet but it sounds interesting"
|
||||
|
||||
This file is available directly from the web; so git-annex can download it:
|
||||
|
||||
# git annex addurl http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg
|
||||
addurl kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg
|
||||
(downloading http://kitenet.net/~joey/screencasts/git-annex_coding_in_haskell.ogg ...)
|
||||
(checksum...) ok
|
||||
(Recording state in git...)
|
||||
# git commit -a -m 'added a screencast I made'
|
||||
|
||||
Feel free the rename the files, etc, using normal git commands:
|
||||
|
||||
# git mv Haskell_Amuse_Bouche-b9FagOVqxmI.mp4 Haskell_Amuse_Bouche.mp4
|
||||
# git mv kitenet.net_~joey_screencasts_git-annex_coding_in_haskell.ogg git-annex_coding_in_haskell.ogg
|
||||
# git commit -m 'better filenames'
|
||||
|
||||
Now push your changes back to the central repository. This first time,
|
||||
remember to push the git-annex branch, which is used to track the file
|
||||
contents.
|
||||
|
||||
# git push origin master git-annex
|
||||
To git@github.com:joeyh/techtalks.git
|
||||
* [new branch] master -> master
|
||||
* [new branch] git-annex -> git-annex
|
||||
|
||||
That push went fast, because it didn't upload large videos to GitHub.
|
||||
To check this, you can ask git-annex where the contents of the videos are:
|
||||
|
||||
# git annex whereis
|
||||
whereis Haskell_Amuse_Bouche.mp4 (1 copy)
|
||||
767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
|
||||
ok
|
||||
whereis git-annex_coding_in_haskell.ogg (2 copies)
|
||||
00000000-0000-0000-0000-000000000001 -- web
|
||||
767e8558-0955-11e1-be83-cbbeaab7fff8 -- here
|
||||
ok
|
||||
|
||||
## make more checkouts
|
||||
|
||||
So far you have a central repository, and a checkout on a laptop.
|
||||
Let's make another checkout that's used as a backup. You can put it anywhere
|
||||
you like, just make it be somewhere your laptop can access. A few options:
|
||||
|
||||
* Put it on a USB drive that you can plug into the laptop.
|
||||
* Put it on a desktop.
|
||||
* Put it on some server in the local network.
|
||||
* Put it on a remote VPS.
|
||||
|
||||
I'll use the VPS option, but these instructions should work for
|
||||
any of the above.
|
||||
|
||||
# ssh server
|
||||
server# sudo apt-get install git-annex
|
||||
|
||||
Clone the central repository as before. (If the clone fails, you need
|
||||
to add your server's ssh public key to github -- see
|
||||
[this page](http://help.github.com/ssh-issues/).)
|
||||
|
||||
server# git clone git@github.com:joeyh/techtalks.git
|
||||
server# cd techtalks
|
||||
server# git config remote.origin.annex-ignore true
|
||||
server# git annex init 'backup'
|
||||
init backup (merging origin/git-annex into git-annex...) ok
|
||||
|
||||
Notice that the server does not have the contents of any of the files yet.
|
||||
If you run `ls`, you'll see broken symlinks. We want to populate this
|
||||
backup with the file contents, by copying them from your laptop.
|
||||
|
||||
Back on your laptop, you need to configure a git remote for the backup.
|
||||
Adjust the ssh url as needed to point to wherever the backup is. (If it
|
||||
was on a local USB drive, you'd use the path to the repository instead.)
|
||||
|
||||
# git remote add backup ssh://server/~/techtalks
|
||||
|
||||
Now git-annex on your laptop knows how to reach the backup repository,
|
||||
and can do things like copy files to it:
|
||||
|
||||
# git annex copy --to backup git-annex_coding_in_haskell.ogg
|
||||
copy git-annex_coding_in_haskell.ogg (checking backup...)
|
||||
12877824 2% 255.11kB/s 00:00
|
||||
ok
|
||||
|
||||
You can also `git annex move` files to it, to free up space on your laptop.
|
||||
And then you can `git annex get` files back to your laptop later on, as
|
||||
desired.
|
||||
|
||||
After you use git-annex to move files around, remember to push,
|
||||
which will broadcast its updated location information.
|
||||
|
||||
# git push
|
||||
|
||||
## take it farther
|
||||
|
||||
Of course you can create as many checkouts as you desire. If you have a
|
||||
desktop machine too, you can make a checkout there, and use `git remote
|
||||
add` to also let your desktop access the backup repository.
|
||||
|
||||
You can add remotes for each direct connection between machines you find you
|
||||
need -- so make the laptop have the desktop as a remote, and the desktop
|
||||
have the laptop as a remote, and then on either machine git-annex can
|
||||
access files stored on the other.
|
|
@ -0,0 +1,33 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkC0W3ZQERUaTkHoks6k68Tsp1tz510nGo"
|
||||
nickname="Georg"
|
||||
subject="sync, push, pull with/to/from centralized bare repository"
|
||||
date="2013-10-07T06:45:19Z"
|
||||
content="""
|
||||
Hi Joey,
|
||||
|
||||
thanks for tutorial with the centralized repo. I am currently trying to set up a central bare repo for two clients (they cannot communicate directly with each other). I am not sure if I am pushing/pulling the right way.
|
||||
|
||||
On the server I did:
|
||||
|
||||
git init --bare
|
||||
git annex init origin
|
||||
|
||||
On Cĺient Alice (I want to give Bob a chance get call \"git annex get\" from \"origin\"):
|
||||
|
||||
git clone ssh://tktest@192.168.56.104/~/annex .
|
||||
git annex init Alice
|
||||
git annex merge
|
||||
git annex add .
|
||||
git commit -a -m \"Added tutorial\"
|
||||
git push origin master git-annex
|
||||
git annex copy . --to origin
|
||||
|
||||
On Client Bob I have called \"clone, init, merge, add, push, copy\" also.
|
||||
|
||||
Now the tricky part - do I have to call \"git annex sync\" at Alice's side to get the updates from Bob over origin?
|
||||
I ran into troubles if I called \"copy --to origin\" before \"git push origin master git-annex\". How can I resolve a non-fast-forware on the git-annex branch?
|
||||
Some notes about how to sync over a central bare repo would be nice here =)
|
||||
|
||||
Thanks a lot, Georg
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.153.253.80"
|
||||
subject="How can I resolve a non-fast-forware on the git-annex branch?"
|
||||
date="2013-10-07T17:08:32Z"
|
||||
content="""
|
||||
By either running `git annex sync`, or if you want to pull and push yourself, by running `git annex merge` before pushing.
|
||||
"""]]
|
63
doc/tips/downloading_podcasts.mdwn
Normal file
63
doc/tips/downloading_podcasts.mdwn
Normal file
|
@ -0,0 +1,63 @@
|
|||
You can use git-annex as a podcatcher, to download podcast contents.
|
||||
No additional software is required, but your git-annex must be built
|
||||
with the Feeds feature (run `git annex version` to check).
|
||||
|
||||
All you need to do is put something like this in a cron job:
|
||||
|
||||
`cd somerepo && git annex importfeed http://url/to/podcast http://other/podcast/url`
|
||||
|
||||
This downloads the urls, and parses them as RSS, Atom, or RDF feeds.
|
||||
All enclosures are downloaded and added to the repository, the same as if you
|
||||
had manually run `git annex addurl` on each of them.
|
||||
|
||||
git-annex will avoid downloading a file from a feed if its url has already
|
||||
been stored in the repository before. So once a file is downloaded,
|
||||
you can move it around, delete it, `git annex drop` its content, etc,
|
||||
and it will not be downloaded again by repeated runs of
|
||||
`git annex importfeed`. Just how a podcatcher should behave.
|
||||
|
||||
## templates
|
||||
|
||||
To control the filenames used for items downloaded from a feed,
|
||||
there's a --template option. The default is
|
||||
`--template='${feedtitle}/${itemtitle}${extension}'`
|
||||
|
||||
Other available template variables:
|
||||
feedauthor, itemauthor, itemsummary, itemdescription, itemrights, itemid
|
||||
|
||||
## catching up
|
||||
|
||||
To catch up on a feed without downloading its contents,
|
||||
use `git annex importfeed --relaxed`, and delete the symlinks it creates.
|
||||
Next time you run `git annex addurl` it will only fetch any new items.
|
||||
|
||||
## fast mode
|
||||
|
||||
To add a feed without downloading its contents right now,
|
||||
use `git annex importfeed --fast`. Then you can use `git annex get` as
|
||||
usual to download the content of an item.
|
||||
|
||||
## storing the podcast list in git
|
||||
|
||||
You can check the list of podcast urls into git right next to the
|
||||
files it downloads. Just make a file named feeds and add one podcast url
|
||||
per line.
|
||||
|
||||
Then you can run git-annex on all the feeds:
|
||||
|
||||
`xargs git-annex importfeed < feeds`
|
||||
|
||||
## distributed podcatching
|
||||
|
||||
A nice benefit of using git-annex as a podcatcher is that you can
|
||||
run `git annex importfeed` on the same url in different clones
|
||||
of a repository, and `git annex sync` will sync it all up.
|
||||
|
||||
## centralized podcatching
|
||||
|
||||
You can also have a designated machine which always fetches all podcstas
|
||||
to local disk and stores them. That way, you can archive podcasts with
|
||||
time-delayed deletion of upstream content. You can also work around slow
|
||||
downloads upstream by podcatching to a server with ample bandwidth or work
|
||||
around a slow local Internet connection by podcatching to your home server
|
||||
and transferring to your laptop on demand.
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="2001:4978:f:21a::2"
|
||||
subject="comment 10"
|
||||
date="2013-08-05T16:47:30Z"
|
||||
content="""
|
||||
`cabal install feed` should get the necessary library installed so that git-annex will build with feeds support.
|
||||
"""]]
|
|
@ -0,0 +1,24 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://a-or-b.myopenid.com/"
|
||||
ip="220.244.41.108"
|
||||
subject="comment 11"
|
||||
date="2013-08-06T04:20:16Z"
|
||||
content="""
|
||||
$ cabal install feed
|
||||
Resolving dependencies...
|
||||
All the requested packages are already installed:
|
||||
feed-0.3.9.1
|
||||
Use --reinstall if you want to reinstall anyway.
|
||||
|
||||
Then I reinstalled `git-annex` but it still doesn't find the feeds flag.
|
||||
|
||||
$ git annex version
|
||||
git-annex version: 4.20130802
|
||||
build flags: Assistant Webapp Pairing Testsuite S3 WebDAV FsEvents XMPP DNS
|
||||
|
||||
Do I need to do something like:
|
||||
|
||||
cabal install git-annex --bindir=$HOME/bin -f\"-assistant -webapp -webdav -pairing -xmpp -dns -feed\"
|
||||
|
||||
...but what are the default flags to include in addition to `-feed`
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="2001:4978:f:21a::2"
|
||||
subject="comment 12"
|
||||
date="2013-08-06T04:24:10Z"
|
||||
content="""
|
||||
-f-Feed will disable the feature. -fFeed will try to force it on.
|
||||
|
||||
You can probably work out what's going wrong using cabal install -v3
|
||||
"""]]
|
|
@ -0,0 +1,18 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://a-or-b.myopenid.com/"
|
||||
ip="220.244.41.108"
|
||||
subject="comment 13"
|
||||
date="2013-08-06T05:42:45Z"
|
||||
content="""
|
||||
So I ran `cabal install -v3` and looked at the output,
|
||||
|
||||
Flags chosen: feed=True, tdfa=True, testsuite=True, android=False,
|
||||
production=True, dns=True, xmpp=True, pairing=True, webapp=True,
|
||||
assistant=True, dbus=True, inotify=True, webdav=True, s3=True
|
||||
|
||||
This looks like feed should be on.
|
||||
|
||||
There doesn't appear to be any errors in the compile either.
|
||||
|
||||
Is it as simple as a bug where this flag just doesn't show in the `git annex version` command?
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="2001:4978:f:21a::2"
|
||||
subject="comment 14"
|
||||
date="2013-08-07T16:03:12Z"
|
||||
content="""
|
||||
Yes, it did turn out to be as simple as my having forgotten that I have to manually add features to the version list.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://23.gs/"
|
||||
ip="46.165.197.5"
|
||||
subject="No file extension?"
|
||||
date="2013-08-12T13:21:50Z"
|
||||
content="""
|
||||
It seems git-annex is a bit overzealous when sanitizing the file extension, currently I get: \"Nerdkunde/Let_s_go_to_the_D_M_C_A_m4a\" from http://www.nerdkunde.de/episodes.m4a.rss with the default template and only \"Nerdkunde/Let_s_go_to_the_D_M_C_A._m4a\" if I add the \".\" in the template myself...
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="arand"
|
||||
ip="130.243.226.21"
|
||||
subject="comment 16"
|
||||
date="2013-08-12T13:32:46Z"
|
||||
content="""
|
||||
The filename extension is a known issue and already fixed in the development version, see <http://git-annex.branchable.com/bugs/importfeed_uses___34____95__foo__34___as_extension/>
|
||||
"""]]
|
|
@ -0,0 +1,9 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawlpKmTa1OPwy5Jk24pOoD8Vlo2jahzTPnw"
|
||||
nickname="Stephen"
|
||||
subject="rss authentication"
|
||||
date="2013-08-13T13:32:52Z"
|
||||
content="""
|
||||
If a podcast requires authentication, is there a way to pass credentials through? I tried `http://user:pass@site.com/rss.xml` but it didn't work.
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,15 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://www.joachim-breitner.de/"
|
||||
nickname="nomeata"
|
||||
subject="--fast and --relaxed"
|
||||
date="2013-08-16T07:27:59Z"
|
||||
content="""
|
||||
Hi,
|
||||
|
||||
the explanations to --fast and --relaxed on this page could be extended a bit. I looked it up in the man page, but it is not yet clear to me when I would use one or the other with feeds. Also, does “Next time you run git annex addurl it will only fetch any new items.” really only apply to --relaxed, and not --fast?
|
||||
|
||||
Furthermore, it would be good if there were a template variable `itemnum` that I can use to ensure that `ls` prints the casts in the right order, even when the titles of the items are not helpful.
|
||||
|
||||
Greetings,
|
||||
Joachim
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.154.0.63"
|
||||
subject="comment 19"
|
||||
date="2013-08-22T15:25:02Z"
|
||||
content="""
|
||||
I would expect user:pass@site.com to work if the site is using http basic auth. `importfeed` just runs `wget` (or `curl`) to do all downloads, and wget's documentation says that works. It also says you can use ~/.netrc to store the password for a site.
|
||||
"""]]
|
|
@ -0,0 +1,13 @@
|
|||
[[!comment format=mdwn
|
||||
username="ckeen"
|
||||
ip="79.249.110.228"
|
||||
subject="Filename too long"
|
||||
date="2013-07-30T14:39:44Z"
|
||||
content="""
|
||||
It seems that some of my feeds get stored into keys that generate a too long filename:
|
||||
|
||||
podcasts/.git/annex/tmp/b1f_325_URL-s143660317--http&c%%feedproxy.google.com%~r%mixotic%~5%urTIRWQK2OQ%Mixotic__258__-__Michael__Miller__-__Galactic__Technolgies.mp3.log.web:
|
||||
openBinaryFile: invalid argument (File name too long)
|
||||
|
||||
Is there a way to work around this?
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.154.0.63"
|
||||
subject="comment 20"
|
||||
date="2013-08-22T15:29:11Z"
|
||||
content="""
|
||||
The git-annex man page has a bit more to say about --relaxed and --fast. Their behavior when used with `importfeed` is the same as with `addurl`.
|
||||
|
||||
If the podcast feed provides an `itemid`, you can use that in the filename template. I don't know how common that is. Due to the way `importfeed` works, it cannot keep track of eg, an incrementing item number itself.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.154.0.21"
|
||||
subject="comment 2"
|
||||
date="2013-07-30T17:16:07Z"
|
||||
content="""
|
||||
@ckeen You seem to be using a filesystem that does not support filenames 150 characters long. This is unusual -- even windows and android can support a filename up to 255 characters in length. `git-annex addurl` already deals with this sort of problem by limiting the filename to 255 characters. If you'd like to file a bug report with details about your system, I can try to make git-annex support its limitations, I suppose.
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://www.joachim-breitner.de/"
|
||||
nickname="nomeata"
|
||||
subject="Great stuff!"
|
||||
date="2013-07-30T21:21:57Z"
|
||||
content="""
|
||||
Looking forward to seeing it in Debian unstable; where it will definitely replace my hpodder setup.
|
||||
|
||||
I guess there is no easy way to re-use the files already downloaded with hpodder? At first I thought that `git annex importfeed --relaxed` followed by adding the files to the git annex would work, but `importfeed` stores URLs, not content-based hashes, so it wouldn’t match up.
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.154.0.21"
|
||||
subject="comment 4"
|
||||
date="2013-07-30T21:29:50Z"
|
||||
content="""
|
||||
@nomeata, well, you can, but it has to download the files again.
|
||||
|
||||
When run without --fast, `importfeed` does use content based hashes, so if you run it in a temporary directory, it will download the content redundantly, hash it and see it's the same, and add the url to that hash. You can then delete the temporary directory, and the files hpodder had downloaded will have the url attached to them now. I don't know if this really buys you anything over deleting the hpodder files and starting over though.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="ckeen"
|
||||
ip="79.249.110.228"
|
||||
subject="Force a reload of a feed?"
|
||||
date="2013-07-31T10:35:50Z"
|
||||
content="""
|
||||
Currently I have my podcasts imported with --fast. For some reason there are podcast episodes missing. This has been done propably during my period of toying with the feature. If I retry on a clean annex I see all episodes. My suspicion is that git-annex has been interrupted during downloading a feed but now somehow thinks it's already there. How can I debug this situation and/or force git annex to retry all the links in a feed?
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.154.0.21"
|
||||
subject="use the force"
|
||||
date="2013-07-31T16:20:39Z"
|
||||
content="""
|
||||
The only way it can skip downloading a file is if its url has already been seen before. Perhaps you deleted them?
|
||||
|
||||
I've made `importfeed --force` re-download files it's seen before.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="ckeen"
|
||||
ip="78.108.63.46"
|
||||
subject="--force reload all URLs"
|
||||
date="2013-08-01T09:47:34Z"
|
||||
content="""
|
||||
Is it intentionally saving URLs with a prefixed 2_? I have sorted out all missing URLs and renamed it, so no harm done, but it has been a bit of a hassle to get there.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.152.108.145"
|
||||
subject="comment 8"
|
||||
date="2013-08-01T16:05:10Z"
|
||||
content="""
|
||||
I've now made importfeed --force a bit smarter about reusing existing files.
|
||||
"""]]
|
|
@ -0,0 +1,24 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://a-or-b.myopenid.com/"
|
||||
ip="220.244.41.108"
|
||||
subject="How do I switch on the 'feeds' feature?"
|
||||
date="2013-08-05T04:52:41Z"
|
||||
content="""
|
||||
Joey - your initial post said:
|
||||
|
||||
git-annex must be built with the Feeds feature (run git annex version to check).
|
||||
|
||||
...but how do I actually switch on the feeds feature?
|
||||
|
||||
I install git-annex from cabal, so I do
|
||||
|
||||
cabal update
|
||||
cabal install git-annex
|
||||
|
||||
which I did this morning and now `git annex version` gives me:
|
||||
|
||||
git-annex version: 4.20130802
|
||||
build flags: Assistant Webapp Pairing Testsuite S3 WebDAV FsEvents XMPP DNS
|
||||
|
||||
So it is the latest version, but without Feeds. :-(
|
||||
"""]]
|
28
doc/tips/dropboxannex.mdwn
Normal file
28
doc/tips/dropboxannex.mdwn
Normal file
|
@ -0,0 +1,28 @@
|
|||
dropboxannex
|
||||
=========
|
||||
|
||||
Hook program for gitannex to use dropbox as backend
|
||||
|
||||
# Requirements:
|
||||
|
||||
python2
|
||||
|
||||
Credit for the Dropbox api interface goes to Dropbox.
|
||||
|
||||
# Install
|
||||
Clone the git repository in your home folder.
|
||||
|
||||
git clone git://github.com/TobiasTheViking/dropboxannex.git
|
||||
|
||||
This should make a ~/dropboxannex folder
|
||||
|
||||
# Setup
|
||||
Run the program once to set it up.
|
||||
|
||||
cd ~/dropboxannex; python2 dropboxannex.py
|
||||
|
||||
# Commands for gitannex:
|
||||
|
||||
git config annex.dropbox-hook '/usr/bin/python2 ~/dropboxannex/dropboxannex.py'
|
||||
git annex initremote dropbox type=hook hooktype=dropbox encryption=shared
|
||||
git annex describe dropbox "the dropbox library"
|
20
doc/tips/emacs_integration.mdwn
Normal file
20
doc/tips/emacs_integration.mdwn
Normal file
|
@ -0,0 +1,20 @@
|
|||
bergey has developed an emacs mode for browsing git-annex repositories,
|
||||
dired style.
|
||||
|
||||
<https://gitorious.org/emacs-contrib/annex-mode>
|
||||
|
||||
Locally available files are colored differently, and pressing g runs
|
||||
`git annex get` on the file at point.
|
||||
|
||||
----
|
||||
|
||||
John Wiegley has developed a brand new git-annex interaction mode for
|
||||
Emacs, which aims to integrate with the standard facilities
|
||||
(C-x C-q, M-x dired, etc) rather than invent its own interface.
|
||||
|
||||
<https://github.com/jwiegley/git-annex-el>
|
||||
|
||||
He has also added support to org-attach; if
|
||||
`org-attach-git-annex-cutoff' is non-nil and smaller than the size
|
||||
of the file you're attaching then org-attach will `git annex add the
|
||||
file`; otherwise it will "git add" it.
|
21
doc/tips/finding_duplicate_files.mdwn
Normal file
21
doc/tips/finding_duplicate_files.mdwn
Normal file
|
@ -0,0 +1,21 @@
|
|||
Maybe you had a lot of files scattered around on different drives, and you
|
||||
added them all into a single git-annex repository. Some of the files are
|
||||
surely duplicates of others.
|
||||
|
||||
While git-annex stores the file contents efficiently, it would still
|
||||
help in cleaning up this mess if you could find, and perhaps remove
|
||||
the duplicate files.
|
||||
|
||||
Here's a command line that will show duplicate sets of files grouped together:
|
||||
|
||||
git annex find --include '*' --format='${file} ${escaped_key}\n' | \
|
||||
sort -k2 | uniq --all-repeated=separate -f1 | \
|
||||
sed 's/ [^ ]*$//'
|
||||
|
||||
Here's a command line that will remove one of each duplicate set of files:
|
||||
|
||||
git annex find --include '*' --format='${file} ${escaped_key}\n' | \
|
||||
sort -k2 | uniq --repeated -f1 | sed 's/ [^ ]*$//' | \
|
||||
xargs -d '\n' git rm
|
||||
|
||||
--[[Joey]]
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmTNrhkVQ26GBLaLD5-zNuEiR8syTj4mI8"
|
||||
nickname="Juan"
|
||||
subject="comment 10"
|
||||
date="2013-08-31T18:20:58Z"
|
||||
content="""
|
||||
I'm already spreading the word. Handling scientific papers, data, simulations and code has been quite a challenge during my academic career. While code was solved long ago, the three first items remained a huge problem.
|
||||
I'm sure many of my colleagues will be happy to use it.
|
||||
Is there any hashtag or twitter account? I've seen that you collected some of my tweets, but I don't know how you did it. Did you search for git-annex?
|
||||
Best,
|
||||
Juan
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://adamspiers.myopenid.com/"
|
||||
nickname="Adam"
|
||||
subject="Cool"
|
||||
date="2011-12-23T19:16:50Z"
|
||||
content="""
|
||||
Very nice :) Just for reference, here's [my Perl implementation](https://github.com/aspiers/git-config/blob/master/bin/git-annex-finddups). As per [this discussion](http://git-annex.branchable.com/todo/wishlist:_Provide_a___34__git_annex__34___command_that_will_skip_duplicates/#comment-fb15d5829a52cd05bcbd5dc53edaffb2) it would be interesting to benchmark these two approaches and see if one is substantially more efficient than the other w.r.t. CPU and memory usage.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="bremner"
|
||||
ip="156.34.89.108"
|
||||
subject="problems with spaces in filenames"
|
||||
date="2012-09-05T02:12:18Z"
|
||||
content="""
|
||||
note that the sort -k2 doesn't work right for filenames with spaces in them. On the other hand, git-rm doesn't seem to like the escaped names from escaped_file.
|
||||
"""]]
|
39
doc/tips/finding_duplicate_files/comment_3._comment
Normal file
39
doc/tips/finding_duplicate_files/comment_3._comment
Normal file
|
@ -0,0 +1,39 @@
|
|||
[[!comment format=mdwn
|
||||
username="mhameed"
|
||||
ip="82.32.202.53"
|
||||
subject="problems with spaces in filenames"
|
||||
date="Wed Sep 5 09:38:56 BST 2012"
|
||||
content="""
|
||||
|
||||
Spaces, and other special chars can make filename handeling ugly.
|
||||
If you don't have a restriction on keeping the exact filenames, then
|
||||
it might be easiest just to get rid of the problematic chars.
|
||||
|
||||
#!/bin/bash
|
||||
|
||||
function process() {
|
||||
dir="$1"
|
||||
echo "processing $dir"
|
||||
pushd $dir >/dev/null 2>&1
|
||||
|
||||
for fileOrDir in *; do
|
||||
nfileOrDir=`echo "$fileOrDir" | sed -e 's/\[//g' -e 's/\]//g' -e 's/ /_/g' -e "s/'//g" `
|
||||
if [ "$fileOrDir" != "$nfileOrDir" ]; then
|
||||
echo renaming $fileOrDir to $nfileOrDir
|
||||
git mv "$fileOrDir" "$nfileOrDir"
|
||||
else
|
||||
echo "skipping $fileOrDir, no need to rename."
|
||||
fi
|
||||
done
|
||||
|
||||
find ./ -mindepth 1 -maxdepth 1 -type d | while read d; do
|
||||
process "$d"
|
||||
done
|
||||
popd >/dev/null 2>&1
|
||||
}
|
||||
|
||||
process .
|
||||
|
||||
Maybe you can run something like this before checking for duplicates.
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="bremner"
|
||||
ip="156.34.89.108"
|
||||
subject="more about spaces..."
|
||||
date="2012-09-09T19:33:01Z"
|
||||
content="""
|
||||
Ironically, previous renaming to remove spaces, plus some synching is how I ended up with these duplicates. For what it is worth, aspiers perl script worked out for me with a small modification. I just only printed out the duplicates with spaces in them (quoted).
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkaBh9VNJ-RZ26wJZ4BEhMN1IlPT-DK6JA"
|
||||
nickname="Alex"
|
||||
subject="printing keys first is the easiest workaround"
|
||||
date="2013-04-01T23:32:23Z"
|
||||
content="""
|
||||
Since the keys are sure to have nos paces in them, putting them first makes working with the output with tools like sort, uniq, and awk simpler.
|
||||
"""]]
|
|
@ -0,0 +1,16 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnkBYpLu_NOj7Uq0-acvLgWhxF8AUEIJbo"
|
||||
nickname="Chris"
|
||||
subject="Find files by key"
|
||||
date="2013-05-03T04:14:55Z"
|
||||
content="""
|
||||
Is there any simple way to search for files with a given key?
|
||||
|
||||
At the moment, the best I've come up with is this:
|
||||
|
||||
````
|
||||
git annex find --include '*' --format='${key} ${file}' | grep <KEY>
|
||||
````
|
||||
|
||||
where `<KEY>` is the key. This seems like an awfully longwinded approach, but I don't see anything in the docs indicating a simpler way to do it. Am I missing something?
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
nickname="joey"
|
||||
subject="comment 7"
|
||||
date="2013-05-13T18:42:14Z"
|
||||
content="""
|
||||
@Chris I guess there's no really easy way because searching for a given key is not something many people need to do.
|
||||
|
||||
However, git does provide a way. Try `git log --stat -S $KEY`
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmTNrhkVQ26GBLaLD5-zNuEiR8syTj4mI8"
|
||||
nickname="Juan"
|
||||
subject="This is an awesome feature"
|
||||
date="2013-08-28T13:40:23Z"
|
||||
content="""
|
||||
Thanks. I have quite a lot of papers in PDF formats. Now I'm saving space, have them controlled, synchronized with many devices and found more than 200 duplicates.
|
||||
Is there a way to donate to the project? You really deserve it.
|
||||
Thanks.
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.153.8.7"
|
||||
subject="comment 9"
|
||||
date="2013-08-28T20:25:20Z"
|
||||
content="""
|
||||
@Juan the best thing to do is tell people about git-annex, help them use it, and file bug reports. Just generally be part of the git-annex community.
|
||||
|
||||
(If you really want to donate to me, <http://campaign.joeyh.name/> is still open.)
|
||||
"""]]
|
62
doc/tips/flickrannex.mdwn
Normal file
62
doc/tips/flickrannex.mdwn
Normal file
|
@ -0,0 +1,62 @@
|
|||
# Latest version 0.1.10
|
||||
Hook program for gitannex to use flickr as backend.
|
||||
|
||||
This allows storing any type of file on flickr, not only images and movies.
|
||||
|
||||
# Requirements:
|
||||
|
||||
python2
|
||||
|
||||
Credit for the flickr api interface goes to: <http://stuvel.eu/flickrapi>
|
||||
Credit for the png library goes to: <https://github.com/drj11/pypng>
|
||||
Credit for the png tEXt patch goes to: <https://code.google.com/p/pypng/issues/detail?id=65>
|
||||
|
||||
# Install
|
||||
|
||||
Clone the git repository in your home folder.
|
||||
|
||||
git clone git://github.com/TobiasTheViking/flickrannex.git
|
||||
|
||||
This should make a ~/flickrannex folder
|
||||
|
||||
# Setup
|
||||
|
||||
Run the program once to set it up.
|
||||
|
||||
cd ~/flickrannex; python2 flickrannex.py
|
||||
|
||||
After the setup has finished, it will print the git-annex configure lines.
|
||||
|
||||
# Configuring git-annex
|
||||
|
||||
git config annex.flickr-hook '/usr/bin/python2 ~/flickrannex/flickrannex.py'
|
||||
git annex initremote flickr type=hook hooktype=flickr encryption=shared
|
||||
git annex describe flickr "the flickr library"
|
||||
|
||||
# Notes
|
||||
|
||||
## Unencrypted mode
|
||||
The photo name on flickr is currently the GPGHMACSHA1 version.
|
||||
|
||||
Run the following command in your annex directory
|
||||
git annex wanted flickr uuid include=*.jpg or include=*.jpeg or include=*.gif or include=*.png
|
||||
|
||||
## Encrypted mode
|
||||
The current version base64 encodes all the data, which results in ~35% larger filesize.
|
||||
|
||||
I might look into yyenc instead. I'm not sure if it will work in the tEXt field.
|
||||
|
||||
Run the following command in your annex directory
|
||||
git annex wanted flickr exclude=largerthan=30mb
|
||||
|
||||
## Including directories as tags
|
||||
Get get each of the directories below the top level git directory added as tags to uploads:
|
||||
|
||||
git config annex.flickr-hook 'GIT_TOP_LEVEL=`git rev-parse --show-toplevel` /usr/bin/python2 %s/flickrannex.py'
|
||||
|
||||
In this case the image:
|
||||
/home/me/annex-photos/holidays/2013/Greenland/img001.jpg
|
||||
would get the following tags: "holidays" "2013" "Greenland"
|
||||
(assuming "/home/me/annex-photos" is the top level in the annex...)
|
||||
|
||||
Caveat Emptor - Tags will *always* be NULL for indirect repos - we don't (easily) know the human-readable file name.
|
|
@ -0,0 +1,13 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmkBwMWvNKZZCge_YqobCSILPMeK6xbFw8"
|
||||
nickname="develop"
|
||||
subject="comment 10"
|
||||
date="2013-06-07T09:39:59Z"
|
||||
content="""
|
||||
I'm not even sure if chunksize is exposed to the hooks at all.
|
||||
|
||||
As it is, the hook will check the filesize, and if the filesize is more than 30mbyte it will exit 1.
|
||||
|
||||
Chunking may be implemented down the road. I do believe joeyh might have some plans that will touch this issue, so I'd rather wait. Than re-invent the wheel yet again.
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,46 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnaH44G3QbxBAYyDwy0PbvL0ls60XoaR3Y"
|
||||
nickname="Nigel"
|
||||
subject="git annex get failed"
|
||||
date="2013-08-02T14:29:30Z"
|
||||
content="""
|
||||
Hi, I am coming back to this and testing Flickr as a repository for moving files about and have run into what may be my very basic misunderstanding with vanilla annex.
|
||||
|
||||
I copied one file to Flickr and dropped it elsewhere (--force). I assumed that the file was on Flickr ok but that the numcopies setting required the force because of the semi-trust level of the Flickr remote.
|
||||
|
||||
Then I find I can't get the file back, even though there is a record of it from whereis.
|
||||
|
||||
Can you help enlighten me as to what am I missing? I assumed whereis would only report files that exist and can be copied back. If not my error, I can raise bug or search for logs. Thanks in advance for any help.
|
||||
|
||||
[[!format perl \"\"\"
|
||||
|
||||
|
||||
nrb@nrb-ThinkPad-T61:~/tmp$ git annex whereis
|
||||
whereis libpeerconnection.log (3 copies)
|
||||
31124688-0792-4214-9e00-7ed115aa6b8e -- flickr (the flickr library)
|
||||
3e3d40d7-de8f-4591-a4ab-747d74a3b278 -- origin (my laptop)
|
||||
ec2d64fc-30d6-48b4-99bf-7b1bc22d420d -- portable USB drive
|
||||
ok
|
||||
whereis test.cgi (1 copy)
|
||||
31124688-0792-4214-9e00-7ed115aa6b8e -- flickr (the flickr library)
|
||||
ok
|
||||
whereis walkthrough.sh (3 copies)
|
||||
31124688-0792-4214-9e00-7ed115aa6b8e -- flickr (the flickr library)
|
||||
3e3d40d7-de8f-4591-a4ab-747d74a3b278 -- origin (my laptop)
|
||||
ec2d64fc-30d6-48b4-99bf-7b1bc22d420d -- portable USB drive
|
||||
ok
|
||||
whereis walkthrough.sh~ (3 copies)
|
||||
31124688-0792-4214-9e00-7ed115aa6b8e -- flickr (the flickr library)
|
||||
3e3d40d7-de8f-4591-a4ab-747d74a3b278 -- origin (my laptop)
|
||||
ec2d64fc-30d6-48b4-99bf-7b1bc22d420d -- portable USB drive
|
||||
ok
|
||||
nrb@nrb-ThinkPad-T61:~/tmp$ git annex get test.cgi
|
||||
get test.cgi (from flickr...)
|
||||
|
||||
git-annex: /home/nrb/tmp/.git/annex/tmp/SHA256E-s48--a01eedbee949120aeda41e566f9ae8faef1c2bacaa6d7bb8e45050fb8df6d09d.cgi: rename: does not exist (No such file or directory)
|
||||
failed
|
||||
git-annex: get: 1 failed
|
||||
nrb@nrb-ThinkPad-T61:~/tmp$
|
||||
|
||||
\"\"\"]]
|
||||
"""]]
|
|
@ -0,0 +1,58 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnaH44G3QbxBAYyDwy0PbvL0ls60XoaR3Y"
|
||||
nickname="Nigel"
|
||||
subject="re: git annex get failed"
|
||||
date="2013-08-02T15:02:14Z"
|
||||
content="""
|
||||
Another try - this time a slightly simpler setup using my version of the walkthrough commands
|
||||
|
||||
[[!format bash \"\"\"
|
||||
|
||||
nrb@nrb-ThinkPad-T61:~/repos/annex/laptop-annex$ git annex drop walkthrough.sh --from usbdrive
|
||||
drop usbdrive walkthrough.sh ok
|
||||
(Recording state in git...)
|
||||
nrb@nrb-ThinkPad-T61:~/repos/annex/laptop-annex$ git annex move walkthrough.sh --to flickr
|
||||
move walkthrough.sh (gpg) (checking flickr...) (to flickr...)
|
||||
/home/nrb/repos/gits/flickrannex/flickrannex.py:92: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
|
||||
if res:
|
||||
/home/nrb/repos/gits/flickrannex/flickrannex.py:100: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
|
||||
if res:
|
||||
ok
|
||||
(Recording state in git...)
|
||||
nrb@nrb-ThinkPad-T61:~/repos/annex/laptop-annex$ git annex whereis
|
||||
whereis walkthrough.sh (1 copy)
|
||||
161b7af0-2075-4314-9767-308a49b86018 -- flickr (the flickr library)
|
||||
ok
|
||||
whereis walkthrough.sh~ (3 copies)
|
||||
161b7af0-2075-4314-9767-308a49b86018 -- flickr (the flickr library)
|
||||
7803d853-d231-4bb4-b696-f12a950fb96b -- here (my laptop)
|
||||
d60d75f9-d878-4214-af20-fa055134ae77 -- usbdrive (portable USB drive)
|
||||
ok
|
||||
nrb@nrb-ThinkPad-T61:~/repos/annex/laptop-annex$ git annex get walkthrough.sh
|
||||
get walkthrough.sh (from flickr...) (gpg)
|
||||
git-annex: /home/nrb/repos/annex/laptop-annex/.git/annex/tmp/GPGHMACSHA1--02f600d7e8b071d2945270fd5e7fc26dd066ff31: openBinaryFile: does not exist (No such file or directory)
|
||||
gpg: decrypt_message failed: eof
|
||||
|
||||
Unable to access these remotes: flickr
|
||||
|
||||
Try making some of these repositories available:
|
||||
161b7af0-2075-4314-9767-308a49b86018 -- flickr (the flickr library)
|
||||
failed
|
||||
git-annex: get: 1 failed
|
||||
nrb@nrb-ThinkPad-T61:~/repos/annex/laptop-annex$ git annex fsck --from flickr
|
||||
fsck walkthrough.sh (gpg) (checking flickr...) (fixing location log)
|
||||
** Based on the location log, walkthrough.sh
|
||||
** was expected to be present, but its content is missing.
|
||||
|
||||
** No known copies exist of walkthrough.sh
|
||||
failed
|
||||
fsck walkthrough.sh~ (checking flickr...) (fixing location log)
|
||||
** Based on the location log, walkthrough.sh~
|
||||
** was expected to be present, but its content is missing.
|
||||
failed
|
||||
(Recording state in git...)
|
||||
git-annex: fsck: 2 failed
|
||||
nrb@nrb-ThinkPad-T61:~/repos/annex/laptop-annex$
|
||||
|
||||
\"\"\" ]]
|
||||
"""]]
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmkBwMWvNKZZCge_YqobCSILPMeK6xbFw8"
|
||||
nickname="develop"
|
||||
subject="Version 0.1.10 pushed"
|
||||
date="2013-09-11T20:31:25Z"
|
||||
content="""
|
||||
Since the initial release of this hook a lot of issues have been fixed, and a few features added.
|
||||
|
||||
I would highly suggest that everyone who is using this hook update to the latest version as i would consider one of the bugs to be fairly major.
|
||||
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmkBwMWvNKZZCge_YqobCSILPMeK6xbFw8"
|
||||
nickname="develop"
|
||||
subject="comment 2"
|
||||
date="2013-06-05T21:33:42Z"
|
||||
content="""
|
||||
Get the statically linked version from here http://git-annex.branchable.com/install/Linux_standalone/
|
||||
|
||||
I believe the new hook format was introduced in version 4.20130521
|
||||
"""]]
|
|
@ -0,0 +1,30 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnaH44G3QbxBAYyDwy0PbvL0ls60XoaR3Y"
|
||||
nickname="Nigel"
|
||||
subject="missing configuration for flickr-checkpresent-hook"
|
||||
date="2013-06-05T20:44:25Z"
|
||||
content="""
|
||||
<https://github.com/TobiasTheViking/flickrannex/issues/3>
|
||||
|
||||
9 days ago: [the annex] \"hook format a few versions ago, and this is using the new hook format\".
|
||||
|
||||
Looks very handy. I am just starting with this, but can't seem to get it working as a remote after following the simple walkthrough. All goes well until:
|
||||
|
||||
$ git annex copy . --to flickr
|
||||
copy walkthrough.sh (checking flickr...)
|
||||
missing configuration for flickr-checkpresent-hook
|
||||
git-annex: checkpresent hook misconfigured
|
||||
|
||||
my Ubuntu 12.04:
|
||||
|
||||
$ git annex version
|
||||
git-annex version: 4.20130516.1
|
||||
build flags: Assistant Webapp Pairing Testsuite S3 WebDAV Inotify DBus XMPP
|
||||
local repository version: 3
|
||||
default repository version: 3
|
||||
supported repository versions: 3 4
|
||||
upgrade supported from repository versions: 0 1 2
|
||||
|
||||
I guess my \"git-annex version is still too old\"? Any idea what version is needed? Even better if I can figure out which Linux distribution/release has the most up to date version of annex.
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmkBwMWvNKZZCge_YqobCSILPMeK6xbFw8"
|
||||
nickname="develop"
|
||||
subject="comment 4"
|
||||
date="2013-06-05T22:02:29Z"
|
||||
content="""
|
||||
The path for the binary \"/usr/bin/python2\" is wrong.
|
||||
|
||||
It could be any of /usr/bin/python /usr/bin/python2.6 /usr/bin/python2.7
|
||||
|
||||
Or maybe in /usr/local/bin
|
||||
|
||||
you can try running \"which python\" or \"which python2\" to get the real path.
|
||||
"""]]
|
|
@ -0,0 +1,24 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnaH44G3QbxBAYyDwy0PbvL0ls60XoaR3Y"
|
||||
nickname="Nigel"
|
||||
subject="missing configuration for flickr-checkpresent-hook"
|
||||
date="2013-06-05T22:00:48Z"
|
||||
content="""
|
||||
Many thanks.
|
||||
|
||||
I used gitannex-install and was left with a slight anomaly:
|
||||
|
||||
Installing...........done
|
||||
git-annex version 4.20130601 has been installed
|
||||
$ git-annex version
|
||||
git-annex version: 4.20130531-g5df09b5
|
||||
|
||||
But I guess this includes the new hook format. I get a bit further:
|
||||
|
||||
$ git annex copy . --to flickr
|
||||
copy walkthrough.sh (checking flickr...) (user error (sh [\"-c\",\"/usr/bin/python2 /home/nrb/repos/gits/flickrannex/flickrannex.py\"] exited 1)) failed
|
||||
copy walkthrough.sh~ (checking flickr...) (user error (sh [\"-c\",\"/usr/bin/python2 /home/nrb/repos/gits/flickrannex/flickrannex.py\"] exited 1)) failed
|
||||
git-annex: copy: 2 failed
|
||||
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,17 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnaH44G3QbxBAYyDwy0PbvL0ls60XoaR3Y"
|
||||
nickname="Nigel"
|
||||
subject="comment 5"
|
||||
date="2013-06-05T22:11:14Z"
|
||||
content="""
|
||||
Thanks, but on my machine I get:
|
||||
|
||||
$ which python2
|
||||
/usr/bin/python2
|
||||
|
||||
I have scripted all my walkthrough commands, blowing away the test repositories and flickr settings first each time. This re-runs the flickr scripts and git config annex.flickr-hook etc.
|
||||
|
||||
I can't spot anything here.
|
||||
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmWg4VvDTer9f49Y3z-R0AH16P4d1ygotA"
|
||||
nickname="Tobias"
|
||||
subject="comment 6"
|
||||
date="2013-06-06T09:44:11Z"
|
||||
content="""
|
||||
That's weird...
|
||||
|
||||
You could try adding \"--dbglevel 1 --stderr\" arguments to the hook command and give me the output. But the way i read the log it seems like it doesn't even launch the python intrepreter. I might be wrong though.
|
||||
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnaH44G3QbxBAYyDwy0PbvL0ls60XoaR3Y"
|
||||
nickname="Nigel"
|
||||
subject="Unencrypted flickr can only accept picture and video files"
|
||||
date="2013-06-06T10:24:58Z"
|
||||
content="""
|
||||
Thanks and sorry to trouble you, it is my error, I picked unencrypted option (thinking it would be less of an issue) and am using a text file for test, gave an error line:
|
||||
|
||||
10:53:07 [flickrannex-0.1.5] main : 'Unencrypted flickr can only accept picture and video files'
|
||||
|
||||
I've not looked through your code yet, but could that message be printed when not in debug mode?
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmWg4VvDTer9f49Y3z-R0AH16P4d1ygotA"
|
||||
nickname="Tobias"
|
||||
subject="comment 8"
|
||||
date="2013-06-06T10:51:39Z"
|
||||
content="""
|
||||
I'll make it so, in the next version i push.
|
||||
"""]]
|
|
@ -0,0 +1,9 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawleVyKk2kQsB_HgEdS7w1s0BmgRGy1aay0"
|
||||
nickname="Milan"
|
||||
subject="chunksize"
|
||||
date="2013-06-07T09:09:56Z"
|
||||
content="""
|
||||
Hi! Does this backend support chunksize option? If yes, is it possible to set it after the remote has been added to the repository?
|
||||
Thanks, Milan.
|
||||
"""]]
|
127
doc/tips/fully_encrypted_git_repositories_with_gcrypt.mdwn
Normal file
127
doc/tips/fully_encrypted_git_repositories_with_gcrypt.mdwn
Normal file
|
@ -0,0 +1,127 @@
|
|||
[git-remote-gcrypt](https://github.com/joeyh/git-remote-gcrypt/)
|
||||
adds support for encrypted remotes to git. The git-annex
|
||||
[[gcrypt special remote|special_remotes/gcrypt]] allows git-annex to
|
||||
also store its files in such repositories. Naturally, git-annex encrypts
|
||||
the files it stores too, so everything stored on the remote is encrypted.
|
||||
|
||||
Here are some ways you can use this awesome stuff..
|
||||
|
||||
[[!toc ]]
|
||||
|
||||
This page will show how to set it up at the command line, but the git-annex
|
||||
[[assistant]] can also be used to help you set up encrypted git
|
||||
repositories.
|
||||
|
||||
## prerequisites
|
||||
|
||||
* Install
|
||||
[git-remote-gcrypt](https://github.com/joeyh/git-remote-gcrypt/)
|
||||
* Install git-annex version 4.20130909 or newer.
|
||||
|
||||
## encrypted backup drive
|
||||
|
||||
Let's make a USB drive into an encrypted backup repository. It will contain
|
||||
both the full contents of your git repository, and all the files you
|
||||
instruct git-annex to store on it, and everything will be encrypted so that
|
||||
only you can see it.
|
||||
|
||||
First, you need to set up a gpg key. You might consider generating a
|
||||
special purpose key just for this use case, since you may end up wanting to
|
||||
put the key on multiple machines that you would not trust with your
|
||||
main gpg key.
|
||||
|
||||
You need to tell git-annex the keyid of the key when setting up the
|
||||
encrypted repository:
|
||||
|
||||
git init --bare /mnt/encryptedbackup
|
||||
git annex initremote encryptedbackup type=gcrypt gitrepo=/mnt/encryptedbackup keyid=$mykey
|
||||
git annex sync encryptedbackup
|
||||
|
||||
Now you can copy (or even move) files to the repository. After
|
||||
sending files to it, you'll probably want to do a sync, which pushes
|
||||
the git repository changes to it as well.
|
||||
|
||||
git annex copy --to encryptedbackup ...
|
||||
git annex sync encryptedbackup
|
||||
|
||||
Note that if you lose your gpg key, it will be *impossible* to get the
|
||||
data out of your encrypted backup. You need to find a secure way to store a
|
||||
backup of your gpg key. Printing it out and storing it in a safe deposit box,
|
||||
for example.
|
||||
|
||||
You can actually specifiy keyid= as many times as you like to allow any one
|
||||
of a set of gpg keys to access this repository. So you could add a friend's
|
||||
key, or another gpg key you have.
|
||||
|
||||
To restore from the backup, just plug the drive into any machine that has
|
||||
the gpg key used to encrypt it, and then:
|
||||
|
||||
git clone gcrypt::/mnt/encryptedbackup restored
|
||||
cd restored
|
||||
git annex enableremote encryptedbackup gitrepo=/mnt/encryptedbackup
|
||||
git annex get --from encryptedbackup
|
||||
|
||||
## encrypted git-annex repository on a ssh server
|
||||
|
||||
If you have a ssh server that has rsync installed, you can set up an
|
||||
encrypted repository there. Works just like the encrypted drive except
|
||||
without the cable.
|
||||
|
||||
First, on the server, run:
|
||||
|
||||
git init --bare encryptedrepo
|
||||
|
||||
(Also, install git-annex on the server if it's possible & easy to do so.
|
||||
While this will work without git-annex being installed on the server, it
|
||||
is recommended to have it installed.)
|
||||
|
||||
Now, in your existing git-annex repository, set up the encrypted remote:
|
||||
|
||||
git annex initremote encryptedrepo type=gcrypt gitrepo=ssh://my.server/home/me/encryptedrepo keyid=$mykey
|
||||
git annex sync encryptedrepo
|
||||
|
||||
If you're going to be sharing this repository with others, be sure to also
|
||||
include their keyids, by specifying keyid= repeatedly.
|
||||
|
||||
Now you can copy (or even move) files to the repository. After
|
||||
sending files to it, you'll probably want to do a sync, which pushes
|
||||
the git repository changes to it as well.
|
||||
|
||||
git annex copy --to encryptedrepo ...
|
||||
git annex sync encryptedbackup
|
||||
|
||||
Anyone who has access to the repo it and has one of the keys
|
||||
used to encrypt it can check it out:
|
||||
|
||||
git clone gcrypt::ssh://my.server/home/me/encryptedrepo myrepo
|
||||
cd myrepo
|
||||
git annex enableremote encryptedrepo gitrepo=ssh://my.server/home/me/encryptedrepo
|
||||
git annex get --from encryptedrepo
|
||||
|
||||
## private encrypted git remote on hosting site
|
||||
|
||||
You can use gcrypt to store your git repository in encrypted form on any
|
||||
hosting site that supports git. Only you can decrypt its contents.
|
||||
Using it this way, git-annex does not store large files on the hosting site; it's
|
||||
only used to store your git repository itself.
|
||||
|
||||
git remote add encrypted gcrypt::ssh://hostingsite/myrepo.git
|
||||
git push encrypted master git-annex
|
||||
|
||||
Now you can carry on using git-annex with your new repository. For example,
|
||||
`git annex sync` will sync with it.
|
||||
|
||||
To check out the repository from the hosting site, use the same gcrypt::
|
||||
url you used when setting it up:
|
||||
|
||||
git clone gcrypt::ssh://hostingsite/myrepo.git
|
||||
|
||||
## multiuser encrypted git remote on hosting site
|
||||
|
||||
Suppose two users want to share an encrypted git remote. Both of you
|
||||
need to set up the remote, and configure gcrypt to encrypt it so that both
|
||||
of you can see it.
|
||||
|
||||
git remote add sharedencrypted gcrypt::ssh://hostingsite/myrepo.git
|
||||
git config remote.sharedencrypted.gcryt-participants "$mykey $friendkey"
|
||||
git config git push sharedencrypted master git-annex
|
|
@ -0,0 +1,15 @@
|
|||
[[!comment format=mdwn
|
||||
username="tanen"
|
||||
ip="83.128.159.25"
|
||||
subject="comment 10"
|
||||
date="2013-11-04T17:58:36Z"
|
||||
content="""
|
||||
> \"We could symetrically encrypt the repository with a keyfile that's stored in the repository itself\"
|
||||
> Then you would need to decrypt the repository in order get the key you need to decrypt the repository. The impossibility of this design is why I didn't do that!
|
||||
|
||||
Sorry, I ment that the file containing the symmetric encryption key should obviously not be used to encrypt itself, it would be stored in the repository \"unencrypted\" (but protected with a passphrase)
|
||||
|
||||
> store a non-encrypted gpg key alongside the repsitory encrypted with it, but then you have to rely on a passphrase for all your security.
|
||||
|
||||
Exactly. I think such a mode be a great addition. It might not be as secure as encryption based on a private key - depending on the passphrase strength -, but it would certainly be a lot more convenient and portable (and still much more secure than the shared encryption method).
|
||||
"""]]
|
|
@ -0,0 +1,18 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkbpbjP5j8MqWt_K4NASwv0WvB8T4rQ-pM"
|
||||
nickname="Fabrice"
|
||||
subject="Is there a way to specify a preferred pgp key?"
|
||||
date="2013-11-01T18:57:38Z"
|
||||
content="""
|
||||
Hi,
|
||||
|
||||
I think the current behavior of the special remote is a bit annoying when one has several pgp keys.
|
||||
|
||||
Indeed, I've followed the encrypted backup drive example specifying the id of a dedicated key in the initremote step, so far so good. Doing that, I was prompted for my key phrase by the gnome keyring daemon, as expected.
|
||||
|
||||
The annoying part starts right at the git annex sync step. Indeed, when git-remote-gcrypt tries to decrypt the manifest from the encrypted remote, rather than trying only the key specified during the initremote step, it tries all my (secret) keys. This means that I get prompted for the key phrase of all those keys (minus the correct one which is already unlocked...).
|
||||
|
||||
In the future, this might possible to avoid by allowing gcrypt to fetch a preferred key from git config and to use with the --try-secret-key option available gnupg 2.1.x. But for 1.x or 2.0.x, the simpler option --default-key does not seem to alter the order in which keys are tried to decrypt the manifest. Also, it does not seem to be a problem of the gnome keyring daemon, but rather a gpg problem as when the daemon is replaced by the standard gpg-agent, the same problem occurs.
|
||||
|
||||
Meanwhile, is there any way to avoid this problem?
|
||||
"""]]
|
|
@ -0,0 +1,21 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkbpbjP5j8MqWt_K4NASwv0WvB8T4rQ-pM"
|
||||
nickname="Fabrice"
|
||||
subject="A possible solution"
|
||||
date="2013-11-02T14:22:13Z"
|
||||
content="""
|
||||
I'm answering to myself :-). A possible solution to the annoying pass phrase asking with current gnupg is to use a specialized secret keyring. One first exports the secret key used for this repository in a specific keyring as follows:
|
||||
|
||||
`gpg --export-secret-keys keyid | gpg --import --no-default-keyring --secret-keyring mygitannexsecret.gpg`
|
||||
|
||||
This will create a keyring in $HOME/.gnupg with only the specific key.
|
||||
|
||||
Then, in the git-remote-gcrypt shell script, gpg should be called as follows
|
||||
|
||||
`gpg --no-default-keyring --secret-keyring mygitannexsecret.gpg -q -d ...`
|
||||
|
||||
when decrypting the manifest in order to try only the specific key. This behavior can be easily triggered via some git configuration variable.
|
||||
|
||||
Any comment?
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.47"
|
||||
subject="comment 3"
|
||||
date="2013-11-02T17:32:28Z"
|
||||
content="""
|
||||
Fabrice, I've filed a bug report about this: <https://github.com/blake2-ppc/git-remote-gcrypt/issues/9>
|
||||
"""]]
|
|
@ -0,0 +1,18 @@
|
|||
[[!comment format=mdwn
|
||||
username="tanen"
|
||||
ip="83.128.159.25"
|
||||
subject="comment 4"
|
||||
date="2013-11-03T22:35:07Z"
|
||||
content="""
|
||||
The way I would want to setup git-annex (assistant) is \"Wuala/Spideroak style\": two computers with a full checkout of the repository, changes automatically being synced between them, even if the two computers are never online simultaneously, and encryption should be done locally: the (special) remote should not be able to view file listings or content.
|
||||
|
||||
Do I understand it correctly that the gcrypt remote is the only way to make this happen? I tried to create such a setup via the webapp but failed. Adding the repository and remote (via \"Encrypt with GnuPG key\") on the first computer went OK*, but trying to enable that remote on the other computer fails: clicking enable asks me for the SSH password, but after that I just get redirected to a blank screen, with nothing to see in the logfile after the succesful call to ssh-keygen. No entry for the second computer is being added to authorized_keys on the remote.
|
||||
|
||||
Perhaps this is because at this point the assistant is unable to actually parse the content of the encrypted repository? I tried importing the private key that was used while creating the repository on the other computer, but that made no difference.
|
||||
|
||||
Thinking about this for a while, I believe gpg keys aren't actually particularly suited for this usecase. Even without the bug above, one would either have to awkwardly copy a private key to all hosts that are syncing to the repository; or, every time a new (or reinstalled) host wants to sync the repository, you would manually have to add the new keyid to the config and do the forced push + GCRYPT_FULL_REPACK, presumably having to reupload your entire history. Apart from this, having to backup a private key (outside of your git-annex based backups!) would be quite inconvenient.
|
||||
|
||||
How would you feel about adding a new mode of operation where encryption is simply based on a passphrase? We could symetrically encrypt the repository with a keyfile that's stored in the repository itself, protecting the keyfile with a passphrase which - if stored at all - would be stored on the individual computers, outside of the repository.
|
||||
|
||||
*although it erroneously used \"E0D2F776E7F674E3\" as key-id while the actual id is E7F674E3; where did that other half come from?
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmNu4V5fvpLlBhaCUfXXOB0MI5NXwh8SkU"
|
||||
nickname="Adam"
|
||||
subject="comment 5"
|
||||
date="2013-11-04T04:40:53Z"
|
||||
content="""
|
||||
> How would you feel about adding a new mode of operation where encryption is simply based on a passphrase? We could symetrically encrypt the repository with a keyfile that's stored in the repository itself, protecting the keyfile with a passphrase which - if stored at all - would be stored on the individual computers, outside of the repository.
|
||||
|
||||
Isn't that what the regular shared-encryption remote already does? Except it doesn't put a passphrase on the key, because anyone who has access to the local repo wouldn't need access to the remote one anyway.
|
||||
"""]]
|
|
@ -0,0 +1,16 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkbpbjP5j8MqWt_K4NASwv0WvB8T4rQ-pM"
|
||||
nickname="Fabrice"
|
||||
subject="comment 6"
|
||||
date="2013-11-04T07:39:21Z"
|
||||
content="""
|
||||
> _How would you feel about adding a new mode of operation where encryption is simply based on a passphrase? We could symetrically encrypt the repository with a keyfile that's stored in the repository itself, protecting the keyfile with a passphrase which - if stored at all - would be stored on the individual computers, outside of the repository._
|
||||
|
||||
As Adam wrote, without a passphrase, this is the shared encryption method. With an encrypted key, this is more or less the hybrid (default) scheme. The thing is that you have to share a secret to have a encrypted remote. I don't use the webapp, so I don't know what's happening in your case, but this is how it should work with the command line tools. First Alice create the encrypted remote with her pgp key. As far as I understand, git annex creates (via gpg) a key for a symmetric cypher which is stored in the repository, encrypted with Alice public key. If Alice wants to share the repository with Bob, she must either give a key pair (so the private key also, of course) to Bob or ask Bob for his public key. In the first case, Bob can clone the repository directly (upon reception of the key pair), while in the second case, Alice has to active Bob's public key (with `git annex enableremote myremote keyid+=bobsId`). In this case, again as far as I understand, the symmetric key is reencrypted for both Alice and Bob in the repo.
|
||||
|
||||
I understand that you tried the first case with the webapp and that it did not work. I had a similar problem documented in this [http://git-annex.branchable.com/bugs/git-annex-shell:_gcryptsetup_permission_denied](bug). Maybe you could had some comments to this bug description?
|
||||
|
||||
> _*although it erroneously used \"E0D2F776E7F674E3\" as key-id while the actual id is E7F674E3; where did that other half come from?_
|
||||
|
||||
This is the long id of your pgp key (16 characters as opposed to 8 for the short id).
|
||||
"""]]
|
|
@ -0,0 +1,16 @@
|
|||
[[!comment format=mdwn
|
||||
username="tanen"
|
||||
ip="83.128.159.25"
|
||||
subject="comment 7"
|
||||
date="2013-11-04T09:01:13Z"
|
||||
content="""
|
||||
Thanks for the responses. Please correct me if I'm wrong, but the way I understood it, using the shared encryption scheme creates a conflict between \"changes being synced between them, even if the two computers are never online simultaneously\" and \"encryption should be done locally: the (special) remote should not be able to view file listings or content.\"
|
||||
|
||||
- If I use shared encryption \"the webapp way\", only the file contents will be rsynced to the remote, not the repository itself. This means that different hosts are unable to sync unless they are online simultaneously, so that commit data can be sent directly between them via XMPP. In practice, this would mean my hosts are never synced (because I don't keep my home computer running when I leave for work, and vice versa)
|
||||
|
||||
- If I use shared encryption and additionally put the repository itself on a remote, that remote would have the keys to fully decrypt the repository, that's not acceptable.
|
||||
|
||||
Reading through the docs again, the hybrid scheme actually seems to be closer to what I want than the shared scheme, but it still has a major downside: the encryption only applies to the files itself, so in order to get \"offline sync\" there still has to be a 'remote' for the repository itself, which will contain all your metadata unencrypted. And also it would depend on the user being able to manually setup and backup a set of gpg keys instead of just memorizing a secure passphrase.
|
||||
|
||||
@Fabrice Looks like the bug you found could very well be the cause of the problem I had; I'll try it again when a new version is available.
|
||||
"""]]
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkbpbjP5j8MqWt_K4NASwv0WvB8T4rQ-pM"
|
||||
nickname="Fabrice"
|
||||
subject="comment 8"
|
||||
date="2013-11-04T10:31:56Z"
|
||||
content="""
|
||||
I think you are (at least partially) right. Of course, the only way to sync completely computers that are not on together is to use either a usb drive or a third always on computer. (I've to confess I did not understand first when I read git annex docs, shame on me ;-) If you don't want to trust completely this computer (I don't, for instance), you must :
|
||||
|
||||
* use an encrypted git repository on this computer;
|
||||
|
||||
* and use either hybrid or pubkey encryption.
|
||||
|
||||
But contrarily to what you seem to imply (I hope I understand you correctly), if you do that, the third computer can still figure out a few things (usage patterns, such as where connections come from), but that's all. You've got full sync and everything is encrypted, both the git part and the files handled by the annex. This applied only to encrypted git special remotes as other remotes do not store the git part.
|
||||
"""]]
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.47"
|
||||
subject="comment 9"
|
||||
date="2013-11-04T17:07:55Z"
|
||||
content="""
|
||||
\"We could symetrically encrypt the repository with a keyfile that's stored in the repository itself\"
|
||||
|
||||
Then you would need to decrypt the repository in order get the key you need to decrypt the repository. The impossibility of this design is why I didn't do that!
|
||||
|
||||
It would certainly be possible to store a non-encrypted gpg key alongside the repsitory encrypted with it, but then you have to rely on a passphrase for all your security.
|
||||
|
||||
You should file a bug report for the bug you saw..
|
||||
"""]]
|
28
doc/tips/googledriveannex.mdwn
Normal file
28
doc/tips/googledriveannex.mdwn
Normal file
|
@ -0,0 +1,28 @@
|
|||
googledriveannex
|
||||
=========
|
||||
|
||||
Hook program for gitannex to use Google Drive as backend
|
||||
|
||||
# Requirements:
|
||||
|
||||
python2
|
||||
|
||||
Credit for the googledrive api interface goes to google
|
||||
|
||||
## Install
|
||||
Clone the git repository in your home folder.
|
||||
|
||||
git clone git://github.com/TobiasTheViking/googledriveannex.git
|
||||
|
||||
This should make a ~/googledriveannex folder
|
||||
|
||||
## Setup
|
||||
Run the program once to make an empty config file
|
||||
|
||||
cd ~/googledriveannex; python2 googledriveannex.py
|
||||
|
||||
## Commands for gitannex:
|
||||
|
||||
git config annex.googledrive-hook '/usr/bin/python2 ~/googledriveannex/googledriveannex.py'
|
||||
git annex initremote googledrive type=hook hooktype=googledrive encryption=shared
|
||||
git annex describe googledrive "the googledrive library"
|
27
doc/tips/imapannex.mdwn
Normal file
27
doc/tips/imapannex.mdwn
Normal file
|
@ -0,0 +1,27 @@
|
|||
imapannex
|
||||
=========
|
||||
|
||||
Hook program for gitannex to use imap as backend
|
||||
|
||||
# Requirements:
|
||||
|
||||
python2
|
||||
|
||||
# Install
|
||||
Clone the git repository in your home folder.
|
||||
|
||||
git clone git://github.com/TobiasTheViking/imapannex.git
|
||||
|
||||
This should make a ~/imapannex folder
|
||||
|
||||
# Setup
|
||||
Run the program once to set it up.
|
||||
|
||||
cd ~/imapannex; python2 imapannex.py
|
||||
|
||||
# Commands for gitannex:
|
||||
|
||||
git config annex.imap-hook '/usr/bin/python2 ~/imapannex/imapannex.py'
|
||||
git annex initremote imap type=hook hooktype=imap encryption=shared
|
||||
git annex describe imap "the imap library"
|
||||
git annex wanted imap exclude=largerthan=30mb
|
41
doc/tips/megaannex.mdwn
Normal file
41
doc/tips/megaannex.mdwn
Normal file
|
@ -0,0 +1,41 @@
|
|||
[Megaannex](https://github.com/TobiasTheViking/megaannex)
|
||||
is a hook program for git-annex to use mega.co.nz as backend
|
||||
|
||||
# Requirements:
|
||||
|
||||
python2
|
||||
requests>=0.10
|
||||
pycrypto
|
||||
|
||||
Credit for the mega api interface goes to:
|
||||
<https://github.com/richardasaurus/mega.py>
|
||||
|
||||
## Install
|
||||
|
||||
Clone the git repository in your home folder.
|
||||
|
||||
git clone git://github.com/TobiasTheViking/megaannex.git
|
||||
|
||||
This should make a ~/megannex folder
|
||||
|
||||
## Setup
|
||||
|
||||
Run the program once to make an empty config file.
|
||||
|
||||
cd ~/megaannex; python2 megaannex.py
|
||||
|
||||
Edit the megaannex.conf file. Add your mega.co.nz username, password, and folder name.
|
||||
|
||||
## Configuring git-annex
|
||||
|
||||
git config annex.mega-hook '/usr/bin/python2 ~/megaannex/megaannex.py'
|
||||
|
||||
git annex initremote mega type=hook hooktype=mega encryption=shared
|
||||
git annex describe mega "the mega.co.nz library"
|
||||
|
||||
## Notes
|
||||
|
||||
You may need to use a different command than "python2", depending
|
||||
on your python installation.
|
||||
|
||||
-- Tobias
|
16
doc/tips/migrating_data_to_a_new_backend.mdwn
Normal file
16
doc/tips/migrating_data_to_a_new_backend.mdwn
Normal file
|
@ -0,0 +1,16 @@
|
|||
Maybe you started out using the WORM backend, and have now configured
|
||||
git-annex to use SHA1. But files you added to the annex before still
|
||||
use the WORM backend. There is a simple command that can migrate that
|
||||
data:
|
||||
|
||||
# git annex migrate my_cool_big_file
|
||||
migrate my_cool_big_file (checksum...) ok
|
||||
|
||||
You can only migrate files whose content is currently available. Other
|
||||
files will be skipped.
|
||||
|
||||
After migrating a file to a new backend, the old content in the old backend
|
||||
will still be present. That is necessary because multiple files
|
||||
can point to the same content. The `git annex unused` subcommand can be
|
||||
used to clear up that detritus later. Note that hard links are used,
|
||||
to avoid wasting disk space.
|
|
@ -0,0 +1,77 @@
|
|||
Scenario
|
||||
--------
|
||||
|
||||
You are a new git-annex user. You have already files spread around many computers and wish to migrate those into git-annex, without having to recopy all files all over the place.
|
||||
|
||||
Let's say, for example, you have a server, named `marcos` and a workstation named `angela`. You have your audio collection stored in `/srv/mp3` in `marcos` and `~/mp3` on `angela`, but only `marcos` has all the files, and `angela` only has a subset.
|
||||
|
||||
We also assume that `marcos` has an SSH server.
|
||||
|
||||
How do you add all this stuff to git-annex?
|
||||
|
||||
Create the biggest git-annex repository
|
||||
---------------------------------------
|
||||
|
||||
Start with `marcos`, with the complete directory:
|
||||
|
||||
cd /srv/mp3
|
||||
git init
|
||||
git annex init
|
||||
git annex add .
|
||||
git commit -m"git annex yay"
|
||||
|
||||
This will checksum all files and add them to the `git-annex` branch of the git repository. Wait for this process to complete.
|
||||
|
||||
Create the smaller repo and synchronise
|
||||
---------------------------------------
|
||||
|
||||
On `angela`, we want to synchronise the git annex metadata with `marcos`. We need to initialize a git repo with `marcos` as a remote:
|
||||
|
||||
cd ~/mp3
|
||||
git init
|
||||
git remote add marcos marcos.example.com:/srv/mp3
|
||||
git fetch marcos
|
||||
git annex info # this should display the two repos
|
||||
git annex add .
|
||||
|
||||
This will, again, checksum all files and add them to git annex. Once that is done, you can verify that the files are really the same as marcos with `whereis`:
|
||||
|
||||
git annex whereis
|
||||
|
||||
This should display something like:
|
||||
|
||||
whereis Orange Seeds/I remember.wav (2 copies)
|
||||
b7802161-c984-4c9f-8d05-787a29c41cfe -- marcos (anarcat@marcos:/srv/mp3)
|
||||
c2ca4a13-9a5f-461b-a44b-53255ed3e2f9 -- here (anarcat@angela)
|
||||
ok
|
||||
|
||||
Once you are sure things went on okay, you can synchronise this with `marcos`:
|
||||
|
||||
git annex sync
|
||||
|
||||
This will push the metadata information to marcos, so it knows which files are available on `angela`. From there on, you can freely get and move files between the two repos!
|
||||
|
||||
Importing files from a third directory
|
||||
--------------------------------------
|
||||
|
||||
Say that some files on `angela` are actually spread out outside of the `~/mp3` directory. You can use the `git annex import` command to add those extra directories:
|
||||
|
||||
cd ~/mp3
|
||||
git annex import ~/music/
|
||||
|
||||
(!) Be careful that `~/music` is not a git-annex repository, or this will [[destroy it!|bugs/git annex import destroys a fellow git annex repository]].
|
||||
|
||||
Deleting deleted files
|
||||
----------------------
|
||||
|
||||
It is quite possible some files were removed (or renamed!) on `marcos` but not on `angela`, since it was synchronised only some time ago. A good way to find out about those files is to use the `--not --in` argument, for example, on `angela`:
|
||||
|
||||
git annex whereis --in here --not --in marcos
|
||||
|
||||
This will show files that are on `angela` and not on `marcos`. They could be new files that were only added on `angela`, so be careful! A manual analysis is necessary, but let's say you are certain those files are not relevant anymore, you can delete them from `angela`:
|
||||
|
||||
git annex drop <file>
|
||||
|
||||
If the file is a renamed or modified version from the original, you may need to use `--force`, but be careful! If you delete the wrong file, it will be lost forever!
|
||||
|
||||
> (!) Maybe this wouldn't happen with [[direct mode]] and an fsck? --[[anarcat]]
|
69
doc/tips/offline_archive_drives.mdwn
Normal file
69
doc/tips/offline_archive_drives.mdwn
Normal file
|
@ -0,0 +1,69 @@
|
|||
After you've used git-annex for a while, you will have data in your repository
|
||||
that you don't want to keep in the limited disk space of a laptop or a server,
|
||||
but that you don't want to entirely delete.
|
||||
|
||||
This is where git-annex's support for offline archive drives shines.
|
||||
You can move old files to an archive drive, which can be kept offline if
|
||||
it's not practical to keep it spinning. Better, you can move old files to
|
||||
two or more archive drives, in case one of them later fails to spin up.
|
||||
(One consideration when [[future_proofing]] your archive.)
|
||||
|
||||
To set up an archive drive, you can take any removable drive, format
|
||||
it with a filesystem you'll be able to read some years later, and then follow
|
||||
the [[walkthrough]] to set up a repository on it that is a git remote of
|
||||
the repository in your computer you want to archive. In short:
|
||||
|
||||
cd /media/archive
|
||||
git clone ~/annex
|
||||
cd ~/annex
|
||||
git remote add archivedrive /media/archive/annex
|
||||
git annex sync archivedrive
|
||||
|
||||
Don't forget to tell git-annex this is an archive drive (or perhaps a backup
|
||||
drive). Also, give the drive a description that matches something you write on
|
||||
its label, so you can find it later:
|
||||
|
||||
git annex group archivedrive archive
|
||||
git annex wanted archivedrive standard
|
||||
git annex describe archivedrive "my first archive drive (SATA)"
|
||||
|
||||
Or you can use the assistant to set up the drive for you.
|
||||
(Nice video tutorial here: [[videos/git-annex_assistant_archiving]])
|
||||
|
||||
(Keeping the archive drive in an offsite location? Consider encrypting
|
||||
it! See [[fully_encrypted_git_repositories_with_gcrypt]].)
|
||||
|
||||
Then, when the archive drive is plugged in, you can easily copy files to
|
||||
it:
|
||||
|
||||
cd ~/annex
|
||||
git-annex copy --auto --to archivedrive
|
||||
|
||||
Or, if you're using the assistant, it will automatically notice when the drive
|
||||
gets plugged in and copy files that need to be archived.
|
||||
|
||||
When you want to get rid of the local file, leaving only the copy on the
|
||||
archive, you can just:
|
||||
|
||||
git annex drop file
|
||||
|
||||
The archive drive has to be plugged in for this to work, so git-annex
|
||||
can verify it still has the file. If you had configured git-annex to
|
||||
always store 2 [[copies]], it will need 2 archive drives plugged in.
|
||||
You may find it useful to configure a [[trust]] setting for the drive to
|
||||
avoid needing to haul it out of storage to drop a file.
|
||||
|
||||
Now the really nice thing. When your archive drive gets filled up, you
|
||||
can simply remove it, store it somewhere safe, and replace it with a new
|
||||
drive, which can be mounted at the same location for simplicity. Set up
|
||||
the new drive the same way described above, and use it to archive even more
|
||||
files.
|
||||
|
||||
Finally, when you want to access one of the files you archived, you can
|
||||
just ask for it:
|
||||
|
||||
git annex get file
|
||||
|
||||
If necessary git-annex will tell you which archive drive you need to
|
||||
pull out of storage to get the file back. This is where the description
|
||||
you entered earlier comes in handy.
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue