remove old closed bugs and todo items to speed up wiki updates and reduce size
Remove closed bugs and todos that were least edited before 2014. Command line used: for f in $(grep -l '\[\[done\]\]' *.mdwn); do if [ -z $(git log --since=2014 --pretty=oneline "$f") ]; then git rm $f; git rm -rf $(echo "$f" | sed 's/.mdwn$//'); fi; done
This commit is contained in:
parent
e157467f92
commit
222f78e9ea
1970 changed files with 0 additions and 56952 deletions
|
@ -1,8 +0,0 @@
|
|||
It seems that currently, syncing will result in every branch winding
|
||||
up everywhere within the network of git annex nodes. It would be great
|
||||
if one could keep some branches purely local.
|
||||
|
||||
The «fetch» part of «sync» seems to respect the fetch refspec in the
|
||||
git config, but the push part seems to always push everything.
|
||||
|
||||
> [[done]]
|
|
@ -1,18 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="108.236.230.124"
|
||||
subject="comment 1"
|
||||
date="2014-05-15T19:51:48Z"
|
||||
content="""
|
||||
No, it does not:
|
||||
|
||||
<pre>
|
||||
push wren
|
||||
[2014-05-15 15:50:33 JEST] call: git [\"--git-dir=/home/joey/lib/big/.git\",\"--work-tree=/home/joey/lib/big\",\"push\",\"wren\",\"+git-annex:synced/git-annex\",\"master:synced/master\"]
|
||||
[2014-05-15 15:50:39 JEST] read: git [\"--git-dir=/home/joey/lib/big/.git\",\"--work-tree=/home/joey/lib/big\",\"push\",\"wren\",\"master\"]
|
||||
</pre>
|
||||
|
||||
That is the entirity of what's pushed: The git-annex branch, and the currently checked out branch.
|
||||
|
||||
I don't see a bug here.
|
||||
"""]]
|
|
@ -1,18 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="zardoz"
|
||||
ip="92.227.51.179"
|
||||
subject="comment 2"
|
||||
date="2014-05-16T08:40:47Z"
|
||||
content="""
|
||||
Joey, thanks for clearing that up. In my test-case I only had two
|
||||
branches, and I mistook it for pushing everything. Actually, what I
|
||||
wanted to achieve was the following:
|
||||
|
||||
Have a main repo M with branches A and A-downstream, and have a
|
||||
downstream repo D with just A-downstream. What confused me was that
|
||||
the main repo always pushed A to D. I suppose if I just have the two
|
||||
branches, I would achieve the desired effect by not using «annex
|
||||
sync», and instead just pushing the git-annex branch manually; would
|
||||
that be the way to go?
|
||||
|
||||
"""]]
|
|
@ -1,4 +0,0 @@
|
|||
It would be wonderful if a pre-built package would be available for Synology NAS. Basically, this is an ARM-based Linux. It has most of the required shell commands either out of the box or easily available (through ipkg). But I think it would be difficult to install the Haskell compiler and all the required modules, so it would probably be better to cross-compile targeting ARM.
|
||||
|
||||
> [[done]]; the standalone armel tarball has now been tested working on
|
||||
> Synology. --[[Joey]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkwjBDXkP9HAQKhjTgThGOxUa1B99y_WRA"
|
||||
nickname="Franck"
|
||||
subject="comment 10"
|
||||
date="2013-06-02T17:23:43Z"
|
||||
content="""
|
||||
I updated the C program to simplify it so it uses a static path for `_chrooter`. In the previous version, I suspect that one can play with symlinks and use it to get a root shell. So, if `_chrooter` is not installed in `/opt/bin` this file has to be edited too before compilation.
|
||||
"""]]
|
|
@ -1,9 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkwjBDXkP9HAQKhjTgThGOxUa1B99y_WRA"
|
||||
nickname="Franck"
|
||||
subject="comment 11"
|
||||
date="2013-06-03T09:55:54Z"
|
||||
content="""
|
||||
A last update and I stop spamming this thread: I've implemented access control and simplified customisation. All this has been moved to https://bitbucket.org/franckp/gasp
|
||||
|
||||
"""]]
|
|
@ -1,9 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawlJEI45rGczFAnuM7gRSj4C6s9AS9yPZDc"
|
||||
nickname="Kevin"
|
||||
subject="SynoCommunity"
|
||||
date="2013-06-26T18:12:39Z"
|
||||
content="""
|
||||
Creating an installable git-annex package available via [SynoCommunity](http://www.synocommunity.com/) would be awesome. They have created [cross-compilation tools](https://github.com/SynoCommunity/spksrc) to help build the packages and integrate the start/stop scripts with the package manager.
|
||||
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnrP-0DGtHDJbWSXeiyk0swNkK1aejoN3c"
|
||||
nickname="sebastien"
|
||||
subject="comment 13"
|
||||
date="2013-08-06T12:18:35Z"
|
||||
content="""
|
||||
I post an issue to github synocommunity for that, i hope somenone have some time to package this great features.
|
||||
"""]]
|
|
@ -1,30 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="lorenzo"
|
||||
ip="84.75.27.69"
|
||||
subject="Running Debian squeeze binaries on libc 2.5 based NAS"
|
||||
date="2013-10-27T23:56:26Z"
|
||||
content="""
|
||||
Following the suggestions in this page I tried to run the binaries that debian provides on my Lacie NetworkSpace which is another one of these NAS devices with old libc. After uploading the binaries and required libraries and using `LD_LIBRARY_PATH` to force the loader to use the version I uploaded of the libraries I was still having a segfault (similar to what Franck was experiencing) while running git-annex in a chroot was working.
|
||||
|
||||
It turns out that it is possible to solve the problem without having to use chroot by not loading the binary directly but by substituting it with a script that calls the correct `ld-linux.so.3`. Assume you have uncompressed the files from the deb packages in `/opt/git-annex`.
|
||||
|
||||
First create a directory `/opt/git-annex/usr/bin/git-annex.exec` and copy the executable `/opt/git-annex/usr/bin/git-annex` there.
|
||||
|
||||
Then create script `/opt/git-annex/usr/bin/git-annex` with the following contents:
|
||||
|
||||
#!/bin/bash
|
||||
|
||||
PREFIX=/opt/git-annex
|
||||
|
||||
export GCONV_PATH=$PREFIX/usr/lib/gconv
|
||||
|
||||
exec $PREFIX/lib/ld-linux.so.3 --library-path $PREFIX/lib/:$PREFIX/usr/lib/ $PREFIX/usr/bin/git-annex.exec/git-annex \"$@\"
|
||||
|
||||
The `GCONV_PATH` setting is important to prevent the app from failing with the message:
|
||||
|
||||
git-annex.exec: mkTextEncoding: invalid argument (Invalid argument)
|
||||
|
||||
The original executable is moved to a different directory instead of being simply renamed to make sure that `$0` is correct when the executable starts. The parameter for the linker `--library-path` is used instead of the environment variable `LD_LIBRARY_PATH` to make sure that the programs exec'ed by git-annex do not have the variable set.
|
||||
|
||||
Some more info about the approach: [[http://www.novell.com/coolsolutions/feature/11775.html]]
|
||||
"""]]
|
|
@ -1,10 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.87"
|
||||
subject="comment 15"
|
||||
date="2013-12-16T05:55:29Z"
|
||||
content="""
|
||||
Following the example of @lorenzo, I have made all the git-annex Linux standalone builds include glibc and shims to make the linker use it.
|
||||
|
||||
Now that there's a [[forum/new_linux_arm_tarball_build]], it may *just work* on Synology.
|
||||
"""]]
|
|
@ -1,10 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
nickname="joey"
|
||||
subject="comment 1"
|
||||
date="2013-05-24T15:55:42Z"
|
||||
content="""
|
||||
There are already git-annex builds for arm available from eg, Debian. There's a good chance that, assuming you match up the arm variant (armel, armhf, etc) and that the NAS uses glibc and does not have too old a version, that the binary could just be copied in, possibly with some other libraries, and work. This is what's done for the existing Linux standalone builds.
|
||||
|
||||
So, I look at this bug report as \"please add a standalone build for arm\", not as a request to support a specific NAS which I don't have ;)
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkwjBDXkP9HAQKhjTgThGOxUa1B99y_WRA"
|
||||
nickname="Franck"
|
||||
subject="comment 2"
|
||||
date="2013-05-24T21:31:44Z"
|
||||
content="""
|
||||
I tried to run the binary from the Debian package, unfortunately, after installing tons of libraries, git-annex fails complaining that GLIBC is not recent enough. Perhaps a static build for ARM (armel) can solve the problem? Thanks again for your help!
|
||||
"""]]
|
|
@ -1,10 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
nickname="joey"
|
||||
subject="comment 3"
|
||||
date="2013-05-25T04:42:22Z"
|
||||
content="""
|
||||
Which Debian package? Different ones link to different libcs.
|
||||
|
||||
(It's not really possible to statically link something with as many dependencies as git-annex on linux anymore, unfortunately.)
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkwjBDXkP9HAQKhjTgThGOxUa1B99y_WRA"
|
||||
nickname="Franck"
|
||||
subject="comment 4"
|
||||
date="2013-05-25T07:40:13Z"
|
||||
content="""
|
||||
I've actually tried several ones: 4.20130521 on sid, 3.20120629~bpo60+2 on squeeze-backports, 3.20120629 on wheezy and jessie, plus a package for Ubuntu 11.02. All of them try to load GLIBC 2.6/2.7 while my system has 2.5 only... I'll try a different approach: install Debian in a chroot on the NAS and extract all the required files, including all libraries.
|
||||
"""]]
|
|
@ -1,23 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkwjBDXkP9HAQKhjTgThGOxUa1B99y_WRA"
|
||||
nickname="Franck"
|
||||
subject="comment 5"
|
||||
date="2013-05-25T10:03:24Z"
|
||||
content="""
|
||||
Unfortunately, chroot approach does not work either. While git-annex works fine when I'm in the chroot, it doesn't work any more outside. If I don't copy libc, I get a version error (just like before so this is normal):
|
||||
|
||||
git-annex: /lib/libc.so.6: version `GLIBC_2.7' not found (required by /opt/share/git-annex/bin/git-annex)
|
||||
git-annex: /lib/libc.so.6: version `GLIBC_2.6' not found (required by /opt/share/git-annex/bin/git-annex)
|
||||
git-annex: /lib/libc.so.6: version `GLIBC_2.7' not found (required by /opt/share/git-annex/lib/libgmp.so.10)
|
||||
|
||||
When I copy libc from the Debian chroot, then, it complains about libpthread:
|
||||
|
||||
git-annex: relocation error: /lib/libpthread.so.0: symbol __default_rt_sa_restorer, version GLIBC_PRIVATE not defined in file libc.so.6 with link time reference
|
||||
|
||||
If then I copy libpthread also, I get:
|
||||
|
||||
Illegal instruction (core dumped)
|
||||
|
||||
So, I'm stuck... :-(
|
||||
I'll try to find a way using the version in the chroot instead of trying to export it to the host system...
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawln3ckqKx0x_xDZMYwa9Q1bn4I06oWjkog"
|
||||
nickname="Michael"
|
||||
subject="bind mount"
|
||||
date="2013-05-25T15:55:52Z"
|
||||
content="""
|
||||
You could bind-mount (e.g. mount -o bind /data /chroot/data ) your main Synology fs into the chroot for git-annex to use.
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkwjBDXkP9HAQKhjTgThGOxUa1B99y_WRA"
|
||||
nickname="Franck"
|
||||
subject="comment 7"
|
||||
date="2013-05-25T19:01:29Z"
|
||||
content="""
|
||||
This is indeed what I'm doing. But I need to make a wrapper that will call the command in the chroot. Thanks for the tip anyway. :-)
|
||||
"""]]
|
|
@ -1,12 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmqz6wCn-Q1vzrsHGvEJHOt_T5ZESilxhc"
|
||||
nickname="Sören"
|
||||
subject="comment 8"
|
||||
date="2013-05-26T13:50:31Z"
|
||||
content="""
|
||||
I have a Synology NAS too, so I thought I could try to run git-annex in a Debian chroot.
|
||||
As it [turns out](http://forum.synology.com/wiki/index.php/What_kind_of_CPU_does_my_NAS_have), my model (DS213+) runs on a PowerPC CPU instead of ARM. Unfortunately, it isn't compatible with PPC in Debian either because it is a different PowerPC variant.
|
||||
There is an unofficial Debian port called [powerpcspe](http://wiki.debian.org/PowerPCSPEPort), but ghc doesn't build there yet for [some reason](http://buildd.debian-ports.org/status/package.php?p=git-annex&suite=sid).
|
||||
|
||||
Any chance that there will be a build for this architecture at some point in the future or should I better look for another NAS? ;-)
|
||||
"""]]
|
|
@ -1,29 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawkwjBDXkP9HAQKhjTgThGOxUa1B99y_WRA"
|
||||
nickname="Franck"
|
||||
subject="comment 9"
|
||||
date="2013-06-02T13:14:56Z"
|
||||
content="""
|
||||
Hi, I finally succeeded! :-)
|
||||
|
||||
Here are the main steps:
|
||||
|
||||
1. install `debian-chroot` on the NAS
|
||||
2. create an account `gitannex` in Debian
|
||||
3. configure git on this account (this is important otherwise git complains and fails) `git config --global user.email YOUR_EMAIL` and `git config --global user.name YOUR_NAME`
|
||||
4. install `gcc` on the NAS (using `ipkg`)
|
||||
5. download the files here: https://www.dropbox.com/sh/b7z68a730aj3mnm/95nFOzE1QP
|
||||
6. edit `_chrooter` to fit your settings (probably there is nothing to change if your Debian is freshly installed)
|
||||
7. run `make install`, everything goes to `/opt/bin`, if you change this, you should also edit line 17 in file `gasp`
|
||||
8. create an account `gitannex` on the NAS (doesn't need to be the same name as in Debian, but I feel it is easier)
|
||||
9. edit its `.ssh/authorized_keys` to prefix lines as follows `command=\"gasp\" THE_PUBLIC_KEY_AS_USUAL`
|
||||
10. it should work
|
||||
11. the repositories will be in the Debian account, but it's easy to symlink them in the NAS account if you wish
|
||||
|
||||
The principle is as follows: `command=\"gasp\"` allows to launch `gasp` on SSH connexion instead of the original command given to `ssh`. This command is retrieved by `gasp` and prefixed with `chrooter-` (so, eg, running `ssh git` on the client results in running `chrooter-git` on the NAS). `chrooter-*` commands are symlinks to `chrooter`, this is a setuid root binary that launches `_chrooter`. (This intermediary binary is necessary because `_chrooter` is a script which cannot be setuid, and setuid is required for the chroot and identity change.) Finally, `_chrooter` starts the `debian-chroot` service, chroot to the target dir, changes identity and eventually launches the original command as if it was lauched directly by `gitannex` user in Debian. `_chrooter` and `gasp` are Python scripts, I did not use shell in order to avoid error-prone issues with spaces in arguments (that need to be passed around several times in the process).
|
||||
|
||||
I'll try now to add command-line parameters to `gasp` in order to restrict the commands that can be run through SSH and the repositories allowed.
|
||||
|
||||
Cheers,
|
||||
Franck
|
||||
"""]]
|
|
@ -1,5 +0,0 @@
|
|||
Especially on Mac OSX (and Windows, and maybe Android), it would be great to be able to check in the webapp if an upgrade is available. A deeper integration with these OS would be even better: for example on Mac OSX, an icon on the status bar list available upgrades for some programs, including LibreOffice and others which are not installed by default.
|
||||
|
||||
Also, it would be great to be able to download and install git-annex upgrades directly from the webapp.
|
||||
|
||||
> comprehensively [[done]]; [[design/assistant/upgrading]] --[[Joey]]
|
|
@ -1,17 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.246"
|
||||
subject="comment 1"
|
||||
date="2013-11-15T20:51:18Z"
|
||||
content="""
|
||||
I have thought about doing this, especially if there is ever a security hole in git-annex.
|
||||
|
||||
All it needs is a file containing the version number to be written along-side the git-annex build, and git-annex knowing if it was built as a standalone build, and should check that.
|
||||
|
||||
As for actually performing the upgrade:
|
||||
|
||||
* Easy on Linux
|
||||
* Not sure on OSX.. Is it possible to use hdiutil attach to replace a dmg while a program contained in it is currently running?
|
||||
* Probably impossible on Android, at least not without using double the space. Probably better to get git-annex into an app store.
|
||||
* Doable on Windows, but would need git-annex to be distributed in a form that was not a installer.exe.
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawlzlNQbf6wBgv9j6-UqfpXcQyAYMF8S3t4"
|
||||
nickname="Tim"
|
||||
subject="comment 2"
|
||||
date="2014-01-12T09:19:31Z"
|
||||
content="""
|
||||
I am pretty sure you know about it, but have you seen https://f-droid.org/? I was rather surprised that git-annex isn't yet listed in that \"store\".
|
||||
"""]]
|
|
@ -1,19 +0,0 @@
|
|||
### Please describe the problem.
|
||||
|
||||
Great work on git annex! One possible enhancement occured to me: It would be very useful though if the "whereis" command would support looking up the location of files by arbitrary keys. This way one could inspect the location of old content which is not currently checked-out in the tree.
|
||||
|
||||
In a related vein, the "unused" command could report old filenames or describe the associated commits. Tracking old versions is a great feature of your git-based approach, but currently, tasks such as pruning selected content seem unwiedly. Though I might be missing existing solutions. You can easily "cut-off" the history by forcing a drop of all unused content. It would be cool if one could somehow "address" old versions by filename and commit/date and selectively drop just these. The same could go for the "whereis" command, where one could e.g. query which remote holds content which was stored under some filename at some specific date.
|
||||
|
||||
Thanks Cheers!
|
||||
|
||||
> I agree that it's useful to run whereis on a specific key. This can
|
||||
> now be done using `git annex whereis --key KEY`
|
||||
> [[done]] --[[Joey]]
|
||||
>
|
||||
> To report old filenames, unused would have to search back through the
|
||||
> contents of symlinks in old versions of the repo, to find symlinks that
|
||||
> referred to a key. The best way I know how to do that is `git log -S$KEY`,
|
||||
> which is what unused suggests you use. But this is slow --
|
||||
> searching for a single key in one of my repos takes 25 seconds.
|
||||
> That's why it doesn't do it for you.
|
||||
>
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="zardoz"
|
||||
ip="92.227.51.179"
|
||||
subject="comment 1"
|
||||
date="2014-05-13T20:34:33Z"
|
||||
content="""
|
||||
I suppose that makes sense. Is it more affordable to just retrieve the most recent filename? That would seem to be enough for many practical purposes. But I guess this would still possibly have to go through many revisions. I wonder if such a restricted search can be done by git though. Maybe using non-porcelain commands.
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="zardoz"
|
||||
ip="134.147.14.84"
|
||||
subject="comment 2"
|
||||
date="2014-05-15T13:03:47Z"
|
||||
content="""
|
||||
Okay, I suppose one way of doing a search that works like that would do a «git log --stat -S'KEY' $commit», starting with HEAD and then walking the parents.
|
||||
"""]]
|
|
@ -1,3 +0,0 @@
|
|||
One Problem I am having is that I could never get the xmpp pairing to work so whenever I switch machines I have to manually run sync once on the command line to get the changes. Is it possible to have a sync now button of some sort that will trigger a sync on the repos?
|
||||
|
||||
> moved from forum; [[done]] --[[Joey]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnR6E5iUghMWdUGlbA9CCs8DKaoigMjJXw"
|
||||
nickname="Efraim"
|
||||
subject="comment 1"
|
||||
date="2014-03-06T20:37:36Z"
|
||||
content="""
|
||||
not quite a sync button, but when I want to force sync now I turn off and turn on sync for one of the repos from the webapp and then it syncs.
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.146"
|
||||
subject="comment 2"
|
||||
date="2014-03-06T22:12:27Z"
|
||||
content="""
|
||||
I've added a \"Sync now\" to the menu for each remote. So can be used to sync with an individual remote, or if picked from the menu for the local repository, it causes it to try to sync with every one if its remotes at once.
|
||||
"""]]
|
|
@ -1,117 +0,0 @@
|
|||
Hi, I am assuming to use git-annex-assistant for two usecases, but I would like to ask about the options or planed roadmap for dropped/removed files from the repository.
|
||||
|
||||
Usecases:
|
||||
|
||||
1. sync working directory between laptop, home computer, work komputer
|
||||
2. archive functionality for my photograps
|
||||
|
||||
Both usecases have one common factor. Some files might become obsolate and
|
||||
in long time frame nobody is interested to keep their revisions. Let's
|
||||
assume photographs. Usuall workflow I take is to import all photograps to
|
||||
filesystem, then assess (select) the good ones I want to keep and then
|
||||
process them what ever way.
|
||||
|
||||
Problem with git-annex(-assistant) I have is that it start to revision all
|
||||
of the files at the time they are added to directory. This is welcome at
|
||||
first but might be an issue if you are used to put 80% of the size of your
|
||||
imported files to trash.
|
||||
|
||||
I am aware of what git-annex is not. I have been reading documentation for
|
||||
"git-annex drop" and "unused" options including forums. I do understand
|
||||
that I am actually able to delete all revisions of the file if I will drop
|
||||
it, remove it and if I will run git annex unused 1..###. (on all synced
|
||||
repositories).
|
||||
|
||||
I actually miss the option to have above process automated/replicated to the other synced repositories.
|
||||
|
||||
I would formulate the 'use case' requirements for git-annex as:
|
||||
|
||||
* command to drop an file including revisions from all annex repositories?
|
||||
(for example like moving a file to /trash folder) that will schedulle
|
||||
it's deletition)
|
||||
* option to keep like max. 10 last revisions of the file?
|
||||
* option to keep only previous revisions if younger than 6 months from now?
|
||||
|
||||
Finally, how to specify a feature request for git-annex?
|
||||
|
||||
> By moving it here ;-) --[[Joey]]
|
||||
|
||||
> So, let's spec out a design.
|
||||
>
|
||||
> * Add preferred content terminal to configure whether a repository wants
|
||||
> to hang on to unused content. Simply `unused`.
|
||||
> (It cannot include a timestamp, because there's
|
||||
> no way repos can agree on about when a key became unused.) **done**
|
||||
> * In order to quickly match that terminal, the Annex monad will need
|
||||
> to keep a Set of unused Keys. This should only be loaded on demand.
|
||||
> **done**
|
||||
> NB: There is some potential for a great many unused Keys to cause
|
||||
> memory usage to balloon.
|
||||
> * Client repositories will end their preferred content with
|
||||
> `and (not unused)`. Transfer repositories too, because typically
|
||||
> only client repos connect to them, and so otherwise unused files
|
||||
> would build up there. Backup repos would want unused files. I
|
||||
> think that archive repos would too. **done**
|
||||
> * Make the assistant check for unused files periodically. Exactly
|
||||
> how often may need to be tuned, but once per day seems reasonable
|
||||
> for most repos. Note that the assistant could also notice on the
|
||||
> fly when files are removed and mark their keys as unused if that was
|
||||
> the last associated file. (Only currently possible in direct mode.)
|
||||
> **done**
|
||||
> * After scanning for unused files, it makes sense for the
|
||||
> assistant to queue transfers of unused files to any remotes that
|
||||
> do want them (eg, backup remotes). If the files can successfully be
|
||||
> sent to a remote, that will lead to them being dropped locally as
|
||||
> they're not wanted.
|
||||
> * Add a git config setting like annex.expireunused=7d. This causes
|
||||
> *deletion* of unused files after the specified time period if they are
|
||||
> not able to be moved to a repo that wants them.
|
||||
> (The default should be annex.expireunused=false.)
|
||||
> * How to detect how long a file has been unused? We can't look at the
|
||||
> time stamp of the object; we could use the mtime of the .map file,
|
||||
> that that's direct mode only and may be replaced with a database
|
||||
> later. Seems best to just keep a unused log file with timestamps.
|
||||
> **done**
|
||||
> * After the assistant scans for unused files, if annex.expireunused
|
||||
> is not set, and there is some significant quantity of unused files
|
||||
> (eg, more than 1000, or more than 1 gb, or more than the amount of
|
||||
> remaining free disk space),
|
||||
> it can pop up a webapp alert asking to configure it. **done**
|
||||
> * Webapp interface to configure annex.expireunused. Reasonable values
|
||||
> are no expiring, or any number of days. **done**
|
||||
>
|
||||
> [[done]] This does not cover every use case that was requested.
|
||||
> But I don't see a cheap way to ensure it keeps eg the past 10 versions of
|
||||
> a file. I guess that if you care about that, you leave
|
||||
> annex.expireunused=false, and set up a backup repository where the unused
|
||||
> files will be moved to.
|
||||
>
|
||||
> Note that since the assistant uses direct mode by default, old versions
|
||||
> of modififed files are not guaranteed to be retained. But they very well
|
||||
> might be. For example, if a file is replicated to 2 clients, and one
|
||||
> client directly edits it, or deletes it, it loses the old version,
|
||||
> but the other client will still be storing that old version.
|
||||
>
|
||||
> ## Stability analysis for unused in preferred content expressions
|
||||
>
|
||||
> This is tricky, because two repos that are otherwise entirely
|
||||
> in sync may have differing opinons about whether a key is unused,
|
||||
> depending on when each last scanned for unused keys.
|
||||
>
|
||||
> So, this preferred content terminal is *not stable*.
|
||||
> It may be possible to write preferred content expressions
|
||||
> that constantly moved such keys around without reaching a steady state.
|
||||
>
|
||||
> Example:
|
||||
>
|
||||
> A and B are clients directly connected, and both also connected
|
||||
> to BACKUP.
|
||||
>
|
||||
> A deletes F. B syncs with A, and runs unused check; decides F
|
||||
> is unused. B sends F to BACKUP. B will then think A doesn't want F,
|
||||
> and will drop F from A. Next time A runs a full transfer scan, it will
|
||||
> *not* find F (because the file was deleted!). So it won't get F back from
|
||||
> BACKUP.
|
||||
>
|
||||
> So, it looks like the fact that unused files are not going to be
|
||||
> looked for on the full transfer scan seems to make this work out ok.
|
|
@ -1,7 +0,0 @@
|
|||
Firefox is my default browser, but as we all know, it doesn't load quickly. If I don't have Firefox running but I want to access the git-annex webapp, I'd rather launch the webapp in some small, quick browser like QupZilla than wait for Firefox to load.
|
||||
|
||||
Could git-annex have a setting, maybe a "webapp --browser" option and/or a setting in the config file, to specify the browser to launch?
|
||||
|
||||
> git-annex uses the standard `git config web.browser` if you set it.
|
||||
> [[done]]
|
||||
> --[[Joey]]
|
|
@ -1,7 +0,0 @@
|
|||
A failure during "make test" should be signalled to the caller by means of
|
||||
a non-zero exit code. Without that signal, it's very hard to run the
|
||||
regression test suite in an automated fashion.
|
||||
|
||||
> git-annex used to have a Makefile that ignored make test exit status,
|
||||
> but that was fixed in commit dab5bddc64ab4ad479a1104748c15d194e138847,
|
||||
> in October 6th. [[done]] --[[Joey]]
|
|
@ -1,9 +0,0 @@
|
|||
Git-annex doesn't compile with the latest version of monad-control. Would it be hard to support that new version?
|
||||
|
||||
> I have been waiting for it to land in Debian before trying to
|
||||
> deal with its changes.
|
||||
>
|
||||
> There is now a branch in git called `new-monad-control` that will build
|
||||
> with the new monad-control. --[[Joey]]
|
||||
|
||||
>> Now merged to master. [[done]] --[[Joey]]
|
|
@ -1,16 +0,0 @@
|
|||
Please provide a command that basically performs something like:
|
||||
|
||||
git get --auto
|
||||
for i in `git remote`; do git copy -to $i --auto; done
|
||||
|
||||
|
||||
The use case is this:
|
||||
I have a very large repo (300.000 files) in three places. Now I want the fastest possible way to ensure, that every file exists in annex.numcopies. This should scan every file one time and then get it or copy it to other repos as needed. Right now, I make one "git annex get --auto" in every repo, which is is a waste of time, since most of the files never change anyway!
|
||||
|
||||
> Now `git annex sync --content` does effectivly just what the shown for
|
||||
> loop does. [[done]]
|
||||
>
|
||||
> The only difference is that copy --auto proactively downloads otherwise
|
||||
> unwanted files to satisfy numcopies, and sync --content does not.
|
||||
> We need a [[preferred_content_numcopies_check]] to solve that.
|
||||
> --[[Joey]]
|
|
@ -1,24 +0,0 @@
|
|||
Support Amazon S3 as a file storage backend.
|
||||
|
||||
There's a haskell library that looks good. Not yet in Debian.
|
||||
|
||||
Multiple ways of using S3 are possible. Currently implemented as
|
||||
a special type of git remote.
|
||||
|
||||
Before this can be close, I need to fix:
|
||||
|
||||
## encryption
|
||||
|
||||
TODO
|
||||
|
||||
## unused checking
|
||||
|
||||
One problem is `git annex unused`. Currently it only looks at the local
|
||||
repository, not remotes. But if something is dropped from the local repo,
|
||||
and you forget to drop it from S3, cruft can build up there.
|
||||
|
||||
This could be fixed by adding a hook to list all keys present in a remote.
|
||||
Then unused could scan remotes for keys, and if they were not used locally,
|
||||
offer the possibility to drop them from the remote.
|
||||
|
||||
[[done]]
|
|
@ -1,10 +0,0 @@
|
|||
It's very confusing to me that the same repo viewed from different client systems can have different names and descriptions. This implies that making changes to a remote repo from one system only affects how that system sees the repo, but it seems to affect how the entire git-annex "pair" or "network of repos" sees it.
|
||||
|
||||
I think it would be good if the names and descriptions of repos were synced across clients.
|
||||
|
||||
> The descriptions of repositories are synced. (They're stored in git-annex:uuid.log)
|
||||
>
|
||||
> git allows for the same repository to be referred to using as many different remote names as you want to set up. git-annex inherits this,
|
||||
> and I can't see this changing; there are very good reasons for remotes to
|
||||
> have this flexability. [[done]]
|
||||
> --[[Joey]]
|
|
@ -1,18 +0,0 @@
|
|||
This is just an idea, and I have no idea if it would work (that's why I'm asking):
|
||||
|
||||
**Would it be possible to use ASICs made for Bitcoin mining inside git-annex to offload the hashing of files?**
|
||||
|
||||
I got the idea, because I have two RaspberryPis here:
|
||||
|
||||
- one runs my git-annex archive. It is really slow at hashing, so I resorted to using the WORM backend
|
||||
- another one runs 2 old-ish ASIC miners. They are just barely "profitable" right now, so in a few months they will be obsolete
|
||||
|
||||
Both devices to some kind of `SHA256`. I have a feeling this is either extremely easy or extremely complicated to do… :)
|
||||
|
||||
> git-annex uses binaries such as `sha256sum` for hashing large files (large is
|
||||
> currently hardcoded as bigger than 1MB). If you insert a binary with the same
|
||||
> interface as `sha256sum` into your `$PATH`, git-annex will automatically use
|
||||
> it. If you want to use ASIC hashing even for small files, you need to tweak
|
||||
> `Backend/Hash.hs`. --[[HelmutGrohne]]
|
||||
|
||||
>> [[done]] --[[Joey]]
|
|
@ -1,10 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.172"
|
||||
subject="comment 1"
|
||||
date="2014-02-20T17:42:10Z"
|
||||
content="""
|
||||
I feel that Helmut has the right approach to this general type of thing.
|
||||
|
||||
I doubt that bitcoin ASICs feature a fast data transfer bus, because bitcoin is a pretty low-data-volume protocol. Additionally AIUI, bitcoin ASICs get their speed by hashing in parallel, which allows them to try many variations of a block at once. So they probably rely on most of the data remaining the same and only a small amount changing. So it's doubtful this would be a win.
|
||||
"""]]
|
|
@ -1,30 +0,0 @@
|
|||
Hi,
|
||||
|
||||
it would be great if the importfeed command would be able to read feeds generated by youtube (like for playlists). The youtube playlist feed contains links to separate youtube video pages, which quvi handles just fine. Currently I use the following python script:
|
||||
|
||||
#!/usr/bin/env python
|
||||
import feedparser
|
||||
import sys
|
||||
d = feedparser.parse('http://gdata.youtube.com/feeds/api/playlists/%s' % sys.argv[1])
|
||||
for entry in d.entries:
|
||||
print entry.link
|
||||
|
||||
and then
|
||||
|
||||
kasimon@pc:~/annex/YouTube/debconf13$ youtube-playlist-urls PLz8ZG1e9MPlzefklz1Gv79icjywTXycR- | xargs git annex addurl --fast
|
||||
addurl Welcome_talk.webm ok
|
||||
addurl Bits_from_the_DPL.webm ok
|
||||
addurl Debian_Cosmology.webm ok
|
||||
addurl Bits_from_the_DPL.webm ok
|
||||
addurl Debian_Cosmology.webm ok
|
||||
addurl Debian_on_Google_Compute_Engine.webm ok
|
||||
^C
|
||||
|
||||
to create a backup of youtube media I'd like to keep.
|
||||
|
||||
It would be great if this functionality could be integrated directly into git annex.
|
||||
|
||||
Best
|
||||
Karsten
|
||||
|
||||
> [[done]] --[[Joey]]
|
|
@ -1,9 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.227"
|
||||
subject="comment 1"
|
||||
date="2013-12-29T18:21:32Z"
|
||||
content="""
|
||||
Ok, so importfeed looks for items in a feed with enclosures, but this feed is not a podcast feed. So it needs to look for some of the `<links>`
|
||||
to find pages that quvi supports. (There might be other links that are not video pages, for all I know. Looks like `getItemLink` finds the right links and then I just need to filter through quvi.
|
||||
"""]]
|
|
@ -1,29 +0,0 @@
|
|||
As per IRC
|
||||
|
||||
22:13:10 < RichiH> joeyh: btw, i have been pondering a `git annex import --lazy` or some such which basically goes through a directory and deletes everything i find in the annex it run from
|
||||
22:50:39 < joeyh> not sure of the use case
|
||||
23:41:06 < RichiH> joeyh: the use case is "i have important a ton of data into my annexes. now, i am going through the usual crud of cp -ax'ed, rsync'ed, and other random 'new disk, move stuff around and just put a full dump over there' file dumps and would like to delete everything that's annexed already"
|
||||
23:41:33 < RichiH> joeyh: that would allow me to spend time on dealing with the files which are not yet annexed
|
||||
23:41:54 < RichiH> instead of verifying file after file which has been imported already
|
||||
23:43:19 < joeyh> have you tried just running git annex import in a subdirectory and then deleting the dups?
|
||||
23:45:34 < joeyh> or in a separate branch for that matter, which you could then merge in, etc
|
||||
23:54:08 < joeyh> Thinking anout it some more, it would need to scan the whole work tree to see what keys were there, and populate a lookup table. I prefer to avoid things that need git-annex to do such a large scan and use arbitrary amounts of memory.
|
||||
00:58:11 < RichiH> joeyh: that would force everything into the annex, though
|
||||
00:58:20 < RichiH> a plain import, that is
|
||||
00:58:53 < RichiH> in a usual data dump directory, there's tons of stuff i will never import
|
||||
00:59:00 < RichiH> i want to delete large portions of it
|
||||
00:59:32 < RichiH> but getting rid of duplicates first allows me to spend my time focused on stuff humans are good at: deciding
|
||||
00:59:53 < RichiH> whereas the computer can focus on stuff it's good at: mindless comparision of bits
|
||||
01:00:15 < RichiH> joeyh: as you're saying this is complex, maybe i need to rephrase
|
||||
01:01:40 < RichiH> what i envision is git annex import --foo to 1) decide what hashing algorithm should be used for this file 2) hash that file 3) look into the annex if that hash is annexed 3a) optionally verify numcopies within the annex 4) delete the file in the source directory
|
||||
01:01:47 < RichiH> and then move on to the next file
|
||||
01:02:00 < RichiH> if the hash does not exist in the annex, leave it alone
|
||||
01:02:50 < RichiH> if the hash exists in annex, but numcopies is not fulfilled, just import it as a normal import would
|
||||
01:03:50 < RichiH> that sounds quite easy, to me; in fact i will prolly script it if you decide not to implement it
|
||||
01:04:07 < RichiH> but i think it's useful for a _lot_ of people who migrate tons of data into annexes
|
||||
01:04:31 < RichiH> thus i would rather see this upstream and not hacked locally
|
||||
|
||||
The only failure mode I see in the above is "file has been dropped elsewhere, numcopies not fulfilled, but that info is not synched to the local repo, yet" -- This could be worked around by always importing the data.
|
||||
|
||||
> [[done]] as `git annex import --deduplicate`.
|
||||
> --[[Joey]]
|
|
@ -1,20 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U"
|
||||
nickname="Richard"
|
||||
subject="comment 1"
|
||||
date="2013-08-06T14:22:03Z"
|
||||
content="""
|
||||
To expand a bit on the use case:
|
||||
|
||||
I have several migration directories which I simply moved to new systems or disks with the help of `cp -ax` or `rsync`.
|
||||
As I don't _need_ the data per se and merely want to hold on to it in case I ever happen to need it again and as disk space is laughably cheap, I have a lot of duplicates.
|
||||
While I can at least detect bit flips with the help of checksum lists, cleaning those duplicates of duplicated duplicates is quite some effort.
|
||||
To make things worse, photos, music, videos, letter and whatnot are thrown into the same container directories.
|
||||
|
||||
All in all, getting data out of those data dumps and into a clean structure is quite an effort.
|
||||
`git annex import --lazy` would help with this effort as I could start with the first directory, sort stuff by hand, and annex it.
|
||||
As soon as data lives in any of my annexes, I could simply run `git annex import --lazy` to get rid of all duplicates while retaining the unannexed files.
|
||||
Iterating through this process a few times, I will be left with clean annexes on the one hand and stuff I can simply delete on the other hand.
|
||||
|
||||
I could script all this by hand on my own machine, but I am _certain_ that others would find easy, integrated, and unit tested support for whittling down data dumps over time useful.
|
||||
"""]]
|
|
@ -1,6 +0,0 @@
|
|||
That would make assessing weird reports like [[bugs/Should_UUID__39__s_for_Remotes_be_case_sensitive__63__/]] easier and quicker.
|
||||
|
||||
> No, if people want to file a bug report, it's up to them to tell me
|
||||
> relevant details about their OS. I'm not going down the rathole
|
||||
> of making git-annex muck about trying to gather such information.
|
||||
> [[done]] --[[Joey]]
|
|
@ -1,6 +0,0 @@
|
|||
As per DebConf13: Introduce a one-shot command to synchronize everything,
|
||||
including data, with the other remotes.
|
||||
|
||||
Especially useful for the debconf annex.
|
||||
|
||||
> [[done]]; `git annex sync --content` --[[Joey]]
|
|
@ -1,4 +0,0 @@
|
|||
Seems pretty self-explanatory.
|
||||
|
||||
> This was already implemented, the --exclude option can be used
|
||||
> for find as well as most any other subcommand. --[[Joey]] [[done]]
|
|
@ -1,22 +0,0 @@
|
|||
`--all` would make git-annex operate on either every key with content
|
||||
present (or in some cases like `get` and `copy --from` on
|
||||
every keys with content not present).
|
||||
|
||||
This would be useful when a repository has a history with deleted files
|
||||
whose content you want to keep (so you're not using `dropunused`).
|
||||
Or when you have a lot of branches and just want to be able to fsck
|
||||
every file referenced in any branch (or indeed, any file referenced in any
|
||||
ref). It could also be useful (or even a
|
||||
good default) in a bare repository.
|
||||
|
||||
A problem with the idea is that `.gitattributes` values for keys not
|
||||
currently in the tree would not be available (without horrific anounts of
|
||||
grubbing thru history to find where/when the key used to exist). So
|
||||
`numcopies` set via `.gitattributes` would not work. This would be a
|
||||
particular problem for `drop` and for `--auto`.
|
||||
|
||||
--[[Joey]]
|
||||
|
||||
> [[done]]. The .gitattributes problem was solved simply by not
|
||||
> supporting `drop --all`. `--auto` also cannot be mixed with --all for
|
||||
> similar reasons. --[[Joey]]
|
|
@ -1,18 +0,0 @@
|
|||
There should be a backend where the file content is stored.. in a git
|
||||
repository!
|
||||
|
||||
This way, you know your annexed content is safe & versioned, but you only
|
||||
have to deal with the pain of git with large files in one place, and can
|
||||
use all of git-annex's features everywhere else.
|
||||
|
||||
> Speaking as a future user, do very, very much want. -- RichiH
|
||||
|
||||
>> Might also be interesting to use `bup` in the git backend, to work
|
||||
>> around git's big file issues there. So git-annex would pull data out
|
||||
>> of the git backend using bup. --[[Joey]]
|
||||
|
||||
>>> Very much so. Generally speaking, having one or more versioned storage back-ends with current data in the local annexes sounds incredibly useful. Still being able to get at old data in via the back-end and/or making offline backups of the full history are excellent use cases. -- RichiH
|
||||
|
||||
[[done]], the bup special remote type is written! --[[Joey]]
|
||||
|
||||
> Yay! -- RichiH
|
|
@ -1,3 +0,0 @@
|
|||
Maybe add the icon /usr/share/doc/git-annex/html/logo.svg to the .desktp file.
|
||||
|
||||
> [[done]] long ago.. --[[Joey]]
|
|
@ -1,14 +0,0 @@
|
|||
I would like to attach metadata to annexed files (objects) without
|
||||
cluttering the workdir with files containing this metadata. A common use
|
||||
case would be to add titles to my photo collection that could than end up
|
||||
in a generated photo album.
|
||||
|
||||
Depending on the implementation it might also be possible to use the metadata facility for a threaded commenting system.
|
||||
|
||||
The first question is whether the metadata is attached to the objects and
|
||||
thus shared by all paths pointing to the same data object or to paths in
|
||||
the worktree. I've no preference here at this point.
|
||||
|
||||
> This is [[done]]; see [[design/metadata]].
|
||||
> The metadata is attached to objects, not to files.
|
||||
> --[[Joey]]
|
|
@ -1,10 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.154.0.63"
|
||||
subject="comment 1"
|
||||
date="2013-08-24T19:58:54Z"
|
||||
content="""
|
||||
I don't know if git-annex is the right vehicle to fix this. It seems that a more generic fix that would work in non-git-annex repos would be better.
|
||||
|
||||
I can answer your question though: The metadata such as urls and locations that git-annex stores in the git-annex branch is attached to objects, and not to work tree paths.
|
||||
"""]]
|
|
@ -1,10 +0,0 @@
|
|||
When the [[design/assistant]] is running on a pair of remotes, I've seen
|
||||
them get out of sync, such that every pull and merge results in a conflict,
|
||||
that then has to be auto-resolved.
|
||||
|
||||
This seems similar to the laddering problem described in this old bug:
|
||||
[[bugs/making_annex-merge_try_a_fast-forward]]
|
||||
|
||||
--[[Joey]]
|
||||
|
||||
Think I've fixed this. [[done]] --[[Joey]]
|
|
@ -1,31 +0,0 @@
|
|||
Client repos do not want files in archive directories. This can turn
|
||||
out to be confusing to users who are using archive directories for their
|
||||
own purposes and not aware of this special case in the assistant. It can
|
||||
seem like the assistant is failing to sync their files.
|
||||
|
||||
I thought, first, that it should have a checkbox to enable the archive
|
||||
directory behavior.
|
||||
|
||||
However, I think I have a better idea. Change the preferred content
|
||||
expression for clients, so they want files in archive directories, *until*
|
||||
those files land in an archive.
|
||||
|
||||
This way, only users who set up an archive repo get this behavior. And they
|
||||
asked for it by setting up that repo!
|
||||
|
||||
Also, the new behavior will mean that files in archive directories still
|
||||
propigate around to clients. Consider this topology:
|
||||
|
||||
client A ---- client B ---- archive
|
||||
|
||||
If a file is created in client A, and moved to an archive directory before
|
||||
it syncs to B, it will never get to the archive, and will continue wasting
|
||||
space on A. With the new behavior, A and B serve as effectively, transfer
|
||||
repositories for archived content.
|
||||
|
||||
Something vaguely like this should work as the preferred content
|
||||
expression for the clients:
|
||||
|
||||
exclude=archive/* or (include=archive/* and (not (copies=archive:1 or copies=smallarchive:1)))
|
||||
|
||||
> [[done]] --[[Joey]]
|
|
@ -1,40 +0,0 @@
|
|||
The [[design/assistant]] would be better if git-annex used ghc's threaded
|
||||
runtime (`ghc -threaded`).
|
||||
|
||||
Currently, whenever the assistant code runs some external command, all
|
||||
threads are blocked waiting for it to finish.
|
||||
|
||||
For transfers, the assistant works around this problem by forking separate
|
||||
upload processes, and not waiting on them until it sees an indication that
|
||||
they have finished the transfer. While this works, it's messy.. threaded
|
||||
would be better.
|
||||
|
||||
When pulling, pushing, and merging, the assistant runs external git
|
||||
commands, and this does block all other threads. The threaded runtime would
|
||||
really help here.
|
||||
|
||||
[[done]]; the assistant now builds with the threaded runtime.
|
||||
Some work still remains to run certian long-running external git commands
|
||||
in their own threads to prevent them blocking things, but that is easy to
|
||||
do, now. --[[Joey]]
|
||||
|
||||
---
|
||||
|
||||
Currently, git-annex seems unstable when built with the threaded runtime.
|
||||
The test suite tends to hang when testing add. `git-annex` occasionally
|
||||
hangs, apparently in a futex lock. This is not the assistant hanging, and
|
||||
git-annex does not otherwise use threads, so this is surprising. --[[Joey]]
|
||||
|
||||
> I've spent a lot of time debugging this, and trying to fix it, in the
|
||||
> "threaded" branch. There are still deadlocks. --[[Joey]]
|
||||
|
||||
>> Fixed, by switching from `System.Cmd.Utils` to `System.Process`
|
||||
>> --[[Joey]]
|
||||
|
||||
---
|
||||
|
||||
It would be possible to not use the threaded runtime. Instead, we could
|
||||
have a child process pool, with associated continuations to run after a
|
||||
child process finishes. Then periodically do a nonblocking waitpid on each
|
||||
process in the pool in turn (waiting for any child could break anything not
|
||||
using the pool!). This is probably a last resort...
|
|
@ -1,29 +0,0 @@
|
|||
It should be possible for clones to learn about how to contact
|
||||
each other without remotes needing to always be explicitly set
|
||||
up. Say that `.git-annex/remote.log` is maintained by git-annex
|
||||
to contain:
|
||||
|
||||
UUID hostname URI
|
||||
|
||||
The URI comes from configured remotes and maybe from
|
||||
`file://$(pwd)`, or even `ssh://$(hostname -f)`
|
||||
for the current repo. This format will merge without
|
||||
conflicts or data loss.
|
||||
|
||||
Then when content is belived to be in a UUID, and no
|
||||
configured remote has it, the remote.log can be consulted and
|
||||
URIs that look likely tried. (file:// ones if the hostname
|
||||
is the same (or maybe always -- a removable drive might tend
|
||||
to be mounted at the same location on different hosts),
|
||||
otherwise ssh:// ones.)
|
||||
|
||||
Question: When should git-annex update the remote.log?
|
||||
(If not just on init.) Whenever it reads in a repo's remotes?
|
||||
|
||||
> This sounds useful and the log should be updated every time any remote is being accessed. A counter or timestamp (yes, distributed times may be wrong/different) could be used to auto-prune old entries via a global and per-remote config setting. -- RichiH
|
||||
|
||||
---
|
||||
|
||||
I no longer think I'd use this myself, I find that my repositories quickly
|
||||
grow the paths I actually use, somewhat organically. Unofficial paths
|
||||
across university quads come to mind. [[done]] --[[Joey]]
|
|
@ -1,7 +0,0 @@
|
|||
Remotes log should probably be stored in ".git/annex/remote.log"
|
||||
instead of ".git-annex/remote.log" to prevent leaking credentials.
|
||||
|
||||
> The idea is to distribute the info between repositories, which is
|
||||
> why it'd go in `.git-annex`. Of course that does mean that repository
|
||||
> location information would be included, and if that'd not desirable
|
||||
> this feature would need to be turned off. --[[Joey]]
|
|
@ -1,15 +0,0 @@
|
|||
A "git annex watch" command would help make git-annex usable by users who
|
||||
don't know how to use git, or don't want to bother typing the git commands.
|
||||
It would run, in the background, watching via inotify for changes, and
|
||||
automatically annexing new files, etc.
|
||||
|
||||
The blue sky goal would be something automated like dropbox, except fully
|
||||
distributed. All files put into the repository would propagate out
|
||||
to all the other clones of it, as network links allow. Note that while
|
||||
dropbox allows modifying files, git-annex freezes them upon creation,
|
||||
so this would not be 100% equivalent to dropbox. --[[Joey]]
|
||||
|
||||
This is a big project with its own [[design pages|design/assistant]].
|
||||
|
||||
> [[done]].. at least, we have a watch command an an assistant, which
|
||||
> is still being developed. --[[Joey]]
|
|
@ -1,20 +0,0 @@
|
|||
Some commands cause a union merge unnecessarily. For example, `git annex add`
|
||||
modifies the location log, which first requires reading the current log (if
|
||||
any), which triggers a merge.
|
||||
|
||||
Would be good to avoid these unnecessary union merges. First because it's
|
||||
faster and second because it avoids a possible delay when a user might
|
||||
ctrl-c and leave the repo in an inconsistent state. In the case of an add,
|
||||
the file will be in the annex, but no location log will exist for it (fsck
|
||||
fixes that).
|
||||
|
||||
It may be that all that's needed is to modify Annex.Branch.change
|
||||
to read the current value, without merging. Then commands like `get`, that
|
||||
query the branch, will still cause merges, and commands like `add` that
|
||||
only modify it, will not. Note that for a command like `get`, the merge
|
||||
occurs before it has done anything, so ctrl-c should not be a problem
|
||||
there.
|
||||
|
||||
This is a delicate change, I need to take care.. --[[Joey]]
|
||||
|
||||
> [[done]] (assuming I didn't miss any cases where this is not safe!) --[[Joey]]
|
|
@ -1,7 +0,0 @@
|
|||
This backend is not finished.
|
||||
|
||||
In particular, while files can be added using it, git-annex will not notice
|
||||
when their content changes, and will not create a new key for the new sha1
|
||||
of the net content.
|
||||
|
||||
[[done]]; use unlock subcommand and commit changes with git
|
|
@ -1,159 +0,0 @@
|
|||
[[done]] !!!
|
||||
|
||||
The use of `.git-annex` to store logs means that if a repo has branches
|
||||
and the user switched between them, git-annex will see different logs in
|
||||
the different branches, and so may miss info about what remotes have which
|
||||
files (though it can re-learn).
|
||||
|
||||
An alternative would be to store the log data directly in the git repo
|
||||
as `pristine-tar` does. Problem with that approach is that git won't merge
|
||||
conflicting changes to log files if they are not in the currently checked
|
||||
out branch.
|
||||
|
||||
It would be possible to use a branch with a tree like this, to avoid
|
||||
conflicts:
|
||||
|
||||
key/uuid/time/status
|
||||
|
||||
As long as new files are only added, and old timestamped files deleted,
|
||||
there would be no conflicts.
|
||||
|
||||
A related problem though is the size of the tree objects git needs to
|
||||
commit. Having the logs in a separate branch doesn't help with that.
|
||||
As more keys are added, the tree object size will increase, and git will
|
||||
take longer and longer to commit, and use more space. One way to deal with
|
||||
this is simply by splitting the logs among subdirectories. Git then can
|
||||
reuse trees for most directories. (Check: Does it still have to build
|
||||
dup trees in memory?)
|
||||
|
||||
Another approach would be to have git-annex *delete* old logs. Keep logs
|
||||
for the currently available files, or something like that. If other log
|
||||
info is needed, look back through history to find the first occurance of a
|
||||
log. Maybe even look at other branches -- so if the logs were on master,
|
||||
a new empty branch could be made and git-annex would still know where to
|
||||
get keys in that branch.
|
||||
|
||||
Would have to be careful about conflicts when deleting and bringing back
|
||||
files with the same name. And would need to avoid expensive searching thru
|
||||
all history to try to find an old log file.
|
||||
|
||||
## fleshed out proposal
|
||||
|
||||
Let's use one branch per uuid, named git-annex/$UUID.
|
||||
|
||||
- I came to realize this would be a good idea when thinking about how
|
||||
to upgrade. Each individual annex will be upgraded independantly,
|
||||
so each will want to make a branch, and if the branches aren't distinct,
|
||||
they will merge conflict for sure.
|
||||
- TODO: What will need to be done to git to make it push/pull these new
|
||||
branches?
|
||||
- A given repo only ever writes to its UUID branch. So no conflicts.
|
||||
- **problem**: git annex move needs to update log info for other repos!
|
||||
(possibly solvable by having git-annex-shell update the log info
|
||||
when content is moved using it)
|
||||
- (BTW, UUIDs probably don't compress well, and this reduces the bloat of having
|
||||
them repeated lots of times in the tree.)
|
||||
- Per UUID branches mean that if it wants to find a file's location
|
||||
among configured remotes, it can examine only their branches, if
|
||||
desired.
|
||||
- It's important that the per-repo branches propigate beyond immediate
|
||||
remotes. If there is a central bare repo, that means push --all. Without
|
||||
one, it means that when repo B pulls from A, and then C pulls from B,
|
||||
C needs to get A's branch -- which means that B should have a tracking
|
||||
branch for A's branch.
|
||||
|
||||
In the branch, only one file is needed. Call it locationlog. git-annex
|
||||
can cache location log changes and write them all to locationlog in
|
||||
a single git operation on shutdown.
|
||||
|
||||
- TODO: what if it's ctrl-c'd with changes pending? Perhaps it should
|
||||
collect them to .git/annex/locationlog, and inject that file on shutdown?
|
||||
- This will be less overhead than the current staging of all the log files.
|
||||
|
||||
The log is not appended to, so in git we have a series of commits each of
|
||||
which replaces the log's entire contens.
|
||||
|
||||
To find locations of a key, all (or all relevant) branches need to be
|
||||
examined, looking backward through the history of each until a log
|
||||
with a indication of the presense/absense of the key is found.
|
||||
|
||||
- This will be less expensive for files that have recently been added
|
||||
or transfered.
|
||||
- It could get pretty slow when digging deeper.
|
||||
- Only 3 places in git-annex will be affected by any slowdown: move --from,
|
||||
get and drop. (Update: Now also unused, whereis, fsck)
|
||||
|
||||
## alternate
|
||||
|
||||
As above, but use a single git-annex branch, and keep the per-UUID
|
||||
info in their own log files. Hope that git can auto-merge as long as
|
||||
each observing repo only writes to its own files. (Well, it can, but for
|
||||
non-fast-forward merges, the git-annex branch would need to be checked out,
|
||||
which is problimatic.)
|
||||
|
||||
Use filenames like:
|
||||
|
||||
<observing uuid>/<location uuid>
|
||||
|
||||
That allows one repo to record another's state when doing a
|
||||
`move`.
|
||||
|
||||
## outside the box approach
|
||||
|
||||
If the problem is limited to only that the `.git-annex/` files make
|
||||
branching difficult (and not to the related problem that commits to them
|
||||
and having them in the tree are sorta annoying), then a simple approach
|
||||
would be to have git-annex look in other branches for location log info
|
||||
too.
|
||||
|
||||
The problem would then be that any locationlog lookup would need to look in
|
||||
all other branches (any branch could have more current info after all),
|
||||
which could get expensive.
|
||||
|
||||
## way outside the box approach
|
||||
|
||||
Another approach I have been mulling over is keeping the log file
|
||||
branch checked out in .git/annex/logs/ -- this would be a checkout of a git
|
||||
repository inside a git repository, using "git fake bare" techniques. This
|
||||
would solve the merge problem, since git auto merge could be used. It would
|
||||
still mean all the log files are on-disk, which annoys some. It would
|
||||
require some tighter integration with git, so that after a pull, the log
|
||||
repo is updated with the data pulled. --[[Joey]]
|
||||
|
||||
> Seems I can't use git fake bare exactly. Instead, the best option
|
||||
> seems to be `git clone --shared` to make a clone that uses
|
||||
> `.git/annex/logs/.git` to hold its index etc, but (mostly) uses
|
||||
> objects from the main repo. There would be some bloat,
|
||||
> as commits to the logs made in there would not be shared with the main
|
||||
> repo. Using `GIT_OBJECT_DIRECTORY` might be a way to avoid that bloat.
|
||||
|
||||
## notes
|
||||
|
||||
Another approach could be to use git-notes. It supports merging branches
|
||||
of notes, with union merge strategy (a hook would have to do this after
|
||||
a pull, it's not done automatically).
|
||||
|
||||
Problem: Notes are usually attached to git
|
||||
objects, and there are no git objects corresponding to git-annex keys.
|
||||
|
||||
Problem: Notes are not normally copied when cloning.
|
||||
|
||||
------
|
||||
|
||||
## elminating the merge problem
|
||||
|
||||
Most of the above options are complicated by the problem of how to merge
|
||||
changes from remotes. It should be possible to deal with the merge
|
||||
problem generically. Something like this:
|
||||
|
||||
* We have a local branch `B`.
|
||||
* For remotes, there are also `origin/B`, `otherremote/B`, etc.
|
||||
* To merge two branches `B` and `foo/B`, construct a merge commit that
|
||||
makes each file have all lines that were in either version of the file,
|
||||
with duplicates removed (probably). Do this without checking out a tree.
|
||||
-- now implemented as git-union-merge
|
||||
* As a `post-merge` hook, merge `*/B` into `B`. This will ensure `B`
|
||||
is always up-to-date after a pull from a remote.
|
||||
* When pushing to a remote, nothing need to be done, except ensure
|
||||
`B` is either successfully pushed, or the push fails (and a pull needs to
|
||||
be done to get the remote's changes merged into `B`).
|
|
@ -1,23 +0,0 @@
|
|||
The checkout subcommand replaces the symlink that normally points at a
|
||||
file's content, with a copy of the file. Once you've checked a file out,
|
||||
you can edit it, and `git commit` it. On commit, git-annex will detect
|
||||
if the file has been changed, and if it has, `add` its content to the
|
||||
annex.
|
||||
|
||||
> Internally, this will need to store the original symlink to the file, in
|
||||
> `.git/annex/checkedout/$filename`.
|
||||
>
|
||||
> * git-annex uncheckout moves that back
|
||||
> * git-annex pre-commit hook checks each file being committed to see if
|
||||
> it has a symlink there, and if so, removes the symlink and adds the new
|
||||
> content to the annex.
|
||||
>
|
||||
> And it seems the file content should be copied, not moved or hard linked:
|
||||
>
|
||||
> * Makes sure other annexes can find it if transferring it from
|
||||
> this annex.
|
||||
> * Ensures it's always available for uncheckout.
|
||||
> * Avoids the last copy of a file's content being lost when
|
||||
> the checked out file is modified.
|
||||
|
||||
[[done]]
|
|
@ -1,105 +0,0 @@
|
|||
Currently [[/direct_mode]] allows the user to point many normally safe
|
||||
git commands at his foot and pull the trigger. At LCA2013, a git-annex
|
||||
user suggested modifying direct mode to make this impossible.
|
||||
|
||||
One way to do it would be to move the .git directory. Instead, make there
|
||||
be a .git-annex directory in direct mode repositories. git-annex would know
|
||||
how to use it, and would be extended to support all known safe git
|
||||
commands, passing parameters through, and in some cases verifying them.
|
||||
|
||||
So, for example, `git annex commit` would run `git commit --git-dir=.git-annex`
|
||||
|
||||
However, `git annex commit -a` would refuse to run, or even do something
|
||||
intelligent that does not involve staging every direct mode file.
|
||||
|
||||
----
|
||||
|
||||
One source of problems here is that there is some overlap between git-annex
|
||||
and git commands. Ie, `git annex add` cannot be a passthrough for `git
|
||||
add`. The git wrapper could instead be another program, or it could be
|
||||
something like `git annex git add`
|
||||
|
||||
--[[Joey]]
|
||||
|
||||
----
|
||||
|
||||
Or, no git wrapper could be provided. Limit the commands to only git-annex
|
||||
commands. This should be all that is needed to manage a direct mode
|
||||
repository simply, and if the user is doing something complicated that
|
||||
needs git access, they can set `GIT_DIR=.git-annex` and be careful not to
|
||||
shoot off their foot. (Or can just switch to indirect mode!)
|
||||
|
||||
This wins on simplicity, and if it's the wrong choice a git wrapper
|
||||
can be added later. --[[Joey]]
|
||||
|
||||
---
|
||||
|
||||
Implementation: Pretty simple really. Already did the hard lifting to
|
||||
support `GIT_DIR`, so only need to override the default git directory
|
||||
in direct mode when that's not set to `.git-annex`.
|
||||
|
||||
A few things hardcode ".git", including Assistant.Threads.Watcher.ignored
|
||||
and `Seek.withPathContents`, and parts of `Git.Construct`.
|
||||
|
||||
---
|
||||
|
||||
Transition: git-annex should detect when it's in a direct mode repository
|
||||
with a .git directory and no .git-annex directory, and transparently
|
||||
do the move to transition to the new scheme. (And remember that `git annex
|
||||
indirect` needs to move it back.)
|
||||
|
||||
# alternative approach: move index
|
||||
|
||||
Rather than moving .git, maybe move .git/index?
|
||||
|
||||
This would cause git to think that all files in the tree were deleted.
|
||||
So git commit -a would make a commit that removes them from git history.
|
||||
But, the files in the work tree are not touched by this.
|
||||
|
||||
Also, git checkout, git merge, and other things that manipulate the work
|
||||
tree refuse to do anything if they'd change a file that they think is
|
||||
untracked.
|
||||
|
||||
Hmm, this does't solve the user accidentially running git add on an annexed
|
||||
file; the whole file still gets added.
|
||||
|
||||
# alternative approach: fake bare repo
|
||||
|
||||
Set core.bare to true. This prevents all work tree operations,
|
||||
so prevents any foot shooting. It still lets the user run commands like
|
||||
git log, even on files in the tree, and git fetch, and push, and git
|
||||
config, etc.
|
||||
|
||||
Even better, it integrates with other tools, like `mr`, so they know
|
||||
it's a git repo.
|
||||
|
||||
This seems really promising. But of course, git-annex has its own set of
|
||||
behaviors in a bare repo, so will need to recognise that this repo is not
|
||||
really bare, and avoid them.
|
||||
|
||||
> [[done]]!! --[[Joey]]
|
||||
|
||||
(Git may also have some bare repo behaviors that are unwanted. One example
|
||||
is that git allows pushes to the current branch in a bare repo,
|
||||
even when `receive.denyCurrentBranch` is set.)
|
||||
|
||||
> This is indeed a problem. Indeed, `git annex sync` successfully
|
||||
> pushes changes to the master branch of a fake bare direct mode repo.
|
||||
>
|
||||
> And then, syncing in the repo that was pushed to causes the changes
|
||||
> that were pushed to the master branch to get reverted! This happens
|
||||
> because sync commits; commit sees that files are staged in index
|
||||
> differing from the (pushed) master, and commits the "changes"
|
||||
> which revert it.
|
||||
>
|
||||
> Could fix this using an update hook, to reject the updated of the master
|
||||
> branch. However, won't work on crippled filesystems! (No +x bit)
|
||||
>
|
||||
> Could make git annex sync detect this. It could reset the master
|
||||
> branch to the last one committed, before committing. Seems very racy
|
||||
> and hard to get right!
|
||||
>
|
||||
> Could make direct mode operate on a different branch, like
|
||||
> `annex/direct/master` rather than `master`. Avoid pushing to that
|
||||
> branch (`git annex sync` can map back from it to `master` and push there
|
||||
> instead). A bit clumsy, but works.
|
|
@ -1,10 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawn-KDr_Z4CMkjS0v_TxQ08SzAB5ecHG3K0"
|
||||
nickname="Glen"
|
||||
subject="This sounds good"
|
||||
date="2013-06-25T10:30:07Z"
|
||||
content="""
|
||||
I think we might have been talking about this feature.. Seems like a good idea to me.
|
||||
|
||||
Glen
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawm7AuSfii_tCkLyspL6Mr0ATlO6OxLNYOo"
|
||||
nickname="Georg"
|
||||
subject="comment 2"
|
||||
date="2013-09-20T11:29:04Z"
|
||||
content="""
|
||||
Maybe make a git sub-namespace of commands. Yeah, I know, something like git annex git-add sounds a bit on the verbose side, but it would allow access to possibly all git commands regardless of name clashes.
|
||||
"""]]
|
|
@ -1,7 +0,0 @@
|
|||
I've an external USB hard disc attached to my (fritzbox) router that is only accessible through SMB/CIFS. I'd like have all my annexed files on this drive in kind of direct-mode so that I can also access the files without git-annex.
|
||||
|
||||
I tried to put a direct-mode repo on the drive but this is painfully slow. The git-annex process than runs on my desktop and accesses the repo over SMB over the slow fritzbox over USB.
|
||||
|
||||
I'd wish that git-annex could be told to just use a (mounted) folder as a direct-mode remote.
|
||||
|
||||
> [[done]]; dup. --[[Joey]]
|
|
@ -1,10 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.64"
|
||||
subject="comment 1"
|
||||
date="2013-11-23T19:03:58Z"
|
||||
content="""
|
||||
It's not clear to me what you are requesting here.
|
||||
|
||||
You seem to say that running git-annex inside a mountpoint is slow. Ok. So, what possible changes to git-annex could make it fast, given that the bottleneck is the SMB/USB?
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawnR6E5iUghMWdUGlbA9CCs8DKaoigMjJXw"
|
||||
nickname="Efraim"
|
||||
subject="comment 2"
|
||||
date="2013-11-26T09:26:53Z"
|
||||
content="""
|
||||
perhaps he's looking to be able to expand the addurl option to include file://path/to/video.mp4, or for over smb://... , to import a file without changing its location to being inside the annex.
|
||||
"""]]
|
|
@ -1,11 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmicVKRM8vJX4wPuAwlLEoS2cjmFXQkjkE"
|
||||
nickname="Thomas"
|
||||
subject="never mind"
|
||||
date="2013-12-01T18:34:05Z"
|
||||
content="""
|
||||
grossmeier.net did a much better job to explain what I want:
|
||||
[[New special remote suggeston - clean directory]]
|
||||
|
||||
Please close this issue as duplicate of the above.
|
||||
"""]]
|
|
@ -1,18 +0,0 @@
|
|||
Say I have some files on remote A. But I'm away from it, and transferring
|
||||
files from B to C. I'd like to avoid transferring any files I already have
|
||||
on A.
|
||||
|
||||
Something like:
|
||||
|
||||
git annex copy --to C --exclude-on A
|
||||
|
||||
This would not contact A, just use its cached location log info.
|
||||
|
||||
I suppose I might also sometime want to only act on files that are
|
||||
thought/known to be on A.
|
||||
|
||||
git annex drop --only-on A
|
||||
|
||||
--[[Joey]]
|
||||
|
||||
[[done]]
|
|
@ -1,9 +0,0 @@
|
|||
Apparently newer gnupg has support for hardware-accelerated AES-NI. It
|
||||
would be good to have an option to use that. I also wonder if using the
|
||||
same symmetric key for many files presents a security issues (and whether
|
||||
using GPG keys directly would be more secure).
|
||||
|
||||
> [[done]]; you can now use encryption=pubkey when setting up a special
|
||||
> remote to use pure public keys without the hybrid symmetric key scheme.
|
||||
> Which you choose is up to you. Also, annex.gnupg-options can configure
|
||||
> the ciphers used. --[[Joey]]
|
|
@ -1,14 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.152.108.145"
|
||||
subject="comment 1"
|
||||
date="2013-08-01T17:10:56Z"
|
||||
content="""
|
||||
There is a remote.name.annex-gnupg-options git-config setting that can be used to pass options to gpg on a per-remote basis.
|
||||
|
||||
> also wonder if using the same symmetric key for many files presents a security issues (and whether using GPG keys directly would be more secure).
|
||||
|
||||
I am not a cryptographer, but I have today run this question by someone with a good amount of crypo knowledge. My understanding is that reusing a symmetric key is theoretically vulnerable to eg known-plaintext or chosen-plaintext attacks. And that modern ciphers like AES and CAST (gpg default) are designed to resist such attacks.
|
||||
|
||||
If someone was particularly concerned about these attack vectors, it would be pretty easy to add a mode where git-annex uses public key encryption directly. With the disadvantage, of course, that once a file was sent to a special remote and encrypted for a given set of public keys, other keys could not later be granted access to it.
|
||||
"""]]
|
|
@ -1,12 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U"
|
||||
nickname="Richard"
|
||||
subject="comment 2"
|
||||
date="2013-08-02T07:21:50Z"
|
||||
content="""
|
||||
Using symmetric keys is significantly cheaper, computation-wise.
|
||||
|
||||
The scheme of encrypting symmetric keys with asymmetric ones is ancient, well-proven, and generally accepted as a good approach.
|
||||
|
||||
Using per-key files makes access control more fine-grained and is only a real performance issue once while creating the private key and a little bit every time more than one file needs to be decrypted as more than one symmetric key needs to be taken care of.
|
||||
"""]]
|
|
@ -1,17 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="guilhem"
|
||||
ip="129.16.20.209"
|
||||
subject="comment 3"
|
||||
date="2013-08-19T13:44:35Z"
|
||||
content="""
|
||||
AES-NI acceleration will be used by default providing you're using
|
||||
the new modularized GnuPG (v2.x) and libgcrypt ≥ 1.5.0. Of course it
|
||||
only speeds up AES encryption, while GnuPG uses CAST by default; you can
|
||||
either set `personal-cipher-preferences` to AES or AES256 in your
|
||||
`gpg.conf` or, as joeyh hinted at, set `remote.<name>.annex-gnupg-options`
|
||||
as described in the manpage.
|
||||
|
||||
By the way, I observed a significant speed up when using `--compress-algo none`.
|
||||
Image, music and video files are typically hard to compress further, and it seems
|
||||
that's where gpg spent most of its time, at least on the few files I benchmarked.
|
||||
"""]]
|
|
@ -1,4 +0,0 @@
|
|||
Using an rsync remote is currently very slow when there are a lot of files, since rsync appears to be called for each file copied. It would be awesome if each call to rsync was amortized to copy many files; rsync is very good at copying many small files quickly.
|
||||
|
||||
> [[done]]; bug submitter was apparently not using a version
|
||||
> with rsync connection caching. --[[Joey]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.152.108.145"
|
||||
subject="comment 1"
|
||||
date="2013-08-01T16:06:42Z"
|
||||
content="""
|
||||
I cannot see a way to do this using rsync's current command-line interface. Ideas how to do it welcomed.
|
||||
"""]]
|
|
@ -1,24 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawln4uCaqZRd5_nRQ-iLcJyGctIdw8ebUiM"
|
||||
nickname="Edward"
|
||||
subject="Just put multiple source files"
|
||||
date="2013-08-01T16:29:04Z"
|
||||
content="""
|
||||
It seems like you can just put multiple source files on the command line:
|
||||
|
||||
ed@ed-Ubu64 /tmp$ touch a b c d
|
||||
ed@ed-Ubu64 /tmp$ mkdir test
|
||||
ed@ed-Ubu64 /tmp$ rsync -avz a b c d test
|
||||
sending incremental file list
|
||||
a
|
||||
b
|
||||
c
|
||||
d
|
||||
|
||||
sent 197 bytes received 88 bytes 570.00 bytes/sec
|
||||
total size is 0 speedup is 0.00
|
||||
ed@ed-Ubu64 /tmp$ ls test
|
||||
a b c d
|
||||
|
||||
It also appears to work with remote transfers too.
|
||||
"""]]
|
|
@ -1,14 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="4.152.108.145"
|
||||
subject="comment 3"
|
||||
date="2013-08-01T16:58:49Z"
|
||||
content="""
|
||||
git-annex needs to build a specific directory structure on the rsync remote though. It seems it would need to build the whole tree locally, containing only the files it wants to send.
|
||||
|
||||
When using encryption, it would need to encrypt all the files it's going to send and store them locally until it's built the tree. That could use a lot of disk space.
|
||||
|
||||
Also, there's the problem of checking which files are already present in the remote, to avoid re-encrypting and re-sending them. Currently this is done by running rsync with the url of the file, and checking its exit code. rsync does not seem to have an interface that would allow checking multiple files in one call. So any optimisation of the number of rsync calls would only eliminate 1/2 of the current number.
|
||||
|
||||
When using ssh:// urls, the rsync special remote already uses ssh connection caching, which I'd think would eliminate most of the overhead. (If you have a version of git-annex older than 4.20130417, you should upgrade to get this feature.) It should not take very long to start up a new rsync over a cached ssh connection. rsync:// is probably noticably slower.
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawln4uCaqZRd5_nRQ-iLcJyGctIdw8ebUiM"
|
||||
nickname="Edward"
|
||||
subject="Thanks"
|
||||
date="2013-08-01T17:03:23Z"
|
||||
content="""
|
||||
I am using an old version of git-annex. I'll try the newer one and see if the connection caching helps!
|
||||
"""]]
|
|
@ -1,5 +0,0 @@
|
|||
Find a way to copy a file with a progress bar, while still preserving
|
||||
stat. Easiest way might be to use pv and fix up the permissions etc
|
||||
after?
|
||||
|
||||
[[done]]
|
|
@ -1,11 +0,0 @@
|
|||
add a git annex fsck that finds keys that have no referring file
|
||||
|
||||
(done)
|
||||
|
||||
* Need per-backend fsck support. sha1 can checksum all files in the annex.
|
||||
WORM can check filesize.
|
||||
|
||||
* Both can check that annex.numcopies is satisfied. Probably only
|
||||
querying the locationlog, not doing an online verification.
|
||||
|
||||
[[done]]
|
|
@ -1,13 +0,0 @@
|
|||
`git annex fsck --from remote`
|
||||
|
||||
Basically, this needs to receive each file in turn from the remote, to a
|
||||
temp file, and then run the existing fsck code on it. Could be quite
|
||||
expensive, but sometimes you really want to check.
|
||||
|
||||
An unencrypted directory special remote could be optimised, by not actually
|
||||
copying the file, just dropping a symlink, etc.
|
||||
|
||||
The WORM backend doesn't care about file content, so it would be nice to
|
||||
avoid transferring the content at all, and only send the size.
|
||||
|
||||
> [[done]] --[[Joey]]
|
|
@ -1,15 +0,0 @@
|
|||
[[done]]
|
||||
|
||||
I've been considering adding a `git-annex-shell` command. This would
|
||||
be similar to `git-shell` (and in fact would pass unknown commands off to
|
||||
`git-shell`).
|
||||
|
||||
## Reasons
|
||||
|
||||
* Allows locking down an account to only be able to use git-annex (and
|
||||
git).
|
||||
* Avoids needing to construct complex shell commands to run on the remote
|
||||
system. (Mostly already avoided by the plumbing level commands.)
|
||||
* Could possibly allow multiple things to be done with one ssh connection
|
||||
in future.
|
||||
* Allows expanding `~` and `~user` in repopath on the remote system.
|
|
@ -1,32 +0,0 @@
|
|||
`git-annex unused` has to compare large sets of data
|
||||
(all keys with content present in the repository,
|
||||
with all keys used by files in the repository), and so
|
||||
uses more memory than git-annex typically needs.
|
||||
|
||||
It used to be a lot worse (hundreds of megabytes).
|
||||
|
||||
Now it only needs enough memory to store a Set of all Keys that currently
|
||||
have content in the annex. On a lightly populated repository, it runs in
|
||||
quite low memory use (like 8 mb) even if the git repo has 100 thousand
|
||||
files. On a repository with lots of file contents, it will use more.
|
||||
|
||||
Still, I would like to reduce this to a purely constant memory use,
|
||||
as running in constant memory no matter the repo size is a git-annex design
|
||||
goal.
|
||||
|
||||
One idea is to use a bloom filter.
|
||||
For example, construct a bloom filter of all keys used by files in
|
||||
the repository. Then for each key with content present, check if it's
|
||||
in the bloom filter. Since there can be false positives, this might
|
||||
miss finding some unused keys. The probability/size of filter
|
||||
could be tunable.
|
||||
|
||||
> Fixed in `bloom` branch in git. --[[Joey]]
|
||||
>> [[done]]! --[[Joey]]
|
||||
|
||||
Another way might be to scan the git log for files that got removed
|
||||
or changed what key they pointed to. Correlate with keys with content
|
||||
currently present in the repository (possibly using a bloom filter again),
|
||||
and that would yield a shortlist of keys that are probably not used.
|
||||
Then scan thru all files in the repo to make sure that none point to keys
|
||||
on the shortlist.
|
|
@ -1,13 +0,0 @@
|
|||
Would help alot when having to add large(ish) amounts of remotes.
|
||||
|
||||
Maybe detect this kind of commit message and ask user whether to automatically add them? See [[auto_remotes]]:
|
||||
> Question: When should git-annex update the remote.log? (If not just on init.) Whenever it reads in a repo's remotes?
|
||||
|
||||
----
|
||||
|
||||
I'm not sure that the above suggestion is going down a path that really
|
||||
makes sense. If you want a list of repository UUIDs and descriptions,
|
||||
it's there in machine-usable form in `.git-annex/uuid.log`, there is no
|
||||
need to try to pull this info out of git commit messages. --[[Joey]]
|
||||
|
||||
[[done]]
|
|
@ -1,39 +0,0 @@
|
|||
gitosis and gitolite should support git-annex being used to send/receive
|
||||
files from the repositories they manage. Users with read-only access
|
||||
could only get files, while users with write access could also put and drop
|
||||
files.
|
||||
|
||||
Doing this right requires modifying both programs, to add [[git-annex-shell]]
|
||||
to the list of things they can run, and only allow through appropriate
|
||||
git-annex-shell subcommands to read-only users.
|
||||
|
||||
I have posted an RFC for modifying gitolite to the
|
||||
[gitolite mailing list](http://groups.google.com/group/gitolite?lnk=srg).
|
||||
|
||||
> I have not developed a patch yet, but all that git-annex needs is a way
|
||||
> to ssh to the server and run the git-annex-shell command there.
|
||||
> git-annex-shell is very similar to git-shell. So, one way to enable
|
||||
> it is simply to set GL_ADC_PATH to a directory containing git-annex-shell.
|
||||
>
|
||||
> But, that's not optimal, since git-annex-shell will send off receive-pack
|
||||
> commands to git, which would bypass gitolite's permissions checking.
|
||||
> Also, it makes sense to limit readonly users to only download, not
|
||||
> upload/delete files from git-annex. Instead, I suggest adding something
|
||||
> like this to gitolite's config:
|
||||
|
||||
# If set, users with W access can write file contents into the git-annex,
|
||||
# and users with R access can read file contents from the git-annex.
|
||||
$GL_GIT_ANNEX = 0;
|
||||
|
||||
> If this makes sense, I'm sure I can put a patch together for your
|
||||
> review. It would involve modifying gl-auth-command so it knows how
|
||||
> to run git-annex-shell, and how to parse out the "verb" from a
|
||||
> git-annex-shell command line, and modifying R_COMMANDS and W_COMMANDS.
|
||||
|
||||
As I don't write python, someone else is needed to work on gitosis.
|
||||
--[[Joey]]
|
||||
|
||||
> [[done]]; support for gitolite is in its `pu` branch, and some changes
|
||||
> made to git-annefor gitolite is in its `pu` branch, and some changes
|
||||
> made to git-annex. Word is gitosis is not being maintained so I won't
|
||||
> worry about try to support it. --[[Joey]]
|
|
@ -1,5 +0,0 @@
|
|||
how to handle git rm file? (should try to drop keys that have no
|
||||
referring file, if it seems safe..)
|
||||
|
||||
[[done]] -- I think that git annex unused and dropunused are the best
|
||||
solution to this.
|
|
@ -1,18 +0,0 @@
|
|||
A repository like http://annex.debconf.org/debconf-share/ has a git repo
|
||||
published via http. When getting files from such a repo, git-annex tries
|
||||
two urls. One url would be used by a bare repo, and the other by a non-bare
|
||||
repo. (This is due to the directory hashing change.) Result is every file
|
||||
download from a non-bare http repo starts with a 404 and then it retries
|
||||
with the right url.
|
||||
|
||||
Since git-annex already downloads the .git/config to find the uuid of the
|
||||
http repo, it could also look at it to see if the repo is bare. If not,
|
||||
set a flag, and try the two urls in reverse order, which would almost
|
||||
always avoid this 404 problem.
|
||||
|
||||
(The real solution is probably to flag day and get rid of the old-style
|
||||
directory hashing, but that's been discussed elsewhere.)
|
||||
|
||||
--[[Joey]]
|
||||
|
||||
[[done]]
|
|
@ -1,8 +0,0 @@
|
|||
The IA would find it useful to be able to control the http headers
|
||||
git-annex get, addurl, etc uses. This will allow setting cookies, for
|
||||
example.
|
||||
|
||||
* annex-web-headers=blah
|
||||
* Perhaps also annex-web-headers-command=blah
|
||||
|
||||
[[done]]
|
|
@ -1,8 +0,0 @@
|
|||
> josh: Do you do anything in git-annex to try to make the files immutable?
|
||||
> For instance, removing write permission, or even chattr?
|
||||
> joey: I don't, but that's a very good idea
|
||||
> josh: Oh, I just thought of another slightly crazy but handy idea.
|
||||
> josh: I'd hate to run into a program which somehow followed the symlink and then did an unlink to replace the file.
|
||||
> josh: To break that, you could create a new directory under annex's internal directory for each file, and make the directory have no write permission.
|
||||
|
||||
[[done]] and done --[[Joey]]
|
|
@ -1,24 +0,0 @@
|
|||
Justin Azoff realized git-annex should have an incremental fsck.
|
||||
|
||||
This requires storing the last fsck time of each object.
|
||||
|
||||
I would not be strongly opposed to sqlite, but I think there are other
|
||||
places the data could be stored. One possible place is the mode or mtime
|
||||
of the .git/annex/objects/xx/yy/$key directories (the parent directories
|
||||
of where the content is stored). Perhaps the sticky bit could be used to
|
||||
indicate the content has been fsked, and the mtime indicate the time
|
||||
of last fsck. Anything that dropped or put in content would need to
|
||||
clear the sticky bit. --[[Joey]]
|
||||
|
||||
> Basic incremental fsck is done now.
|
||||
>
|
||||
> Some enhancements would include:
|
||||
>
|
||||
> * --max-age=30d Once the incremental fsck completes and was started 30 days ago,
|
||||
> start a new one.
|
||||
> * --time-limit --size-limit --file-limit: Limit how long the fsck runs.
|
||||
|
||||
>> Calling this [[done]]. The `--incremental-schedule` option
|
||||
>> allows scheduling time between incremental fscks. `--time-limit` is
|
||||
>> done. I implemented `--smallerthan` independently. Not clear what
|
||||
>> `--file-limit` would be. --[[Joey]]
|
|
@ -1,14 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmBUR4O9mofxVbpb8JV9mEbVfIYv670uJo"
|
||||
nickname="Justin"
|
||||
subject="comment 1"
|
||||
date="2012-09-20T14:11:57Z"
|
||||
content="""
|
||||
I have a [proof of concept written in python](https://github.com/JustinAzoff/git-annex-background-fsck/blob/master/git-annex-background-fsck).
|
||||
|
||||
You can run it and point it the root of an annex or to a subdirectory. In my brief testing it seems to work :-)
|
||||
|
||||
the goal would be to have options like
|
||||
|
||||
git annex fsck /data/annex --check-older-than 1w --check-for 2h --max-load-avg 0.5
|
||||
"""]]
|
|
@ -1,52 +0,0 @@
|
|||
I have two repos, using SHA1 backend and both using git.
|
||||
The first one is a laptop, the second one is a usb drive.
|
||||
|
||||
When I drop a file on the laptop repo, the file is not available on that repo until I run *git annex get*
|
||||
But when the usb drive is plugged in the file is actually available.
|
||||
|
||||
How about adding a feature to link some/all files to the remote repo?
|
||||
|
||||
e.g.
|
||||
We have *railscasts/196-nested-model-form-part-1.mp4* file added to git, and only available on the usb drive:
|
||||
|
||||
$ git annex whereis 196-nested-model-form-part-1.mp4
|
||||
whereis 196-nested-model-form-part-1.mp4 (1 copy)
|
||||
a7b7d7a4-2a8a-11e1-aebc-d3c589296e81 -- origin (Portable usb drive)
|
||||
|
||||
I can see the link with:
|
||||
|
||||
$ cd railscasts
|
||||
$ ls -ls 196*
|
||||
8 lrwxr-xr-x 1 framallo staff 193 Dec 20 05:49 196-nested-model-form-part-1.mp4 -> ../.git/annex/objects/Wz/6P/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e
|
||||
|
||||
I save this in a variable just to make the example more clear:
|
||||
|
||||
ID=".git/annex/objects/Wz/6P/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e"
|
||||
|
||||
The file doesn't exist on the local repo:
|
||||
|
||||
$ ls ../$ID
|
||||
ls: ../$ID: No such file or directory
|
||||
|
||||
however I can create a link to access that file on the remote repo.
|
||||
First I create a needed dir:
|
||||
|
||||
$ mkdir ../.git/annex/objects/Wz/6P/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e/
|
||||
|
||||
Then I link to the remote file:
|
||||
|
||||
$ ln -s /mnt/usb_drive/repo_folder/$ID ../$ID
|
||||
|
||||
now I can open the file in the laptop repo.
|
||||
|
||||
|
||||
I think it could be easy to implement. Maybe It's a naive approach, but looks apealing.
|
||||
Checking if it's a real file or a link shouldn't impact on performance.
|
||||
The limitation is that it would work only with remote repos on local dirs
|
||||
|
||||
Also allows you to have one directory structure like AFS or other distributed FS. If the file is not local I go to the remote server.
|
||||
Which is great for apps like Picasa, Itunes, and friends that depends on the file location.
|
||||
|
||||
> This is a duplicate of [[union_mounting]]. So closing it: [[done]].
|
||||
>
|
||||
> It's a good idea, but making sure git-annex correctly handles these links in all cases is a subtle problem that has not yet been tackled. --[[Joey]]
|
|
@ -1,18 +0,0 @@
|
|||
Some podcasts don't include a sortable date as the first thing in their episode title, which makes listening to them in order challenging if not impossible.
|
||||
|
||||
The date the item was posted is part of the RSS standard, so we should parse that and provide a new importfeed template option "itemdate".
|
||||
|
||||
(For the curious, I tried "itemid" thinking that might give me something close, but it doesn't. I used --template='${feedtitle}/${itemid}-${itemtitle}${extension}' and get:
|
||||
|
||||
http___openmetalcast.com__p_1163-Open_Metalcast_Episode__93__Headless_Chicken.ogg
|
||||
|
||||
or
|
||||
|
||||
http___www.folkalley.com_music_podcasts__name_2013_08_21_alleycast_6_13.mp3-Alleycast___06.13.mp3
|
||||
|
||||
that "works" but is ugly :)
|
||||
|
||||
Would love to be able to put a YYYYMMDD at the beginning and then the title.
|
||||
|
||||
> [[done]]; itempubdate will use form YYYY-MM-DD (or the raw date string
|
||||
> if the feed does not use a parsable form). --[[Joey]]
|
|
@ -1,16 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://grossmeier.net/"
|
||||
nickname="greg"
|
||||
subject="Without knowing Haskell"
|
||||
date="2014-04-06T04:55:31Z"
|
||||
content="""
|
||||
Maybe this just requires adding:
|
||||
|
||||
, fieldMaybe \"itemdate\" $ getFeedPubDate $ item i
|
||||
|
||||
on line 214 in Command/ImportFeed.hs ??
|
||||
|
||||
It is supported by [Text.Feed.Query](http://hackage.haskell.org/package/feed-0.3.9.2/docs/Text-Feed-Query.html)
|
||||
|
||||
I have no haskell dev env so I can't test this, but if my suggestion is true, I might set one up :)
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.244"
|
||||
subject="comment 2"
|
||||
date="2014-04-07T19:51:27Z"
|
||||
content="""
|
||||
https://github.com/sof/feed/issues/6
|
||||
"""]]
|
|
@ -1,25 +0,0 @@
|
|||
The `Makefile` should respect a `PREFIX` passed on the commandline so git-annex can be installed in (say) `$HOME`.
|
||||
|
||||
Simple patch:
|
||||
|
||||
[[!format diff """
|
||||
diff --git a/Makefile b/Makefile
|
||||
index b8995b2..5b1a6d4 100644
|
||||
--- a/Makefile
|
||||
+++ b/Makefile
|
||||
@@ -3,7 +3,7 @@ all=git-annex $(mans) docs
|
||||
|
||||
GHC?=ghc
|
||||
GHCMAKE=$(GHC) $(GHCFLAGS) --make
|
||||
-PREFIX=/usr
|
||||
+PREFIX?=/usr
|
||||
CABAL?=cabal # set to "./Setup" if you lack a cabal program
|
||||
|
||||
# Am I typing :make in vim? Do a fast build.
|
||||
"""]]
|
||||
|
||||
--[[anarcat]]
|
||||
|
||||
> [[done]] --[[Joey]]
|
||||
|
||||
> > thanks! ;) --[[anarcat]]
|
|
@ -1,22 +0,0 @@
|
|||
The traditionnal way of marking commandline flags in a manpage is with a `.B` (for Bold, I guess). It doesn't seem to be used by mdwn2man, which makes the manpage look a little more dull than it could.
|
||||
|
||||
The following patch makes those options come out more obviously:
|
||||
|
||||
[[!format diff """
|
||||
diff --git a/Build/mdwn2man b/Build/mdwn2man
|
||||
index ba5919b..7f819ad 100755
|
||||
--- a/Build/mdwn2man
|
||||
+++ b/Build/mdwn2man
|
||||
@@ -8,6 +8,7 @@ print ".TH $prog $section\n";
|
||||
|
||||
while (<>) {
|
||||
s{(\\?)\[\[([^\s\|\]]+)(\|[^\s\]]+)?\]\]}{$1 ? "[[$2]]" : $2}eg;
|
||||
+ s/\`([^\`]*)\`/\\fB$1\\fP/g;
|
||||
s/\`//g;
|
||||
s/^\s*\./\\&./g;
|
||||
if (/^#\s/) {
|
||||
"""]]
|
||||
|
||||
I tested it against the git-annex manpage and it seems to work well. --[[anarcat]]
|
||||
|
||||
> [[done]], thanks --[[Joey]]
|
|
@ -1,5 +0,0 @@
|
|||
Support for remote git repositories (ssh:// specifically can be made to
|
||||
work, although the other end probably needs to have git-annex
|
||||
installed..)
|
||||
|
||||
[[done]], at least get and put work..
|
|
@ -1,100 +0,0 @@
|
|||
We had some informal discussions on IRC about improving the output of the `whereis` command.
|
||||
|
||||
[[!toc levels=2]]
|
||||
|
||||
First version: columns
|
||||
======================
|
||||
|
||||
[[mastensg]] started by implementing a [simple formatter](https://gist.github.com/mastensg/6500982) that would display things in columns [screenshot](http://www.ping.uio.no/~mastensg/whereis.png)
|
||||
|
||||
Second version: Xs
|
||||
==================
|
||||
|
||||
After some suggestions from [[joey]], [[mastensg]] changed the format slightly ([screenshot](http://www.ping.uio.no/~mastensg/whereis2.png)):
|
||||
|
||||
[[!format txt """
|
||||
17:01:34 <joeyh> foo
|
||||
17:01:34 <joeyh> |bar
|
||||
17:01:34 <joeyh> ||baz (untrusted)
|
||||
17:01:34 <joeyh> |||
|
||||
17:01:34 <joeyh> XXx 3? img.png
|
||||
17:01:36 <joeyh> _X_ 1! bigfile
|
||||
17:01:37 <joeyh> XX_ 2 zort
|
||||
17:01:39 <joeyh> __x 1?! maybemissing
|
||||
17:02:09 * joeyh does a s/\?/+/ in the above
|
||||
17:02:24 <joeyh> and decrements the counters for untrusted
|
||||
17:03:37 <joeyh> __x 0+! maybemissing
|
||||
"""]]
|
||||
|
||||
Third version: incremental
|
||||
==========================
|
||||
|
||||
Finally, [[anarcat]] worked on making it run faster on large repositories, in a [fork](https://gist.github.com/anarcat/6502988) of that first gist. Then paging was added (so headers are repeated).
|
||||
|
||||
Fourth version: tuning and blocked
|
||||
==================================
|
||||
|
||||
[[TobiasTheViking]] provided some bugfixes, and the next step was to implement the trusted/untrusted detection, and have a counter.
|
||||
|
||||
This required more advanced parsing of the remotes, and instead of starting to do some JSON parsing, [[anarcat]] figured it was time to learn some Haskell instead.
|
||||
|
||||
Current status: needs merge
|
||||
===========================
|
||||
|
||||
So right now, the most recent version of the python script is in [anarcat's gist](https://gist.github.com/anarcat/6502988) and works reasonably well. However, it doesn't distinguish between trusted and untrusted repos and so on.
|
||||
|
||||
Furthermore, we'd like to see this factored into the `whereis` command directly. A [raw.hs](http://codepad.org/miVJb5oK) file has been programmed by `mastensg`, and is now available in the above gist. It fits the desired output and prototypes, and has been `haskellized` thanks to [[guilhem]].
|
||||
|
||||
Now we just need to merge those marvelous functions in `Whereis.hs` - but I can't quite figure out where to throw that code, so I'll leave it to someone more familiar with the internals of git-annex. The most recent version is still in [anarcat's gist](https://gist.github.com/anarcat/6502988). --[[anarcat]]
|
||||
|
||||
Desired output
|
||||
--------------
|
||||
|
||||
The output we're aiming for is:
|
||||
|
||||
foo
|
||||
|bar
|
||||
||baz (untrusted)
|
||||
|||
|
||||
XXx 2+ img.png
|
||||
_X_ 1! bigfile
|
||||
XX_ 2 zort
|
||||
__x 0+! maybemissing
|
||||
|
||||
Legend:
|
||||
|
||||
* `_` - file missing from repo
|
||||
* `x` - file may be present in untrusted repo
|
||||
* `X` - file is present in trusted repo
|
||||
* `[0-9]` - number of copies present in trusted repos
|
||||
* `+` - indicates there may be more copies present
|
||||
* `!` - indicates only one copy is left
|
||||
|
||||
Implementation notes
|
||||
--------------------
|
||||
|
||||
[[!format txt """
|
||||
20:48:18 <joeyh> if someone writes me a headerWhereis :: [(RemoteName, TrustLevel)] -> String and a formatWhereis :: [(RemoteName, TrustLevel, UUID)] -> [UUD] -> FileName -> String , I can do the rest ;)
|
||||
20:49:22 <joeyh> make that second one formatWhereis :: [(RemoteName, TrueLevel, Bool)] -> FileName -> String
|
||||
20:49:37 <joeyh> gah, typos
|
||||
20:49:45 <joeyh> suppose you don't need the RemoteName either
|
||||
"""]]
|
||||
|
||||
> So, I incorporated this, in a new remotes command.
|
||||
> Showing all known repositories seemed a bit much
|
||||
> (I have 30-some known repositories in some cases),
|
||||
> so just showing configured remotes seems a good simplification.
|
||||
> [[done]]
|
||||
> --[[Joey]]
|
||||
|
||||
> > I would have prefered this to be optional since I don't explicitely configure all remotes in git, especially if I can't reach them all the time (e.g. my laptop). It seems to me this should at least be an option, but I am confused as to why `Remote.List.remoteList` doesn't list all remotes the same way `Remote.remote_list` does... Also, it's unfortunate that the +/!/count flags have been dropped, it would have been useful... Thanks for the merge anyways! --[[done]]
|
||||
> >
|
||||
> > The more I look at this, the more i think there are a few things wrong with the new `remotes` command.
|
||||
> >
|
||||
> > 1. the name is confusing: being a git addict, I would expect the `git annex remote` command to behave like the `git remote` command: list remotes, add remotes, remove remotes and so on. it would actually be useful to have such a command (which would replace `initremote`, I guess). i recommend replacing the current `whereis` command, even if enabled through a special flag
|
||||
> >
|
||||
> > 2. its behavior is inconsistent with other git annex commands: `git annex status`, for example, lists information about all remotes, regardless of whether they are configured in git. `remotes` (whatever it's called), should do the same, or at least provide an option to allow the user to list files on all remotes. The way things stand, there is no way to list files on non-git remotes, even if they are added explicitely as a remote, if the remote is not actually reachable: the files are just marked as absent (even thought `whereis` actually finds them). i recommend showing all remotes regardless, either opt-in or opt-out using a flag.
|
||||
> >
|
||||
> > 3. having the `!` flag, at least, would be useful because it would allow users to intuitively grep for problematic files without having to learn extra syntax. same with + and having an explicit count.
|
||||
> >
|
||||
> > thanks. --[[anarcat]]
|
|
@ -1,25 +0,0 @@
|
|||
Several things suggest now would be a good time to reorgaize the object
|
||||
directory. This would be annex.version=2. It will be slightly painful for
|
||||
all users, so this should be the *last* reorg in the forseeable future.
|
||||
|
||||
1. Remove colons from filenames, for [[bugs/fat_support]]
|
||||
|
||||
2. Add hashing, since some filesystems do suck (like er, fat at least :)
|
||||
[[forum/hashing_objects_directories]]
|
||||
(Also, may as well hash .git-annex/* while at it -- that's what
|
||||
really gets big.)
|
||||
|
||||
3. Add filesize metadata for [[bugs/free_space_checking]]. (Currently only
|
||||
present in WORM, and in an ad-hoc way.)
|
||||
|
||||
4. Perhaps use a generic format that will allow further metadata to be
|
||||
added later. For example,
|
||||
"bSHA1,s101111,kf3101c30bb23467deaec5d78c6daa71d395d1879"
|
||||
|
||||
(Probably everything after ",k" should be part of the key, even if it
|
||||
contains the "," separator character. Otherwise an escaping mechanism
|
||||
would be needed.)
|
||||
|
||||
[[done]] now!
|
||||
|
||||
Although [[bugs/free_space_checking]] is not quite there --[[Joey]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U"
|
||||
nickname="Richard"
|
||||
subject="comment 1"
|
||||
date="2011-03-16T01:16:48Z"
|
||||
content="""
|
||||
If you support generic meta-data, keep in mind that you will need to do conflict resolution. Timestamps may not be synched across all systems, so keeping a log of old metadata could be used, sorting by history and using the latest. Which leaves the situation of two incompatible changes. This would probably mean manual conflict resolution. You will probably have thought of this already, but I still wanted to make sure this is recorded. -- RichiH
|
||||
"""]]
|
|
@ -1,8 +0,0 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U"
|
||||
nickname="Richard"
|
||||
subject="comment 2"
|
||||
date="2011-03-16T01:19:25Z"
|
||||
content="""
|
||||
Hmm, I added quite a few comments at work, but they are stuck in moderation. Maybe I forgot to log in before adding them. I am surprised this one appeared immediately. -- RichiH
|
||||
"""]]
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue