Added a comment
This commit is contained in:
parent
6808b08c1a
commit
30cf6ce81c
1 changed files with 14 additions and 0 deletions
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joey.kitenet.net/"
|
||||
nickname="joey"
|
||||
subject="comment 6"
|
||||
date="2011-12-22T16:39:24Z"
|
||||
content="""
|
||||
My main concern with putting this in git-annex is that finding duplicates necessarily involves storing a list of every key and file in the repository, and git-annex is very carefully built to avoid things that require non-constant memory use, so that it can scale to very big repositories. (The only exception is the `unused` command, and reducing its memory usage is a continuing goal.)
|
||||
|
||||
So I would rather come at this from a different angle.. like providing a way to output a list of files and their associated keys, which the user can then use in their own shell pipelines to find duplicate keys:
|
||||
|
||||
git annex find --include '*' --format=\"%f %k\n\" | sort foo --key 2 | uniq --all-repeated --skip-fields=1
|
||||
|
||||
(Making that properly handle filenames with spaces is left as an exercise for the reader..)
|
||||
"""]]
|
Loading…
Reference in a new issue