Commit graph

24478 commits

Author SHA1 Message Date
Joey Hess
5b801fcad9 on second thought, sync --content --unused is probably not useful, remove 2015-06-16 19:01:06 -04:00
Joey Hess
8b0549b408 Merge branch 'master' of ssh://git-annex.branchable.com 2015-06-16 18:56:20 -04:00
Joey Hess
2c77fb5cae devblog 2015-06-16 18:54:35 -04:00
Joey Hess
adba0595bd use bloom filter in second pass of sync --all --content
This is needed because when preferred content matches on files,
the second pass would otherwise want to drop all keys. Using a bloom filter
avoids this, and in the case of a false positive, a key will be left
undropped that preferred content would allow dropping. Chances of that
happening are a mere 1 in 1 million.
2015-06-16 18:50:13 -04:00
dev@c21308d8de79665e508a8f95f6f68ef82d56f698
0c69e6055d 2015-06-16 22:43:15 +00:00
Joey Hess
a0a8127956 instance Hashable Key for bloomfilter 2015-06-16 18:37:41 -04:00
Joey Hess
8b74aec3ea Increased the default annex.bloomaccuracy from 1000 to 10000000
This makes git annex unused use around 48 mb more memory than it did before,
but the massive increase in accuracy makes this worthwhile for all but the
smallest systems.

Also, I want to use the bloom filter for sync --all --content, to avoid
dropping files that the preferred content doesn't want, and 1/1000
false positives would be far too many in that use case, even if it were
acceptable for unused.

Actual memory use numbers:

1000: 21.06user 3.42system 0:26.40elapsed 92%CPU (0avgtext+0avgdata 501552maxresident)k
1000000: 21.41user 3.55system 0:26.84elapsed 93%CPU (0avgtext+0avgdata 549496maxresident)k
10000000: 21.84user 3.52system 0:27.89elapsed 90%CPU (0avgtext+0avgdata 549920maxresident)k

Based on these numbers, 10 million seemed a better pick than 1 million.
2015-06-16 18:12:00 -04:00
Joey Hess
f7350b7c33 wording 2015-06-16 17:32:41 -04:00
Joey Hess
8268f7951e adjust standard preferred content to work better with git annex sync --all --content
backup: Use new "anything" terminal. This means that content that
is not unused, but has no associated file will be wanted by backup repos.

unwanted: "not anything" will result in any and all content moving
off of these repos.

incremental backup: Remove the "(include=* or unused)",
so it matches content that has no associated files
but is not unused.

client: Add a include=* to the expression. This limits it to matching
only files in the work tree. Without this change, sync --all --content
would match a key against the expression, and since it matches
exclude=archive/*, the client repo would have wanted the file content.
The "and not unused" would have kept unused objects out, but not
objects that were not known to be unused, or objects that another branch
referred to. In practice, everything would have flooded into client repos
without this change.
2015-06-16 17:18:53 -04:00
Joey Hess
a4955542a3 Fix incremental backup standard preferred content expression to match its documentation, which says it does not want files that have reached a backup repository.
Checked history and these have been out of sync from the very beginning!
2015-06-16 17:10:10 -04:00
anarcat
da60a29e56 sign and split out 2015-06-16 21:06:14 +00:00
anarcat
c508c3472a first python implementation of this 2015-06-16 21:03:48 +00:00
Joey Hess
8c46ea22c2 Added new "anything" preferred content expression, which matches all versions of all files. 2015-06-16 17:03:34 -04:00
Joey Hess
29c03145e6 sync: Add support for --all and --unused. 2015-06-16 16:50:03 -04:00
anarcat
f5d84ac62e Added a comment 2015-06-16 20:10:50 +00:00
Joey Hess
58e6f033b9 update 2015-06-16 16:04:13 -04:00
Joey Hess
99a1113461 switch code to using associated files 2015-06-16 15:07:03 -04:00
Joey Hess
32adb5f0e0 actually.. 2015-06-16 14:03:13 -04:00
Joey Hess
fbc06b3d1f Merge branch 'master' of ssh://git-annex.branchable.com 2015-06-16 13:50:48 -04:00
Joey Hess
67f7f1b1cb info: Added json output for "backend usage", "numcopies stats", "repositories containing these files", and "transfers in progress". 2015-06-16 13:50:28 -04:00
eigengrau
5e9684436e Added a comment 2015-06-16 13:20:07 +00:00
https://id.koumbit.net/anarcat
bc87ed040e neat checksumming api at s3 that could be leveraged 2015-06-16 00:50:06 +00:00
anarcat
911054dbb8 Added a comment 2015-06-15 20:05:13 +00:00
anarcat
4f427c64f9 Added a comment 2015-06-15 20:02:56 +00:00
anarcat
003e979576 Added a comment 2015-06-15 19:48:46 +00:00
Joey Hess
c96b333869 clarify 2015-06-15 15:30:00 -04:00
Joey Hess
f62138b9c5 add basic progress 2015-06-15 15:27:17 -04:00
anarcat
009e961eca workaround: restarting the assistant 2015-06-15 19:16:18 +00:00
anarcat
5a8de788b0 Added a comment 2015-06-15 19:15:12 +00:00
Joey Hess
18a3d1b100 followup 2015-06-15 15:00:02 -04:00
Joey Hess
08acf42ce3 Merge branch 'master' of ssh://git-annex.branchable.com 2015-06-15 14:50:14 -04:00
Joey Hess
1eb4b47c79 layout 2015-06-15 14:48:38 -04:00
anarcat
bd6a6ac2af weird s3 sync bug? 2015-06-15 18:46:05 +00:00
Joey Hess
297b118d3e close 2015-06-15 14:27:48 -04:00
Joey Hess
a82ffec3cb comment 2015-06-15 14:26:12 -04:00
Joey Hess
149f8ced6b comment 2015-06-15 14:12:31 -04:00
Joey Hess
15d11fc903 link to todo item about this 2015-06-15 13:57:23 -04:00
Joey Hess
abe1c7b0bb bug in old version of git-annex not current version 2015-06-15 13:55:47 -04:00
Joey Hess
687202ee65 comment 2015-06-15 13:32:19 -04:00
Joey Hess
1ec2c11536 Merge branch 'master' of ssh://git-annex.branchable.com 2015-06-15 13:27:33 -04:00
zsolist@20b8dad52ed42acde0810648144f7df87b29cd39
402dfa6583 2015-06-15 08:22:45 +00:00
https://me.yahoo.com/a/WioZezwAj_PPf7_qtC0oN9Pl5iUte78gVg--#97871
6e71e9e61c 2015-06-15 04:22:34 +00:00
Joey Hess
9b38c14165 debian/cabal-wrapper: Removed this hack which should not be needed anymore. 2015-06-14 14:43:55 -04:00
Joey Hess
a6c56fb459 improve url parsing more
Now can handle eg, "http://[::1]/download/cdrom-fontzip[foo]", where
the first [] need to stay unescaped, but the rest have to be escaped.
2015-06-14 13:54:24 -04:00
Joey Hess
829007d629 Improve url parsing to handle some urls containing illegal [] characters in their paths.
Ie, "https://archive.org/download/zoom-2/Zoom - Release 2 (1996)(Active Software)[!].iso"
2015-06-14 13:39:44 -04:00
Joey Hess
866982aa3a comment 2015-06-14 12:54:27 -04:00
Joey Hess
303d374605 resp 2015-06-14 12:52:53 -04:00
Joey Hess
d14c86acea Merge branch 'master' of ssh://git-annex.branchable.com 2015-06-14 12:48:29 -04:00
Joey Hess
0e8f23fa06 set LC_ALL in ikiwiki build to ensure deterministic build in other locales
smcv suggeted using C.UTF-8, but I want this to work beyond Debian, so went
with C, which seems to work ok.
2015-06-14 10:51:48 -04:00
tomekwi
83ddf94ac5 Added a comment: … 2015-06-13 13:07:49 +00:00