From 14213c0e42bf453acbdb0c838ec19ba7e9265d3f Mon Sep 17 00:00:00 2001 From: anarcat Date: Sat, 1 Dec 2018 17:42:18 +0000 Subject: [PATCH] Added a comment: how about for regular key/value storage? --- ...mment_3_c60bbe9b280855a583c7c3e48e803760._comment | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 doc/design/assistant/deltas/comment_3_c60bbe9b280855a583c7c3e48e803760._comment diff --git a/doc/design/assistant/deltas/comment_3_c60bbe9b280855a583c7c3e48e803760._comment b/doc/design/assistant/deltas/comment_3_c60bbe9b280855a583c7c3e48e803760._comment new file mode 100644 index 0000000000..dc174163a8 --- /dev/null +++ b/doc/design/assistant/deltas/comment_3_c60bbe9b280855a583c7c3e48e803760._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="anarcat" + avatar="http://cdn.libravatar.org/avatar/4ad594c1e13211c1ad9edb81ce5110b7" + subject="how about for regular key/value storage?" + date="2018-12-01T17:42:18Z" + content=""" +are there plans to have chunks stored in the regular backend storage? + +i'm curious because one my use cases is archiving websites, where we end up with lots of WARC files. those files are basically a bunch of files from the website gzipp'd together in a stream, which means that multiple crawls of the same website (or actually, different website) have *lots* of redundant data (e.g. jQuery.js). storing those files in git-annex is not very efficient, because that data is duplicated all over the place. + +if the storage backend was chunked, there could be massive deduplication across those files... this is why i looked at the [[todo/borg_special_remote]]: I figured that i could at least deduplicate on the remote side, but it would still be nice to have this built-in! -- [[anarcat]] +"""]]