From f19a45973aaad573e3e0abd6a817cb2f575ca2e7 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Thu, 31 Aug 2017 18:14:04 -0400 Subject: [PATCH 01/11] devblog --- doc/devblog/day_467__export_progress.mdwn | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 doc/devblog/day_467__export_progress.mdwn diff --git a/doc/devblog/day_467__export_progress.mdwn b/doc/devblog/day_467__export_progress.mdwn new file mode 100644 index 0000000000..d5d32d5446 --- /dev/null +++ b/doc/devblog/day_467__export_progress.mdwn @@ -0,0 +1,11 @@ +Good progress on `git annex export` today. Changing the exported tree now +works and is done efficiently. Resuming an export is working. Even +detecting and resolving export conflicts should work (have not tested it). +The necessary information about the export is recorded in the git-annex +branch, including grafting in the exported tree there. + +There are some known problems when the tree that is exported contains +multiple files with the same content. And git-annex is not yet able +to download exported files from a special remote. Handling both of those +needs way to get from keys to exported filenames. So, I plan to +populate a sqlite database with that information next. From 28635f01909b1b7b6620c28a4291175af2c459f0 Mon Sep 17 00:00:00 2001 From: vgp Date: Fri, 1 Sep 2017 21:40:11 +0000 Subject: [PATCH 02/11] Added a comment --- .../comment_3_4788a41425d4cc65e9f529dbcdb2bf73._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/forum/huge_text_files___40__not_binary__41___-_compress/comment_3_4788a41425d4cc65e9f529dbcdb2bf73._comment diff --git a/doc/forum/huge_text_files___40__not_binary__41___-_compress/comment_3_4788a41425d4cc65e9f529dbcdb2bf73._comment b/doc/forum/huge_text_files___40__not_binary__41___-_compress/comment_3_4788a41425d4cc65e9f529dbcdb2bf73._comment new file mode 100644 index 0000000000..d070f67a33 --- /dev/null +++ b/doc/forum/huge_text_files___40__not_binary__41___-_compress/comment_3_4788a41425d4cc65e9f529dbcdb2bf73._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="vgp" + avatar="http://cdn.libravatar.org/avatar/b332bfc1d3f49c196e1bff84b53d0f8b" + subject="comment 3" + date="2017-09-01T21:40:11Z" + content=""" +I've tried the \"directory\" special remote with encryption=shared. It works well and I got a total size of 3.5GB while the working tree .git/annex dir has 21GB :-). The problem is: the git server of my research lab gives me a disk quota of 10GB, however, I cannot access it directly to store these files using \"directory\" special remote. Is there a way to use compression (probably through encryption) with a normal git remote? +"""]] From fa4defc9d720076681178578f7f1b2edd2eefb3d Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 4 Sep 2017 17:02:30 -0400 Subject: [PATCH 03/11] devblog --- doc/devblog/day_467__firming_up_export.mdwn | 27 +++++++++++++++++++++ 1 file changed, 27 insertions(+) create mode 100644 doc/devblog/day_467__firming_up_export.mdwn diff --git a/doc/devblog/day_467__firming_up_export.mdwn b/doc/devblog/day_467__firming_up_export.mdwn new file mode 100644 index 0000000000..6b5e193ca0 --- /dev/null +++ b/doc/devblog/day_467__firming_up_export.mdwn @@ -0,0 +1,27 @@ +More work on `git annex export`. Made `initremote exporttree=yes` be +required to enable exporting to a special remote. Added a sqlite database +to keep track of what files have been exported. That let me fix the known +problems with exporting multiple files that have the same content. + +The same database lets `git annex get` (etc) download content from exports. +Since an export is not a key/value store, git-annex has to do more +verification of content downloaded from an export. Some types of keys, +that are not based on checksums (eg WORM and URL), +cannot be downloaded from an export. And, git-annex will never trust +an export to retain the content of a key, since some other tree could +be exported over it at any time. + +With `git annex get` working from exports, it might be nice to also support +`git annex copy --to export` for exporting specific files to them. However, +that needs information that is not currently stored in the sqlite database +until the export has already completed. One way it could work is for `git +annex export --fast treeish --to export` to put all the filenames in the +database but not export anything, and then `git annex copy --to export` (or +even `git annex sync --content` to send the contents). I don't know if this +complication is worth it. + +Otherwise, the export feature is fairly close to being complete now. +Still need to make renames be handled efficiently, and add support for +exporting to more special remotes. + +Today's work was supported by the NSF-funded DataLad project. From b8b7a9a9021cb3eb263f3047f5f1c7c3080fd22f Mon Sep 17 00:00:00 2001 From: eacousineau Date: Tue, 5 Sep 2017 01:22:19 +0000 Subject: [PATCH 04/11] --- ...nex_cat-file__34___type_command__63__.mdwn | 20 +++++++++++++++++++ 1 file changed, 20 insertions(+) create mode 100644 doc/forum/Is_there_a___34__git_annex_cat-file__34___type_command__63__.mdwn diff --git a/doc/forum/Is_there_a___34__git_annex_cat-file__34___type_command__63__.mdwn b/doc/forum/Is_there_a___34__git_annex_cat-file__34___type_command__63__.mdwn new file mode 100644 index 0000000000..898cb0a2ff --- /dev/null +++ b/doc/forum/Is_there_a___34__git_annex_cat-file__34___type_command__63__.mdwn @@ -0,0 +1,20 @@ +Out of curiosity, is there an equivalent to `git cat-file` with `git annex`? + +The motivation is our usage of Bazel as a build system, which during test enforces hermiticity, and thus is very persnickity about modifying your workspace (e.g., the Git repository) while the test is being run, and usually isolates execution to a chroot'd sandbox of sorts. + +Ideally, the workflows I'd like are: + +A. Developer + +- 1. Clones repository. +- 2. Inits `git annex`, and does `git annex get .` to fetch all required files. +- 3. Runs `bazel test //repo:my_test`, which will symlink the existing large file into the sandbox, and run without a hitch. + +B. Tentative Contributor + +- 1. Clones repository. Pokes around. +- 2. Runs `bazel test //repo:my_test`. Since the large file does not exist, under the hood `git annex cat-file` is called to directly add the file to sandbox (possibly caching it somewhere, such that `git annex get` will use the already fetch'd file). + + +May I ask if this doable with simple visible commands? +If not, is there a way to achieve this that is special remote-agnostic? From 5e1595622521c839db785314ea47d0ebddfcfb49 Mon Sep 17 00:00:00 2001 From: EskildHustvedt Date: Tue, 5 Sep 2017 09:16:26 +0000 Subject: [PATCH 05/11] Added a comment: Partial exports --- .../comment_1_382fac60766340481acbcb8c05b70f42._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/devblog/day_467__firming_up_export/comment_1_382fac60766340481acbcb8c05b70f42._comment diff --git a/doc/devblog/day_467__firming_up_export/comment_1_382fac60766340481acbcb8c05b70f42._comment b/doc/devblog/day_467__firming_up_export/comment_1_382fac60766340481acbcb8c05b70f42._comment new file mode 100644 index 0000000000..09277b2f2b --- /dev/null +++ b/doc/devblog/day_467__firming_up_export/comment_1_382fac60766340481acbcb8c05b70f42._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="EskildHustvedt" + avatar="http://cdn.libravatar.org/avatar/0be1310904ded29624b9edb4824d451b" + subject="Partial exports" + date="2017-09-05T09:16:26Z" + content=""" +For what it's worth, partial exports (being able to only copy certain files to an export) would be very useful for me. My main usecase is exporting to my android phone (which has an sshd in termux that I use) from my desktop. I've got some large repos where having it all on my phone isn't possible, but it would be very useful to use git-annex to upload partials (right now I'm just using plain-old-rsync for that). +"""]] From 70ecf52888f43660101a65f6981ec5c8f9d7878b Mon Sep 17 00:00:00 2001 From: EskildHustvedt Date: Tue, 5 Sep 2017 09:16:59 +0000 Subject: [PATCH 06/11] Added a comment: Partial exports --- .../comment_2_fdd07a5bde18cac7c38393661369302c._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/devblog/day_467__firming_up_export/comment_2_fdd07a5bde18cac7c38393661369302c._comment diff --git a/doc/devblog/day_467__firming_up_export/comment_2_fdd07a5bde18cac7c38393661369302c._comment b/doc/devblog/day_467__firming_up_export/comment_2_fdd07a5bde18cac7c38393661369302c._comment new file mode 100644 index 0000000000..6ab560f67e --- /dev/null +++ b/doc/devblog/day_467__firming_up_export/comment_2_fdd07a5bde18cac7c38393661369302c._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="EskildHustvedt" + avatar="http://cdn.libravatar.org/avatar/0be1310904ded29624b9edb4824d451b" + subject="Partial exports" + date="2017-09-05T09:16:59Z" + content=""" +For what it's worth, partial exports (being able to only copy certain files to an export) would be very useful for me. My main usecase is exporting to my android phone (which has an sshd in termux that I use) from my desktop. I've got some large repos where having it all on my phone isn't possible, but it would be very useful to use git-annex to upload partials (right now I'm just using plain-old-rsync for that). +"""]] From 8755f320f512359d80fc0e37cc40575dab7891b6 Mon Sep 17 00:00:00 2001 From: EskildHustvedt Date: Tue, 5 Sep 2017 09:17:44 +0000 Subject: [PATCH 07/11] removed --- .../comment_2_fdd07a5bde18cac7c38393661369302c._comment | 8 -------- 1 file changed, 8 deletions(-) delete mode 100644 doc/devblog/day_467__firming_up_export/comment_2_fdd07a5bde18cac7c38393661369302c._comment diff --git a/doc/devblog/day_467__firming_up_export/comment_2_fdd07a5bde18cac7c38393661369302c._comment b/doc/devblog/day_467__firming_up_export/comment_2_fdd07a5bde18cac7c38393661369302c._comment deleted file mode 100644 index 6ab560f67e..0000000000 --- a/doc/devblog/day_467__firming_up_export/comment_2_fdd07a5bde18cac7c38393661369302c._comment +++ /dev/null @@ -1,8 +0,0 @@ -[[!comment format=mdwn - username="EskildHustvedt" - avatar="http://cdn.libravatar.org/avatar/0be1310904ded29624b9edb4824d451b" - subject="Partial exports" - date="2017-09-05T09:16:59Z" - content=""" -For what it's worth, partial exports (being able to only copy certain files to an export) would be very useful for me. My main usecase is exporting to my android phone (which has an sshd in termux that I use) from my desktop. I've got some large repos where having it all on my phone isn't possible, but it would be very useful to use git-annex to upload partials (right now I'm just using plain-old-rsync for that). -"""]] From 3e7d0e0de7aba8f7d1788c39bf1c87bf1d632ee8 Mon Sep 17 00:00:00 2001 From: yarikoptic Date: Tue, 5 Sep 2017 17:00:38 +0000 Subject: [PATCH 08/11] Added datalad "super-dataset". --- doc/publicrepos.mdwn | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/doc/publicrepos.mdwn b/doc/publicrepos.mdwn index c3adf67391..cb08797a07 100644 --- a/doc/publicrepos.mdwn +++ b/doc/publicrepos.mdwn @@ -37,6 +37,11 @@ the public repositories that you can clone to try out git-annex. A slightly outdated mirror of http://ifarchive.org. Scripts should probably be written to update the archive regularly. +* [datasets.datalad.org](http://datasets.datalad.org) + A large (over 10TB of data) collection of DataLad (git-annex) datasets, providing access primarily + to public neural data resources. Organized via git submodule mechanism. Although underlying + repositories are pure git/git-annex repositories, use of datalad tool is advised for more functionality + (search, recursive operation, etc). It is regularly updated and enriched. This is a wiki -- add your own public repository to the list! See [[tips/centralized_git_repository_tutorial]]. From 9a2e687b0db388fb1abb4c5a87cd73e20067076b Mon Sep 17 00:00:00 2001 From: karel-de-macil Date: Wed, 6 Sep 2017 09:20:26 +0000 Subject: [PATCH 09/11] --- doc/forum/Is_there_a_way_to_unannex_some_file___63__.mdwn | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 doc/forum/Is_there_a_way_to_unannex_some_file___63__.mdwn diff --git a/doc/forum/Is_there_a_way_to_unannex_some_file___63__.mdwn b/doc/forum/Is_there_a_way_to_unannex_some_file___63__.mdwn new file mode 100644 index 0000000000..78aa6744a4 --- /dev/null +++ b/doc/forum/Is_there_a_way_to_unannex_some_file___63__.mdwn @@ -0,0 +1,3 @@ +Having by mistake annex a full repo i look for a way of unannex some file to make them +managed by the "standard" git proccess again - mostly some source code file - +Is there a way to do that ? From fd8392b669723c7c69162f086c9ed0e9d4ef3f72 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Wed, 6 Sep 2017 11:23:04 -0400 Subject: [PATCH 10/11] update --- doc/thanks/list | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/thanks/list b/doc/thanks/list index e56ecc917d..e586794f4e 100644 --- a/doc/thanks/list +++ b/doc/thanks/list @@ -74,3 +74,4 @@ Lukas Platz, Sergey Karpukhin, Silvio Ankermann, Paul Tötterman, +Erik Bjäreholt, From c1b9f718bc5d235f4ce176495572b88f57d54c37 Mon Sep 17 00:00:00 2001 From: Edward Betts Date: Wed, 6 Sep 2017 11:58:17 +0100 Subject: [PATCH 11/11] move line break to fix broken link --- .../comment_1_cb87f7518da252b950d70c60352e848e._comment | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/todo/export/comment_1_cb87f7518da252b950d70c60352e848e._comment b/doc/todo/export/comment_1_cb87f7518da252b950d70c60352e848e._comment index 9158d123c3..c07acc5caa 100644 --- a/doc/todo/export/comment_1_cb87f7518da252b950d70c60352e848e._comment +++ b/doc/todo/export/comment_1_cb87f7518da252b950d70c60352e848e._comment @@ -4,8 +4,8 @@ subject="sounds like the dumb backend, except not dumb" date="2017-04-08T20:21:41Z" content=""" -This sounds a lot like what i was trying to do in [[todo/dumb, unsafe, -human-readable_backend]], except done properly. :) +This sounds a lot like what i was trying to do in +[[todo/dumb, unsafe, human-readable_backend]], except done properly. :) I was wondering about that asymmetry recentrly, and it would seem like a good idea to fix this. the `--to remote` flag could especially be