diff --git a/doc/forum/correct_way_to_add_two_preexisting_datasets.mdwn b/doc/forum/correct_way_to_add_two_preexisting_datasets.mdwn new file mode 100644 index 0000000000..35ed5b665a --- /dev/null +++ b/doc/forum/correct_way_to_add_two_preexisting_datasets.mdwn @@ -0,0 +1,16 @@ +I've been syncronizing my data since long time, mainly using rsync or unison. Thus I had two 3.5Gb datasets set1 (usb drive, hfs+ partition) and set2 (hdd, ext4 ubuntu 13.04 box) which differed only in 50Mb (new on set1 ). This was double checked using diff -r before doing anything. +I created a git annex repo in direct mode for set2 from command line, and after that I let the assistant scan it. +After that created the repo for set1 and added it to the assistant. I think here comes my mistake (I think). +Instead of keeping them apart, at told assistant to sync with set2. +Why I think this was a mistake? Because set2 was indexed and set1 no, and I'm seeing a lot of file moving a copying, which in my humble opinion should not happen. +What I expected it only the difference to be transferred from set1 to set2. +What it seems to be doing is moving away all content in set1, and copying it back from set2. I think it will end correctly, but with a lot of unnecessary and risky operations. +I think I should have independently added both datasets, let them be scanned and then connect to each other. +So, now the questions: +1. Is that the correct way to proceed? +2. What if I have to identical files with different modifying times, I hope they are not synced, right? +3. Is it posssible to achieve this behaviour of copying only the 50Mb? + +Thanks in advance and keep up the good work. +Best regards, + Juan