old version?
This commit is contained in:
		
					parent
					
						
							
								f2817f13ac
							
						
					
				
			
			
				commit
				
					
						0bb3a31a6e
					
				
			
		
					 1 changed files with 27 additions and 2 deletions
				
			
		|  | @ -1,4 +1,29 @@ | |||
| It seems that git-annex copies every individual file in a separate transaction. This is quite costly for mass transfers: each file involves a separate rsync invocation and the creation of a new commit. Even with a meager thousand files or so in the annex, I have to wait for fifteen minutes to copy the contents to another disk, simply because every individual file involves some disk thrashing. Also, it seems suspicious that the git-annex branch would get a thousands commits of history from the simple procedure of copying everything to a new repository. Surely it would be better to first copy everything and then create only a single commit that registers the changes to the files' availability? | ||||
| It seems that git-annex copies every individual file in a separate | ||||
| transaction. This is quite costly for mass transfers: each file involves a | ||||
| separate rsync invocation and the creation of a new commit. Even with a | ||||
| meager thousand files or so in the annex, I have to wait for fifteen | ||||
| minutes to copy the contents to another disk, simply because every | ||||
| individual file involves some disk thrashing. Also, it seems suspicious | ||||
| that the git-annex branch would get a thousands commits of history from the | ||||
| simple procedure of copying everything to a new repository. Surely it would | ||||
| be better to first copy everything and then create only a single commit | ||||
| that registers the changes to the files' availability? | ||||
| 
 | ||||
| (I'm also not quite clear on why rsync is being used when both repositories are local. It seems to be just overhead.) | ||||
| > git-annex is very careful to commit as infrequently as possible, | ||||
| > and the current version makes *1* commit after all the copies are | ||||
| > complete, even if it transferred a billion files. The only overhead | ||||
| > incurred for each file is writing a journal file. | ||||
| > You must have an old version. | ||||
| > --[[Joey]] | ||||
| 
 | ||||
| (I'm also not quite clear on why rsync is being used when both repositories | ||||
| are local. It seems to be just overhead.) | ||||
| 
 | ||||
| > Even when copying to another disk it's often on  | ||||
| > some slow bus, and the file is by definition large. So it's | ||||
| > nice to support resumes of interrupted transfers of files. | ||||
| > Also because rsync has a handy progress display that is hard to get with cp. | ||||
| >  | ||||
| > (However, if the copy is to another directory in the same disk, it does | ||||
| > use cp, and even supports really fast copies on COW filesystems.) | ||||
| > --[[Joey]] | ||||
|  |  | |||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue
	
	 Joey Hess
				Joey Hess