40 lines
		
	
	
	
		
			1.8 KiB
			
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			40 lines
		
	
	
	
		
			1.8 KiB
			
		
	
	
	
		
			Markdown
		
	
	
	
	
	
I've been running some large transfers with the assistant, and looking at
 | 
						|
ways to improve performance. (I also found and fixed a zombie process
 | 
						|
leak.)
 | 
						|
 | 
						|
----
 | 
						|
 | 
						|
One thing I noticed is that the assistant pushes changes to the git-annex
 | 
						|
location log quite frequently during a batch transfer. If the files being
 | 
						|
transferred are reasonably sized, it'll be pushing once per file transfer. 
 | 
						|
It would be good to reduce the number of pushes, but the pushes are
 | 
						|
important in some network topologies to inform other nodes
 | 
						|
when a file gets near to them, so they can get the file too.
 | 
						|
 | 
						|
Need to see if I can find a smart way to avoid some of the pushes.
 | 
						|
For example, if we've just downloaded a file, and are queuing uploads
 | 
						|
of the file to a remote, we probably don't need to push the git-annex
 | 
						|
branch to the remote.
 | 
						|
 | 
						|
----
 | 
						|
 | 
						|
Another performance problem is that having the webapp open while transfers
 | 
						|
are running uses significant CPU just for the browser to update the progress
 | 
						|
bar. Unsurprising, since the webapp is sending the browser a new `<div>`
 | 
						|
each time. Updating the DOM instead from javascript would avoid that;
 | 
						|
the webapp just needs to send the javascript either a full `<div>` or a
 | 
						|
changed percentage and quantity complete to update a single progress bar.
 | 
						|
 | 
						|
I'd prefer to wait on doing that until I'm able to use Fay to generate
 | 
						|
Javascript from Haskell, because it would be much more pleasant.. will see.
 | 
						|
 | 
						|
----
 | 
						|
 | 
						|
Also a performance problem when performing lots of transfers, particularly
 | 
						|
of small files, is that the assistant forks off a `git annex transferkey`
 | 
						|
for each transfer, and that has to in turn start up several git commands.
 | 
						|
 | 
						|
Today I have been working to change that, so the assistant maintains a
 | 
						|
pool of transfer processes, and dispatches each transfer it wants to make
 | 
						|
to a process from the pool. I just got all that to build, although untested
 | 
						|
so far, in the `transferpools` branch.
 |