This commit is contained in:
thk 2020-03-06 17:38:55 +00:00 committed by admin
parent c9ba6b0e20
commit 75d382a9f7

View file

@ -0,0 +1,49 @@
### Please describe the problem.
I have a special directory remote with exporttree=yes (encryption=none) on an USB hard drive. Both `git annex sync --content` and `git annex export` only write around 400 KiB/s. Thus an export of a 9GB DVD iso takes a whole night.
The drive is not blazing fast, but:
- `sync; dd if=/dev/zero of=tempfile bs=1M count=10; sync` gives something around 10MB/s (don't recall the exact number)
- rsync (with --progress turned on) copies files with 2.35MB/s
`mount` for this drive shows:
> /dev/sdc1 on /media/thk/thk-sg1 type ext4 (rw,nosuid,nodev,relatime,sync,stripe=8191,uhelper=udisks2)
I tried to mount the drive without sync but failed. Even with the usdisks2 service stopped I could not manually mount the drive without sync (or with async). It always ended up being mounted with sync.
### What steps will reproduce the problem?
TODO(thk): try other drive and other laptop once the current transfer finishes...
### What version of git-annex are you using? On what operating system?
- git-annex version: 8.20200227-gf56dfe791
- Debian testing with Kernel 5.2.17
### Please provide any additional information below.
I now learned that there is no Linux kernel primitive to copy a file but that this is actually a high art:
<http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/copy.c>
I was surprised to see the implementation of `meteredWrite` in *Utility/Metered.hs*. I hoped that there would be some haskell standard library for efficient file copying? I wonder how rsync implements its progress meter? And whether the progress meter is the reason why rsync had slower write speed than dd.
Maybe it would make sense to call out to the *cp* command and just issue a *stat()* every few seconds for the progress meter? This is what I do to monitor cp progress manually.
I have no clue, but maybe these could help for fast file copying in Haskell?
- <https://github.com/snoyberg/conduit>
- <https://wiki.haskell.org/Pipes>
- reddit: [What is your take on conduits, pipes, and streams?](https://www.reddit.com/r/haskell/comments/7w79q1/what_is_your_take_on_conduits_pipes_and_streams/)
### Have you had any luck using git-annex before?
Well, I'm coming back to git-annex after several years. So far it is better than I remembered:
- tor support is great and solves the need for a central server
- I hope that the sqlite integration will now make large collections of files managable
- Finally we have exporttree, yeah!