Commit graph

39210 commits

Author SHA1 Message Date
Joey Hess
530e96b80e
fix unannex data overwrite bug
unannex, uninit: When an annexed file is modified, don't overwrite the
modified version with an older version from the annex

This commit was sponsored by Mark Reidenbach on Patreon.
2021-02-22 13:35:00 -04:00
Joey Hess
224bc7579b
bug report 2021-02-22 13:03:22 -04:00
Joey Hess
62d5a73bdd
unannex, uninit: Avoid running git rm once per annexed file, for a large speedup. 2021-02-22 12:56:11 -04:00
Joey Hess
cddf2343b2
wording 2021-02-22 12:51:52 -04:00
Joey Hess
dd97017246
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-22 12:40:06 -04:00
Joey Hess
3ba349eec3
comment 2021-02-22 12:38:34 -04:00
Joey Hess
17a08f38e2
comment 2021-02-22 12:31:55 -04:00
EvanDeaubl
abb70e83dd Added a comment: One possible workaround 2021-02-22 16:23:41 +00:00
Joey Hess
266f779e84
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-22 12:13:20 -04:00
zsolt1
4381ad636e Added a comment 2021-02-21 18:11:36 +00:00
git-annex.branchable.com@d12f3f46c9222459d17f96bc7be04f7cd03a6732
1e4fac1046 Added a comment 2021-02-21 15:50:49 +00:00
Lukey
4f63f0d162 Added a comment 2021-02-20 21:35:10 +00:00
zsolt1
0f68eed33c Added a comment: reproduce 2021-02-20 19:06:45 +00:00
git-annex.branchable.com@d12f3f46c9222459d17f96bc7be04f7cd03a6732
26c19de0d9 Add workaround 2021-02-20 19:05:42 +00:00
git-annex.branchable.com@d12f3f46c9222459d17f96bc7be04f7cd03a6732
c37bfccb63 Initial report 2021-02-20 19:04:18 +00:00
zsolt1
c71f005889 2021-02-20 19:02:26 +00:00
Joey Hess
e5ca188544
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-19 19:43:21 -04:00
jwrauch
9dcb11fc23 Added a comment 2021-02-19 18:58:17 +00:00
yarikoptic
a876884987 initial observation about slow uninit 2021-02-19 17:08:39 +00:00
jwrauch
904f25c559 Added a comment 2021-02-19 16:19:53 +00:00
jwrauch
79fb2f499d Added a comment 2021-02-19 16:17:34 +00:00
jwrauch
dab459787f Added a comment 2021-02-19 16:10:03 +00:00
Lukey
fbbd1d7cf1 Added a comment 2021-02-19 15:43:48 +00:00
jwrauch
5e2f2f7d9b 2021-02-19 14:14:13 +00:00
Lukey
4fc0c58e24 Added a comment 2021-02-19 07:28:15 +00:00
Lukey
6708bff1a9 Added a comment 2021-02-19 07:16:27 +00:00
jodumont
a16d0a5f90 Added a comment: how to rename/remove the [here] ?? 2021-02-19 04:56:47 +00:00
jodumont
b0684e7567 Added a comment: Explanation for a noGUI usage 2021-02-19 04:39:35 +00:00
Joey Hess
06eab90d44
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-17 20:46:35 -04:00
georg.schnabel@bd6be2144f897f5caa0028e0dd1e0a65634add81
1845c4e1e3 Added a comment: import from special directory remote fails due to running out of memory 2021-02-17 14:29:53 +00:00
georg.schnabel@bd6be2144f897f5caa0028e0dd1e0a65634add81
8c2629cfd1 2021-02-17 13:58:00 +00:00
Joey Hess
381f203d1a
refactor
Avoiding using a callback simplifies this and should make it easier to
implement incremental checksumming, which will need to happen partly in
writeRetrievedContent and partly in retrieveChunks.
2021-02-16 16:03:28 -04:00
Joey Hess
48310f2d55
windows build fix from jwodder 2021-02-15 13:35:01 -04:00
Joey Hess
0cb2b1b126
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-15 13:31:55 -04:00
Joey Hess
664c15f20a
comment 2021-02-15 13:31:13 -04:00
Joey Hess
178dc5ea6c
perf note
Following up to f44d4704c6,
I tried making updateIncremental pure, avoiding the IORef overhead.
That did not improve speed though. It did complicate the interface since
thunks needed to be forced to avoid leaking memory. So am not going with
that change.

Looking at Crypto.Hash.hashUpdate, it copies a byte array on each call,
compared with hashlazy that only uses 1 copy for the whole bytestring.
That could well explain a lot of the overhead discussed in the
abovementioned commit. Don't see any way to improve that while hashing
incrementally, except using bigger chunks should reduce its overhead.
Since 10x larger chunks did not, I'm kind of puzzled if it's really
what's affecting performance.
2021-02-15 12:36:45 -04:00
falsifian
4407ade4c3 Added a comment 2021-02-12 03:52:39 +00:00
jwodder
80ec124fb0 2021-02-11 16:06:00 +00:00
jwodder
69f2fb7a23 2021-02-11 16:04:20 +00:00
Joey Hess
cb7bb3e4b9
comment 2021-02-10 21:49:25 -04:00
Joey Hess
e3832af5d5
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-10 16:40:16 -04:00
Joey Hess
dc9376feeb
optimisation
IORef rather than MVar sped up benchmark mentioned in last commit to
13.0s.

This makes me wonder if changing the interface to not need the IORef
either would improve speed further.
2021-02-10 16:39:41 -04:00
Joey Hess
f44d4704c6
incremental checksum for local remotes
This benchmarks only slightly faster than the old git-annex. Eg, for a 1
gb file, 14.56s vs 15.57s. (On a ram disk; there would certianly be
more of an effect if the file was written to disk and didn't stay in
cache.)

Commenting out the updateIncremental calls make the same run in 6.31s.
May be that overhead in the implementation, other than the actual
checksumming, is slowing it down. Eg, MVar access.

(I also tried using 10x larger chunks, which did not change the speed.)
2021-02-10 16:05:24 -04:00
Joey Hess
48f63c2798
stop using rsync in fileCopier
This is groundwork for calculating checksums while copying, rather than
in a separate pass, but that's not done yet. For now, avoid using rsync
(and cp on Windows), and instead read and write the file ourselves, with
resume handling.

Benchmarking vs old git-annex that used rsync, this is faster,
at least once the file size is larger than a couple of MB.
2021-02-10 14:44:35 -04:00
Joey Hess
c4c9b99e22
refactoring 2021-02-10 13:38:45 -04:00
Joey Hess
e24ddb8946
Bugfix: fsck --from a ssh remote did not actually check that the content on the remote is not corrupted
Changing to the P2P protocol broke this, because preseedTmp copies
the local copy of the object to the temp file, and then the P2P transfer
sees the right length file and uses it as-is.

When git-annex-shell is too old and rsync is used, it did verify the
content, and when the local repo does not have the object it did verify the
content.
2021-02-10 13:29:12 -04:00
Joey Hess
6487a75d33
comment 2021-02-10 13:15:00 -04:00
Joey Hess
1c75364eac
fix missing call to check after hard linking
This could perhaps have caused a hard link to be made when the content
of the object was modified. I don't think that actually happened,
because the annexed file would have to be unlocked, with annex.thin, for
the object to get modified, and in that case, a hard link is not made.
However, to be sure, run the check.

Note that it seemed best to run the check only once, although the
current implementation is fast and safe to run repeatedly.
2021-02-10 13:07:38 -04:00
Joey Hess
f08d7688e9
Merge branch 'incrementalhash' 2021-02-10 12:42:17 -04:00
Joey Hess
4b63e932f3
incremental checksum on upload to ssh or p2p 2021-02-10 12:41:05 -04:00