Commit graph

41900 commits

Author SHA1 Message Date
Joey Hess
36f0bdcd57
add annex.alwayscompact
Added annex.alwayscompact setting which can be unset to speed up writes to
the git-annex branch in some cases.

Sponsored-by: Dartmouth College's DANDI project
2022-07-18 16:39:19 -04:00
Joey Hess
ccff639651
Merge branch 'master' into append 2022-07-18 14:17:15 -04:00
Joey Hess
de18d92de6
efficient but unsafe journal file append
This is only for checking performance, it's not safe.

Sponsored-by: Dartmouth College's DANDI project
2022-07-18 14:17:12 -04:00
Joey Hess
1c40b927aa
minor optimisation
Avoid re-writing the file when the journal directory did not
exist.
2022-07-18 13:50:35 -04:00
Joey Hess
2e6e9876e3
Revert "lock journal before reading journal files"
This reverts commit 47358a6f95.

This added overhead, and will not be needed, because appends are going
to have to be made atomic for other reasons than avoiding incomplete
reads of data being appended.

In particular, when git-annex is interrupted in the middle of an append,
it must not leave the file with a partially written line. So appending
has to somehow be made fully atomic.
2022-07-18 13:38:12 -04:00
Joey Hess
ce455223df
split out appending to journal from writing, high level only
Currently this is not an improvement, but it allows for optimising
appendJournalFile later. With an optimised appendJournalFile, this will
greatly speed up access patterns like git-annex addurl of a lot of urls
to the same key, where the log file can grow rather large. Appending
rather than re-writing the journal file for each line can save a lot of
disk writes.

It still has to read the current journal or branch file, to check
if it can append to it, and so when the journal file does not exist yet,
it can write the old content from the branch to it. Probably the re-reads
are better cached by the filesystem than repeated writes. (If the
re-reads turn out to keep performance bad, they could be eliminated, at
the cost of not being able to compact the log when replacing old
information in it. That could be enabled by a switch.)

While the immediate need is to affect addurl writes, it was implemented
at the level of presence logs, so will also perhaps speed up location logs.
The only added overhead is the call to isNewInfo, which only needs to
compare ByteStrings. Helping to balance that out, it avoids compactLog
when it's able to append.

Sponsored-by: Dartmouth College's DANDI project
2022-07-18 13:22:50 -04:00
Joey Hess
2ce1eaf56a
Merge branch 'master' into append 2022-07-18 12:38:17 -04:00
Joey Hess
4b520e0683
increase cabal-version to work with recent cabal
It started complaining about custom setup needing too old a version of
cabal, a very confusing error message.

1.12 is the version of Cabal on the i386ancient builder.

Sponsored-by: Jack Hill on Patreon
2022-07-16 14:57:29 -04:00
yarikoptic
eee1169ad5 Added a comment 2022-07-15 19:33:50 +00:00
Joey Hess
ee8acd5b5d
Merge branch 'master' of ssh://git-annex.branchable.com 2022-07-15 15:07:05 -04:00
Joey Hess
8bc9381d8d
design work 2022-07-15 15:06:40 -04:00
Joey Hess
47358a6f95
lock journal before reading journal files
This is not currently necessary; journal files are updated atomically.

However, for faster appends to large journal files, locking on read will
be needed, because appends are not atomic.

Sponsored-by: Dartmouth College's DANDI project
2022-07-15 14:43:29 -04:00
Joey Hess
a2b1f369d1
disable journalIgnorable in enableInteractiveBranchAccess
Fix a reversion that prevented --batch commands (and the assistant)
from noticing data written to the journal by other commands.

I have not identified which commit broke this for sure,
but probably it was aeca7c2207

--batch commands that wrote to the journal avoided the problem since
journalIgnorable sets unset on write. It's a little bit surprising that
nobody noticed that query --batch commands did not see data written by
other commands.

Sponsored-by: Dartmouth College's DANDI project
2022-07-15 13:48:41 -04:00
Joey Hess
91abd872d3
complete a comment 2022-07-15 12:59:59 -04:00
nick.guenther@e418ed3c763dff37995c2ed5da4232a7c6cee0a9
4f66f036e6 2022-07-15 16:26:18 +00:00
nick.guenther@e418ed3c763dff37995c2ed5da4232a7c6cee0a9
bfcdf8374b Added a comment 2022-07-15 15:55:38 +00:00
Joey Hess
dc2de5784a
comment 2022-07-15 11:14:13 -04:00
Joey Hess
7c8c5ffe8e
Merge branch 'master' of ssh://git-annex.branchable.com 2022-07-15 11:10:44 -04:00
Joey Hess
f561602484
comment 2022-07-15 11:10:40 -04:00
oliv5
8c4ac8e63c 2022-07-15 06:33:25 +00:00
jkniiv
c2cdf0f61f Added a comment 2022-07-14 20:32:06 +00:00
Joey Hess
94b50c61b3
comment 2022-07-14 16:09:48 -04:00
yarikoptic
f2c30bcb07 Added a comment 2022-07-14 19:42:58 +00:00
Joey Hess
2e57da226c
comments 2022-07-14 15:08:01 -04:00
Joey Hess
093ad89ead
S3: Avoid writing or checking the uuid file in the S3 bucket when importtree=yes or exporttree=yes
It does not make sense for either; importing from an existing bucket should
not write to it. And the user may not have write access at all. And exporting to
a bucket should not write other files.

Also this prevents the uuid file being imported after being written.

Sponsored-by: Dartmouth College's DANDI project
2022-07-14 15:05:51 -04:00
Joey Hess
c3df38dd15
comment 2022-07-14 13:54:50 -04:00
Joey Hess
78da3e2783
close 2022-07-14 13:53:32 -04:00
Joey Hess
557542d621
comment 2022-07-14 13:51:59 -04:00
yarikoptic
06981c6c5a Added a comment 2022-07-14 17:00:40 +00:00
yarikoptic
3c948423a9 Added a comment 2022-07-14 16:50:15 +00:00
Joey Hess
ad467791c1
optimise journal writes to not mkdir journal directory when it already exists
Sponsored-by: Dartmouth College's DANDI project
2022-07-14 12:29:39 -04:00
Joey Hess
5e407304a2
comment with a question 2022-07-14 12:13:28 -04:00
yarikoptic
c4cca7e6c6 initial request for more efficient registerurl 2022-07-14 13:40:16 +00:00
yarikoptic
ba3c48a9e3 Added a comment 2022-07-13 22:05:46 +00:00
Joey Hess
50c2cac7e7
adb: Added configuration setting oldandroid=true
To avoid using find -printf, which was first supported in Android around
2019-2020.

Probing seems too fragile, and execing stat once per file is too slow to do
when there's a faster way available, which brought me to an option...

Sponsored-by: Brett Eisenberg on Patreon
2022-07-13 18:00:47 -04:00
Joey Hess
6c7550ba62
comment 2022-07-13 17:27:30 -04:00
Joey Hess
f358805fc2
comment 2022-07-13 17:26:10 -04:00
Joey Hess
58d163cfc0
Merge branch 'master' of ssh://git-annex.branchable.com 2022-07-13 17:17:44 -04:00
Joey Hess
fbc3c223a6
filter-process: Fix protocol for empty files
This caused git to complain that filter-process failed and kill it with
signal 15. Because it wrote an extra flushPkt for an empty file, which
git did not expect, and so git saw an unexpected response to the next
request.

Luckily, filter-process is only used by default in v9 and up, and v8 is
still the default. Also, git had to be updating an empty file, followed
by another file, which is a fairly unlikely situation. And git restarts
filter-process after this happens and uses it to filter the rest of the
files. So this isn't a crippling bug.

Sponsored-by: Luke Shumaker on Patreon
2022-07-13 17:13:54 -04:00
yarikoptic
5764450a80 Added a comment 2022-07-13 20:05:34 +00:00
yarikoptic
557341315a Added a comment 2022-07-13 19:27:06 +00:00
Joey Hess
1b680d330b
revert accidental change 2022-07-13 15:17:08 -04:00
yarikoptic
fa126d1ac9 Added a comment 2022-07-13 19:04:40 +00:00
Joey Hess
7daa51a380
response to misplaced bug report 2022-07-13 14:58:04 -04:00
Joey Hess
7c7b7ac9b9
followup 2022-07-13 14:53:46 -04:00
Joey Hess
68e9b7f987
comment 2022-07-13 13:44:43 -04:00
Joey Hess
bb68909cb5
Merge branch 'master' of ssh://git-annex.branchable.com 2022-07-13 13:09:15 -04:00
Joey Hess
afbfc106af
comment 2022-07-13 13:09:04 -04:00
yarikoptic
2d71b83e7f initial complain/call for a more efficient journal? 2022-07-13 16:19:55 +00:00
Joey Hess
aaccada8fd
comment 2022-07-12 17:00:36 -04:00