Commit graph

39725 commits

Author SHA1 Message Date
Joey Hess
0547884eb2
importfeed: fix bug while also speeding up 12x!
* Fix bug that could make git-annex importfeed not see recently recorded
  state when configured with annex.alwayscommit=false.
* importfeed: Made "checking known urls" phase run 12 times faster.

The massive speedup is because it no longer queries for metadata
accompanying each url. Instead it processes the whole git-annex branch and
checks all metadata files for feed item ids, and uses any it finds.

This could result in a behavior change, in an unlikely situation: If a feed
id is recorded in a key's metadata, but the url gets removed, the old code
would not see that item id and would re-download it if it finds an url for
it in a feed, while the new code will see the item id. I don't think
the old behavior was intentional, and it may be that the new behavior is
better. Not gonna worry about this.
2021-04-23 12:36:56 -04:00
Joey Hess
b689f17062
refactoring 2021-04-23 11:44:10 -04:00
fooness
0eb0995a26 removed 2021-04-23 15:39:28 +00:00
Joey Hess
657d55c401
convert withKnownUrls to use overBranchFileContents
This only partly fixes importfeed to see journalled files, since it
separately cats metadata directly from the branch. Held off on a
changelog for a bug fix until that's dealt with.
2021-04-23 11:32:25 -04:00
Joey Hess
da0a696c96
Revert "reorder another test"
This reverts commit 3e63f00f63.
2021-04-23 01:01:29 -04:00
Joey Hess
bb4f2af602
windows needs to die 2021-04-23 00:38:18 -04:00
Joey Hess
e623b0fd87
close old bug 2021-04-23 00:25:22 -04:00
Joey Hess
37541c6d27
rename forum post that breaks windows checkout
at least sometimes, not on the autobuilder. I think git treats the ...
as a possible path traveral or something
2021-04-22 19:15:43 -04:00
Joey Hess
27a29c99fe
update 2021-04-22 12:52:32 -04:00
Joey Hess
3e63f00f63
reorder another test
continuing to try to narrow down cause of failure on windows
2021-04-22 10:03:35 -04:00
Joey Hess
7b465515e6
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-22 09:40:25 -04:00
Atemu
07a7d33364 Added a comment 2021-04-22 10:01:01 +00:00
Joey Hess
dc37a5d1eb
update 2021-04-21 23:42:00 -04:00
pat
0b51e86f4f Added a comment 2021-04-21 22:33:55 +00:00
Joey Hess
7482c17897
comment 2021-04-21 17:30:27 -04:00
Joey Hess
4af9b3381c
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-21 17:19:25 -04:00
Joey Hess
7cb96bc3e3
alternative 2021-04-21 17:18:47 -04:00
pat
7c8bf24768 2021-04-21 21:04:27 +00:00
Joey Hess
affdffb323
docs 2021-04-21 17:01:03 -04:00
Joey Hess
c687eae80b
got private repos really working
This new TODO will need private indexes to resolve; until then the
private journal has to be checked when private UUIDs are known.
2021-04-21 16:26:23 -04:00
Joey Hess
d0c5f6d2f0
optimisation
Avoid trying to read private journal files when no private uuids are
known.
2021-04-21 16:02:56 -04:00
Joey Hess
24eeacdba8
adapt recent bug fixes to support private journal
At this point, private repos should mostly work, except for a few
commands that directly read from the git-annex branch and will not see
the private journal.

Private index not yet implemented.
2021-04-21 16:01:13 -04:00
Joey Hess
0bb57702e1
Merge branch 'master' into hiddenannex 2021-04-21 15:45:12 -04:00
Joey Hess
653b719472
fix --all to include not yet committed files from the journal
Fix bug caused by recent optimisations that could make git-annex not see
recently recorded status information when configured with
annex.alwayscommit=false.

This does mean that --all can end up processing the same key more than once,
but before the optimisations that introduced this bug, it used to also behave
that way. So I didn't try to fix that; it's an edge case and anyway git-annex
behaves well when run on the same key repeatedly.

I am not too happy with the use of a MVar to buffer the list of files in the
journal. I guess it doesn't defeat lazy streaming of the list, if that
list is actually generated lazily, and anyway the size of the journal is
normally capped and small, so if configs are changed to make it huge and
this code path fire, git-annex using enough memory to buffer it all is not a
large problem.
2021-04-21 15:40:32 -04:00
Joey Hess
74acf17a31
refactoring 2021-04-21 14:29:02 -04:00
Joey Hess
6eb3c0a6b4
fix branch precacheing bug by checking journal
Fix bug caused by recent optimisations that could make git-annex not see
recently recorded status information when configured with
annex.alwayscommit=false.

When not using --all, precaching only gets triggered when the
command actually needs location logs, and so there's no speed hit there.

This is a minor speed hit for --all, because it precaches even when the
location log is not actually going to be used, and so checking the journal
is not necessary. It would have been possible to defer checking the journal
until the cache gets used. But that would complicate the usual Branch.get
code path with two different kinds of caches, and the speed hit is really
minimal. A better way to speed up --all, later, would be to avoid
precaching at all when the location log is not going to be used.
2021-04-21 14:02:15 -04:00
Joey Hess
b470673e50
wip 2021-04-21 13:46:39 -04:00
Joey Hess
56478e99ac
bug report 2021-04-21 13:24:32 -04:00
Joey Hess
9b870e29fd
Merge branch 'master' into hiddenannex 2021-04-21 13:04:40 -04:00
Joey Hess
f058618074
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-21 13:02:47 -04:00
Joey Hess
39d94919cd
reorder tests debugging windows failure
This order will work just as well, so no need to revert this change
later.
2021-04-21 13:01:41 -04:00
Kyle Meyer
876b134b54 doc/git-annex.mdwn: Fix quoting of annex.supportunlocked 2021-04-21 12:36:34 -04:00
kyle
108dc11af3 Added a comment: re: clarifying unlocked files 2021-04-21 16:34:50 +00:00
kyle
54a0db2563 Added a comment: re: Are my unlocked, annexed files still safe? 2021-04-21 16:32:42 +00:00
Joey Hess
430c5a15f5
update 2021-04-21 12:18:04 -04:00
Joey Hess
4ef3fb0b59
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-21 12:15:57 -04:00
Joey Hess
85d75b8a18
comment 2021-04-21 12:15:42 -04:00
Ilya_Shlyakhter
d568c7c88b Added a comment: clarifying unlocked files 2021-04-21 16:08:05 +00:00
pat
8e3ee7b91d Added a comment: Are my unlocked, annexed files still safe? 2021-04-21 15:47:48 +00:00
Ilya_Shlyakhter
78f31022e3 Added a comment: auto-expire temp repos 2021-04-21 15:37:38 +00:00
Joey Hess
5dae95f95f
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-20 15:18:56 -04:00
Joey Hess
154fb46b24
update 2021-04-20 15:18:18 -04:00
Joey Hess
05989556a2
start implementing hidden git-annex repositories
This adds a separate journal, which does not currently get committed to
an index, but is planned to be committed to .git/annex/index-private.

Changes that are regarding a UUID that is private will get written to
this journal, and so will not be published into the git-annex branch.

All log writing should have been made to indicate the UUID it's
regarding, though I've not verified this yet.

Currently, no UUIDs are treated as private yet, a way to configure that
is needed.

The implementation is careful to not add any additional IO work when
privateUUIDsKnown is False. It will skip looking at the private journal
at all. So this should be free, or nearly so, unless the feature is
used. When it is used, all branch reads will be about twice as expensive.

It is very lucky -- or very prudent design -- that Annex.Branch.change
and maybeChange are the only ways to change a file on the branch,
and Annex.Branch.set is only internal use. That let Annex.Branch.get
always yield any private information that has been recorded, without
the risk that Annex.Branch.set might be called, with a non-private UUID,
and end up leaking the private information into the git-annex branch.

And, this relies on the way git-annex union merges the git-annex branch.
When reading a file, there can be a public and a private version, and
they are just concacenated together. That will be handled the same as if
there were two diverged git-annex branches that got union merged.
2021-04-20 15:04:53 -04:00
Atemu
f0e15c5a0d Added a comment 2021-04-20 18:30:39 +00:00
Atemu
2f05565db5 Added a comment 2021-04-20 18:05:27 +00:00
Joey Hess
b2222e4639
optimisation
Avoid unnecessary conversion to/from String.
2021-04-20 13:13:45 -04:00
Joey Hess
c30557594e
remove now redundant function 2021-04-20 12:42:57 -04:00
fooness
f71b0218aa 2021-04-20 16:38:33 +00:00
Joey Hess
752c389849
comment 2021-04-20 11:57:10 -04:00
Joey Hess
fee6504388
fix reversion in recent CoW changes
File handle accidentially left open is both a FD leak and causes the
haskell RTS to reject opening it again with "file is locked".
2021-04-20 11:41:43 -04:00