Commit graph

41079 commits

Author SHA1 Message Date
Joey Hess
b60a041e9e
comment 2021-12-31 11:17:26 -04:00
Joey Hess
df269b2f8f
Merge branch 'master' of ssh://git-annex.branchable.com 2021-12-31 11:04:52 -04:00
falsifian
e7c78d2910 2021-12-30 19:01:54 +00:00
Joey Hess
cf61f955ad
also run query tests in a readonly repo
Sponsored-by: Dartmouth College's Datalad project
2021-12-30 13:16:57 -04:00
Joey Hess
ff4486b91c
comments 2021-12-30 12:45:00 -04:00
Joey Hess
283ef4aae6
Merge branch 'master' of ssh://git-annex.branchable.com 2021-12-30 12:33:50 -04:00
Joey Hess
f8ebd0363b
complete the magic wormhole pairin appid transition
Started in 2017 in commit 3fe9d99f24.
Starting tomorrow, all versions of git-annex since then will provide
an appid, and so it will no longer be necessary to check the date.

Sponsored-by: Nicholas Golder-Manning on Patreon
2021-12-30 12:16:22 -04:00
jkniiv
465bf26000 Added a comment: very much appreciated 2021-12-29 17:20:30 +00:00
yarikoptic
34e2e37ffd Added a comment: thank you! 2021-12-29 02:00:48 +00:00
Joey Hess
5abd4cf275
close 2021-12-28 13:40:35 -04:00
Joey Hess
b1d719f9d2
handle transitions with read-only unmerged git-annex branches
Capstone to this feature. Any transitions that have been performed on an
unmerged remote ref but not on the local git-annex branch, or vice-versa
have to be applied on the fly when reading files.

Sponsored-by: Dartmouth College's Datalad project
2021-12-28 13:23:32 -04:00
Joey Hess
1291a7d86c
Merge branch 'master' into readonly-annex-merge 2021-12-28 13:03:27 -04:00
Joey Hess
720baf820e
refactoring 2021-12-28 12:15:51 -04:00
Joey Hess
91317dd2bb
Merge branch 'master' of ssh://git-annex.branchable.com 2021-12-27 15:46:01 -04:00
Joey Hess
4257c23370
update on status 2021-12-27 15:45:22 -04:00
Joey Hess
058193adc6
prevent git-annex log with read-only unmerged git-annex branches
It would display incomplete information, which would differ from the
information displayed with write access. So refuse to display anything.

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 15:44:15 -04:00
Joey Hess
23a485498f
handle Annex.Branch.files with read-only unmerged git-annex branches
It would be difficult to make Annex.Branch.files query the unmerged
git-annex branches. Might be possible, similar to what was discussed in
7f6b2ca49c but again I decided to make it
not do anything in that situation to start with before adding such a
complicated thing.

git-annex info uses it when getting info about a repostory. The choices
were to make that fail with an error, or display the info it can, and
change the output slightly for the bits of info it cannot access. While
that is a behavior change, and I want to avoid any behavior changes due
to unmerged git-annex branches in a read-only repo, displaying a message
that is not a number seems unlikely to break anything that was consuming
a number, any worse than throwing an exception would. Probably.

Also git-annex unused --from origin is made to throw an error, but
it would fail later anyway when trying to write to the unused log files.

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 15:28:31 -04:00
jasonb@ab4484d9961a46440958fa1a528e0fc435599057
fe26c9aa60 Added a comment 2021-12-27 18:51:14 +00:00
Joey Hess
7f6b2ca49c
handle overBranchFileContents with read-only unmerged git-annex branches
This makes --all error out in that situation. Which is better than
ignoring information from the branches.

To really handle the branches right, overBranchFileContents would need
to both query all the branches and union merge file contents
(or perhaps not provide any file content), as well as diffing between
branches to find files that are only present in the unmerged branches.
And also, it would need to handle transitions..

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 14:30:51 -04:00
Joey Hess
d9d0fe5fa4
disable precaching git-annex branch when there are unmerged branches in a read-only repo
The way precaching works, it can't merge in information from those
branches efficiently, so just disable it and fall back to
Annex.Branch.get in order to get the correct information.

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 14:08:50 -04:00
Joey Hess
6b7601c7f6
Merge branch 'master' into readonly-annex-merge 2021-12-27 13:46:03 -04:00
Joey Hess
38f7f36e9c
Merge remote-tracking branch 'origin/master' 2021-12-27 13:45:21 -04:00
Joey Hess
0c208e2cdb
comment 2021-12-27 13:44:49 -04:00
Joey Hess
1e09cf661e
remove git-annex branch ref from unmerged refs list
It's queried separately so it was causing extra work to include it.
2021-12-27 13:33:27 -04:00
Joey Hess
6d7ecd9e5d
merge git-annex branch in memory in read-only repository
Improved support for using git-annex in a read-only repository, git-annex
branch information from remotes that cannot be merged into the git-annex
branch will now not crash it, but will be merged in memory.

To avoid this making git-annex behave one way in a read-only repository,
and another way when it can write, it's important that Annex.Branch.get
return the same thing (modulo log file compaction) in both cases.

This manages that mostly. There are some exceptions:

- When there is a transition in one of the remote git-annex branches
  that has not yet been applied to the local or other git-annex branches.
  Transitions are not handled.
- `git-annex log` runs git log on the git-annex branch, and so
  it will not be able to show information coming from the other, not yet
  merged branches.
- Annex.Branch.files only looks at files in the git-annex branch and not
  unmerged branches. This affects git-annex info output.
- Annex.Branch.hs.overBranchFileContents ditto. Affects --all and
  also importfeed (but importfeed cannot work in a read-only repo
  anyway).
- CmdLine.Seek.seekFilteredKeys when precaching location logs.
  Note use of Annex.Branch.fullname
- Database.ContentIdentifier.needsUpdateFromLog and updateFromLog

These warts make this not suitable to be merged yet.

This readonly code path is more expensive, since it has to query several
branches. The value does get cached, but still large queries will be
slower in a read-only repository when there are unmerged git-annex
branches.

When annex.merge-annex-branches=false, updateTo skips doing anything,
and so the read-only repository code does not get triggered. So a user who
is bothered by the extra work can set that.

Other writes to the repository can still result in permissions errors.
This includes the initial creation of the git-annex branch, and of course
any writes to the git-annex branch.

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 13:21:15 -04:00
Joey Hess
ba3d89935b
status 2021-12-27 13:21:09 -04:00
Joey Hess
1363c89fd3
status 2021-12-26 14:33:32 -04:00
Joey Hess
da6aa6e944
retitle 2021-12-26 12:33:34 -04:00
Joey Hess
575cd71ce4
comment 2021-12-26 12:31:17 -04:00
Joey Hess
5ff55f622d
improve sync message in export edge case
sync: Better error message when unable to export to a remote because
remote.name.annex-tracking-branch is configured to a ref that does not
exist.

It does not suggest how to fix the problem because there are several
possible solutions: Change the git config to point to something that does
exist, git add some files, or put files on the special remote that will be
imported and so populate the ref.

I considered just silently not doing anything, which is what it does
when annex-tracking-branch = master and nothing has been committed to
master yet. But it seems better to be explicit about it, since this is a
fairly confusing situation to find yourself in.

Sponsored-By: Max Thoursie on Patreon
2021-12-23 14:45:01 -04:00
Joey Hess
1ca73107a3
comment 2021-12-23 14:03:04 -04:00
Joey Hess
6600cd2df3
response 2021-12-22 13:02:12 -04:00
tim@5431dd39464df207b7d46d3cf1bc74c82123ac68
139683a56d 2021-12-19 16:30:18 +00:00
jenkin.schibel@286264d9ceb79998aecff0d5d1a4ffe34f8b8421
aad516ee42 2021-12-17 17:45:38 +00:00
jasonb@ab4484d9961a46440958fa1a528e0fc435599057
9e402d21a3 Added a comment 2021-12-16 23:41:56 +00:00
manishofyore@b68d21cd485417e84ea87876a9064f82714a08a1
5098f970e2 Added a comment 2021-12-16 19:35:03 +00:00
Joey Hess
f566658b31
comment 2021-12-16 15:24:59 -04:00
manishofyore@b68d21cd485417e84ea87876a9064f82714a08a1
3ab86307bb Added a comment 2021-12-16 18:53:15 +00:00
Joey Hess
1d4e1c2f6d
comment 2021-12-16 10:54:49 -04:00
Joey Hess
a03e9107cb
wording 2021-12-14 13:53:36 -04:00
Joey Hess
681d8611be
fix flush order reversion
commit c2e46f4707 caused
the queue to possibly be flushed in the wrong order when
it contained a mix of different actions.
2021-12-14 13:51:00 -04:00
Joey Hess
8b3238cf42
Merge branch 'master' of ssh://git-annex.branchable.com 2021-12-14 13:27:11 -04:00
Joey Hess
c2e46f4707
improve git command queue flushing with time limit
So that eg, addurl of several large files that take time to download will
update the index for each file, rather than deferring the index updates to
the end.

In cases like an add of many smallish files, where a new file is being
added every few seconds. In that case, the queue will still build up a
lot of changes which are flushed at once, for best performance. Since
the default queue size is 10240, often it only gets flushed once at the
end, same as before. (Notice that updateQueue updated _lastchanged
when adding a new item to the queue without flushing it; that is
necessary to avoid it flushing the queue every 5 minutes in this case.)

But, when it takes more than a 5 minutes to add a file, the overhead of
updating the index immediately is probably small, so do it after each
file. This avoids git-annex potentially taking a very very long time
indeed to stage newly added files, which can be annoying to the user who
would like to get on with doing something with the files it's already
added, eg using git mv to rename them to a better name.

This is only likely to cause a problem if it takes say, 30 seconds to
update the index; doing an extra 30 seconds of work after every 5
minute file add would be less optimal. Normally, updating the index takes
significantly less time than that. On a SSD with 100k files it takes
less than 1 second, and the index write time is bound by disk read and
write so is not too much worse on a hard drive. So I hope this will not
impact users, although if it does turn out to, the time limit could be
made configurable.

A perhaps better way to do it would be to have a background worker
thread that wakes up every 60 seconds or so and flushes the queue.
That is made somewhat difficult because the queue can contain Annex
actions and so this would add a new source of concurrency issues.
So I'm trying to avoid that approach if possible.

Sponsored-by: Erik Bjäreholt on Patreon
2021-12-14 12:23:19 -04:00
manishofyore@b68d21cd485417e84ea87876a9064f82714a08a1
0ba973463f Added a comment 2021-12-13 23:12:53 +00:00
Joey Hess
fe31951e5e
close 2021-12-13 13:13:54 -04:00
Joey Hess
22e805b9f2
clarify 2021-12-13 12:48:45 -04:00
Joey Hess
ca99d43a2a
comment 2021-12-13 12:47:52 -04:00
Joey Hess
50cfc4e71f
comment 2021-12-13 12:46:47 -04:00
Joey Hess
3e199a558d
comment 2021-12-13 12:38:27 -04:00
jasonb@ab4484d9961a46440958fa1a528e0fc435599057
14823f485d Added a comment 2021-12-12 02:49:28 +00:00