Commit graph

39871 commits

Author SHA1 Message Date
Joey Hess
63de81b52a
Merge branch 'master' into trackassociated 2021-05-24 16:27:24 -04:00
Joey Hess
2de49c186f
update 2021-05-24 16:27:07 -04:00
Joey Hess
44a0d21e57
Merge branch 'master' into trackassociated 2021-05-24 16:24:53 -04:00
Joey Hess
73e1507c72
fix deadlock
git-annex test hung, at varying points depending
on when git decided to run the smudge clean filter.

Recent changes to reconcileStaged caused a deadlock, when git write-tree
for some reason decides to run the smudge clean filter. Which tries
to open the keys db, and blocks waiting for the lock file that its
grandparent has locked.

I don't know why git write-tree does that. It's supposed to only write a
tree from the index which needs no smudge/clean filtering.

I've verified that, in a situation where git write-tree runs the clean
filter, disabling the filter results in a tree being written that
contains the annex link, not eg, the worktree file content. So it seems
safe to disable the clean filter, but also this seems likely to be
working around a bug in git because it seems it is running the clean
filter in a situation where the object has already been cleaned.

Sponsored-by: Dartmouth College's Datalad project
2021-05-24 16:19:26 -04:00
Joey Hess
5d18994736
clearer language 2021-05-24 14:54:51 -04:00
Joey Hess
f46e4c9b7c
fix case where keys db was not initialized in time
When the keys db is opened for read, and did not exist yet, it used to
skip creating it, and return mempty values. But that prevents
reconcileStaged from populating associated files information in time for
the read. This fixes the one remaining case I know of where
the fix in a56b151f90 didn't work.

Note that, when there is a permissions error, it still avoids creating
the db and returns mempty for all queries. This does mean that
reconcileStaged does not run and so it may want to drop files that it
should not. However, presumably a permissions error on the keys database
also means that the user does not have permission to delete annex
objects, so they won't be able to drop the files anyway.

Sponsored-by: Dartmouth College's Datalad project
2021-05-24 14:46:59 -04:00
Joey Hess
a56b151f90
fix longstanding indeterminite preferred content for duplicated file problem
* drop: When two files have the same content, and a preferred content
  expression matches one but not the other, do not drop the file.
* sync --content, assistant: Fix an edge case where a file that is not
  preferred content did not get dropped.

The sync --content edge case is that handleDropsFrom loaded associated files
and used them without verifying that the information from the database was
not stale.

It seemed best to avoid changing --want-drop's behavior, this way when
debugging a preferred content expression with it, the files matched will
still reflect the expression. So added a note to the --want-drop documentation,
to make clear it may not behave identically to git-annex drop --auto.

While it would be possible to introspect the preferred content
expression to see if it matches on filenames, and only look up the
associated files when it does, it's generally fairly rare for 2 files to
have the same content, and the database lookup is already avoided when
there's only 1 file, so I did not implement that further optimisation.

Note that there are still some situations where the associated files
database does not get locked files recorded in it, which will prevent
this fix from working.

Sponsored-by: Dartmouth College's Datalad project
2021-05-24 14:07:05 -04:00
Joey Hess
78be7cf73f
remove warning about combining options
the option parser no longer allows combining --want-get/--want-drop with
options like --all
2021-05-24 13:53:28 -04:00
Joey Hess
d62d6e2fcf
note about a wart
All code that uses associated files already deals with this problem,
which used to be worse. Unfortunately I was not able to entirely
eliminate it, although it happens in fewer cases now.
2021-05-24 12:05:49 -04:00
Joey Hess
c1b5028211
update 2021-05-24 11:59:01 -04:00
Joey Hess
13423f337c
refactoring 2021-05-24 11:38:22 -04:00
Joey Hess
efae085272
fixed reconcileStaged crash when index is locked or in conflict
Eg, when git commit runs the smudge filter.

Commit 428c91606b introduced the crash,
as write-tree fails in those situations. Now it will work, and git-annex
always gets up-to-date information even in those situations. It does
need to do a bit more work, each time git-annex is run with the index
locked. Although if the index is unmodified from the last time
write-tree succeeded, that work is avoided.
2021-05-24 11:33:23 -04:00
Joey Hess
3698e804d4
Merge branch 'master' into trackassociated 2021-05-24 10:24:53 -04:00
Joey Hess
b81f5532c6
comment 2021-05-21 16:44:44 -04:00
Joey Hess
428c91606b
include locked files in the keys database associated files
Before only unlocked files were included.

The initial scan now scans for locked as well as unlocked files. This
does mean it gets a little bit slower, although I optimised it as well
as I think it can be.

reconcileStaged changed to diff from the current index to the tree of
the previous index. This lets it handle deletions as well, removing
associated files for both locked and unlocked files, which did not
always happen before.

On upgrade, there will be no recorded previous tree, so it will diff
from the empty tree to current index, and so will fully populate the
associated files, as well as removing any stale associated files
that were present due to them not being removed before.

reconcileStaged now does a bit more work. Most of the time, this will
just be due to running more often, after some change is made to the
index, and since there will be few changes since the last time, it will
not be a noticable overhead. What may turn out to be a noticable
slowdown is after changing to a branch, it has to go through the diff
from the previous index to the new one, and if there are lots of
changes, that could take a long time. Also, after adding a lot of files,
or deleting a lot of files, or moving a large subdirectory, etc.

Command.Lock used removeAssociatedFile, but now that's wrong because a
newly locked file still needs to have its associated file tracked.

Command.Rekey used removeAssociatedFile when the file was unlocked.
It could remove it also when it's locked, but it is not really
necessary, because it changes the index, and so the next time git-annex
run and accesses the keys db, reconcileStaged will run and update it.

There are probably several other places that use addAssociatedFile and
don't need to any more for similar reasons. But there's no harm in
keeping them, and it probably is a good idea to, if only to support
mixing this with older versions of git-annex.

However, mixing this and older versions does risk reconcileStaged not
running, if the older version already ran it on a given index state. So
it's not a good idea to mix versions. This problem could be dealt with
by changing the name of the gitAnnexKeysDbIndexCache, but that would
leave the old file dangling, or it would need to keep trying to remove
it.
2021-05-21 16:24:37 -04:00
Joey Hess
df0b75cdc4
complications 2021-05-21 14:18:38 -04:00
Joey Hess
1d9bad51d2
plan for these 2021-05-21 13:50:26 -04:00
Joey Hess
f39b7c3663
comment 2021-05-21 12:39:35 -04:00
Joey Hess
d5e18c8710
comment 2021-05-21 12:26:00 -04:00
Joey Hess
a26e7d763d
comment 2021-05-21 12:07:21 -04:00
Joey Hess
442398e1e0
comment 2021-05-21 11:48:57 -04:00
Joey Hess
414dc39a12
comment 2021-05-21 11:31:38 -04:00
Joey Hess
9dbbecc8f4
Merge branch 'master' of ssh://git-annex.branchable.com 2021-05-21 11:28:17 -04:00
Joey Hess
5393c0ae58
reopen per comment 2021-05-21 11:27:13 -04:00
Joey Hess
b68a40fa88
todo 2021-05-20 11:18:46 -04:00
Nick_P
588f8461cb Added a comment 2021-05-20 10:43:14 +00:00
Atemu
0b2c17b49b Added a comment 2021-05-19 17:13:45 +00:00
Nick_P
bfede8f92d Added a comment 2021-05-19 16:59:21 +00:00
Atemu
e9e3cc015e Added a comment 2021-05-19 16:54:19 +00:00
Joey Hess
84366fa2d0
fix by improving docs 2021-05-19 11:13:53 -04:00
Joey Hess
64e26287dd
comment 2021-05-19 11:07:02 -04:00
Joey Hess
901f1fc74c
comment 2021-05-19 10:55:35 -04:00
Nick_P
bff290d864 Added a comment 2021-05-19 14:35:43 +00:00
Nick_P
f7c99032d7 Added a comment 2021-05-19 14:29:23 +00:00
strmd
84ceedb263 Added a comment 2021-05-19 07:00:58 +00:00
strmd
131208bb72 2021-05-19 06:56:34 +00:00
strmd
2ade910ca6 Added a comment 2021-05-19 06:33:03 +00:00
Nick_P
2dbe699b78 2021-05-18 17:35:33 +00:00
Atemu
94fb769e76 removed 2021-05-18 17:16:12 +00:00
Atemu
48314f625f Added a comment 2021-05-18 16:23:09 +00:00
Atemu
4a76ba8761 Forgot the assistant needed for repro 2021-05-18 16:16:56 +00:00
yarikoptic
674f33c139 todo for extra logging when content changed 2021-05-18 14:05:18 +00:00
Atemu
828a5922df Added a comment 2021-05-18 09:13:32 +00:00
Joey Hess
7d57866c3e
update for filter-branch 2021-05-17 15:03:47 -04:00
Joey Hess
c525d18cf7
filter-branch: New command, useful to produce a filtered version of the git-annex branch, eg when splitting a repository 2021-05-17 14:16:46 -04:00
Joey Hess
40f093775c
Merge branch 'filter-branch' 2021-05-17 14:16:13 -04:00
Joey Hess
24c7d9ba78
decided not to include export/import trees
They're only needed to cover a gc edge case, and it's better someone
gets caught by that edge case than that someone who does not know about
them ends up with a filtered git-annex branch that contains such a tree
when some of the files listed in it are ones they wanted to *remove*
from the repository.
2021-05-17 14:12:15 -04:00
Joey Hess
2420910ab8
include info for sameas repos
It's not currently possible to exclude a sameas repo using its
annex-config-uuid. (Remote.nameToUUID rejects them).
Since there's no real documented way to learn those, this seems ok, at
least for now. Also it avoids the problem of someone excluding the
parent but including the sameas, which would probably make the sameas
repo not usable when using the filtered branch.
2021-05-17 14:04:14 -04:00
Joey Hess
984034f335
filter-branch working aside from some edge cases
Added a note to man page about what happens to information that is
recorded in the private journal. Since it uses Branch.get, that
information will be copied when options allow. It seemed better to allow
it and document it than not allow it, since the options allow excluding
repositories and so can be used to exclude private repos if desired.
2021-05-17 13:24:58 -04:00
Joey Hess
8b6dad11a2
add createMessage
init: When annex.commitmessage is set, use that message for the commit
that creates the git-annex branch.

This will be used by filter-branch too, and it seems to make sense to let
annex.commitmessage affect it.
2021-05-17 13:07:47 -04:00