close todo

This commit is contained in:
Joey Hess 2021-03-05 14:46:09 -04:00
parent cdd512cd9f
commit e8065ee99d
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
6 changed files with 97 additions and 0 deletions

View file

@ -8,6 +8,13 @@ git-annex (8.20210224) UNRELEASED; urgency=medium
* Fixed handling of --mimetype or --mimeencoding combined with
options like --all or --unused.
* Fix handling of --branch combined with --unlocked or --locked.
* When non-annexed files in a tree are exported to a special remote,
importing from the special remote keeps the files non-annexed,
as long as their content has not changed, rather than converting
them to annexed files.
(Such a conversion will still happen when importing from a remote
an old git-annex exported such a tree to before; export the tree
with the new git-annex before importing to avoid that.)
-- Joey Hess <id@joeyh.name> Wed, 24 Feb 2021 13:18:38 -0400

View file

@ -0,0 +1,12 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2021-03-05T18:35:54Z"
content="""
This bug has been fixed.
And next time you have a problem so blatent with a test case and
everything, it's certianly a bug, so don't be afraid to file a bug report
rather than using the forum. It's harder to close forum posts than
bug reports!
"""]]

View file

@ -11,3 +11,5 @@ The importer could check for each file, if there's a corresponding file in
the branch it's generating the import for, if that file is annexed.
This corresponds to how git-annex add (and the smudge filter) handles these
files. But this might be slow when importing a large tree of files.
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,30 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2021-03-05T16:42:03Z"
content="""
> The importer could check for each file, if there's a corresponding file in the branch it's generating the import for, if that file is annexed.
Should it check the branch it's generating the import for though?
If the non-annexed file is "foo" and master is exported, then in master
that file is renamed to "bar", the import should not look at the new master
to see if the "foo" from the remote should be annexed. The correct tree
to consult would be the tree that was exported to the remote last.
It seems reasonable to look at the file in that exported tree to see it was
non-annexed before, and if the ContentIdentifier is the same as what
was exported before, keep it non-annexed on import. If the ContentIdentifier
has changed, apply annex.largefiles to decide whether or not to annex it.
The export database stores information about that tree already,
but it does not keep track of whether a file was exported annexed or not.
So changing the database to include an indication of that, and using it
when importing, seems like a way to solve this problem, and without slowing
things down much.
*Alternatively* the GitKey that git-annex uses for these files when
exporting is represented as a SHA1 key with no size field. That's unusual;
nothing else creates such a key usually. (Although some advanced users may
for some reason.) Just treating such keys as non-annexed files when
importing would be at least a bandaid if not a real fix.
"""]]

View file

@ -0,0 +1,14 @@
[[!comment format=mdwn
username="joey"
subject="""comment 3"""
date="2021-03-05T17:31:32Z"
content="""
Wait... The import code has a separate "GIT" key type that it uses
internally once it's decided a file should be non-annexed. Currently
that never hits disk. Using that rather than a SHA1 key for the export
database could be a solution.
(Using that rather than "SHA1" for the keys would also avoid
the problem that the current GitKey hardcods an assumption
that git uses sha1..)
"""]]

View file

@ -0,0 +1,32 @@
[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2021-03-05T17:44:54Z"
content="""
In fact, a very simple patch that just makes a GitKey generate a
"GIT" key seems to have solved this problem! Files that were non-annexed
on export remain so on import, until they're changed, and then
annex.largefiles controls what happens.
Once non-annexed files have been exported using the new version, they'll
stay non-annexed on import. Even when an old version of git-annex is doing
the importing!
When an old annex had exported, and a new one imports, what happens is
the file gets imported as an annexed file. Exporting first with the new
version avoids that unwanted conversion.
Interestingly though, the annexed file when that conversion happens does
not use the SHA1 key from git, so its content can be retrieved. I'm not
quite sure how that problem was avoided in this case but something avoided
the worst behavior.
It would be possible to special case the handling of SHA1 keys without a
size to make importing from an old export not do the conversion. But that
risks breakage for some user who is generating their own SHA1 keys and not
including a size in them. Or for some external special remote that supports
IMPORTKEY and generates SHA1 keys without a size. It seems better to avoid
that potential breakage of unrelated things, and keep the upgrade process
somewhat complicated when non-annexed files were exported before, than it
does to streamline the upgrade.
"""]]