todo triage

Tagging todos that seem to have a plan ready as confirmed.

Also closed some old ones for various reasons. Including several that
turn out to be addressed by newer features.

Also opened a new todo about git-annex-config needing a criteria to add
new configs to it.
This commit is contained in:
Joey Hess 2022-04-04 15:22:49 -04:00
parent f51007d716
commit 77de20c925
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
27 changed files with 199 additions and 1 deletions

View file

@ -20,3 +20,5 @@ Thanks a lot for all the work on git-annex, it's a really amazing project! The m
[1]: https://github.com/git/git/blob/v2.11.0/Documentation/gitattributes.txt#L384
[2]: https://github.com/git-lfs/git-lfs/pull/1382
> [[done]] --[[Joey]]

View file

@ -1,2 +1,4 @@
Hello Joey,<br>
`git annex sync --content (--all)` checks if the content of a key exists locally by checking if the path exists instead looking at the location log. I suspect that this has a large impact on performance if the cache is cold.
[[!tag moreinfo]]

View file

@ -1 +1,3 @@
Would it be possible to create a package for Lacie's NacOS?
> [[closing|done]] --[[Joey]]

View file

@ -1 +1,3 @@
I'd like to be able to set a fixed limit on how much storage can be uploaded to a special remote. A use case for this may be that I want to spend no more than Y dollars, on a storage service that charges $X per gigabyte. I would thus set a limit where I a upload would be interrupted with a warning about the limit, and to continue I would need to use a --force option.
> dup of [[Specify_maximum_usable_space_per_remote]]; [[done]] --[[Joey]]

View file

@ -1 +1,5 @@
It would be nice to be able to upload and download git history with special remotes. This could be a move towards full special remote syncing.
> I feel this is out of scope. git has its own interface to let a program
> be registered as performing a transport, to store a git repository
> anywhere. [[wontfix|done]] --[[Joey]]

View file

@ -32,3 +32,5 @@ index 5745030..6a09c3a 100644
"""]]
> [[done]] --[[Joey]]

View file

@ -1 +1,3 @@
Add an armel build like the i386ancient build.
> [[wontfix|done]] --[[Joey]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="joey"
subject="""comment 7"""
date="2022-04-04T17:53:39Z"
content="""
[[need_a_clear_criteria_for_adding_git-annex-config_settings]] will need to
be resolved before I do anything about this.
"""]]

View file

@ -1 +1,3 @@
Currently [[`git-annex-fsck`|git-annex-fsck]] gives a warning for all my files stored with MD5 keys that they can be upgraded to the more secure SHA256: `Can be upgraded to an improved key format. You can do so by running: git annex migrate`. In my case the key choice is deliberate, so it would be good if this warning could be disabled, to prevent it from drowning out more serious ones.
[[!tag moreinfo]]

View file

@ -2,3 +2,4 @@ Should `annex.gitaddtoannex` and `annex.addsmallfiles` be [[`git-annex-config`|g
Also, [maybe](https://git-annex.branchable.com/todo/Avoid_lengthy___34__Scanning_for_unlocked_files_...__34__/#comment-85cb4d3eac345df7b08a31ea4bd810f6) make `annex.supportunlocked` a `git-annex-config` setting as well.
> [[rejected|done]] --[[Joey]]

View file

@ -0,0 +1,11 @@
[[!comment format=mdwn
username="joey"
subject="""comment 8"""
date="2022-04-04T17:54:33Z"
content="""
I have opened a new todo,
[[need_a_clear_criteria_for_adding_git-annex-config_settings]].
I am going to close this one, because I'm certianly not adding a bunch of
configs listed here without that crieria.
"""]]

View file

@ -1,3 +1,5 @@
To help understand hard-to-replicate failures, add an option to always generate a debug log but to erase it as a final step if an operation succeeds. If an operation fails, keep the log and print a message pointing to it.
Maybe, save such logs somewhere under .git/annex and have a command to upload them (over https and encrypted with @joeyh's key?) to some server where they can be examined.
> [[rejected|done]] --[[Joey]]

View file

@ -33,3 +33,10 @@ What about just making `git-annex sync --content` try to get the content of
all files before updating the work tree? (The assistant would need changes
too; it would need to queue all the downloads and trigger a work tree
update once all the downloads have been tried.)
> `git-annex adjust --hide-missing` implements this. The assistant
> does not support it yet, and there's a todo for that,
> [[todo/assistant_support_hide-missing]]
>
> So, closing this as there's a path forward in that other todo. [[done]]
> --[[Joey]]

View file

@ -17,3 +17,11 @@ reference to that tree, until some other tree is exported that deletes that
file. So this is a not very likely, but possible, way for the git-annex
branch to still mention a dead key after --drop-dead. Could rewrite the
tree as well, but now it's getting complicated indeed.
> Let's leave out the idea that all references to the dead key get
> scrubbed from the branch. In any case a key is probably referred to in
> the master branch too.
>
> It's still useful to implement this, just to reduce branch size.
[[!tag confirmed]]

View file

@ -10,3 +10,8 @@ which would make fetch and pull run `git-remote-annex`. Currently, special
remote's don't get an url configured. (`annex::uuid` was my first thought,
but `annex::foo` avoids repeating the remote's uuid and git-annex can
look up the uuid from the name) --[[Joey]]
> While this seems possible, I wonder if it's a good idea. It seems, that,
> the justify the added code and new executable (or symlink to git-annex),
> there would need to be a real benefit. Is it enough benefit to unify
> import/export with pull/push? Is it really a benefit at al? --[[Joey]]

View file

@ -74,3 +74,5 @@ It could be done as a Remote.Helper.SimpleImport that takes those
3 methods and translates them to the current interface.
Or by complicating Remote.Helper.ExportImport further..
--[[Joey]]
[[!tag confirmed]]

View file

@ -1,4 +1,4 @@
Would it be hard to add a variantion to checksumming [[backends]], that would change how the checksum is computed: instead of computing it on the whole file, it would first be computed on file chunks of given size, and then the final checksum computed on the concatenation of the chunk checksums? You'd add a new [[key field|internals/key_format]], say cNNNNN, specifying the chunking size (the last chunk might be shorter). Then (1) for large files, checksum computation could be parallelized (there could be a config option specifying the default chunk size for newly added files); (2) I often have large files on a remote, for which I have md5 for each chunk, but not for the full file; this would enable me to register the location of these fies with git-annex without downloading them, while still using a checksum-based key.
> Closing, because [[external_backends]] is implemented, so you should be
> able to roll your own backend for your use case here. --[[Joey]]
> able to roll your own backend for your use case here. [[done]] --[[Joey]]

View file

@ -0,0 +1,107 @@
There are often requests to add various git-annex gitconfig settings
to git-annex-config. Probably, if every such request were implemented
indesciminately, almost all settings would end up added to it. But adding
settings to git-annex-config can be an imposition on users who don't want
to have to override unusual settings.
git's own gitconfigs cannot be set by git-annex-config, and git users do
not seem to be clamoring for ways to set gitconfigs across all clones of a
repo. Instead, git users probably use a variety of ways to manage the same
thing, all of which also work for git-annex configs too.
So, git-annex-config, though it started out for good reasons, risks
becoming a slippery slope toward an inconsistent mess. To avoid that,
a clear criteria is needed for when it's appropriate to add a new setting
to it.
----
It's worth considering gitattributes, since they also set somewhat
repo-global configs. (Though less global since they can change in branch.)
git-annex uses gitattributes some too, though less so.
One good thing about gitattributes is that it applies the attribute to a
set of files, and so it only makes sense for things that are related
to individual files. So there is a gitattribute that controls how 3-way
merging of a file happens, but not a gitattribute that controls whether
git commits are gpg signed.
git-annex-config does not have such a scope limiter currently.
----
The settings that git-annex-config supports are, in the order they were
added:
* annex.autocommit
This was suggested because someone had a problem of cloning a repo
where annex.autocommit was usually set, but forgetting to set it,
resulting in an unwanted commit. This does not seem like a good
justification, couldn't someone run `git commit -a` accidentially
and have the same result?
* annex.synccontent
This was made a global because there is a hope for `git-annex sync
--content` to perhaps eventually because the default, and this lets
users get ahead of that. But that is not really a good justification
because if that behavior change did happen, there could be a transition
period where `git-annex sync` warned that its behavior was going to
change, which would give users an opportunity to choose the behavior
they want, and configure it locally.
* annex.securehashesonly
This is a global because a user who is relying on cryptographically
secure hashes for their security should not need to remember to set
the config in each new clone of the repo. Also, their collaborators
should not need to remember to set the config to avoid committing
things that do not use secure hashes, which would result in a mess
that would be painful to get out of. I do think this needs to be a
global.
* annex.resolvemerge
This is a global because, when git-annex's automatic merge conflict
resulution is not appropriate for a repository, it needs to be disabled
globally, since one can happen in any clone and would result in the wrong
thing being commited by git-annex.
* annex.largefiles
This is a global because it was already a (semi-)global in .gitattributes
files, but the syntax of those files made more complex expressions
hard to use in them. And so also putting the config here avoids that
problem and does not make it more global. This seems reasonable.
* annex.addunlocked
This would be suitable for a gitattribute, since it applies to an
individual file. But, like annex.largefiles, the syntax of .gitattributes
files makes more complex expressions a problem in them. So, it was added
to git-annex-config instead.
* annex.dotfiles
One reason this was made a global is probably that there was a large
amount of user complaint about git-annex add's handing of dotfiles,
with no one choice that would avoid it, but it did seem that each repo
probably had a choice that would satisfy the users of that repo.
Besides being sick of navigating that maze of complaints, the only
other justification for it being a global seems to be that setting
annex.dotfiles works with annex.largfiles to control which particular
dotfiles to add to the annex (when users for some reason care),
and annex.largefiles is a global.
* annex.synconlyannex
I don't see a justification for this being global.
At this point, we do seem to be a ways down a slippery slope. I started
pushing back at adding them in 2020, and so no more have been added.
--[[Joey]]
---
Looking at the settings that were added and why, here are some possible
criteria that could be extracted from that:
1. The config is for behavior that needs to happen in every clone of the
repository, to avoid situations where varying the config would lead
to difficult to resolve situations (annex.securehashesonly)
2. The config is something that would be suitable for .gitattributes,
but limitations of .gitattributes makes it convenient to have another way
to set it globally (when not actually targeting specific files).
(annex.largefiles, annex.addunlocked, annex.dotfiles)
Things like annex.autocommit do not meet criteria #1, because it's
easy to fix up a git commit history to remove an unwanted commit.
Does annex.resolvemerge meet criteria #1? --[[Joey]]
[[!tag confirmed]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2022-04-04T17:59:39Z"
content="""
See prior discussion in this old todo:
<https://git-annex.branchable.com/todo/additional_git-annex-config_settings__63__/>
"""]]

View file

@ -5,3 +5,5 @@ i have used this oneliner so far, but it's ugly and painful, especially since `g
git annex unused 2>&1 | grep '^ *[0-9][0-9]*' | sed 's/^ *[0-9][0-9]* *//' | xargs -I'{}' git log --oneline --stat -S'{}' -1
any way to do this more easily? --[[anarcat]]
> [[done]]; use `git-annex whereused --unused` --[[Joey]]

View file

@ -5,3 +5,5 @@ KDE-connect provides a an easy to set up method of creating a transport tunnel b
Apparently, it can even be used to connect multiple desktops with another but I haven't tested that. This could be an alternative (or perhaps even replacement?) to git-annex' current pairing mechanism.
KDE-connect offers remote command execution and file sharing, so it should be possible to use it for git-annex' purposes.
> [[rejected|done]] --[[Joey]]

View file

@ -12,3 +12,5 @@ accomplish this, but it's a bit cumbersome and also not a very precise
way to specify the amount I want to copy/move/get. The last example
would also be a useful command to limit the traffic when I'm connecting
via mobile get as much as possible, but don't blow the mobile quota.
> [[done]] as --size-limit --[[Joey]]

View file

@ -1 +1,5 @@
As suggested during the first Gitify BoF during DebConf13: Adding a way to have on-demand dropping of content in a given remote would allow a user to quickly free up disk space on demand while still heeding numcopies etc.
> [[done]] as --size-limit. This does affect dropping, so eg
> `git-annex drom --from foo --size-limit=10gb` frees up 10gb from remote
> foo. --[[Joey]]

View file

@ -1,3 +1,5 @@
A big part of my online use is done via a low-speed connection over my mobile phone, this is limited to 16KB/sec because I always use up my 500MB quota the very first day of the month. `;-/` So when I need to download big files, I first download them to my online server, then transfer the files to my laptop with git-annex. If I'm connected via GSM, this occupies all the bandwidth and everything else moves like a heavily sedated slug. So if I want to work via VNC or SSH, I have to terminate ongoing transfers with Ctrl-C and then hopefully remember to restart it when I work locally. I know git-annex is robust enough to handle this gracefully, but it would be really nice to have a continuous connection going on in the background, limited to a value I choose.
rsync(1) has a `--bwlimit` (bandwidth limit) where you can specify max download/upload speed in kilobytes/sec. It would be great if a similar option was integrated into git-annex. Thanks in advance.
> [[done]] generally as annex.bwlimit and related options. --[[Joey]]

View file

@ -1 +1,4 @@
The `annex.largefiles` feature is very nice to mix annexed files with normal git managed files. I'd like to be able to configure this setting on the webapp and that the configuration directive would be synchronized accross all remotes.
> annex.largefiles can be configured in .gitattributes or by `git-annex
> config` and will sync across remotes either way. [[done]]

View file

@ -3,3 +3,6 @@ I understand that for backwards compatibility the non-bare remotes use the old "
> If this option existed then every clone of a repository would need to set
> it, or files would be hashed into the wrong location and would appear not
> visible. Sounds like a bug magnet to me; not attractive. --[[Joey]]
> > Actually, `git-anex init -c annex.tune.objecthashlower=true`
> > does just this. So, [[done]] --[[Joey]]

View file

@ -5,3 +5,6 @@ This would work somewhat similar to looping over a directory and adding file://
A use case is importing optical media (read-only), whilst keeping that media as a remote, and being able to calculate checksums directly without moving any files around.
For single files, it would also be interesting if addurl had a "--localchecksum" option that would only work for file:// urls, and make it checksum files directly from their source location?)
> A directory special remote with importtree=yes and `git-annex import`--no-content`
> can be used to do this. [[done]] --[[Joey]]