Merge branch 'master' into proxy
This commit is contained in:
commit
5aaa285083
7 changed files with 100 additions and 1 deletions
35
doc/bugs/git_annex_unannex_-_some_files_still_symlinked.mdwn
Normal file
35
doc/bugs/git_annex_unannex_-_some_files_still_symlinked.mdwn
Normal file
|
@ -0,0 +1,35 @@
|
|||
### Please describe the problem.
|
||||
|
||||
1. Some files remain symlinked after aborted `git annex add` and completed `git annex unannex`
|
||||
2. This files are present in``.git/annex/objects` but `git annex unused` does not find them. Running `git annex whereused --key=SHA256E...` runs empty.
|
||||
|
||||
To restore files and remove them from git-annex objects folder - need manual workarounds or hacks like adding file again with `git annex add` and trying to removing it again
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
1. run `git annex add` and abort operation mid-way (this was on directory with large number of files ~3K and running with 12 jobs command switch)
|
||||
2. run `git annex unannex` until done
|
||||
3. find that some files that were added - were restored, and some still symlinked but are not tracked by git annex
|
||||
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
Debian Bookworm / git-annex version: 10.20240227-1
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
Similar report from another user here:
|
||||
https://git-annex.branchable.com/forum/File_still_symlinked_after_git_annex_unannex/
|
||||
|
||||
[[!format sh """
|
||||
# If you can, paste a complete transcript of the problem occurring here.
|
||||
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
|
||||
|
||||
|
||||
# End of transcript or log.
|
||||
"""]]
|
||||
|
||||
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||
|
||||
|
||||
Yes, using it extensively for a few years with terabytes of data
|
|
@ -0,0 +1,22 @@
|
|||
[[!comment format=mdwn
|
||||
username="ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd"
|
||||
nickname="ruslan"
|
||||
avatar="http://cdn.libravatar.org/avatar/37d3c852372d96daa8a99629755ed1f9"
|
||||
subject="comment 1"
|
||||
date="2024-06-05T17:34:32Z"
|
||||
content="""
|
||||
Solution with running `git annex add` is also described at the link below:
|
||||
|
||||
https://git-annex.branchable.com/forum/git_annex_add_crash_and_subsequent_recovery/#comment-4f5af644597a055624009c5bbb9aca3f
|
||||
|
||||
---
|
||||
|
||||
So need to find files that are symlinks to git annex object folder and run `git annex add` / `git annex unused` - I can handle that with a script, though would be nice to have a built-in method
|
||||
|
||||
---
|
||||
|
||||
Additional notes:
|
||||
|
||||
1. There should be a way to find files that were added to git annex folder but are not tracked by git annex. Is this something that can be done with existing commands?
|
||||
2. It's desirable to have a way to abort `git annex add` gracefully on long-running jobs. Is there a way to do it now? Looks like ctrl-c resulted in a broken state. Whould Ctrl-z work better?
|
||||
"""]]
|
|
@ -0,0 +1,3 @@
|
|||
As I understand - there is currently no way to track metadata for directories with `git annex metadata` (it only works for files). Is that indeed the case?
|
||||
|
||||
One workaround I'm looking at is to add a metadata placeholder file for directory metadata inside the directory. As I understand - each directory would need to have such file with some unique content (perhaps UUID), otherwise metadata between files for different directories will actually collide. Are there alternatives/better solutions for tracking datasets metadata (groups of files in a folder)?
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="nobodyinperson"
|
||||
avatar="http://cdn.libravatar.org/avatar/736a41cd4988ede057bae805d000f4f5"
|
||||
subject="comment 1"
|
||||
date="2024-06-06T09:09:03Z"
|
||||
content="""
|
||||
You are absolutely right. You might be interested in [DataLad](https://datalad.org), which provides a lot of convenience around git-annex, has the concept of datasets (git submodules) and also an extended approach to metadata.
|
||||
"""]]
|
|
@ -0,0 +1,15 @@
|
|||
[[!comment format=mdwn
|
||||
username="ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd"
|
||||
nickname="ruslan"
|
||||
avatar="http://cdn.libravatar.org/avatar/37d3c852372d96daa8a99629755ed1f9"
|
||||
subject="comment 2"
|
||||
date="2024-06-06T11:23:34Z"
|
||||
content="""
|
||||
Thank you for the heads up!
|
||||
|
||||
I've actually looked in to DataLad, and have been using git annex with submodules.
|
||||
|
||||
Problem I found with submodules is that they required a lot of additional steps as far as adding/moving/deleting/syncing them. A very manual process, with a lot of complexity and some rough edge cases. They also interfere with some of Git-Annex functionality like metadata driven views I believe. So I'm using submodules very sparingly, only when I really need them.
|
||||
|
||||
As far as DataLad - it looks like a mature and well supported project, would love to see more feedback/reviews on it.
|
||||
"""]]
|
|
@ -34,7 +34,13 @@ For June's work on [[design/passthrough_proxy]], implementation plan:
|
|||
1. Add `git-annex updateproxy` command and remote.name.annex-proxy
|
||||
configuration. (done)
|
||||
|
||||
2. Test implementation of remote instantiation for proxies.
|
||||
2. Remote instantiation for proxies almost works, but fails at:
|
||||
"git-annex: cannot determine uuid for origin-foo"
|
||||
|
||||
getRepoUUID does not look at the Repo's UUID setting, but reads it
|
||||
from git-config. It's not set there for a proxied remote.
|
||||
|
||||
So: Add annex-uuid parsing to RemoteConfig.
|
||||
|
||||
3. Implement proxying in git-annex-shell.
|
||||
|
||||
|
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd"
|
||||
nickname="ruslan"
|
||||
avatar="http://cdn.libravatar.org/avatar/37d3c852372d96daa8a99629755ed1f9"
|
||||
subject="comment 1"
|
||||
date="2024-06-05T16:53:50Z"
|
||||
content="""
|
||||
Yes, limiting it to a single file would be sufficient for the use case I encountered, and keep it simple from the usage / user interface stand point IMHO
|
||||
Would look forward to this!
|
||||
"""]]
|
Loading…
Reference in a new issue