Added a comment: Weird behavior of git archive in combination with largefiles configuration
This commit is contained in:
parent
ff601ddc45
commit
804ea016d0
1 changed files with 97 additions and 0 deletions
|
@ -0,0 +1,97 @@
|
|||
[[!comment format=mdwn
|
||||
username="arnaud.legrand@e79f5d4cff79116f56388885021e8507bef18e12"
|
||||
nickname="arnaud.legrand"
|
||||
avatar="http://cdn.libravatar.org/avatar/143239914c3e3c1a374a7c244b56d73e"
|
||||
subject="Weird behavior of git archive in combination with largefiles configuration"
|
||||
date="2023-02-17T09:46:34Z"
|
||||
content="""
|
||||
Hi,
|
||||
|
||||
I'm preparing a lecture on how git annex can help research data
|
||||
management and I stumbled, when playing with `git-annex unannex`, on a
|
||||
strange behavior that I fail to understand nor to properly work around.
|
||||
When preparing for a public archive it may make sense to include
|
||||
**some** annexed files in the archive while it may be desirable to keep
|
||||
the symlinks for others (e.g., because they are already available from
|
||||
somewhere else). This is why I do not want to rely on the `git-annex
|
||||
export` mechanism that would replace the symlinks of all annexed files
|
||||
by their content.
|
||||
|
||||
Instead, I `unannex` some of my files but surprisingly, depending on git
|
||||
annex configuration, their content may not be in the archive produced by
|
||||
`git archive`. Here is a minimal working example.
|
||||
|
||||
``` shell
|
||||
DIR=/tmp/test
|
||||
chmod -Rf u+w $DIR; rm -rf $DIR ; mkdir -p $DIR; cd $DIR
|
||||
git init
|
||||
git annex init
|
||||
git config --local annex.largefiles 'largerthan=100kb and include=data/*'
|
||||
|
||||
echo \"Hello\" > README
|
||||
git add README
|
||||
|
||||
mkdir data/
|
||||
dd if=/dev/zero of=data/foo.dat bs=1M count=1 2>/dev/null
|
||||
git annex add data/foo.dat
|
||||
git commit -m \"Initial commit\"
|
||||
|
||||
## git config --local annex.largefiles ''
|
||||
git annex unannex data/foo.dat && git add data/foo.dat && git commit -m \"Unannexing\"
|
||||
git archive --format=tar.gz --prefix nobel_project/ -o ../archive.tgz HEAD
|
||||
|
||||
tar zxf ../archive.tgz
|
||||
tree -s nobel_project/
|
||||
```
|
||||
|
||||
``` example
|
||||
|
||||
Initialized empty Git repository in /tmp/test/.git/
|
||||
init ok
|
||||
(recording state in git...)
|
||||
add data/foo.dat
|
||||
31.98 KiB 14 MiB/s 0s100% 1 MiB 137 MiB/s 0s ok
|
||||
(recording state in git...)
|
||||
[master (root-commit) 8fbb907] Initial commit
|
||||
2 files changed, 2 insertions(+)
|
||||
create mode 100644 README
|
||||
create mode 120000 data/foo.dat
|
||||
unannex data/foo.dat ok
|
||||
(recording state in git...)
|
||||
[master da73fb0] Unannexing
|
||||
1 file changed, 1 insertion(+), 1 deletion(-)
|
||||
)
|
||||
100644
|
||||
mnobel_project/
|
||||
├── [ 4096] data
|
||||
│ └── [ 102] foo.dat
|
||||
└── [ 6] README
|
||||
|
||||
1 directory, 2 files
|
||||
```
|
||||
|
||||
As you may see from the output, `foo.dat` is only 102 bytes whereas it
|
||||
should be 1MB. Instead the content of `foo.dat` is:
|
||||
|
||||
``` shell
|
||||
cat nobel_project/data/foo.dat
|
||||
```
|
||||
|
||||
``` example
|
||||
/annex/objects/SHA256E-s1048576--30e14955ebf1352266dc2ff8067e68104607e750abb9d3b36582b8af909fcb58.dat
|
||||
```
|
||||
|
||||
But if I remove the `annex.largefiles` configuration (either upfront or
|
||||
right before calling `unannex`), everything works as expected, i.e., my
|
||||
archive comprises the content of the annexed file.
|
||||
|
||||
Is this an expected behavior ? This is the kind of operation I typically
|
||||
do in a branch that I erase afterward but it (temporarily) messes my
|
||||
local git configuration, which I don't like, so I'm looking for a better
|
||||
workaround.
|
||||
|
||||
Thanks for you amazing work,
|
||||
|
||||
Arnaud
|
||||
|
||||
"""]]
|
Loading…
Reference in a new issue