Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2021-11-09 15:53:13 -04:00
commit 886d60dabe
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
3 changed files with 161 additions and 0 deletions

View file

@ -0,0 +1,116 @@
### Please describe the problem.
I tried using GitLab as a `--type git-lfs` special remote. I followed the steps in https://git-annex.branchable.com/tips/storing_data_in_git-lfs/ for both GitHub and GitLab.
While it worked as described for GitHub I get an error message when trying to push data to GitLab.
### What steps will reproduce the problem?
[[!format sh """
# basic repository setup
mkdir annex
cd annex
git init
git annex init
# create a random binary file
head -c 100 /dev/urandom >random.bin
git annex add .
git commit -a -m added
# set GitHub as a remote (repo exists and is public: https://github.com/iimog/annex)
git remote add github git@github.com:iimog/annex
git push --all github
git annex initremote lfs type=git-lfs encryption=none url=git@github.com:iimog/annex
git annex copy * --to lfs
#copy random.bin (to lfs...)
#ok
#(recording state in git...)
# set GitLab as a remote (repo exists and is public: https://gitlab.com/iimog/annex)
git remote add gitlab git@gitlab.com:iimog/annex
git push --all gitlab
git annex initremote lfs-gitlab type=git-lfs encryption=none url=git@gitlab.com:iimog/annex
#
# Unable to parse git config from gitlab
#
# Remote gitlab does not have git-annex installed; setting annex-ignore
#
# This could be a problem with the git-annex installation on the remote. Please make sure that git-annex-shell is available in PATH when you ssh into the remote. Once you have fixed the git-annex installation, run: git annex enableremote gitlab
#initremote lfs-gitlab ok
#(recording state in git...)
git annex copy * --to lfs-gitlab
#copy random.bin (to lfs-gitlab...)
#<long error message>
#--> The error in short is a "Bad Request" (details see below)
# Try again after explicitly calling enableremote
git annex enableremote gitlab
#enableremote gitlab ok
git annex copy * --to lfs-gitlab
#copy random.bin (to lfs-gitlab...)
#<long error message>
#--> This still results in the same "Bad Request" error
"""]]
### What version of git-annex are you using? On what operating system?
git-annex version: 8.20210903-ga4d179c
Ubuntu 20.04.3 LTS
### Please provide any additional information below.
I initially discovered this issue while using DataLad and reported the issue [here](https://github.com/datalad/datalad/issues/6126).
As suspected by @yarikoptic the issue also exists when using git-annex directly, not through DataLad.
On a self-hosted GitLab instance the following error was found in the server log:
[[!format sh """
2021/10/28 21:07:13 [error] 7412#0: *1869253 client sent invalid chunked body while sending request to upstream, client: <...>, server: <...>, request: "PUT /iimog/my-cool-ds2.git/gitlab-lfs/objects/92f711c1799c5aa936c2be297171d08afc19ba97899b57a1f67f2e70486395aa/80547 HTTP/1.1", upstream: "http://unix/var/opt/gitlab/gitlab-workhorse/sockets/socket:/iimog/my-cool-ds2.git/gitlab-lfs/objects/92f711c1799c5aa936c2be297171d08afc19ba97899b57a1f67f2e70486395aa/80547", host: "<...>"
"""]]
This is the full error message I got when calling `git annex copy * --to lfs-gitlab`:
[[!format sh """
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
$ git annex copy * --to lfs-gitlab
copy random.bin (to lfs-gitlab...)
HttpExceptionRequest Request {
host = "gitlab.com"
port = 443
secure = True
requestHeaders = [("Authorization","<REDACTED>"),("Content-Type","application/octet-stream"),("Transfer-Encoding","chunked"),("User-Agent","git-annex/8.20210903-ga4d179c")]
path = "/iimog/annex.git/gitlab-lfs/objects/04b38782b7f5990e090577047fb54b52e9771ba57c1c36e21f4702049adfcfe3/100"
queryString = ""
method = "PUT"
proxy = Nothing
rawBody = False
redirectCount = 10
responseTimeout = ResponseTimeoutDefault
requestVersion = HTTP/1.1
}
(StatusCodeException (Response {responseStatus = Status {statusCode = 400, statusMessage = "Bad Request"}, responseVersion = HTTP/1.1, responseHeaders = [("Server","cloudflare"),("Date","Tue, 09 Nov 2021 15:03:11 GMT"),("Content-Type","text/html"),("Content-Length","155"),("Connection","close"),("CF-RAY","6ab7ed2afb9fdfcf-FRA")], responseBody = (), responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}) "<html>\r\n<head><title>400 Bad Request</title></head>\r\n<body>\r\n<center><h1>400 Bad Request</h1></center>\r\n<hr><center>cloudflare</center>\r\n</body>\r\n</html>\r\n")
HttpExceptionRequest Request {
host = "gitlab.com"
port = 443
secure = True
requestHeaders = [("Authorization","<REDACTED>"),("Content-Type","application/octet-stream"),("Transfer-Encoding","chunked"),("User-Agent","git-annex/8.20210903-ga4d179c")]
path = "/iimog/annex.git/gitlab-lfs/objects/04b38782b7f5990e090577047fb54b52e9771ba57c1c36e21f4702049adfcfe3/100"
queryString = ""
method = "PUT"
proxy = Nothing
rawBody = False
redirectCount = 10
responseTimeout = ResponseTimeoutDefault
requestVersion = HTTP/1.1
}
(StatusCodeException (Response {responseStatus = Status {statusCode = 400, statusMessage = "Bad Request"}, responseVersion = HTTP/1.1, responseHeaders = [("Server","cloudflare"),("Date","Tue, 09 Nov 2021 15:03:12 GMT"),("Content-Type","text/html"),("Content-Length","155"),("Connection","close"),("CF-RAY","6ab7ed2c9a4ad6b5-FRA")], responseBody = (), responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}) "<html>\r\n<head><title>400 Bad Request</title></head>\r\n<body>\r\n<center><h1>400 Bad Request</h1></center>\r\n<hr><center>cloudflare</center>\r\n</body>\r\n</html>\r\n")
failed
copy: 1 failed
# End of transcript or log.
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Yes, I'm using DataLad for some of my projects and I'm really impressed how it makes use of git-annex to solve many of the tasks that I struggled with pure git before.

View file

@ -0,0 +1,12 @@
[[!comment format=mdwn
username="Ilya_Shlyakhter"
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
subject="clarification re: smudge filter and annex.supportunlocked"
date="2021-11-09T18:28:09Z"
content="""
> If you have non-annexed files in git, it does run the filter on those
Not when `annex.supportunlocked=false`, right? (Of course, that's not the default setting, so the smudge filter changes will still help by default.)
"""]]

View file

@ -0,0 +1,33 @@
[[!comment format=mdwn
username="jkniiv"
avatar="http://cdn.libravatar.org/avatar/05fd8b33af7183342153e8013aa3713d"
subject="comment 1"
date="2021-11-09T08:48:33Z"
content="""
My first impression of commit [[!commit a0758bdd1002e798f62353efa725ac2972589b96]] with the cost model
is quite positive as I'm the one with multigigabyte annexed files in his otherwise rather small (by number of files)
repo and thus I'm affected by the limitations of the filter-process method which pipes all the content of annexed
files from git to git-annex. Compared to commit [[!commit 837025b14f523f9180f82d0cced1e53a8a9b94de]], which frankly
for me was unusable in this particular repo with `filter.annex.process` set, the new version behaves rather nicely
in that a simple test of `time git checkout git-annex` followed by
`time git checkout 'adjusted/master(hidemissing-unlocked)'` turns out to be faster than using an unoptimised version
(=8.20211028) without the long-running `filter-process` functionality. Obviously, it's only the first stage,
i.e. checking out the git-annex branch, that becomes faster by over 50 percentage points but I'll take any improvement
in my daily git-annex operations. :)
The timings I got are as follows.
* `git checkout git-annex`
- unoptimised 8.20211028 / w/o `filter-process`: 103s
- commit 837025b14 / w/ `filter-process` enabled: 36s
- commit 9d3ce224e / w/ `filter-process` enabled: 37s
* `git checkout 'adjusted/master(hidemissing-unlocked)'`
- unoptimised 8.20211028 / w/o `filter-process`: 49s
- commit 837025b14 / w/ `filter-process` enabled: 57 minutes (I had dropped a few files, in reality this would've taken even longer)
- commit 9d3ce224e / w/ `filter-process` enabled: 43s
This repo is on Windows (with annex.thin set) and locally has only 13 annexed files on this very drive but the files
cover some 870 gigabytes worth of system backup images so individual files are definitely on the larger side for
git-annex.
"""]]