Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2016-04-02 16:48:03 -04:00
commit c7b06b879b
Failed to extract signature
5 changed files with 179 additions and 1 deletions

View file

@ -0,0 +1,69 @@
### Please describe the problem.
When the assistant starts it takes several hours to do the startup scan, even when there are no files to add.
The repo contains many small files but it is configured to add the smaller ones via gitattributes. In particular there are: 91949 files added to git repo and 1029 annexed.
This is my gitattributes
* annex.largefiles=(largerthan=500kb)
annex.addunlocked is set to true
### What steps will reproduce the problem?
Create a repo with ~90000 files smaller than 500k and ~1000 files larger (in my case ranging from 500k to 32M). Set addunlocked to true and annex.largefiles to largerthan=500kb. Start the assistant and let it finish adding the files. Restart the assistant.
### What version of git-annex are you using? On what operating system?
git-annex version: 6.20160318
build flags: Assistant Webapp Pairing Testsuite S3(multipartupload)(storageclasses) WebDAV Inotify DBus DesktopNotify XMPP ConcurrentOutput TorrentParser MagicMime Feeds Quvi
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 SHA1E SHA1 MD5E MD5 WORM URL
remote types: git gcrypt S3 bup directory rsync web bittorrent webdav tahoe glacier ddar hook external
local repository version: 6
I'm running it on Arch Linux (packaged version)
### Please provide any additional information below.
[[!format sh """
[2016-03-29 22:08:26.356586] main: starting assistant version 6.20160318
No known network monitor available through dbus; falling back to polling
(scanning...) [2016-03-29 22:08:41.426049] Watcher: Performing startup scan
[2016-03-29 23:05:40.533113] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 00:10:07.085051] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 01:23:29.784236] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 02:43:02.048312] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 03:37:53.273057] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 04:04:56.875573] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 04:31:14.370618] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 04:56:12.467889] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 05:21:09.021728] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 05:43:11.111616] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 06:14:38.096425] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 06:49:54.730879] Committer: Committing changes to git
(recording state in git...)
[2016-03-30 07:26:47.721929] Committer: Committing changes to git
(recording state in git...)
# End of transcript or log.
"""]]
At this point I stopped the assistant that was still doing the startup scan...
### Have you had any luck using git-annex before?
Sure!

View file

@ -0,0 +1,7 @@
[[!comment format=mdwn
username="zarel"
subject="comment 1"
date="2016-04-02T20:22:23Z"
content="""
I suppose that the problem is strictly assistant related since both \"git annex status\" and \"git status\" give me the correct status when new files are present in a couple of seconds the first time and in a fraction of a second in the subsequent calls
"""]]

View file

@ -0,0 +1,92 @@
### Please describe the problem.
If a file is executable, the content of the file remains to be an SHA hash in a newly cloned repository. Neither 'git annex sync --content' or 'git annex get' can bring the file back.
The only way to bring the file back is to remove the file and do a 'git checkout' or 'git reset HEAD --hard'
If the file is not an executable (a tarball for example), it works as expected.
If I did not clone the repo but created a new repo and then manually added a remote it also worked as expected.
### What steps will reproduce the problem?
See log below.
### What version of git-annex are you using? On what operating system?
6.20160318-gd594fc0 on Ubuntu 15.10
### Please provide any additional information below.
[[!format sh """
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
vagrant@vm:/tmp/$ cd annex/
vagrant@vm:/tmp/annex/$ mkdir repo1
vagrant@vm:/tmp/annex/$ cd repo1/
vagrant@vm:/tmp/annex/repo1/$ git init
Initialized empty Git repository in /tmp/annex/repo1/.git/
vagrant@vm:/tmp/annex/repo1/$ git annex init --version=6
init ok
(recording state in git...)
vagrant@vm:/tmp/annex/repo1/$ cp /bin/ls .
/bin/ls -> ./ls
vagrant@vm:/tmp/annex/repo1/$ git add ls
vagrant@vm:/tmp/annex/repo1/$ git ci -am 'added ls binary'
(recording state in git...)
[master (root-commit) 7889519] added ls binary
1 file changed, 1 insertion(+)
create mode 100755 ls
vagrant@vm:/tmp/annex/repo1/$ ls -l
total 116
-rwxr-xr-x 1 vagrant vagrant 118272 Apr 1 12:56 ls
vagrant@vm:/tmp/annex/repo1/$ cd ..
vagrant@vm:/tmp/annex/$ git clone repo1 repo2
Cloning into 'repo2'...
done.
vagrant@vm:/tmp/annex/$ cd repo2
vagrant@vm:/tmp/annex/repo2/$ git annex init --version=6
init (merging origin/git-annex into git-annex...)
(recording state in git...)
(scanning for unlocked files...)
ok
(recording state in git...)
vagrant@vm:/tmp/annex/repo2/$ ls -l
total 4
-rwxrwxr-x 1 vagrant vagrant 97 Apr 1 12:57 ls
vagrant@vm:/tmp/annex/repo2/$ git annex sync --content
commit ok
pull origin
ok
get ls (from origin...) (checksum...) ok
pull origin
ok
(recording state in git...)
push origin
Counting objects: 11, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (9/9), done.
Writing objects: 100% (11/11), 1.10 KiB | 0 bytes/s, done.
Total 11 (delta 1), reused 0 (delta 0)
To /tmp/annex/repo1
* [new branch] git-annex -> synced/git-annex
* [new branch] master -> synced/master
ok
vagrant@vm:/tmp/annex/repo2/$ ls -l
total 4
-rwxrwxr-x 1 vagrant vagrant 97 Apr 1 12:57 ls
vagrant@vm:/tmp/annex/repo2/$ cat ls
/annex/objects/SHA256E-s118272--0b786b336b0391b56dabb7b078a23ec4295115628cfd4b635f4d8ae5ae0cfafc
vagrant@vm:/tmp/annex/repo2/$ git annex get ls
vagrant@vm:/tmp/annex/repo2/$ cat ls
/annex/objects/SHA256E-s118272--0b786b336b0391b56dabb7b078a23ec4295115628cfd4b635f4d8ae5ae0cfafc
vagrant@vm:/tmp/annex/repo2/$ rm ls
vagrant@vm:/tmp/annex/repo2/$ git checkout ls
vagrant@vm:/tmp/annex/repo2/$ ls -l
total 116
-rwxrwxr-x 1 vagrant vagrant 118272 Apr 1 12:59 ls
# End of transcript or log.
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)

View file

@ -18,7 +18,7 @@ In repositories with annex.version 5 or earlier, unlocking a file is local
to the repository, and is temporary. With version 6, unlocking a file
changes how it is stored in the git repository (from a symlink to a pointer
file), so you can commit it like any other change. Also in version 6, you
can use `git add` to add a fie to the annex in unlocked form. This allows
can use `git add` to add a file to the annex in unlocked form. This allows
workflows where a file starts out unlocked, is modified as necessary, and
is locked once it reaches its final version.

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="https://me.yahoo.com/a/EbvxpTI_xP9Aod7Mg4cwGhgjrCrdM5s-#7c0f4"
subject="cache?"
date="2016-04-01T19:56:43Z"
content="""
I was about to whine in a separate TODO but then remembered that the issue is not new...
I wondered -- since sizes report depends on what is present or not locally, and that all directly relates to the state of git-annex branch, could may be annex cache collected information associated with a given annex / current branch treeishes? Then subsequent invocations would be fast.
In my case I would want to list information on multiple annexes e.g. in current directory. If each one takes 3-4 seconds, for 30 of them -- minutes. With caching, at least subsequent runs should be much faster (in case of no changes, which would be frequent case I think)
"""]]