Merge branch 'master' into database
This commit is contained in:
commit
823cc9b800
20 changed files with 318 additions and 3 deletions
|
@ -0,0 +1,11 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 3"""
|
||||||
|
date="2015-02-17T21:54:33Z"
|
||||||
|
content="""
|
||||||
|
Since the two repos git-annex branches have diverged, you need to run `git
|
||||||
|
annex merge` to merge them before you can push that branch.
|
||||||
|
|
||||||
|
Of course, `git annex sync` handles all that for you. It can be used
|
||||||
|
against a bare repository as well as a non-bare.
|
||||||
|
"""]]
|
|
@ -0,0 +1,88 @@
|
||||||
|
[[!comment format=c
|
||||||
|
username="https://www.google.com/accounts/o8/id?id=AItOawlmLuHhscJsoAqb9q0N3LdtHum6LjY1LK4"
|
||||||
|
nickname="Markus"
|
||||||
|
subject="comment 7"
|
||||||
|
date="2015-02-17T14:43:02Z"
|
||||||
|
content="""
|
||||||
|
ssh -t makes no difference, the strace output:
|
||||||
|
it's completely repetitive, only the futex and mmap calls are at random positions (mmap probably leads to the enormous memory consumption)
|
||||||
|
|
||||||
|
rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
|
||||||
|
clock_gettime(0x2 /* CLOCK_??? */, {31, 737743240}) = 0
|
||||||
|
clock_gettime(CLOCK_MONOTONIC, {365100, 810332327}) = 0
|
||||||
|
clock_gettime(0x3 /* CLOCK_??? */, {31, 737155560}) = 0
|
||||||
|
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
|
||||||
|
futex(0x2b32fb1c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x2b32fb18, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
|
||||||
|
futex(0x2b32fb48, FUTEX_WAKE_PRIVATE, 1) = 1
|
||||||
|
futex(0x41981d0, FUTEX_WAKE_PRIVATE, 1) = 1
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
|
||||||
|
clock_gettime(0x2 /* CLOCK_??? */, {31, 851239760}) = 0
|
||||||
|
clock_gettime(CLOCK_MONOTONIC, {365100, 933314386}) = 0
|
||||||
|
clock_gettime(0x3 /* CLOCK_??? */, {31, 850549960}) = 0
|
||||||
|
mmap2(0x30b00000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x30b00000
|
||||||
|
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [HUP ILL TRAP KILL USR1 USR2 CHLD TSTP TTIN URG XFSZ VTALRM IO PWR])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
|
||||||
|
clock_gettime(0x2 /* CLOCK_??? */, {56, 575838240}) = 0
|
||||||
|
clock_gettime(CLOCK_MONOTONIC, {365125, 751101804}) = 0
|
||||||
|
clock_gettime(0x3 /* CLOCK_??? */, {56, 574935120}) = 0
|
||||||
|
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [ILL FPE KILL SEGV USR2 PIPE TERM STOP TSTP URG XCPU XFSZ VTALRM])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [QUIT ABRT BUS PIPE TERM CONT STOP URG IO PWR])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [QUIT ABRT BUS PIPE TERM CONT STOP URG IO PWR])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [QUIT ABRT BUS PIPE TERM CONT STOP URG IO PWR])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
sigreturn() = ? (mask now [QUIT ABRT BUS PIPE TERM CONT STOP URG IO PWR])
|
||||||
|
--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
|
||||||
|
"""]]
|
|
@ -0,0 +1,7 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="sairon"
|
||||||
|
subject="comment 2"
|
||||||
|
date="2015-02-17T15:04:55Z"
|
||||||
|
content="""
|
||||||
|
looks like it was the assistant
|
||||||
|
"""]]
|
|
@ -0,0 +1,9 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 2"""
|
||||||
|
date="2015-02-17T21:39:23Z"
|
||||||
|
content="""
|
||||||
|
Re finding repos, if the assistant is configured to automatically
|
||||||
|
start managing the repo at boot/login, the repo will be
|
||||||
|
listed in ~/.config/git-annex/autostart
|
||||||
|
"""]]
|
|
@ -6,7 +6,7 @@ locally paired systems, and remote servers with rsync.
|
||||||
Help me prioritize my work: What special remote would you most like
|
Help me prioritize my work: What special remote would you most like
|
||||||
to use with the git-annex assistant?
|
to use with the git-annex assistant?
|
||||||
|
|
||||||
[[!poll open=yes 18 "Amazon S3 (done)" 12 "Amazon Glacier (done)" 10 "Box.com (done)" 74 "My phone (or MP3 player)" 25 "Tahoe-LAFS" 13 "OpenStack SWIFT" 36 "Google Drive"]]
|
[[!poll open=yes 18 "Amazon S3 (done)" 12 "Amazon Glacier (done)" 10 "Box.com (done)" 74 "My phone (or MP3 player)" 25 "Tahoe-LAFS" 14 "OpenStack SWIFT" 36 "Google Drive"]]
|
||||||
|
|
||||||
This poll is ordered with the options I consider easiest to build
|
This poll is ordered with the options I consider easiest to build
|
||||||
listed first. Mostly because git-annex already supports them and they
|
listed first. Mostly because git-annex already supports them and they
|
||||||
|
|
|
@ -1,3 +1,5 @@
|
||||||
|
[[!meta title="day 254 sqlite for incremental fsck"]]
|
||||||
|
|
||||||
Yesterday I did a little more investigation of key/value stores.
|
Yesterday I did a little more investigation of key/value stores.
|
||||||
I'd love a pure haskell key/value store that didn't buffer everything in
|
I'd love a pure haskell key/value store that didn't buffer everything in
|
||||||
memory, and that allowed concurrent readers, and was ACID, and production
|
memory, and that allowed concurrent readers, and was ACID, and production
|
||||||
|
|
|
@ -0,0 +1,7 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 2"""
|
||||||
|
date="2015-02-17T20:15:36Z"
|
||||||
|
content="""
|
||||||
|
@anarcat, see [[design/caching_database]] for my thinking on that.
|
||||||
|
"""]]
|
34
doc/devblog/day_255__sqlite_concurrent_writers_problem.mdwn
Normal file
34
doc/devblog/day_255__sqlite_concurrent_writers_problem.mdwn
Normal file
|
@ -0,0 +1,34 @@
|
||||||
|
Worked today on making incremental fsck's use of sqlite be safe with
|
||||||
|
multiple concurrent fsck processes.
|
||||||
|
|
||||||
|
The first problem was that having `fsck --incremental` running and starting a
|
||||||
|
new `fsck --incremental` caused it to crash. And with good reason, since
|
||||||
|
starting a new incremental fsck deletes the old database, the old process
|
||||||
|
was left writing to a datbase that had been deleted and recreated out from
|
||||||
|
underneath it. Fixed with some locking.
|
||||||
|
|
||||||
|
Next problem is harder. Sqlite doesn't support multiple concurrent writers
|
||||||
|
at all. One of them will fail to write. It's not even possible to have two
|
||||||
|
processes building up separate transactions at the same time. Before using
|
||||||
|
sqlite, incremental fsck could work perfectly well with multiple fsck
|
||||||
|
processes running concurrently. I'd like to keep that working.
|
||||||
|
|
||||||
|
My partial solution, so far, is to make git-annex buffer writes, and every
|
||||||
|
so often send them all to sqlite at once, in a transaction. So most of the
|
||||||
|
time, nothing is writing to the database. (And if it gets unlucky and
|
||||||
|
a write fails due to a collision with another writer, it can just wait and
|
||||||
|
retry the write later.) This lets multiple processes write to the database
|
||||||
|
successfully.
|
||||||
|
|
||||||
|
But, for the purposes of concurrent, incremental fsck, it's not ideal.
|
||||||
|
Each process doesn't immediately learn of files that another process has
|
||||||
|
checked. So they'll tend to do redundant work. Only way I can see to
|
||||||
|
improve this is to use some other mechanism for short-term IPC between the
|
||||||
|
fsck processes.
|
||||||
|
|
||||||
|
----
|
||||||
|
|
||||||
|
Also, I made `git annex fsck --from remote --incremental` use a different
|
||||||
|
database per remote. This is a real improvement over the sticky bits;
|
||||||
|
multiple incremental fscks can be in progress at once,
|
||||||
|
checking different remotes.
|
|
@ -106,11 +106,14 @@ with appropriate handling of the direct mode files.
|
||||||
|
|
||||||
## undoing changes in direct mode
|
## undoing changes in direct mode
|
||||||
|
|
||||||
There is also the `undo` command to do the equivalent of the above revert in a simpler way. Say you made a change in direct mode, the assistant dutifully committed it and you realise your mistake, you can try:
|
There is also the `undo` command to do the equivalent of the above revert
|
||||||
|
in a simpler way. Say you made a change in direct mode, the assistant
|
||||||
|
dutifully committed it and you realise your mistake, you can try:
|
||||||
|
|
||||||
git annex undo file
|
git annex undo file
|
||||||
|
|
||||||
to revert the last change to `file`. Note that you can use the `--depth` flag to revert earlier versions of the file.
|
to revert the last change to `file`. Note that you can use the `--depth`
|
||||||
|
flag to revert earlier versions of the file.
|
||||||
|
|
||||||
## forcing git to use the work tree in direct mode
|
## forcing git to use the work tree in direct mode
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,7 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="https://id.koumbit.net/anarcat"
|
||||||
|
subject="comment 16"
|
||||||
|
date="2015-02-17T05:22:00Z"
|
||||||
|
content="""
|
||||||
|
i believe this is [answered here](https://git-annex.branchable.com/todo/windows_support/#comment-e72601243c643d7821e68d3a04489fcb). TLDR; basically NTFS + symlink works in Linux, but not in Windows/Cygwin, which git-annex seems to be using. YMMV.
|
||||||
|
"""]]
|
|
@ -0,0 +1,12 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="https://www.google.com/accounts/o8/id?id=AItOawnPgn611P6ym5yyL0BS8rUzO0_ZKRldMt0"
|
||||||
|
nickname="Samuel"
|
||||||
|
subject="Reseting to the git-annex branch"
|
||||||
|
date="2015-02-17T09:21:12Z"
|
||||||
|
content="""
|
||||||
|
Well, it appears you explicitely asked for reseting to the git-annex branch with the following command
|
||||||
|
git annex reset --hard git-annex
|
||||||
|
To go back to the master branch, containing the symlinks, just do
|
||||||
|
git annex checkout master
|
||||||
|
|
||||||
|
"""]]
|
|
@ -0,0 +1,24 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 2"""
|
||||||
|
date="2015-02-17T21:31:42Z"
|
||||||
|
content="""
|
||||||
|
There is never a reason to run "git reset --hard git-annex"! For that matter,
|
||||||
|
don't mess with the git-annex branch if you have not read and understand
|
||||||
|
the [[internals]] documentation. Even if you have, it's entirely the wrong
|
||||||
|
thing to be messing with in this situation. It has nothing at all to do
|
||||||
|
with your problem, except that after running that completely random reset
|
||||||
|
command, you now have two problems..
|
||||||
|
|
||||||
|
The right answer to your interrupted add is something like:
|
||||||
|
|
||||||
|
* `git reset --hard master`
|
||||||
|
* Or, run the `git-annex add` command again and let it resume
|
||||||
|
* Or, run `git commit` to commit any changes the add made,
|
||||||
|
followed by `git annex unannex` to back out adding those files.
|
||||||
|
|
||||||
|
Or, if this is an entirely new git repo that you have
|
||||||
|
never committed to before
|
||||||
|
(my guess based on the "bad default revision 'HEAD'"),
|
||||||
|
just `rm -rf .git` and start over.
|
||||||
|
"""]]
|
|
@ -0,0 +1,37 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="https://id.koumbit.net/anarcat"
|
||||||
|
subject="the actual process i use"
|
||||||
|
date="2015-02-17T00:58:38Z"
|
||||||
|
content="""
|
||||||
|
So it seems i am able to forget all of this within the matter of a few days, and since this is so error prone, here goes a more detailed explanation.
|
||||||
|
|
||||||
|
What I do is:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
git clone repo repo.test
|
||||||
|
cd repo.test
|
||||||
|
git annex indirect # be safe! this may take a while, but it's necessary!
|
||||||
|
git tag bak # keep track of a good working state
|
||||||
|
git log --stat --stat-count=3 # find the commits we want to trash
|
||||||
|
git tag firstbad badbeef1 # the first commit we want to kill
|
||||||
|
git tag keep dada1234 # the first commit we want to keep
|
||||||
|
git rebase -p --onto firstbad^ keep # drop everything between firstbad (inclusive) and keep (exclusive)
|
||||||
|
git diff --stat keep # make sure this did what we expected
|
||||||
|
git branch -D annex/direct/master synced/master # destroy this old branch that still has refs to the old commits
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
Then for each repo:
|
||||||
|
|
||||||
|
<pre>
|
||||||
|
cd repo
|
||||||
|
git tag bak
|
||||||
|
git fetch origin # sync the master branch in
|
||||||
|
git remote prune origin # make sure the dropped branches are gone
|
||||||
|
git annex indirect # be safe
|
||||||
|
git reset --hard origin/master
|
||||||
|
git branch -D synced/master annex/direct/master
|
||||||
|
git diff --stat bak # should change
|
||||||
|
</pre>
|
||||||
|
|
||||||
|
It would be useful to have that transition propagate properly everywhere so I don't have to do this in every repo, but at least the above should work fairly reliably.
|
||||||
|
"""]]
|
|
@ -0,0 +1,13 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 9"""
|
||||||
|
date="2015-02-17T21:43:16Z"
|
||||||
|
content="""
|
||||||
|
It's entirely expected and normal for git-annex to update the UUID
|
||||||
|
of a remote with `url = somepath` when it notices that the repo at
|
||||||
|
`somepath` has changed.
|
||||||
|
|
||||||
|
This is what you want to happen. If git-annex didn't notice and react to
|
||||||
|
the UUID change, its location tracking information (for UUID A) would be
|
||||||
|
inconsistent with the actual status of the repo (using UUID B).
|
||||||
|
"""]]
|
|
@ -0,0 +1,18 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 1"""
|
||||||
|
date="2015-02-17T21:46:01Z"
|
||||||
|
content="""
|
||||||
|
Yes, that's the same, except lookupkey only operates on files that are
|
||||||
|
checked into git.
|
||||||
|
|
||||||
|
(Also, lookupkey will work in a direct mode repo, while such a repo
|
||||||
|
may not have a symlink to examine.)
|
||||||
|
|
||||||
|
25ms doesn't seem bad for a "whole runtime" to fire up. :) I think most of
|
||||||
|
the overhead probably involves reading the git config and running
|
||||||
|
git-ls-files.
|
||||||
|
|
||||||
|
Note that lookupkey can be passed a whole set of files, so you could avoid
|
||||||
|
the startup overhead that way too.
|
||||||
|
"""]]
|
|
@ -0,0 +1,11 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 2"""
|
||||||
|
date="2015-02-17T21:50:11Z"
|
||||||
|
content="""
|
||||||
|
And yes, it's fine to bypass git-annex when querying git.
|
||||||
|
|
||||||
|
Or even when manipulating the git-annex branch, so long as you either
|
||||||
|
delete or update .git/annex/index. git-annex is not intended to be magical,
|
||||||
|
see [[internals]].
|
||||||
|
"""]]
|
|
@ -0,0 +1,15 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 1"""
|
||||||
|
date="2015-02-17T21:41:27Z"
|
||||||
|
content="""
|
||||||
|
I would not recommend running the assistant as root. Any security issue
|
||||||
|
would escalate the root access; any bug could result in some root level
|
||||||
|
damage to system.
|
||||||
|
|
||||||
|
Of course, I don't know of any such security issues or bugs. If I did, I'd
|
||||||
|
be fixing them.
|
||||||
|
|
||||||
|
On my system, /usr/local is managed by group staff. It seems much safer to
|
||||||
|
make the assistant be run by some non-root user who is in the staff group.
|
||||||
|
"""]]
|
|
@ -0,0 +1,7 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""re: why md5sum?"""
|
||||||
|
date="2015-02-17T21:51:59Z"
|
||||||
|
content="""
|
||||||
|
Not all types of keys contain hashes.
|
||||||
|
"""]]
|
|
@ -0,0 +1,7 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="https://id.koumbit.net/anarcat"
|
||||||
|
subject="document in the manpage?"
|
||||||
|
date="2015-02-17T05:28:33Z"
|
||||||
|
content="""
|
||||||
|
the manpage makes a passing reference to \"groups\", but nowhere in the manpage is there a reference to this page, which i had to find through google. maybe this should be in the manpage?
|
||||||
|
"""]]
|
3
doc/todo/wishlist:_global_progress_status.mdwn
Normal file
3
doc/todo/wishlist:_global_progress_status.mdwn
Normal file
|
@ -0,0 +1,3 @@
|
||||||
|
similar to [[do_not_bug_me_about_intermediate_files]] - i feel that massive `git annex get` operations should have better progress information than the current individual `rsync --progress` bits. i wonder if this couldn't be accomplished with `rsync --info=PROGRESS2`, which gives overall rsync progress, combined with copying multiple files at once with rsync (which would have the side-effect of speeding up `git annex get` for large number of small files).
|
||||||
|
|
||||||
|
once this is done, it could be sent back to the webapp UI to give the user a global sense of the overall sync progress (as opposed to per-file progress). --[[anarcat]]
|
Loading…
Add table
Add a link
Reference in a new issue