Commit graph

781 commits

Author SHA1 Message Date
Joey Hess
38a23928e9
temporarily remove cached keys database connection
The problem is that shutdown is not always called, particularly in the test
suite. So, a database connection would be opened, possibly some changes
queued, and then not shut down.

One way this can happen is when using Annex.eval or Annex.run with a new
state. A better fix might be to make both of them call Keys.shutdown
(and be sure to do it even if the annex action threw an error).

Complication: Sometimes they're run reusing an existing state, so shutting
down a database connection could cause problems for other users of that
same state. I think this would need a MVar holding the database handle,
so it could be emptied once shut down, and another user of the database
connection could then start up a new one if it got shut down. But, what if
2 threads were concurrently using the same database handle and one shut it
down while the other was writing to it? Urgh.

Might have to go that route eventually to get the database access to run
fast enough. For now, a quick fix to get the test suite happier, at the
expense of speed.
2015-12-16 14:05:26 -04:00
Joey Hess
7d0e79b9e1
Use git-annex init --version=6 to get v6 for now
Not ready to make it default because of the direct mode upgrade needing to
all happen at once.
2015-12-15 17:17:13 -04:00
Joey Hess
f9d077186a
implemented upgrade of direct mode repo to v6 2015-12-15 16:00:26 -04:00
Joey Hess
cdd27b8920
reorg 2015-12-15 15:34:28 -04:00
Joey Hess
2bc920e266
update inode cache to cover file even when nothing needs to be done to linkAnnex
This covers the case where multiple files have the same content and are
added with git add. Previously only the one that was linked to the annex
got its inode cached; now both are.
2015-12-15 13:02:33 -04:00
Joey Hess
1dad3af3fc
checked getKeysPresent; it's ok for v6 unlocked files
When a v6 unlocked files is removed from the work tree,
unused doesn't show it. When it gets removed from the index,
unused does show it. This is the same as a locked file.
2015-12-11 16:12:42 -04:00
Joey Hess
7790e059b2
finish v6 git-annex lock
This was a doozy!
2015-12-11 15:28:34 -04:00
Joey Hess
50e83b606c
only make 1 hardlink max between pointer file and annex object
If multiple files point to the same annex object, the user may want to
modify them independently, so don't use a hard link.

Also, check diskreserve when copying.
2015-12-11 14:00:21 -04:00
Joey Hess
c608a752a5
Merge branch 'master' into smudge 2015-12-11 13:50:31 -04:00
Joey Hess
abd66c7089
fsck: Failed to honor annex.diskreserve when checking a remote. 2015-12-11 13:50:27 -04:00
Joey Hess
c910b4e255
wip 2015-12-11 10:42:18 -04:00
Joey Hess
9dffd3d255
add generalized linkAnnex' 2015-12-10 16:08:19 -04:00
Joey Hess
06a8256bf6
always format pointer file with a trailing newline
Before the smudge filter added a trailing newline, but other things that
wrote formatPointer to a file did not.

also some new pointer staging code to use later
2015-12-10 16:06:58 -04:00
Joey Hess
f80a3d8cd0
check InodeCache in inAnnex et al
This avoids querying the database when the content file doen't exist
(or otherwise fails the provided check). However, it does add overhead of
querying the database, and will certianly impact performance.
2015-12-10 14:51:04 -04:00
Joey Hess
2b8f6b8b2f
check inode cache in prepSendAnnex
This does mean one query of the database every time an object is sent.
May impact performance.
2015-12-10 14:50:52 -04:00
Joey Hess
3b2a7f216d
move 2015-12-10 14:20:38 -04:00
Joey Hess
3719d1b390
make clear when code is using deprecated direct mode files 2015-12-09 19:43:15 -04:00
Joey Hess
aa88851ec1
reorder 2015-12-09 19:38:37 -04:00
Joey Hess
ce73a96e4e
use InodeCache when dropping a key to see if a pointer file can be safely reset
The Keys database can hold multiple inode caches for a given key. One for
the annex object, and one for each pointer file, which may not be hard
linked to it.

Inode caches for a key are recorded when its content is added to the annex,
but only if it has known pointer files. This is to avoid the overhead of
maintaining the database when not needed.

When the smudge filter outputs a file's content, the inode cache is not
updated, because git's smudge interface doesn't let us write the file. So,
dropping will fall back to doing an expensive verification then. Ideally,
git's interface would be improved, and then the inode cache could be
updated then too.
2015-12-09 17:54:54 -04:00
Joey Hess
5e8c628d2e
add inode cache to the db
Renamed the db to keys, since it is various info about a Keys.

Dropping a key will update its pointer files, as long as their content can
be verified to be unmodified. This falls back to checksum verification, but
I want it to use an InodeCache of the key, for speed. But, I have not made
anything populate that cache yet.
2015-12-09 17:00:37 -04:00
Joey Hess
3311c48631
move InodeSentinal from direct mode code to its own module
Will be used outside of direct mode for v6 unlocked files, and is already
used outside of direct mode when adding files to annex.
2015-12-09 15:52:11 -04:00
Joey Hess
8a818088a3
link/copy pointer files to object content when it's added 2015-12-09 15:27:29 -04:00
Joey Hess
751120c171
avoid pre-commit hook messing up new-style unlocked files in v6 repo 2015-12-09 15:18:54 -04:00
Joey Hess
78a6b8ce05
refactor and improve pointer file handling code 2015-12-09 14:27:43 -04:00
Joey Hess
712c9fc590
require "annex/objects/" before key in pointer files
This removes ambiguity, because while someone might have "WORM--foo" in a
file that's not intended to be a git-annex pointer file,
"annex/objects/WORM--foo" is less likely.

Also, 664cc987e8 had a caveat about symlink
targets being parsed as pointer files, and now the same parser is used for
both.

I did not include any hash directories before the key in the pointer file,
as they're not needed. However, if they were included, the parser would
still work ok.
2015-12-07 15:45:08 -04:00
Joey Hess
664cc987e8
support pointer files
Backend.lookupFile is changed to always fall back to catKey when
operating on a file that's not a symlink.

catKey is changed to understand pointer files, as well as annex symlinks.

Before, catKey needed a file mode witness, to be sure it was looking at a
symlink. That was complicated stuff. Now, it doesn't actually care if a
file in git is a symlink or not; in either case asking git for the content
of the file will get the pointer to the key.

This does mean that git-annex will treat a link
foo -> WORM--bar as a git-annex file, and also treats
a regular file containing annex/objects/WORM--bar as a git-annex file.

Calling catKey could make git-annex commands need to do more work than
before. This would especially be the case if a repo contained many regular
files, and only a few annexed files, as now git-annex will need to ask
git about the contents of the regular files.
2015-12-07 15:35:36 -04:00
Joey Hess
62a2fba1cd
Merge branch 'master' into smudge 2015-12-07 12:29:34 -04:00
Joey Hess
2936153fc4
fix temp filename
Was not putting it inside the temp dir, but next to it!

This was just wrong, and it led to a longer filename that desired being
used, leading to some bug reports.
2015-12-06 16:54:01 -04:00
Joey Hess
6e71094e7d
avoid too long temp dir template
The filename might be at or close to the filename length limit, so using it
as the template for the temp dir would then fail.
2015-12-06 16:42:40 -04:00
Joey Hess
e7f75b079d
don't let git-annex direct be run in a v6 repo 2015-12-04 16:33:09 -04:00
Joey Hess
ccc49861ca
add v6; keep v5 working for now and manual upgrade
Since all places where a repo is used in direct mode need to have git-annex
upgraded before the repo can safely be converted to v6, the upgrade needs
to be manual for now.

I suppose that at some point I'll want to drop all the direct mode support
code. At that point, will stop supporting v5, and will need to auto-upgrade
any remaining v5 repos. If possible, I'd like to carry the direct mode
support for say, a year or so, to give people plenty of time to upgrade and
avoid disruption.
2015-12-04 16:14:48 -04:00
Joey Hess
34ead644d9
auto-configure filter.annex.smudge and clean on init 2015-12-04 16:14:11 -04:00
Joey Hess
983c1894eb
avoid unnecessary reading of git-annex branch data when matching on annex.largefiles
This makes git annex clean not look at the git-annex branch at all,
and so speeds it up by 50% or more.
2015-12-04 15:06:41 -04:00
Joey Hess
99b2a524a0
clean filter should update location log when adding new content to annex 2015-12-04 14:20:32 -04:00
Joey Hess
2c6454a2e2
basic clean filter working 2015-12-04 13:39:14 -04:00
Joey Hess
0d432dd1a4
annex object file mode for core.sharedRepository
When core.sharedRepository is set, annex object files are not made mode
444, since that prevents a user other than the file owner from locking
them. Instead, a mode such as 664 is used in this case.
2015-11-18 15:45:32 -04:00
Joey Hess
3449c0e8ec
avoid spawning file size polling thread when not in -J mode 2015-11-16 21:21:58 -04:00
Joey Hess
e97fce35a6
Display progress meter in -J mode when downloading from the web.
Including in addurl, and get --from web, but also in S3 and External
special remotes when a web url is known for content in those remotes.
2015-11-16 21:00:54 -04:00
Joey Hess
262c37c16e
add missing checkSaneLock wrapper for pidlocks 2015-11-16 15:35:41 -04:00
Joey Hess
bb86eebfbd
init: Automatically enable annex.pidlock when necessary. 2015-11-13 13:35:29 -04:00
Joey Hess
aaf1ef268d
convert from Utility.LockPool to Annex.LockPool everywhere 2015-11-12 18:13:37 -04:00
Joey Hess
aa4192aea6
pid locking configuration and abstraction layer for git-annex
(not actually used anywhere yet)
2015-11-12 17:50:34 -04:00
Joey Hess
7c741302cc
assistant: Pass ssh-options through 3 more git pull/push calls that were missed before.
It was used for regular pull, but not for regular push, tagged push, or the
fallback fetching.
2015-11-10 16:52:30 -04:00
Joey Hess
7938b87864
add: Fix error recovery rollback to not move the injested file content out of the annex back to the file, because other files may point to that same content. Instead, copy the injected file content out to recover.
That was not a data loss, but it came close!
2015-11-06 15:28:20 -04:00
Joey Hess
51e60259e1
fix replaceFile makeAnnexLink race
replaceFile created a temp file, which was guaranteed to not overlap with
another temp file. However, makeAnnexLink then deleted that file, in
preparation for making the symlink in its place. This caused a race, since
some other replaceFile could create a temp file, using the same name!

I was able to reproduce the race easily running git-annex add -J10 in a
directory with 100 files (all with different contents). Some files would
get ingested into the annex, but their annex links would fail to be added.

There could be other situations where this same problem could occur.
Perhaps when the assistant is adding a file, if the user manually also ran
git-annex add. Perhaps in cases not involving adding a file.

The new replaceFile makes a temprary directory, which is guaranteed to be
unique, and doesn't make a temp file in there. makeAnnexLink can thus
create the symlink without problem and the race is avoided.

Audited all calls to replaceFile to make sure that the old behavior of
providing an empty temp file was not relied on.

The general problem of asking for a temp file and deleting it as part of
the process of using it could reach beyond replaceFile. Did some quick
audits and didn't find other cases of it. Probably only symlink creation
stuff would tend to make that mistake, mostly.
2015-11-06 15:08:19 -04:00
Joey Hess
31472161e4
merge git command queue when joining with concurrent thread 2015-11-05 18:21:48 -04:00
Joey Hess
a4dd8503b8
add regions to concurrent output
still no progress displays when getting files etc, but a big improvement
2015-11-04 14:52:07 -04:00
Joey Hess
640dba43b6
enableremote: List uuids and descriptions of remotes that can be enabled, and accept either the uuid or the description in leu if the name. 2015-10-26 14:55:40 -04:00
Joey Hess
806819be57
Avoid displaying network transport warning when a ssh remote does not yet have an annex.uuid set.
Instead, only display transport error if the configlist output doesn't
include an annex.uuid line, even an empty one.

A recent change made git-annex init try to get all the remote uuids, and so
the transport error would be displayed by it. It was also displayed when
eg, copying files to a remote that had no uuid yet.
2015-10-15 15:36:54 -04:00
Joey Hess
3879f6e6be
do tmp dir cleanup in error case too 2015-10-15 14:27:14 -04:00