Commit graph

761 commits

Author SHA1 Message Date
Joey Hess
4b819bee2b
avoid confusing git with a modified ctime in clean filter
Linking the file to the tmp dir was not necessary in the clean
filter, and it caused the ctime to change, which caused git to think
the file was changed. This caused git status to get slow as it kept
re-cleaning unchanged files.
2016-01-07 17:48:04 -04:00
Joey Hess
3b960d1422
migrate and rekey v6 unlocked file support 2016-01-07 15:14:15 -04:00
Joey Hess
0b59fb423e
migrate: Copy over metadata to new key. 2016-01-07 14:21:12 -04:00
Joey Hess
b3d60ca285
use TopFilePath for associated files
Fixes several bugs with updates of pointer files. When eg, running
git annex drop --from localremote
it was updating the pointer file in the local repository, not the remote.
Also, fixes drop ../foo when run in a subdir, and probably lots of other
problems. Test suite drops from ~30 to 11 failures now.

TopFilePath is used to force thinking about what the filepath is relative
to.

The data stored in the sqlite db is still just a plain string, and
TopFilePath is a newtype, so there's no overhead involved in using it in
DataBase.Keys.
2016-01-05 17:22:19 -04:00
Joey Hess
f36f24197a
scan for unlocked files on init/upgrade of v6 repo 2016-01-01 15:09:42 -04:00
Joey Hess
a2c056df65
convert isPointerFile from Annex to IO 2016-01-01 13:22:38 -04:00
Joey Hess
829ae91009
fix failing git-annex unused test case in v6
WorkTree.lookupFile was finding a key for a file that's deleted from the
work tree, which is different than the v5 behavior (though perhaps the same
as the direct mode behavior). Fix by checking that the work tree file exists
before catting its key.

Hopefully this won't slow down much, probably the catKey is much more expensive.
I can't see any way to optimise this, except perhaps to make Command.Unused
check if work tree files exist before/after calling lookupFile. But,
it seems better to make lookupFile really only find keys for worktree files;
that's what it's intended to do.
2015-12-30 14:23:31 -04:00
Joey Hess
5057fffccd
flush queue before cleaning cruft
Else, queued file stages won't have reached the index, and it won't find
everthing.

This evidently fixes a reversion in my work today, although I don't see how
I broke it. It didn't use to flush the queue first, before, and worked
somehow.

Test suite for v5 is back to 100% green now.
2015-12-29 17:35:57 -04:00
Joey Hess
f3be28eedc
test suite noticed a direct mode reversion 2015-12-29 17:12:57 -04:00
Joey Hess
10ecc43790
rename 2015-12-29 17:02:14 -04:00
Joey Hess
996ae9b172
don't disable smudge filter while merging
The smudge filter does need to be run, because if the key is in the local
annex already (due to renaming, or a copy of a file added, or a new file
added and its content has already arrived), git merge smudges the file and
this should provide its content.

This does probably mean that in merge conflict resolution, git smudges the
existing file, re-copying all its content to it, and then the file is
deleted. So, not efficient.
2015-12-29 16:36:21 -04:00
Joey Hess
24bbaa2346
avoid renaming file when auto-resolving conflict in annex pointer
This is a behavior change for merge conflicts between locked files
that both pointed to the same key, in different ways.
Before, the conflict was resolved, but the file was renamed to .variant.
This was unnecessary, because there was only one variant.

Of course, this also handles conflicts between unlocked and locked, or even
two unlocked files with different pointer contents.
2015-12-29 16:35:34 -04:00
Joey Hess
2e9341a47d
fix inode cache consistency bug when a merge unlocks a present file
Since the file was present and locked, its annex object was not in the
inode cache. So, despite not needing to update the annex object when the
clean filter is run on the content by git merge, it does need to record the
inode cache of the annex object. Otherwise, the annex object will be
assumed to be bad, since its inode is not cached.
2015-12-29 16:26:27 -04:00
Joey Hess
b6b34f4916
automatic conflict resolution for v6 unlocked files
Several tricky parts:

* When the conflict is just between the same key being locked and unlocked,
  the unlocked version wins, and the file is not renamed in this case.

* Need to update associated file map when conflict resolution renames
  an unlocked file.

* git merge runs the smudge filter on the conflicting file, and actually
  overwrites the file with the same content it had before, and so
  invalidates its inode cache. This makes it difficult to know when it's
  safe to remove such files as conflict cruft, without going so far as to
  compare their entire contents.

  Dealt with this by preventing the smudge filter from populating the file
  when a merge is run. However, that also prevents the smudge filter being
  run for non-conflicting files, so eg moving a file won't put its new
  content into place.

* Ideally, if a merge or a merge conflict resolution renames an unlocked
  file, the file in the work tree can just be moved, rather than copying
  the content to a new worktree file.

  This is attempted to be done in merge conflict resolution, but
  due to git merge's behavior of running smudge filters, what actually
  seems to happen is the old worktree file with the content is deleted and
  rewritten as a pointer file, so doesn't get reused.

So, this is probably not as efficient as it optimally could be.
If that becomes a problem, could look into running the merge in a separate
worktree and updating the real worktree more efficiently, similarly to the
direct mode merge. However, the direct mode merge had a lot of bugs, and
I'd rather not use that more error-prone method unless really needed.
2015-12-29 15:41:09 -04:00
Joey Hess
645833774d
fix windows build 2015-12-28 12:44:04 -04:00
Joey Hess
121f5d5b0c
annex.thin
Decided it's too scary to make v6 unlocked files have 1 copy by default,
but that should be available to those who need it. This is consistent with
git-annex not dropping unused content without --force, etc.

* Added annex.thin setting, which makes unlocked files in v6 repositories
  be hard linked to their content, instead of a copy. This saves disk
  space but means any modification of an unlocked file will lose the local
  (and possibly only) copy of the old version.
* Enable annex.thin by default on upgrade from direct mode to v6, since
  direct mode made the same tradeoff.
* fix: Adjusts unlocked files as configured by annex.thin.
2015-12-27 15:59:59 -04:00
Joey Hess
54f87ef95f
get associated files from Keys database 2015-12-26 15:09:53 -04:00
Joey Hess
7593917147
cleanup 2015-12-26 15:09:47 -04:00
Joey Hess
289a3592c3
support v6 unlocked files
This optimisation was not necessary, and didn't work for v6 unlocked files.
Typically only a small number of files will be changed by a commit, so just
catKey them all.
2015-12-26 15:04:26 -04:00
Joey Hess
60c36ef6ba
make views work with v6 unlocked files
Have to only use the view index in one place; lookupFile was failing for
unlocked files because it was run using the view index, which was empty.
2015-12-26 14:52:58 -04:00
Joey Hess
49fca49991
remove dead code 2015-12-26 14:45:07 -04:00
Joey Hess
f324ad24c1
improve comment 2015-12-26 13:47:36 -04:00
Joey Hess
0c03629173
clean up cruft in assistant fast rename code path 2015-12-22 18:03:47 -04:00
Joey Hess
d8a8c77a8f
move cleanOldKey into ingest 2015-12-22 16:55:49 -04:00
Joey Hess
cfaac52b88
populate unlocked files with newly available content when ingesting
This can happen when ingesting a new file in either locked or unlocked
mode, when some unlocked files in the repo use the same key, and the
content was not locally available before.
2015-12-22 16:22:28 -04:00
Joey Hess
4f60234690
finish v6 support for assistant
Seems to basically work now!
2015-12-22 15:23:27 -04:00
Joey Hess
4392140946
make linkAnnex detect when the file changes as it's being copied/linked in
This fixes a race where the modified file ended up in annex/objects, and
the InodeCache stored in the database was for the modified version, so
git-annex didn't know it had gotten modified.

The race could occur when the smudge filter was running; now it gets the
InodeCache before generating the Key, which avoids the race.
2015-12-22 15:20:03 -04:00
Joey Hess
8e9608d7f0
refactoring
no behavior changes
2015-12-22 13:42:58 -04:00
Joey Hess
ca2c977704
wip v6 support for assistant
Files are not yet added to v6 repos in unlocked mode.
2015-12-21 18:41:15 -04:00
Joey Hess
35f6a78b66
fix reversion in v5 git-annex add of unlocked file
In v5, lookupFile is supposed to only look at symlinks on disk (except when
in direct mode).

Note that v6 also has a bug when a locked file's symlink is deleted and is
replaced with a new file. It sees that a link is staged and gets that
key.
2015-12-16 14:27:12 -04:00
Joey Hess
38a23928e9
temporarily remove cached keys database connection
The problem is that shutdown is not always called, particularly in the test
suite. So, a database connection would be opened, possibly some changes
queued, and then not shut down.

One way this can happen is when using Annex.eval or Annex.run with a new
state. A better fix might be to make both of them call Keys.shutdown
(and be sure to do it even if the annex action threw an error).

Complication: Sometimes they're run reusing an existing state, so shutting
down a database connection could cause problems for other users of that
same state. I think this would need a MVar holding the database handle,
so it could be emptied once shut down, and another user of the database
connection could then start up a new one if it got shut down. But, what if
2 threads were concurrently using the same database handle and one shut it
down while the other was writing to it? Urgh.

Might have to go that route eventually to get the database access to run
fast enough. For now, a quick fix to get the test suite happier, at the
expense of speed.
2015-12-16 14:05:26 -04:00
Joey Hess
7d0e79b9e1
Use git-annex init --version=6 to get v6 for now
Not ready to make it default because of the direct mode upgrade needing to
all happen at once.
2015-12-15 17:17:13 -04:00
Joey Hess
f9d077186a
implemented upgrade of direct mode repo to v6 2015-12-15 16:00:26 -04:00
Joey Hess
cdd27b8920
reorg 2015-12-15 15:34:28 -04:00
Joey Hess
2bc920e266
update inode cache to cover file even when nothing needs to be done to linkAnnex
This covers the case where multiple files have the same content and are
added with git add. Previously only the one that was linked to the annex
got its inode cached; now both are.
2015-12-15 13:02:33 -04:00
Joey Hess
1dad3af3fc
checked getKeysPresent; it's ok for v6 unlocked files
When a v6 unlocked files is removed from the work tree,
unused doesn't show it. When it gets removed from the index,
unused does show it. This is the same as a locked file.
2015-12-11 16:12:42 -04:00
Joey Hess
7790e059b2
finish v6 git-annex lock
This was a doozy!
2015-12-11 15:28:34 -04:00
Joey Hess
50e83b606c
only make 1 hardlink max between pointer file and annex object
If multiple files point to the same annex object, the user may want to
modify them independently, so don't use a hard link.

Also, check diskreserve when copying.
2015-12-11 14:00:21 -04:00
Joey Hess
c608a752a5
Merge branch 'master' into smudge 2015-12-11 13:50:31 -04:00
Joey Hess
abd66c7089
fsck: Failed to honor annex.diskreserve when checking a remote. 2015-12-11 13:50:27 -04:00
Joey Hess
c910b4e255
wip 2015-12-11 10:42:18 -04:00
Joey Hess
9dffd3d255
add generalized linkAnnex' 2015-12-10 16:08:19 -04:00
Joey Hess
06a8256bf6
always format pointer file with a trailing newline
Before the smudge filter added a trailing newline, but other things that
wrote formatPointer to a file did not.

also some new pointer staging code to use later
2015-12-10 16:06:58 -04:00
Joey Hess
f80a3d8cd0
check InodeCache in inAnnex et al
This avoids querying the database when the content file doen't exist
(or otherwise fails the provided check). However, it does add overhead of
querying the database, and will certianly impact performance.
2015-12-10 14:51:04 -04:00
Joey Hess
2b8f6b8b2f
check inode cache in prepSendAnnex
This does mean one query of the database every time an object is sent.
May impact performance.
2015-12-10 14:50:52 -04:00
Joey Hess
3b2a7f216d
move 2015-12-10 14:20:38 -04:00
Joey Hess
3719d1b390
make clear when code is using deprecated direct mode files 2015-12-09 19:43:15 -04:00
Joey Hess
aa88851ec1
reorder 2015-12-09 19:38:37 -04:00
Joey Hess
ce73a96e4e
use InodeCache when dropping a key to see if a pointer file can be safely reset
The Keys database can hold multiple inode caches for a given key. One for
the annex object, and one for each pointer file, which may not be hard
linked to it.

Inode caches for a key are recorded when its content is added to the annex,
but only if it has known pointer files. This is to avoid the overhead of
maintaining the database when not needed.

When the smudge filter outputs a file's content, the inode cache is not
updated, because git's smudge interface doesn't let us write the file. So,
dropping will fall back to doing an expensive verification then. Ideally,
git's interface would be improved, and then the inode cache could be
updated then too.
2015-12-09 17:54:54 -04:00
Joey Hess
5e8c628d2e
add inode cache to the db
Renamed the db to keys, since it is various info about a Keys.

Dropping a key will update its pointer files, as long as their content can
be verified to be unmodified. This falls back to checksum verification, but
I want it to use an InodeCache of the key, for speed. But, I have not made
anything populate that cache yet.
2015-12-09 17:00:37 -04:00