Commit graph

1670 commits

Author SHA1 Message Date
Joey Hess
885e54df0a
fsck: Populate unlocked files in v6 repositories whose content is present in annex/objects but didn't reach the work tree.
This also handles fixing up after cf260d9a15
2016-02-14 17:27:50 -04:00
Joey Hess
675321264f
fsck: Detect and fix missing associated file mappings in v6 repositories.
This also handles fixing up after the bad data written by
cf260d9a15.
2016-02-14 17:09:54 -04:00
Joey Hess
74bbdfa888
files with only 1 linkCount may still be unlocked
When on crippled filesystem, or without annex.thin set.
2016-02-14 17:04:09 -04:00
Joey Hess
5b51db7645
clean up 2016-02-14 16:52:43 -04:00
Joey Hess
d3130930db
avoid --batch crashing if a remote fails to be accessed 2016-02-12 16:48:03 -04:00
Joey Hess
cc4d3e3d45
checkpresentkey: Allow to be run without an explicit remote and add --batch
* checkpresentkey: Allow to be run without an explicit remote.
* checkpresentkey: Added --batch.
2016-02-12 16:43:51 -04:00
Joey Hess
23cc315c38
matchexpression: Added --largefiles option to parse an annex.largefiles expression. 2016-02-03 16:58:36 -04:00
Joey Hess
403b56fb91
Limit annex.largefiles parsing to the subset of preferred content expressions that make sense in its context.
So, not "standard" or "lackingcopies", etc.
2016-02-03 15:04:42 -04:00
Joey Hess
d37fe6a547
annex.largefiles can be configured in .gitattributes too
This is particulary useful for v6 repositories, since the .gitattributes
configuration will apply in all clones of the repository.
2016-02-02 15:18:17 -04:00
Joey Hess
9a4af2324e
Fix reversion in lookupkey, contentlocation, and examinekey which caused them to sometimes output side messages. 2016-01-29 13:20:24 -04:00
Joey Hess
7c1df36d63
annex.addsmallfiles: New option controlling what is done when adding files not matching annex.largefiles. 2016-01-28 14:04:32 -04:00
Gabor Greif
7f13e8fba6
Ord constraint redundant 2016-01-28 12:34:07 -04:00
Joey Hess
039e83ed5d
Fix nasty reversion in the last release that broke sync --content's handling of many preferred content expressions.
The type checker should have noticed this, but the changes to mapM
that make it accept any Traversable hid the fact that it was not being
passed a list at all. Thus, what should have returned an empty list most
of the time instead returned [""] which was treated as the name of the
associated file, with disasterout consequences.

When I have time, I should add a test case checking what sync --content
drops. I should also consider replacing mapM with one re-specialized to
lists.
2016-01-26 14:28:43 -04:00
Joey Hess
f051b51645
remove 3 build flags
* Removed the webapp-secure build flag, rolling it into the webapp build
  flag.
* Removed the quvi and tahoe build flags, which only adds aeson to
  the core dependencies.
* Removed the feed build flag, which only adds feed to the core
  dependencies.

Build flags have cost in both code complexity and also make Setup configure
have to work harder to find a usable set of build flags when some
dependencies are missing.
2016-01-26 08:14:57 -04:00
Joey Hess
d3ba9fe5c8
matchexpression: New plumbing command to check if a preferred content expression matches some data. 2016-01-25 16:16:18 -04:00
Joey Hess
737e45156e
remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Joey Hess
70b8cad9c8
make noMessages disable closing of json object in --json mode
This allows things like Command.Find to use noMessages and generate their
own complete json objects. Previouly, Command.Find managed that only via a
hack, which wasn't compatable with batch mode.

Only Command.Find, Command.Smudge, and Commange.Status use noMessages
currently, and none except for Command.Find are impacted by this change.

Fixes find --json --batch output
2016-01-20 14:10:13 -04:00
Joey Hess
7aac76d40e
remove unused imports 2016-01-20 13:43:58 -04:00
Joey Hess
f1daa83118
remove no longer needed noMessages
All three of these are using batch mode to drive their processing, so
there is no automatic output, and noMessages is no longer needed.
2016-01-20 13:25:22 -04:00
Joey Hess
6d79f9e755
find --batch 2016-01-20 13:04:07 -04:00
Joey Hess
1bd4809bd2
remove excess space 2016-01-20 12:51:22 -04:00
Joey Hess
9b9b5a30e1
whereis --batch 2016-01-20 12:46:00 -04:00
Joey Hess
7d0ece86f6
add --batch 2016-01-19 17:48:42 -04:00
Joey Hess
1f3358512a
refactor 2016-01-19 15:55:32 -04:00
Joey Hess
68edd308af
registerurl: Check if a remote claims the url, same as addurl does. 2016-01-19 15:46:32 -04:00
Joey Hess
80d5feefc7
addurl --json: Include field for added key
(unless the file was added directly to git due to annex.largefiles configuration.)

(Also done by add --json and import --json)
2016-01-19 12:01:00 -04:00
Joey Hess
7b8e79c0f0
add, import: Support --json output.
Include added key in output.
2016-01-19 11:56:38 -04:00
Joey Hess
aa35f5cdf7
info: Support --batch mode. 2016-01-15 15:56:47 -04:00
Joey Hess
b9f921248e
convert existing non-annexed file to non-exception 2016-01-15 14:34:33 -04:00
Joey Hess
b26ce646e4
whereis --json: Urls are now listed inside the remote that claims them, rather than all together at the end. 2016-01-15 14:16:48 -04:00
Joey Hess
1d1cb16fe0
addurl: Refuse to overwrite any existing, non-annexed file. 2016-01-13 15:09:47 -04:00
Joey Hess
1d5b70db9c
addurl: Support --json, particularly useful in --batch mode. 2016-01-13 14:25:30 -04:00
Joey Hess
423fffcd41
change keys database to use IKey type with more efficient serialization
This breaks any existing keys database!

IKey serializes more efficiently than SKey, although this limits the
use of its Read/Show instances.

This makes the keys database use less disk space, and so should be a win.

Updated benchmark:

benchmarking keys database/getAssociatedFiles from 1000 (hit)
time                 64.04 μs   (63.95 μs .. 64.13 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 64.02 μs   (63.96 μs .. 64.08 μs)
std dev              218.2 ns   (172.5 ns .. 299.3 ns)

benchmarking keys database/getAssociatedFiles from 1000 (miss)
time                 52.53 μs   (52.18 μs .. 53.21 μs)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 52.31 μs   (52.18 μs .. 52.91 μs)
std dev              734.6 ns   (206.2 ns .. 1.623 μs)

benchmarking keys database/getAssociatedKey from 1000 (hit)
time                 64.60 μs   (64.46 μs .. 64.77 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 64.74 μs   (64.57 μs .. 65.20 μs)
std dev              900.2 ns   (389.7 ns .. 1.733 μs)

benchmarking keys database/getAssociatedKey from 1000 (miss)
time                 52.46 μs   (52.29 μs .. 52.68 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 52.63 μs   (52.35 μs .. 53.37 μs)
std dev              1.362 μs   (562.7 ns .. 2.608 μs)
variance introduced by outliers: 24% (moderately inflated)

benchmarking keys database/addAssociatedFile to 1000 (old)
time                 487.3 μs   (484.7 μs .. 490.1 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 490.9 μs   (487.8 μs .. 496.5 μs)
std dev              13.95 μs   (6.841 μs .. 22.03 μs)
variance introduced by outliers: 20% (moderately inflated)

benchmarking keys database/addAssociatedFile to 1000 (new)
time                 6.633 ms   (5.741 ms .. 7.751 ms)
                     0.905 R²   (0.850 R² .. 0.965 R²)
mean                 8.252 ms   (7.803 ms .. 8.602 ms)
std dev              1.126 ms   (900.3 μs .. 1.430 ms)
variance introduced by outliers: 72% (severely inflated)

benchmarking keys database/getAssociatedFiles from 10000 (hit)
time                 65.36 μs   (64.71 μs .. 66.37 μs)
                     0.998 R²   (0.995 R² .. 1.000 R²)
mean                 65.28 μs   (64.72 μs .. 66.45 μs)
std dev              2.576 μs   (920.8 ns .. 4.122 μs)
variance introduced by outliers: 42% (moderately inflated)

benchmarking keys database/getAssociatedFiles from 10000 (miss)
time                 52.34 μs   (52.25 μs .. 52.45 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 52.49 μs   (52.42 μs .. 52.59 μs)
std dev              255.4 ns   (205.8 ns .. 312.9 ns)

benchmarking keys database/getAssociatedKey from 10000 (hit)
time                 64.76 μs   (64.67 μs .. 64.84 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 64.67 μs   (64.62 μs .. 64.72 μs)
std dev              177.3 ns   (148.1 ns .. 217.1 ns)

benchmarking keys database/getAssociatedKey from 10000 (miss)
time                 52.75 μs   (52.66 μs .. 52.82 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 52.69 μs   (52.63 μs .. 52.75 μs)
std dev              210.6 ns   (173.7 ns .. 265.9 ns)

benchmarking keys database/addAssociatedFile to 10000 (old)
time                 489.7 μs   (488.7 μs .. 490.7 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 490.4 μs   (489.6 μs .. 492.2 μs)
std dev              3.990 μs   (2.435 μs .. 7.604 μs)

benchmarking keys database/addAssociatedFile to 10000 (new)
time                 9.994 ms   (9.186 ms .. 10.74 ms)
                     0.959 R²   (0.928 R² .. 0.979 R²)
mean                 9.906 ms   (9.343 ms .. 10.40 ms)
std dev              1.384 ms   (1.051 ms .. 2.100 ms)
variance introduced by outliers: 69% (severely inflated)
2016-01-12 14:01:50 -04:00
Joey Hess
015a5e485f
add benchmarks of adding an associated file
benchmarking keys database/addAssociatedFile to 1000 (old)
time                 516.1 μs   (514.7 μs .. 517.4 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 514.0 μs   (512.1 μs .. 515.2 μs)
std dev              4.740 μs   (2.972 μs .. 7.068 μs)

benchmarking keys database/addAssociatedFile to 1000 (new)
time                 5.750 ms   (4.857 ms .. 6.885 ms)
                     0.815 R²   (0.698 R² .. 0.904 R²)
mean                 7.858 ms   (7.311 ms .. 8.421 ms)
std dev              1.684 ms   (1.383 ms .. 2.027 ms)
variance introduced by outliers: 88% (severely inflated)

benchmarking keys database/addAssociatedFile to 10000 (old)
time                 515.7 μs   (514.8 μs .. 516.5 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 515.4 μs   (513.7 μs .. 516.6 μs)
std dev              4.824 μs   (2.957 μs .. 7.167 μs)

benchmarking keys database/addAssociatedFile to 10000 (new)
time                 8.934 ms   (7.779 ms .. 10.05 ms)
                     0.868 R²   (0.751 R² .. 0.934 R²)
mean                 11.51 ms   (10.66 ms .. 12.26 ms)
std dev              2.174 ms   (1.816 ms .. 2.747 ms)
variance introduced by outliers: 82% (severely inflated)
2016-01-12 13:22:31 -04:00
Joey Hess
647b2f33af
refactor 2016-01-12 13:15:15 -04:00
Joey Hess
f9c5aa84e0
add database benchmark
The benchmark shows that the database access is quite fast indeed!
And, it scales linearly to the number of keys, with one exception,
getAssociatedKey.

Based on this benchmark, I don't think I need worry about optimising
for cases where all files are locked and the database is mostly empty.
In those cases, database access will be misses, and according to this
benchmark, should add only 50 milliseconds to runtime.

(NB: There may be some overhead to getting the database opened and locking
the handle that this benchmark doesn't see.)

joey@darkstar:~/src/git-annex>./git-annex benchmark
setting up database with 1000
setting up database with 10000
benchmarking keys database/getAssociatedFiles from 1000 (hit)
time                 62.77 μs   (62.70 μs .. 62.85 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 62.81 μs   (62.76 μs .. 62.88 μs)
std dev              201.6 ns   (157.5 ns .. 259.5 ns)

benchmarking keys database/getAssociatedFiles from 1000 (miss)
time                 50.02 μs   (49.97 μs .. 50.07 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 50.09 μs   (50.04 μs .. 50.17 μs)
std dev              206.7 ns   (133.8 ns .. 295.3 ns)

benchmarking keys database/getAssociatedKey from 1000 (hit)
time                 211.2 μs   (210.5 μs .. 212.3 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 211.0 μs   (210.7 μs .. 212.0 μs)
std dev              1.685 μs   (334.4 ns .. 3.517 μs)

benchmarking keys database/getAssociatedKey from 1000 (miss)
time                 173.5 μs   (172.7 μs .. 174.2 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 173.7 μs   (173.0 μs .. 175.5 μs)
std dev              3.833 μs   (1.858 μs .. 6.617 μs)
variance introduced by outliers: 16% (moderately inflated)

benchmarking keys database/getAssociatedFiles from 10000 (hit)
time                 64.01 μs   (63.84 μs .. 64.18 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 64.85 μs   (64.34 μs .. 66.02 μs)
std dev              2.433 μs   (547.6 ns .. 4.652 μs)
variance introduced by outliers: 40% (moderately inflated)

benchmarking keys database/getAssociatedFiles from 10000 (miss)
time                 50.33 μs   (50.28 μs .. 50.39 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 50.32 μs   (50.26 μs .. 50.38 μs)
std dev              202.7 ns   (167.6 ns .. 252.0 ns)

benchmarking keys database/getAssociatedKey from 10000 (hit)
time                 1.142 ms   (1.139 ms .. 1.146 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.142 ms   (1.140 ms .. 1.144 ms)
std dev              7.142 μs   (4.994 μs .. 10.98 μs)

benchmarking keys database/getAssociatedKey from 10000 (miss)
time                 1.094 ms   (1.092 ms .. 1.096 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 1.095 ms   (1.095 ms .. 1.097 ms)
std dev              4.277 μs   (2.591 μs .. 7.228 μs)
2016-01-12 13:07:03 -04:00
Joey Hess
d6fe7fdd7d
rekey: No longer copies over urls from the old to the new key.
It makes sense for migrate to do that, but not for this low-level (and
little used) plumbing command to.
2016-01-07 18:06:20 -04:00
Joey Hess
4b819bee2b
avoid confusing git with a modified ctime in clean filter
Linking the file to the tmp dir was not necessary in the clean
filter, and it caused the ctime to change, which caused git to think
the file was changed. This caused git status to get slow as it kept
re-cleaning unchanged files.
2016-01-07 17:48:04 -04:00
Joey Hess
3b960d1422
migrate and rekey v6 unlocked file support 2016-01-07 15:14:15 -04:00
Joey Hess
0b59fb423e
migrate: Copy over metadata to new key. 2016-01-07 14:21:12 -04:00
Joey Hess
66f3fb1ce2
unused: deal with v6 unlocked file that is implicitly ingested by git diff etc 2016-01-06 22:11:21 -04:00
Joey Hess
2e071a09b7
cleanup 2016-01-06 20:41:25 -04:00
Joey Hess
3320870bad
optimise
03cb2c8ece put a cat-file into the fast
bloomfilter generation path. Instead, add another bloom filter which diffs
from the work tree to the index.

Also, pull the sha of the changed object out of the diffs, and cat that
object directly, rather than indirecting through the filename.

Finally, removed some hacks that are unncessary thanks to the worktree to
index diff.
2016-01-06 20:38:02 -04:00
Joey Hess
b26776d92f
fix parsing of v6 unlocked file
The newline broke this ad-hoc parser; use the normal one.
2016-01-06 17:46:46 -04:00
Joey Hess
03cb2c8ece
unused: Bug fix when a new file was added to the annex, and then removed (but not git rmed). git still has the add staged in this case, so the content should not be unused and was wrongly treated as such.
So, we need to look at both the file on disk to see if it's a annex link,
and the file in the index too. lookupFile doesn't look in the index if the file
is not present on disk.
2016-01-06 16:49:41 -04:00
Joey Hess
0c1cc7789f
fix test failure locking an unlocked not present file
In v5, that was not possible, but it is in v6, and so the test was failing.

Investigating, it turns out that locking was copying the pointer file
content to the annex object despite the content not being present. So,
add a check to prevent that.
2016-01-06 16:01:52 -04:00
Joey Hess
b96cfdc094
whereis --json: Make url list be included in machine-parseable form. 2016-01-06 12:33:32 -04:00
Joey Hess
b3d60ca285
use TopFilePath for associated files
Fixes several bugs with updates of pointer files. When eg, running
git annex drop --from localremote
it was updating the pointer file in the local repository, not the remote.
Also, fixes drop ../foo when run in a subdir, and probably lots of other
problems. Test suite drops from ~30 to 11 failures now.

TopFilePath is used to force thinking about what the filepath is relative
to.

The data stored in the sqlite db is still just a plain string, and
TopFilePath is a newtype, so there's no overhead involved in using it in
DataBase.Keys.
2016-01-05 17:22:19 -04:00
Joey Hess
121659576b
info --json: Improve json for "backend usage", using a nested object with fields for each backend instead of the previous weird nested lists. This may break existing parsers of this json output, if there were any. 2016-01-01 16:33:05 -04:00
Joey Hess
09a2fcb643
info: Fix "backend usage" numbers, which were counting present keys twice.
Let's just count the referenced keys for that, and not present keys at all.
2016-01-01 16:13:16 -04:00
Joey Hess
43b1333216
switch to using main ingest code
Fixes at least one bug, in populating existing worktree files that use the
same key that's ingested.
2016-01-01 14:16:40 -04:00
Joey Hess
a2c056df65
convert isPointerFile from Annex to IO 2016-01-01 13:22:38 -04:00
Joey Hess
996ae9b172
don't disable smudge filter while merging
The smudge filter does need to be run, because if the key is in the local
annex already (due to renaming, or a copy of a file added, or a new file
added and its content has already arrived), git merge smudges the file and
this should provide its content.

This does probably mean that in merge conflict resolution, git smudges the
existing file, re-copying all its content to it, and then the file is
deleted. So, not efficient.
2015-12-29 16:36:21 -04:00
Joey Hess
b6b34f4916
automatic conflict resolution for v6 unlocked files
Several tricky parts:

* When the conflict is just between the same key being locked and unlocked,
  the unlocked version wins, and the file is not renamed in this case.

* Need to update associated file map when conflict resolution renames
  an unlocked file.

* git merge runs the smudge filter on the conflicting file, and actually
  overwrites the file with the same content it had before, and so
  invalidates its inode cache. This makes it difficult to know when it's
  safe to remove such files as conflict cruft, without going so far as to
  compare their entire contents.

  Dealt with this by preventing the smudge filter from populating the file
  when a merge is run. However, that also prevents the smudge filter being
  run for non-conflicting files, so eg moving a file won't put its new
  content into place.

* Ideally, if a merge or a merge conflict resolution renames an unlocked
  file, the file in the work tree can just be moved, rather than copying
  the content to a new worktree file.

  This is attempted to be done in merge conflict resolution, but
  due to git merge's behavior of running smudge filters, what actually
  seems to happen is the old worktree file with the content is deleted and
  rewritten as a pointer file, so doesn't get reused.

So, this is probably not as efficient as it optimally could be.
If that becomes a problem, could look into running the merge in a separate
worktree and updating the real worktree more efficiently, similarly to the
direct mode merge. However, the direct mode merge had a lot of bugs, and
I'd rather not use that more error-prone method unless really needed.
2015-12-29 15:41:09 -04:00
Joey Hess
1499b9b79d
fix file perms after breaking hard link 2015-12-27 16:12:48 -04:00
Joey Hess
121f5d5b0c
annex.thin
Decided it's too scary to make v6 unlocked files have 1 copy by default,
but that should be available to those who need it. This is consistent with
git-annex not dropping unused content without --force, etc.

* Added annex.thin setting, which makes unlocked files in v6 repositories
  be hard linked to their content, instead of a copy. This saves disk
  space but means any modification of an unlocked file will lose the local
  (and possibly only) copy of the old version.
* Enable annex.thin by default on upgrade from direct mode to v6, since
  direct mode made the same tradeoff.
* fix: Adjusts unlocked files as configured by annex.thin.
2015-12-27 15:59:59 -04:00
Joey Hess
f776ac0a11
add unlocked flag for git-annex-shell recvkey
The direct flag is also set when sending unlocked content, to support old
versions of git-annex-shell. At some point, the direct flag will be
removed, and only the unlocked flag will be used.
2015-12-26 13:59:27 -04:00
Joey Hess
87f0708f88
persistent-sqlite is now a hard build dependency, since v6 repository mode needs it. 2015-12-26 13:00:52 -04:00
Joey Hess
7c02f070b1
lost some bookkeeping info
I forgot to convert this to use Annex.Ingest, todo later.
2015-12-24 13:15:26 -04:00
Joey Hess
39048e4568
Merge branch 'master' into smudge 2015-12-22 18:10:40 -04:00
Joey Hess
d8a8c77a8f
move cleanOldKey into ingest 2015-12-22 16:55:49 -04:00
Joey Hess
4f60234690
finish v6 support for assistant
Seems to basically work now!
2015-12-22 15:23:27 -04:00
Joey Hess
4392140946
make linkAnnex detect when the file changes as it's being copied/linked in
This fixes a race where the modified file ended up in annex/objects, and
the InodeCache stored in the database was for the modified version, so
git-annex didn't know it had gotten modified.

The race could occur when the smudge filter was running; now it gets the
InodeCache before generating the Key, which avoids the race.
2015-12-22 15:20:03 -04:00
Joey Hess
8e9608d7f0
refactoring
no behavior changes
2015-12-22 13:42:58 -04:00
Joey Hess
2dce8081a6
addurl: Added --with-files option. 2015-12-22 12:20:39 -04:00
Joey Hess
03f2ae0423
refactor 2015-12-22 11:58:59 -04:00
Joey Hess
4cf9efb51a
remove (v6) associated file in unannex 2015-12-21 18:00:48 -04:00
Joey Hess
d82b110da8
Merge branch 'master' into smudge 2015-12-21 17:12:46 -04:00
Joey Hess
a8b398c1fa
addurl: Added --batch option. 2015-12-21 12:57:13 -04:00
Joey Hess
35827e2705
status: On crippled filesystems, was displaying M for all annexed files that were present. Probably caused by a change to what git status displays in this situation. Fixed by treating files git thinks are modified the same as typechanged files. 2015-12-19 13:36:40 -04:00
Joey Hess
6b717032c5
v6: fix locking modified file when the content is not present 2015-12-16 15:35:42 -04:00
Joey Hess
2d343224dc
fix add of file that was locked but has been replaced by a new, unlocked file (v6) 2015-12-16 14:53:41 -04:00
Joey Hess
7d0e79b9e1
Use git-annex init --version=6 to get v6 for now
Not ready to make it default because of the direct mode upgrade needing to
all happen at once.
2015-12-15 17:17:13 -04:00
Joey Hess
b9588fe69e
in v6 mode, unannex does not interact badly with pre-commit hook
So can be used in a tree with staged changes, no problems. Much nicer.
2015-12-15 16:18:39 -04:00
Joey Hess
99f1d7991d
recent fsck changes caused ugly message when object was not present 2015-12-15 16:10:48 -04:00
Joey Hess
cdd27b8920
reorg 2015-12-15 15:34:28 -04:00
Joey Hess
0ddcaae9c1
changes for v6 broke fsck in direct mode 2015-12-15 14:27:20 -04:00
Joey Hess
8a660a7b14
add: In v6 mode, acts on modified files.
Same as was done in direct mode, except in v6 mode add always adds files
locked, so
2015-12-15 14:17:00 -04:00
Joey Hess
d245a80518
avoid pre-commit check having to do with v5 unlocked files when in v6 mode 2015-12-15 14:09:36 -04:00
Joey Hess
a983a3a7a2
rename stuff for v5 unlocked files to indicate it's old 2015-12-15 14:08:07 -04:00
Joey Hess
a4a813fb07
add: no need to make pass for old unlocked files in v6 2015-12-15 14:03:25 -04:00
Joey Hess
71e2050f8f
have clean filter check if the filename was already in use by an old key
The annex object for it may have been modified due to hard link, and
that should be cleaned up when the new version is added. If another
associated file has the old key's content, that's linked into the annex
object. Otherwise, update location log to reflect that content has been
lost.
2015-12-15 13:06:52 -04:00
Joey Hess
42caf42857
avoid smudge filter returning invalid content
1. git add file
2. git commit
3. modify file
4. git commit
5. git reset HEAD^

Before this fix, that resulted in git saying the file was modified. And
indeed, it didn't have the content it should in the just checked out ref,
because step 3 modified the object file for the old key.
2015-12-11 18:01:50 -04:00
Joey Hess
e7183d83d3
fsck for v6 unlocked files
This only adds 1 stat to each file fscked for locked files, so
added overhead is minimal.

For unlocked files it has to access the database to see if a file
is modified.
2015-12-11 16:07:54 -04:00
Joey Hess
7790e059b2
finish v6 git-annex lock
This was a doozy!
2015-12-11 15:28:34 -04:00
Joey Hess
50e83b606c
only make 1 hardlink max between pointer file and annex object
If multiple files point to the same annex object, the user may want to
modify them independently, so don't use a hard link.

Also, check diskreserve when copying.
2015-12-11 14:00:21 -04:00
Joey Hess
c608a752a5
Merge branch 'master' into smudge 2015-12-11 13:50:31 -04:00
Joey Hess
abd66c7089
fsck: Failed to honor annex.diskreserve when checking a remote. 2015-12-11 13:50:27 -04:00
Joey Hess
c910b4e255
wip 2015-12-11 10:42:18 -04:00
Joey Hess
e2c8dc6778
v6 git-annex unlock
Note that the implementation uses replaceFile, so that the actual
replacement of the work tree file is atomic. This seems a good property to
have!

It would be possible for unlock in v6 mode to be run on files that do not
have their content present. However, that would be a behavior change from
before, and I don't see any immediate need to support it, so I didn't
implement it.
2015-12-10 16:12:48 -04:00
Joey Hess
06a8256bf6
always format pointer file with a trailing newline
Before the smudge filter added a trailing newline, but other things that
wrote formatPointer to a file did not.

also some new pointer staging code to use later
2015-12-10 16:06:58 -04:00
Joey Hess
ce73a96e4e
use InodeCache when dropping a key to see if a pointer file can be safely reset
The Keys database can hold multiple inode caches for a given key. One for
the annex object, and one for each pointer file, which may not be hard
linked to it.

Inode caches for a key are recorded when its content is added to the annex,
but only if it has known pointer files. This is to avoid the overhead of
maintaining the database when not needed.

When the smudge filter outputs a file's content, the inode cache is not
updated, because git's smudge interface doesn't let us write the file. So,
dropping will fall back to doing an expensive verification then. Ideally,
git's interface would be improved, and then the inode cache could be
updated then too.
2015-12-09 17:54:54 -04:00
Joey Hess
5e8c628d2e
add inode cache to the db
Renamed the db to keys, since it is various info about a Keys.

Dropping a key will update its pointer files, as long as their content can
be verified to be unmodified. This falls back to checksum verification, but
I want it to use an InodeCache of the key, for speed. But, I have not made
anything populate that cache yet.
2015-12-09 17:00:37 -04:00
Joey Hess
3311c48631
move InodeSentinal from direct mode code to its own module
Will be used outside of direct mode for v6 unlocked files, and is already
used outside of direct mode when adding files to annex.
2015-12-09 15:52:11 -04:00
Joey Hess
ba39f993f5
avoid clean filter trying to annex a pointer file 2015-12-09 15:24:32 -04:00
Joey Hess
751120c171
avoid pre-commit hook messing up new-style unlocked files in v6 repo 2015-12-09 15:18:54 -04:00
Joey Hess
05b598a057
stash DbHandle in Annex state 2015-12-09 14:55:47 -04:00
Joey Hess
78a6b8ce05
refactor and improve pointer file handling code 2015-12-09 14:27:43 -04:00
Joey Hess
712c9fc590
require "annex/objects/" before key in pointer files
This removes ambiguity, because while someone might have "WORM--foo" in a
file that's not intended to be a git-annex pointer file,
"annex/objects/WORM--foo" is less likely.

Also, 664cc987e8 had a caveat about symlink
targets being parsed as pointer files, and now the same parser is used for
both.

I did not include any hash directories before the key in the pointer file,
as they're not needed. However, if they were included, the parser would
still work ok.
2015-12-07 15:45:08 -04:00
Joey Hess
664cc987e8
support pointer files
Backend.lookupFile is changed to always fall back to catKey when
operating on a file that's not a symlink.

catKey is changed to understand pointer files, as well as annex symlinks.

Before, catKey needed a file mode witness, to be sure it was looking at a
symlink. That was complicated stuff. Now, it doesn't actually care if a
file in git is a symlink or not; in either case asking git for the content
of the file will get the pointer to the key.

This does mean that git-annex will treat a link
foo -> WORM--bar as a git-annex file, and also treats
a regular file containing annex/objects/WORM--bar as a git-annex file.

Calling catKey could make git-annex commands need to do more work than
before. This would especially be the case if a repo contained many regular
files, and only a few annexed files, as now git-annex will need to ask
git about the contents of the regular files.
2015-12-07 15:35:36 -04:00