Commit graph

34172 commits

Author SHA1 Message Date
Joey Hess
e412129523
concurrency and status messages when downloading from import 2019-03-08 12:33:44 -04:00
Joey Hess
ee5f1422df
remove debug print 2019-03-07 16:08:58 -04:00
Joey Hess
e3a704224f
fix export db locking deadlock 2019-03-07 16:06:02 -04:00
Joey Hess
7e35c81ada
locking problem 2019-03-07 15:22:23 -04:00
Joey Hess
4efd431136
remove obsolete TODO
updateExportDb runs addExportedLocation
2019-03-07 15:11:24 -04:00
Joey Hess
9a72785307
fixes to export db lookup when accessing importtree=yes
Now in a fresh clone with a importtree=yes remote enabled,
git annex fsck --from the remote works.
2019-03-07 14:10:56 -04:00
Joey Hess
93025dd59f
add missing locking of ContentIdentifier database when writing
This is not super efficient; it would be better to lock the database
once and build up a queue of changes and flush once.

But, storeExportWithContentIdentifier is likely going to be the really
expensive part, so let's do the simple thing and only optimise later if
needed.
2019-03-07 13:32:33 -04:00
Joey Hess
3f449f845e
update 2019-03-07 13:28:18 -04:00
Joey Hess
50797ee2c5
remove obsolete comment 2019-03-07 13:02:46 -04:00
Joey Hess
71fec9060c
move 2019-03-07 12:56:40 -04:00
Joey Hess
68d1661251
cross-repo import now working correctly 2019-03-07 12:31:35 -04:00
Joey Hess
ee251b2e2e
implement updating the ContentIdentifier db with info from the git-annex branch
untested

This won't be super slow, but it does need to diff two likely large
trees, and since the git-annex branch rarely sits still, it will most
likely be run at the beginning of every import.

A possible speed improvement would be to only run this when the database
did not contain a ContentIdentifier. But that would only speed up
imports when there is no new version of a file on the special remote,
at most renames of existing files being imported.

A better speed improvement would be to record something in the git-annex
branch that indicates when an import has been run, and only do the diff
if the git-annex branch has record of a newer import than we've seen
before. Then, it would only run when there is in fact new
ContentIdentifier information available from a remote. Certianly doable,
but didn't want to complicate things yet.
2019-03-06 18:04:30 -04:00
Joey Hess
12e4906657
refactor
locationLogFileKey had an out of date list of toplevel log files to skip
over, and was only not broken because the other toplevel log files don't
look like keys. Fixed that too.
2019-03-06 17:54:29 -04:00
Joey Hess
dec30d2b14
updates
Note that I tried an evil remote that lists ImportLocations with
../../../ in them and indeed this resulted in git blowing up and the
import failing, and not writing outside the repo.
2019-03-06 17:07:36 -04:00
Joey Hess
8e9713b769
add export+import test case 2019-03-06 16:49:33 -04:00
Joey Hess
3b412aaae0
simplify Applicative instance 2019-03-06 16:44:17 -04:00
Joey Hess
1b3d04979e
speed up slow quickcheck test
Only test parsing of ContentIdentifier lists, not the whole log.
2019-03-06 16:43:41 -04:00
Joey Hess
c6c5f6336b
avoid whitespace in Arbitrary UUID and empty UUID 2019-03-06 15:44:27 -04:00
Joey Hess
6ef38df881
fix another parser bug 2019-03-06 14:51:31 -04:00
Joey Hess
b3d30e7d70
remove unncessary locking of ContentIdentifier db
Remote.Helper.ExportImport only reads from it, and locking is only
needed when writing.
2019-03-06 14:36:57 -04:00
Joey Hess
c0bd202147
fix failing test case
An empty list of [ContentIdenfier] serialized to the same thing
as a single ContentIdentifier "". Avoid this ambiguity by requiring the
list be non-empty.
2019-03-06 14:27:15 -04:00
Joey Hess
be6085cfe5
fix option parser
Alternative doesn't combine the subparsers the way I wanted.
Unfortunately this new parser has suboptimal usage because everything is
all jumbled together.
2019-03-06 13:10:29 -04:00
Joey Hess
b0fe4916b7
forgot to change list delimiter in parser 2019-03-06 11:59:42 -04:00
Joey Hess
f957f64278
add todo 2019-03-06 11:24:06 -04:00
Joey Hess
f85f06aae3
change to more efficient IKey 2019-03-06 11:14:33 -04:00
Joey Hess
0db393d82f
add bug 2019-03-05 17:19:26 -04:00
Joey Hess
b23c301820
fix false positive from checkPresentExportWithContentIdentifierM when file does not exist 2019-03-05 17:04:00 -04:00
Joey Hess
5767b1b00d
avoid updating tracking branch when transfer to export throws exception 2019-03-05 16:51:13 -04:00
Joey Hess
dc278c059c
fix STM crash
git-annex: thread blocked indefinitely in an STM transaction
failed

git-annex: sqlite query crashed
CallStack (from HasCallStack):
  error, called at ./Database/Handle.hs:98:42 in main:Database.Handle
failed

This needs further investigation.
2019-03-05 16:37:40 -04:00
Joey Hess
46d33e804a
added checkPresentExportWithContentIdentifier
Ugh, don't like needing to add this, but I can't see a way around it.
2019-03-05 16:03:03 -04:00
Joey Hess
3c652e1499
limit to requested remote 2019-03-05 15:56:28 -04:00
Joey Hess
354aafce1a
refactor database handle code
Use same, simpler method to make only one thread open the export db as
is used for the ContentIdentifier db.

And, always update the export db once before using.
2019-03-05 15:42:39 -04:00
Joey Hess
fd2a1aaa17
avoid using renameExport on import remotes 2019-03-05 14:57:48 -04:00
Joey Hess
9df9a3f82b
more todo 2019-03-05 14:55:22 -04:00
Joey Hess
f7be2e9d37
document how conflicts are handled with imports 2019-03-05 14:55:06 -04:00
Joey Hess
bec66258a8
minor 2019-03-05 14:50:39 -04:00
Joey Hess
8c54604e67
import+export from directory special remote fully working
Had to add two more API calls to override export APIs that are not safe
for use in combination with import.

It's unfortunate that removeExportDirectory is documented to be allowed
to remove non-empty directories. I'm not entirely sure why it's that
way, my best guess is it was intended to make it easy to implement with
just rm -rf.
2019-03-05 14:20:14 -04:00
Joey Hess
554b7b7f3e
fix todo 2019-03-04 18:20:12 -04:00
Joey Hess
bc509143e5
avoid opening export db until needed
Before, it was opened when constructing the export Remote, even if it
never got used.
2019-03-04 18:11:32 -04:00
Joey Hess
cd3a2b023a
initial try at using storeExportWithContentIdentifier
Untested, and I'm not sure about the locking of the ContentIdentifier db.
2019-03-04 17:50:41 -04:00
Joey Hess
b67fa2180e
add getExportedKey
Not optimised because that would need transition code to be written for
existing export datbases.
2019-03-04 17:27:10 -04:00
Joey Hess
138d07eb97
add getContentIdentifiers
Changed the database schema for this, with a new index.
2019-03-04 16:48:07 -04:00
Joey Hess
00722ba1f8
lock before writing to the ContentIdentifier db 2019-03-04 16:47:30 -04:00
Joey Hess
aaacf431d8
handle importtree=yes config
For now, it's only allowed when exporttree=yes is also set.
That simplified the implementation, but could later be changed if
there's a remote that makes sense to be an import but not an export.
However, it may work just as well to make a remote be readonly to
prevent export to it while still allowing import.
2019-03-04 16:07:35 -04:00
Joey Hess
5f17a9cc50
docs for importtree config 2019-03-04 15:39:19 -04:00
Joey Hess
88ccfaa78c
storeExportWithContentIdentifierM for directory special remote
Not sure if my reasoning about the races really holds.

It would certianly be possible to better guard against races by using
Linux-specific renameat2 with RENAME_EXCHANGE or RENAME_NOREPLACE.

Or by using link and relying on it not overwriting existing files -- but
that would need a filesystem that supports hard links and directory can
be used in filesystems that don't.
2019-03-04 14:46:25 -04:00
Joey Hess
1ec9e1494c
use relatedTempate in viaTmp 2019-03-04 14:12:00 -04:00
Joey Hess
3cd19fb4d0
use InodeCache to avoid races in import from directory special remote
This does not avoid all possible races, but it does avoid all likely
ones, and is demonstratably better than git's own handling of races
where files get modified at the same time as it's updating the working
tree.

The main thing this won't detect are not unlikely races where part
of a file gets changed while it's being copied and then the file is
restored to its original condition before the modification check.
No, it's more likely that the limitations of checking inode, size,
and mtime won't detect certian modifications, involving eg mmapped
files.
2019-03-04 13:57:23 -04:00
Joey Hess
51fc969b66
notes 2019-03-01 16:44:34 -04:00
Joey Hess
18d7a1dbbb
make export and sync update special remote tracking branch
The branch is only updated once the export is 100% complete. This way,
if an export is started but interrupted and so the remote does not yet
contain some of the files, an import will make a commit on the old
branch, and so won't delete the missing files.
2019-03-01 16:35:48 -04:00