Commit graph

1616 commits

Author SHA1 Message Date
Joey Hess
1499b9b79d
fix file perms after breaking hard link 2015-12-27 16:12:48 -04:00
Joey Hess
121f5d5b0c
annex.thin
Decided it's too scary to make v6 unlocked files have 1 copy by default,
but that should be available to those who need it. This is consistent with
git-annex not dropping unused content without --force, etc.

* Added annex.thin setting, which makes unlocked files in v6 repositories
  be hard linked to their content, instead of a copy. This saves disk
  space but means any modification of an unlocked file will lose the local
  (and possibly only) copy of the old version.
* Enable annex.thin by default on upgrade from direct mode to v6, since
  direct mode made the same tradeoff.
* fix: Adjusts unlocked files as configured by annex.thin.
2015-12-27 15:59:59 -04:00
Joey Hess
f776ac0a11
add unlocked flag for git-annex-shell recvkey
The direct flag is also set when sending unlocked content, to support old
versions of git-annex-shell. At some point, the direct flag will be
removed, and only the unlocked flag will be used.
2015-12-26 13:59:27 -04:00
Joey Hess
87f0708f88
persistent-sqlite is now a hard build dependency, since v6 repository mode needs it. 2015-12-26 13:00:52 -04:00
Joey Hess
7c02f070b1
lost some bookkeeping info
I forgot to convert this to use Annex.Ingest, todo later.
2015-12-24 13:15:26 -04:00
Joey Hess
39048e4568
Merge branch 'master' into smudge 2015-12-22 18:10:40 -04:00
Joey Hess
d8a8c77a8f
move cleanOldKey into ingest 2015-12-22 16:55:49 -04:00
Joey Hess
4f60234690
finish v6 support for assistant
Seems to basically work now!
2015-12-22 15:23:27 -04:00
Joey Hess
4392140946
make linkAnnex detect when the file changes as it's being copied/linked in
This fixes a race where the modified file ended up in annex/objects, and
the InodeCache stored in the database was for the modified version, so
git-annex didn't know it had gotten modified.

The race could occur when the smudge filter was running; now it gets the
InodeCache before generating the Key, which avoids the race.
2015-12-22 15:20:03 -04:00
Joey Hess
8e9608d7f0
refactoring
no behavior changes
2015-12-22 13:42:58 -04:00
Joey Hess
2dce8081a6
addurl: Added --with-files option. 2015-12-22 12:20:39 -04:00
Joey Hess
03f2ae0423
refactor 2015-12-22 11:58:59 -04:00
Joey Hess
4cf9efb51a
remove (v6) associated file in unannex 2015-12-21 18:00:48 -04:00
Joey Hess
d82b110da8
Merge branch 'master' into smudge 2015-12-21 17:12:46 -04:00
Joey Hess
a8b398c1fa
addurl: Added --batch option. 2015-12-21 12:57:13 -04:00
Joey Hess
35827e2705
status: On crippled filesystems, was displaying M for all annexed files that were present. Probably caused by a change to what git status displays in this situation. Fixed by treating files git thinks are modified the same as typechanged files. 2015-12-19 13:36:40 -04:00
Joey Hess
6b717032c5
v6: fix locking modified file when the content is not present 2015-12-16 15:35:42 -04:00
Joey Hess
2d343224dc
fix add of file that was locked but has been replaced by a new, unlocked file (v6) 2015-12-16 14:53:41 -04:00
Joey Hess
7d0e79b9e1
Use git-annex init --version=6 to get v6 for now
Not ready to make it default because of the direct mode upgrade needing to
all happen at once.
2015-12-15 17:17:13 -04:00
Joey Hess
b9588fe69e
in v6 mode, unannex does not interact badly with pre-commit hook
So can be used in a tree with staged changes, no problems. Much nicer.
2015-12-15 16:18:39 -04:00
Joey Hess
99f1d7991d
recent fsck changes caused ugly message when object was not present 2015-12-15 16:10:48 -04:00
Joey Hess
cdd27b8920
reorg 2015-12-15 15:34:28 -04:00
Joey Hess
0ddcaae9c1
changes for v6 broke fsck in direct mode 2015-12-15 14:27:20 -04:00
Joey Hess
8a660a7b14
add: In v6 mode, acts on modified files.
Same as was done in direct mode, except in v6 mode add always adds files
locked, so
2015-12-15 14:17:00 -04:00
Joey Hess
d245a80518
avoid pre-commit check having to do with v5 unlocked files when in v6 mode 2015-12-15 14:09:36 -04:00
Joey Hess
a983a3a7a2
rename stuff for v5 unlocked files to indicate it's old 2015-12-15 14:08:07 -04:00
Joey Hess
a4a813fb07
add: no need to make pass for old unlocked files in v6 2015-12-15 14:03:25 -04:00
Joey Hess
71e2050f8f
have clean filter check if the filename was already in use by an old key
The annex object for it may have been modified due to hard link, and
that should be cleaned up when the new version is added. If another
associated file has the old key's content, that's linked into the annex
object. Otherwise, update location log to reflect that content has been
lost.
2015-12-15 13:06:52 -04:00
Joey Hess
42caf42857
avoid smudge filter returning invalid content
1. git add file
2. git commit
3. modify file
4. git commit
5. git reset HEAD^

Before this fix, that resulted in git saying the file was modified. And
indeed, it didn't have the content it should in the just checked out ref,
because step 3 modified the object file for the old key.
2015-12-11 18:01:50 -04:00
Joey Hess
e7183d83d3
fsck for v6 unlocked files
This only adds 1 stat to each file fscked for locked files, so
added overhead is minimal.

For unlocked files it has to access the database to see if a file
is modified.
2015-12-11 16:07:54 -04:00
Joey Hess
7790e059b2
finish v6 git-annex lock
This was a doozy!
2015-12-11 15:28:34 -04:00
Joey Hess
50e83b606c
only make 1 hardlink max between pointer file and annex object
If multiple files point to the same annex object, the user may want to
modify them independently, so don't use a hard link.

Also, check diskreserve when copying.
2015-12-11 14:00:21 -04:00
Joey Hess
c608a752a5
Merge branch 'master' into smudge 2015-12-11 13:50:31 -04:00
Joey Hess
abd66c7089
fsck: Failed to honor annex.diskreserve when checking a remote. 2015-12-11 13:50:27 -04:00
Joey Hess
c910b4e255
wip 2015-12-11 10:42:18 -04:00
Joey Hess
e2c8dc6778
v6 git-annex unlock
Note that the implementation uses replaceFile, so that the actual
replacement of the work tree file is atomic. This seems a good property to
have!

It would be possible for unlock in v6 mode to be run on files that do not
have their content present. However, that would be a behavior change from
before, and I don't see any immediate need to support it, so I didn't
implement it.
2015-12-10 16:12:48 -04:00
Joey Hess
06a8256bf6
always format pointer file with a trailing newline
Before the smudge filter added a trailing newline, but other things that
wrote formatPointer to a file did not.

also some new pointer staging code to use later
2015-12-10 16:06:58 -04:00
Joey Hess
ce73a96e4e
use InodeCache when dropping a key to see if a pointer file can be safely reset
The Keys database can hold multiple inode caches for a given key. One for
the annex object, and one for each pointer file, which may not be hard
linked to it.

Inode caches for a key are recorded when its content is added to the annex,
but only if it has known pointer files. This is to avoid the overhead of
maintaining the database when not needed.

When the smudge filter outputs a file's content, the inode cache is not
updated, because git's smudge interface doesn't let us write the file. So,
dropping will fall back to doing an expensive verification then. Ideally,
git's interface would be improved, and then the inode cache could be
updated then too.
2015-12-09 17:54:54 -04:00
Joey Hess
5e8c628d2e
add inode cache to the db
Renamed the db to keys, since it is various info about a Keys.

Dropping a key will update its pointer files, as long as their content can
be verified to be unmodified. This falls back to checksum verification, but
I want it to use an InodeCache of the key, for speed. But, I have not made
anything populate that cache yet.
2015-12-09 17:00:37 -04:00
Joey Hess
3311c48631
move InodeSentinal from direct mode code to its own module
Will be used outside of direct mode for v6 unlocked files, and is already
used outside of direct mode when adding files to annex.
2015-12-09 15:52:11 -04:00
Joey Hess
ba39f993f5
avoid clean filter trying to annex a pointer file 2015-12-09 15:24:32 -04:00
Joey Hess
751120c171
avoid pre-commit hook messing up new-style unlocked files in v6 repo 2015-12-09 15:18:54 -04:00
Joey Hess
05b598a057
stash DbHandle in Annex state 2015-12-09 14:55:47 -04:00
Joey Hess
78a6b8ce05
refactor and improve pointer file handling code 2015-12-09 14:27:43 -04:00
Joey Hess
712c9fc590
require "annex/objects/" before key in pointer files
This removes ambiguity, because while someone might have "WORM--foo" in a
file that's not intended to be a git-annex pointer file,
"annex/objects/WORM--foo" is less likely.

Also, 664cc987e8 had a caveat about symlink
targets being parsed as pointer files, and now the same parser is used for
both.

I did not include any hash directories before the key in the pointer file,
as they're not needed. However, if they were included, the parser would
still work ok.
2015-12-07 15:45:08 -04:00
Joey Hess
664cc987e8
support pointer files
Backend.lookupFile is changed to always fall back to catKey when
operating on a file that's not a symlink.

catKey is changed to understand pointer files, as well as annex symlinks.

Before, catKey needed a file mode witness, to be sure it was looking at a
symlink. That was complicated stuff. Now, it doesn't actually care if a
file in git is a symlink or not; in either case asking git for the content
of the file will get the pointer to the key.

This does mean that git-annex will treat a link
foo -> WORM--bar as a git-annex file, and also treats
a regular file containing annex/objects/WORM--bar as a git-annex file.

Calling catKey could make git-annex commands need to do more work than
before. This would especially be the case if a repo contained many regular
files, and only a few annexed files, as now git-annex will need to ask
git about the contents of the regular files.
2015-12-07 15:35:36 -04:00
Joey Hess
2cbcb4f1a8
update associated files database on smudge and clean 2015-12-07 14:41:22 -04:00
Joey Hess
fb6ebdaae7
refactor 2015-12-04 17:18:26 -04:00
Joey Hess
e8ca01cbc0
comments 2015-12-04 16:46:00 -04:00
Joey Hess
e7f75b079d
don't let git-annex direct be run in a v6 repo 2015-12-04 16:33:09 -04:00
Joey Hess
ccc49861ca
add v6; keep v5 working for now and manual upgrade
Since all places where a repo is used in direct mode need to have git-annex
upgraded before the repo can safely be converted to v6, the upgrade needs
to be manual for now.

I suppose that at some point I'll want to drop all the direct mode support
code. At that point, will stop supporting v5, and will need to auto-upgrade
any remaining v5 repos. If possible, I'd like to carry the direct mode
support for say, a year or so, to give people plenty of time to upgrade and
avoid disruption.
2015-12-04 16:14:48 -04:00
Joey Hess
723e4e31a1
merge clean into smudge command
The git filter config can be used to map the single git-annex command to
the 2 actions, and this avoids "git annex clean" being used for this thing,
it might have a better use for that name later.
2015-12-04 15:32:47 -04:00
Joey Hess
99b2a524a0
clean filter should update location log when adding new content to annex 2015-12-04 14:20:32 -04:00
Joey Hess
ad06f8ceed
avoid commit and messages for smudge filter 2015-12-04 14:20:22 -04:00
Joey Hess
fdfda7b7bb
annex.largefiles support for clean filter 2015-12-04 14:10:18 -04:00
Joey Hess
d349693269
smudge filter working 2015-12-04 14:03:10 -04:00
Joey Hess
2c6454a2e2
basic clean filter working 2015-12-04 13:39:14 -04:00
Joey Hess
20ca89dfa3
skeleton smudge/clean filters 2015-12-04 13:03:39 -04:00
Joey Hess
37a5e2d419
dropunused: Make more robust when trying to drop an object that has already been dropped.
Before it crashed trying to lock the not-present content and prevented
dropping anything else. Instead, succeed.
2015-12-03 15:58:00 -04:00
Joey Hess
f16e235983
addurl, importfeed: Changed to honor annex.largefiles settings, when the content of the url is downloaded. (Not when using --fast or --relaxed.)
importfeed just calls addurl functions, so inherits this from it.

Note that addurl still generates a temp file, and uses that key to download
the file. It just adds it to the work tree at the end when the file is small.
2015-12-02 15:12:33 -04:00
Joey Hess
dc8099872a
import: Changed to honor annex.largefiles settings. 2015-12-02 14:49:03 -04:00
Joey Hess
c2674308c0
map: Improve display of git remotes with non-ssh urls, including http and gcrypt. 2015-11-18 15:08:55 -04:00
Joey Hess
cecf3894ff
note where map is left in --fast mode 2015-11-18 14:17:52 -04:00
Joey Hess
e97fce35a6
Display progress meter in -J mode when downloading from the web.
Including in addurl, and get --from web, but also in S3 and External
special remotes when a web url is known for content in those remotes.
2015-11-16 21:00:54 -04:00
Joey Hess
4b02af57b6
display a message in the unlikely scenario of fsking a dead repository 2015-11-10 14:44:58 -04:00
Joey Hess
cd7929034a
fsck: When fscking a dead repo, avoid incorrect "fixing location log" message.
keyLocations doesn't return locations in dead repos, but if we're fscking a
dead repo, we want to look at what locations are actually logged for it.
2015-11-10 13:59:04 -04:00
Joey Hess
53db9d0b5c
work around git check-ignore --batch bad exit status bug, and bring back import -J 2015-11-06 15:39:51 -04:00
Joey Hess
7938b87864
add: Fix error recovery rollback to not move the injested file content out of the annex back to the file, because other files may point to that same content. Instead, copy the injected file content out to recover.
That was not a data loss, but it came close!
2015-11-06 15:28:20 -04:00
Joey Hess
8ea594f565
missed adding allowConcurrentOutput here 2015-11-06 13:41:26 -04:00
Joey Hess
362ab39aad
import -J fails at the end, disable util it can be fixed 2015-11-05 18:48:46 -04:00
Joey Hess
7dc90f2225
import: Avoid very ugly error messages when the directory files are imported to is not a directort, but perhaps an annexed file. 2015-11-05 18:46:05 -04:00
Joey Hess
5db7d435e7
-J for add/addurl/import 2015-11-05 18:24:15 -04:00
Joey Hess
c4d45ef83d
drop -Jn 2015-11-04 17:13:20 -04:00
Joey Hess
3d0f41518d
parallel fsck (yes, these changes are all it takes now!) 2015-11-04 16:28:14 -04:00
Joey Hess
c0c595345c
arrange for regional output manager to run when -J is enabled
Commands that want to use it have to run their seek action inside
allowConcurrentOutput. Which seems reasonable; perhaps some future command
will want to support the -J flag but not use regions.

The region state moved from Annex to MessageState. This makes sense
organizationally, and note that some uses of onLocal use a different Annex
state, but pass the MessageState into it, which is what is needed.
2015-11-04 16:22:43 -04:00
Joey Hess
640dba43b6
enableremote: List uuids and descriptions of remotes that can be enabled, and accept either the uuid or the description in leu if the name. 2015-10-26 14:55:40 -04:00
Joey Hess
1f65de4085
improve layout and comment 2015-10-15 15:10:14 -04:00
Joey Hess
fa9333e99f
use action, not sideAction
sideAction is for things not generally related to the current action being
performed. And, it adds a newline after the side action. This was not the
right thing to use for stuff like "checksum", where doing a checksum is
part of the git annex get process, and indeed we want it to display
"(checksum...) ok"
2015-10-11 13:29:44 -04:00
Joey Hess
3b89d5a20c
implement lockContent for ssh remotes 2015-10-09 16:55:41 -04:00
Joey Hess
e392ec112f
also generate a drop safety proof for move --from remote 2015-10-09 16:16:03 -04:00
Joey Hess
b944da832b
tests and verified that the bug is fixed, in all the cases I identified 2015-10-09 15:59:42 -04:00
Joey Hess
6a72045707
fix local dropping to not require extra locking of copies, but only that the local copy be locked for removal 2015-10-09 15:48:02 -04:00
Joey Hess
b021321aae
rename constructor 2015-10-09 15:01:33 -04:00
Joey Hess
45e1a7c361
verify local copy of content with locking 2015-10-09 14:57:32 -04:00
Joey Hess
a5e74e9e64
display drop safety proofs in debug mode 2015-10-09 13:47:19 -04:00
Joey Hess
cf79dffa4c
improve drop proof code 2015-10-09 11:09:46 -04:00
Joey Hess
c75c79864d
support invalidating existing VerifiedCopys 2015-10-08 17:58:32 -04:00
Joey Hess
90f7c4b6a2
add VerifiedCopy data type
There should be no behavior changes in this commit, it just adds a more
expressive data type and adjusts code that had been passing around a [UUID]
or sometimes a Maybe Remote to instead use [VerifiedCopy].

Although, since some functions were taking two different [UUID] lists,
there's some potential for me to have gotten it horribly wrong.
2015-10-08 16:55:11 -04:00
Joey Hess
b1abe59193
add removeKey action to Remote
Not implemented for any remotes yet; probably the git remote is the only
one that will ever implement it.
2015-10-08 15:01:38 -04:00
Joey Hess
5240a9f315
git-annex-shell: Added lockcontent command, to prevent dropping of key's content. 2015-10-08 14:47:46 -04:00
Joey Hess
4d50958ed7
add lockContentShared
Also, rename lockContent to lockContentExclusive

inAnnexSafe should perhaps be eliminated, and instead use
`lockContentShared inAnnex`. However, I'm waiting on that, as there are
only 2 call sites for inAnnexSafe and it's fiddly.
2015-10-08 14:29:35 -04:00
Joey Hess
1ac79e6fe5 copy --auto was checking the wrong repo's preferred content. (--from was checking what --to should, and vice-versa.) Fixed this bug, which was introduced in version 5.20150727. 2015-10-06 17:29:44 -04:00
Joey Hess
60d382a840 avoid using print action, which is reserved for debugging 2015-10-06 15:26:42 -04:00
Joey Hess
2def1d0a23 other 80% of avoding verification when hard linking to objects in shared repo
In c6632ee5c8, it actually only handled
uploading objects to a shared repository. To avoid verification when
downloading objects from a shared repository, was a lot harder.

On the plus side, if the process of downloading a file from a remote
is able to verify its content on the side, the remote can indicate this
now, and avoid the extra post-download verification.

As of yet, I don't have any remotes (except Git) using this ability.
Some more work would be needed to support it in special remotes.

It would make sense for tahoe to implicitly verify things downloaded from it;
as long as you trust your tahoe server (which typically runs locally),
there's cryptographic integrity. OTOH, despite bup being based on shas,
a bup repo under an attacker's control could have the git ref used for an
object changed, and so a bup repo shouldn't implicitly verify. Indeed,
tahoe seems unique in being trustworthy enough to implicitly verify.
2015-10-02 14:35:12 -04:00
Joey Hess
2fb3722ce9 Do verification of checksums of annex objects downloaded from remotes.
* When annex objects are received into git repositories, their checksums are
  verified then too.
* To get the old, faster, behavior of not verifying checksums, set
  annex.verify=false, or remote.<name>.annex-verify=false.
* setkey, rekey: These commands also now verify that the provided file
  matches the key, unless annex.verify=false.
* reinject: Already verified content; this can now be disabled by
  setting annex.verify=false.

recvkey and reinject already did verification, so removed now duplicate
code from them. fsck still does its own verification, which is ok since it
does not use getViaTmp, so verification doesn't happen twice when using fsck
--from.
2015-10-01 15:56:39 -04:00
Joey Hess
b72d3fbeba rename function 2015-10-01 14:18:57 -04:00
Joey Hess
cad3349001 rename fsckKey to verifyKeyContent
No behavior changes.
2015-10-01 13:29:17 -04:00
Joey Hess
f2b6ebd502 status: Show added but not yet committed files.
Seems easy, but git ls-files can't list the right subset of files.
So, I wrote a whole new parser for git status output, and converted the
status command to use that.

There are a few other small behavior changes. The order changed. Unlocked
files show as T. In indirect mode, deleted files were not shown before, and
that's fixed. Regular files checked directly into git and modified
were not shown before, and are now.
2015-09-22 17:32:28 -04:00
Joey Hess
178826c4cb cleanup 2015-09-22 15:55:31 -04:00
Joey Hess
9e48c04d15 info: Don't allow use in a non-git-annex repository, since it uses the git-annex branch and would create it if it were missing.
I made the change to allow in 2014 without any rationalle or associated
request that I can find.
2015-09-16 12:25:43 -04:00