git-annex

Author	SHA1	Message	Date
Joey Hess	04d4830ac3	add catCommit	2016-02-25 15:34:46 -04:00
Joey Hess	be2e9427ad	refactor	2016-02-25 13:46:31 -04:00
Joey Hess	a5bf674bec	Avoid crashing when built with MagicMime support, but when the magic database cannot be loaded.	2016-02-23 14:39:56 -04:00
Joey Hess	b0081598c7	Fix memory leak in last release, which affected commands like git-annex status when a large non-annexed file is present in the work tree. The whole file was strictly read, and so buffered in memory, and remained buffered for some time when running git-annex status.	2016-02-19 14:45:26 -04:00
Joey Hess	3fba4f83ed	fix windows build	2016-02-16 16:15:32 -04:00
Joey Hess	aa569500d5	fix numerous problem with test suite on crippled filesystems etc	2016-02-16 15:30:59 -04:00
Joey Hess	15148ee9eb	annex.addunlocked * add, addurl, import, importfeed: When in a v6 repository on a crippled filesystem, add files unlocked. * annex.addunlocked: New configuration setting, makes files always be added unlocked. (v6 only)	2016-02-16 14:43:43 -04:00
Joey Hess	adc27f081a	escape slashes in annex pointer files The problem with having the slashes unescaped is, it broke parsing, since the parser takes the filename to get the part containing the key. That particularly affected URL keys. This makes the format be the same as symlinks point to, which keeps things simple. Existing pointer files will continue to work ok.	2016-02-16 14:10:08 -04:00
Joey Hess	7899f7248a	force strict file read Avoid possibly having the file open still when it gets deleted. Needed on Windows, particularly.	2016-02-15 16:47:34 -04:00
Joey Hess	4d89a1ffd1	allow \r in pointer files git-annex doesn't write \r, but it can be present due to line ending conversions or perhaps user edits.	2016-02-15 16:37:40 -04:00
Joey Hess	f9d79d194b	Windows: Fix v6 unlocked files to actually work. Pointer files were not being treated as annex content, so "git annex get" didn't replace them with the object.	2016-02-15 16:12:18 -04:00
Joey Hess	2e3b5e645f	When initializing a v6 repo on a crippled filesystem, don't force it into direct mode.	2016-02-15 15:41:49 -04:00
Joey Hess	540a0343ba	more windows build fix	2016-02-15 15:03:44 -04:00
Joey Hess	f55c576923	fix windows build	2016-02-15 14:58:45 -04:00
Joey Hess	40207b26ea	move old ghc compat code into separate module; eliminate WITH_CLIBS This avoids hsc2hs being run except when building for the old version of ghc. Should speed up builds.	2016-02-15 11:47:33 -04:00
Joey Hess	0983f136b8	create directory for transfer lock file, and catch perm error Before, the call to mkProgressUpdater created the directory as a side-effect, but since that ignored failure to create it, this led to a "does not exist" exception when the transfer lock file was created, rather than a permissions error. So, make sure the directory exists before trying to lock the file in it. When a PermissionDenied exception is caught, skip making the transfer lock. This lets downloads from readonly remotes happen. If an upload is being tried, and the lock file can't be written due to permissions, then probably the actual transfer will fail for the same reason, so I think it's ok that it continues w/o taking the lock in that case.	2016-02-12 14:11:25 -04:00
Joey Hess	17c97434f2	init: Fix bugs in submodule .git symlink fixup, that occurred when initializing in a subdirectory of a submodule and a submodule of a submodule.	2016-02-08 15:41:27 -04:00
Joey Hess	23cc315c38	matchexpression: Added --largefiles option to parse an annex.largefiles expression.	2016-02-03 16:58:36 -04:00
Joey Hess	5127cb59cc	annex.largefiles: Add support for mimetype=text/* etc, when git-annex is linked with libmagic.	2016-02-03 16:29:34 -04:00
Joey Hess	403b56fb91	Limit annex.largefiles parsing to the subset of preferred content expressions that make sense in its context. So, not "standard" or "lackingcopies", etc.	2016-02-03 15:04:42 -04:00
Joey Hess	cdf5977053	simplify	2016-02-03 13:23:34 -04:00
Joey Hess	5d9c7a1164	refactor	2016-02-03 13:08:15 -04:00
Joey Hess	aded00c5f0	avoid unnecessary building of a one-off Map A case lookup should be more efficient.	2016-02-03 12:59:28 -04:00
Joey Hess	d37fe6a547	annex.largefiles can be configured in .gitattributes too This is particulary useful for v6 repositories, since the .gitattributes configuration will apply in all clones of the repository.	2016-02-02 15:18:17 -04:00
Joey Hess	e8fc2ff27c	add "nothing" to preferred content DSL Same as "not anything"; will be particularly useful in annex.largefiles gitattributes.	2016-02-02 14:42:13 -04:00
Gabor Greif	daf8aa76fe	Unneded constraint	2016-01-28 12:34:07 -04:00
Gabor Greif	50e4ec36c7	Another redundant constraint	2016-01-28 12:34:07 -04:00
Joey Hess	710d44a16e	add the known associated file to the list of others	2016-01-26 14:48:19 -04:00
Joey Hess	039e83ed5d	Fix nasty reversion in the last release that broke sync --content's handling of many preferred content expressions. The type checker should have noticed this, but the changes to mapM that make it accept any Traversable hid the fact that it was not being passed a list at all. Thus, what should have returned an empty list most of the time instead returned [""] which was treated as the name of the associated file, with disasterout consequences. When I have time, I should add a test case checking what sync --content drops. I should also consider replacing mapM with one re-specialized to lists.	2016-01-26 14:28:43 -04:00
Joey Hess	23ff58cd4f	optimise getUUID This avoids a Map lookup each time it's called, instead the GitConfig field lazily looks it up once and then caches.	2016-01-20 16:55:06 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	b52cf5697b	immediate queue flushing when annex.queuesize=1 Previously, it only flushed when the queue got larger than 1. Also, make the queue auto-flush when items are added, rather than needing to be flushed as a separate step. This simplifies the code and make it more efficient too, as it avoids needing to read the queue out of the state to check if it should be flushed.	2016-01-13 14:55:01 -04:00
Joey Hess	bafcbe95c3	fix one more test failure with v6 unlocked file merge conflict resolution	2016-01-08 15:23:15 -04:00
Joey Hess	51bc32e21e	better fix for slash in view metadata The homomorphs are back, just encoded such that it doesn't crash in LANG=C However, I noticed a bug in the old escaping; [pseudoSlash] was escaped the same as ['/','/']. Fixed by using '%' to escape pseudoSlash. Which requires doubling '%' to escape it, but that's already done in the escaping of worktree filenames in a view, so is probably ok.	2016-01-08 13:55:35 -04:00
Joey Hess	42619e2231	view: Avoid using cute unicode homomorphs for '/' and '\' and instead use ugly escaping, as the unicode method doesn't work on non-unicode supporting systems.	2016-01-08 12:45:32 -04:00
Joey Hess	4b819bee2b	avoid confusing git with a modified ctime in clean filter Linking the file to the tmp dir was not necessary in the clean filter, and it caused the ctime to change, which caused git to think the file was changed. This caused git status to get slow as it kept re-cleaning unchanged files.	2016-01-07 17:48:04 -04:00
Joey Hess	3b960d1422	migrate and rekey v6 unlocked file support	2016-01-07 15:14:15 -04:00
Joey Hess	0b59fb423e	migrate: Copy over metadata to new key.	2016-01-07 14:21:12 -04:00
Joey Hess	b3d60ca285	use TopFilePath for associated files Fixes several bugs with updates of pointer files. When eg, running git annex drop --from localremote it was updating the pointer file in the local repository, not the remote. Also, fixes drop ../foo when run in a subdir, and probably lots of other problems. Test suite drops from ~30 to 11 failures now. TopFilePath is used to force thinking about what the filepath is relative to. The data stored in the sqlite db is still just a plain string, and TopFilePath is a newtype, so there's no overhead involved in using it in DataBase.Keys.	2016-01-05 17:22:19 -04:00
Joey Hess	f36f24197a	scan for unlocked files on init/upgrade of v6 repo	2016-01-01 15:09:42 -04:00
Joey Hess	a2c056df65	convert isPointerFile from Annex to IO	2016-01-01 13:22:38 -04:00
Joey Hess	829ae91009	fix failing git-annex unused test case in v6 WorkTree.lookupFile was finding a key for a file that's deleted from the work tree, which is different than the v5 behavior (though perhaps the same as the direct mode behavior). Fix by checking that the work tree file exists before catting its key. Hopefully this won't slow down much, probably the catKey is much more expensive. I can't see any way to optimise this, except perhaps to make Command.Unused check if work tree files exist before/after calling lookupFile. But, it seems better to make lookupFile really only find keys for worktree files; that's what it's intended to do.	2015-12-30 14:23:31 -04:00
Joey Hess	5057fffccd	flush queue before cleaning cruft Else, queued file stages won't have reached the index, and it won't find everthing. This evidently fixes a reversion in my work today, although I don't see how I broke it. It didn't use to flush the queue first, before, and worked somehow. Test suite for v5 is back to 100% green now.	2015-12-29 17:35:57 -04:00
Joey Hess	f3be28eedc	test suite noticed a direct mode reversion	2015-12-29 17:12:57 -04:00
Joey Hess	10ecc43790	rename	2015-12-29 17:02:14 -04:00
Joey Hess	996ae9b172	don't disable smudge filter while merging The smudge filter does need to be run, because if the key is in the local annex already (due to renaming, or a copy of a file added, or a new file added and its content has already arrived), git merge smudges the file and this should provide its content. This does probably mean that in merge conflict resolution, git smudges the existing file, re-copying all its content to it, and then the file is deleted. So, not efficient.	2015-12-29 16:36:21 -04:00
Joey Hess	24bbaa2346	avoid renaming file when auto-resolving conflict in annex pointer This is a behavior change for merge conflicts between locked files that both pointed to the same key, in different ways. Before, the conflict was resolved, but the file was renamed to .variant. This was unnecessary, because there was only one variant. Of course, this also handles conflicts between unlocked and locked, or even two unlocked files with different pointer contents.	2015-12-29 16:35:34 -04:00
Joey Hess	2e9341a47d	fix inode cache consistency bug when a merge unlocks a present file Since the file was present and locked, its annex object was not in the inode cache. So, despite not needing to update the annex object when the clean filter is run on the content by git merge, it does need to record the inode cache of the annex object. Otherwise, the annex object will be assumed to be bad, since its inode is not cached.	2015-12-29 16:26:27 -04:00
Joey Hess	b6b34f4916	automatic conflict resolution for v6 unlocked files Several tricky parts: * When the conflict is just between the same key being locked and unlocked, the unlocked version wins, and the file is not renamed in this case. * Need to update associated file map when conflict resolution renames an unlocked file. * git merge runs the smudge filter on the conflicting file, and actually overwrites the file with the same content it had before, and so invalidates its inode cache. This makes it difficult to know when it's safe to remove such files as conflict cruft, without going so far as to compare their entire contents. Dealt with this by preventing the smudge filter from populating the file when a merge is run. However, that also prevents the smudge filter being run for non-conflicting files, so eg moving a file won't put its new content into place. * Ideally, if a merge or a merge conflict resolution renames an unlocked file, the file in the work tree can just be moved, rather than copying the content to a new worktree file. This is attempted to be done in merge conflict resolution, but due to git merge's behavior of running smudge filters, what actually seems to happen is the old worktree file with the content is deleted and rewritten as a pointer file, so doesn't get reused. So, this is probably not as efficient as it optimally could be. If that becomes a problem, could look into running the merge in a separate worktree and updating the real worktree more efficiently, similarly to the direct mode merge. However, the direct mode merge had a lot of bugs, and I'd rather not use that more error-prone method unless really needed.	2015-12-29 15:41:09 -04:00
Joey Hess	645833774d	fix windows build	2015-12-28 12:44:04 -04:00
Joey Hess	121f5d5b0c	annex.thin Decided it's too scary to make v6 unlocked files have 1 copy by default, but that should be available to those who need it. This is consistent with git-annex not dropping unused content without --force, etc. * Added annex.thin setting, which makes unlocked files in v6 repositories be hard linked to their content, instead of a copy. This saves disk space but means any modification of an unlocked file will lose the local (and possibly only) copy of the old version. * Enable annex.thin by default on upgrade from direct mode to v6, since direct mode made the same tradeoff. * fix: Adjusts unlocked files as configured by annex.thin.	2015-12-27 15:59:59 -04:00
Joey Hess	54f87ef95f	get associated files from Keys database	2015-12-26 15:09:53 -04:00
Joey Hess	7593917147	cleanup	2015-12-26 15:09:47 -04:00
Joey Hess	289a3592c3	support v6 unlocked files This optimisation was not necessary, and didn't work for v6 unlocked files. Typically only a small number of files will be changed by a commit, so just catKey them all.	2015-12-26 15:04:26 -04:00
Joey Hess	60c36ef6ba	make views work with v6 unlocked files Have to only use the view index in one place; lookupFile was failing for unlocked files because it was run using the view index, which was empty.	2015-12-26 14:52:58 -04:00
Joey Hess	49fca49991	remove dead code	2015-12-26 14:45:07 -04:00
Joey Hess	f324ad24c1	improve comment	2015-12-26 13:47:36 -04:00
Joey Hess	0c03629173	clean up cruft in assistant fast rename code path	2015-12-22 18:03:47 -04:00
Joey Hess	d8a8c77a8f	move cleanOldKey into ingest	2015-12-22 16:55:49 -04:00
Joey Hess	cfaac52b88	populate unlocked files with newly available content when ingesting This can happen when ingesting a new file in either locked or unlocked mode, when some unlocked files in the repo use the same key, and the content was not locally available before.	2015-12-22 16:22:28 -04:00
Joey Hess	4f60234690	finish v6 support for assistant Seems to basically work now!	2015-12-22 15:23:27 -04:00
Joey Hess	4392140946	make linkAnnex detect when the file changes as it's being copied/linked in This fixes a race where the modified file ended up in annex/objects, and the InodeCache stored in the database was for the modified version, so git-annex didn't know it had gotten modified. The race could occur when the smudge filter was running; now it gets the InodeCache before generating the Key, which avoids the race.	2015-12-22 15:20:03 -04:00
Joey Hess	8e9608d7f0	refactoring no behavior changes	2015-12-22 13:42:58 -04:00
Joey Hess	ca2c977704	wip v6 support for assistant Files are not yet added to v6 repos in unlocked mode.	2015-12-21 18:41:15 -04:00
Joey Hess	35f6a78b66	fix reversion in v5 git-annex add of unlocked file In v5, lookupFile is supposed to only look at symlinks on disk (except when in direct mode). Note that v6 also has a bug when a locked file's symlink is deleted and is replaced with a new file. It sees that a link is staged and gets that key.	2015-12-16 14:27:12 -04:00
Joey Hess	38a23928e9	temporarily remove cached keys database connection The problem is that shutdown is not always called, particularly in the test suite. So, a database connection would be opened, possibly some changes queued, and then not shut down. One way this can happen is when using Annex.eval or Annex.run with a new state. A better fix might be to make both of them call Keys.shutdown (and be sure to do it even if the annex action threw an error). Complication: Sometimes they're run reusing an existing state, so shutting down a database connection could cause problems for other users of that same state. I think this would need a MVar holding the database handle, so it could be emptied once shut down, and another user of the database connection could then start up a new one if it got shut down. But, what if 2 threads were concurrently using the same database handle and one shut it down while the other was writing to it? Urgh. Might have to go that route eventually to get the database access to run fast enough. For now, a quick fix to get the test suite happier, at the expense of speed.	2015-12-16 14:05:26 -04:00
Joey Hess	7d0e79b9e1	Use git-annex init --version=6 to get v6 for now Not ready to make it default because of the direct mode upgrade needing to all happen at once.	2015-12-15 17:17:13 -04:00
Joey Hess	f9d077186a	implemented upgrade of direct mode repo to v6	2015-12-15 16:00:26 -04:00
Joey Hess	cdd27b8920	reorg	2015-12-15 15:34:28 -04:00
Joey Hess	2bc920e266	update inode cache to cover file even when nothing needs to be done to linkAnnex This covers the case where multiple files have the same content and are added with git add. Previously only the one that was linked to the annex got its inode cached; now both are.	2015-12-15 13:02:33 -04:00
Joey Hess	1dad3af3fc	checked getKeysPresent; it's ok for v6 unlocked files When a v6 unlocked files is removed from the work tree, unused doesn't show it. When it gets removed from the index, unused does show it. This is the same as a locked file.	2015-12-11 16:12:42 -04:00
Joey Hess	7790e059b2	finish v6 git-annex lock This was a doozy!	2015-12-11 15:28:34 -04:00
Joey Hess	50e83b606c	only make 1 hardlink max between pointer file and annex object If multiple files point to the same annex object, the user may want to modify them independently, so don't use a hard link. Also, check diskreserve when copying.	2015-12-11 14:00:21 -04:00
Joey Hess	c608a752a5	Merge branch 'master' into smudge	2015-12-11 13:50:31 -04:00
Joey Hess	abd66c7089	fsck: Failed to honor annex.diskreserve when checking a remote.	2015-12-11 13:50:27 -04:00
Joey Hess	c910b4e255	wip	2015-12-11 10:42:18 -04:00
Joey Hess	9dffd3d255	add generalized linkAnnex'	2015-12-10 16:08:19 -04:00
Joey Hess	06a8256bf6	always format pointer file with a trailing newline Before the smudge filter added a trailing newline, but other things that wrote formatPointer to a file did not. also some new pointer staging code to use later	2015-12-10 16:06:58 -04:00
Joey Hess	f80a3d8cd0	check InodeCache in inAnnex et al This avoids querying the database when the content file doen't exist (or otherwise fails the provided check). However, it does add overhead of querying the database, and will certianly impact performance.	2015-12-10 14:51:04 -04:00
Joey Hess	2b8f6b8b2f	check inode cache in prepSendAnnex This does mean one query of the database every time an object is sent. May impact performance.	2015-12-10 14:50:52 -04:00
Joey Hess	3b2a7f216d	move	2015-12-10 14:20:38 -04:00
Joey Hess	3719d1b390	make clear when code is using deprecated direct mode files	2015-12-09 19:43:15 -04:00
Joey Hess	aa88851ec1	reorder	2015-12-09 19:38:37 -04:00
Joey Hess	ce73a96e4e	use InodeCache when dropping a key to see if a pointer file can be safely reset The Keys database can hold multiple inode caches for a given key. One for the annex object, and one for each pointer file, which may not be hard linked to it. Inode caches for a key are recorded when its content is added to the annex, but only if it has known pointer files. This is to avoid the overhead of maintaining the database when not needed. When the smudge filter outputs a file's content, the inode cache is not updated, because git's smudge interface doesn't let us write the file. So, dropping will fall back to doing an expensive verification then. Ideally, git's interface would be improved, and then the inode cache could be updated then too.	2015-12-09 17:54:54 -04:00
Joey Hess	5e8c628d2e	add inode cache to the db Renamed the db to keys, since it is various info about a Keys. Dropping a key will update its pointer files, as long as their content can be verified to be unmodified. This falls back to checksum verification, but I want it to use an InodeCache of the key, for speed. But, I have not made anything populate that cache yet.	2015-12-09 17:00:37 -04:00
Joey Hess	3311c48631	move InodeSentinal from direct mode code to its own module Will be used outside of direct mode for v6 unlocked files, and is already used outside of direct mode when adding files to annex.	2015-12-09 15:52:11 -04:00
Joey Hess	8a818088a3	link/copy pointer files to object content when it's added	2015-12-09 15:27:29 -04:00
Joey Hess	751120c171	avoid pre-commit hook messing up new-style unlocked files in v6 repo	2015-12-09 15:18:54 -04:00
Joey Hess	78a6b8ce05	refactor and improve pointer file handling code	2015-12-09 14:27:43 -04:00
Joey Hess	712c9fc590	require "annex/objects/" before key in pointer files This removes ambiguity, because while someone might have "WORM--foo" in a file that's not intended to be a git-annex pointer file, "annex/objects/WORM--foo" is less likely. Also, `664cc987e8` had a caveat about symlink targets being parsed as pointer files, and now the same parser is used for both. I did not include any hash directories before the key in the pointer file, as they're not needed. However, if they were included, the parser would still work ok.	2015-12-07 15:45:08 -04:00
Joey Hess	664cc987e8	support pointer files Backend.lookupFile is changed to always fall back to catKey when operating on a file that's not a symlink. catKey is changed to understand pointer files, as well as annex symlinks. Before, catKey needed a file mode witness, to be sure it was looking at a symlink. That was complicated stuff. Now, it doesn't actually care if a file in git is a symlink or not; in either case asking git for the content of the file will get the pointer to the key. This does mean that git-annex will treat a link foo -> WORM--bar as a git-annex file, and also treats a regular file containing annex/objects/WORM--bar as a git-annex file. Calling catKey could make git-annex commands need to do more work than before. This would especially be the case if a repo contained many regular files, and only a few annexed files, as now git-annex will need to ask git about the contents of the regular files.	2015-12-07 15:35:36 -04:00
Joey Hess	62a2fba1cd	Merge branch 'master' into smudge	2015-12-07 12:29:34 -04:00
Joey Hess	2936153fc4	fix temp filename Was not putting it inside the temp dir, but next to it! This was just wrong, and it led to a longer filename that desired being used, leading to some bug reports.	2015-12-06 16:54:01 -04:00
Joey Hess	6e71094e7d	avoid too long temp dir template The filename might be at or close to the filename length limit, so using it as the template for the temp dir would then fail.	2015-12-06 16:42:40 -04:00
Joey Hess	e7f75b079d	don't let git-annex direct be run in a v6 repo	2015-12-04 16:33:09 -04:00
Joey Hess	ccc49861ca	add v6; keep v5 working for now and manual upgrade Since all places where a repo is used in direct mode need to have git-annex upgraded before the repo can safely be converted to v6, the upgrade needs to be manual for now. I suppose that at some point I'll want to drop all the direct mode support code. At that point, will stop supporting v5, and will need to auto-upgrade any remaining v5 repos. If possible, I'd like to carry the direct mode support for say, a year or so, to give people plenty of time to upgrade and avoid disruption.	2015-12-04 16:14:48 -04:00
Joey Hess	34ead644d9	auto-configure filter.annex.smudge and clean on init	2015-12-04 16:14:11 -04:00
Joey Hess	983c1894eb	avoid unnecessary reading of git-annex branch data when matching on annex.largefiles This makes git annex clean not look at the git-annex branch at all, and so speeds it up by 50% or more.	2015-12-04 15:06:41 -04:00
Joey Hess	99b2a524a0	clean filter should update location log when adding new content to annex	2015-12-04 14:20:32 -04:00
Joey Hess	2c6454a2e2	basic clean filter working	2015-12-04 13:39:14 -04:00

1 2 3 4 5 ...

846 commits