git-annex

Author	SHA1	Message	Date
Joey Hess	af457d73a4	followup	2016-01-13 15:12:48 -04:00
Joey Hess	ecd0684bfc	avoid hard linking object from other repository when annex.thin is set This is simpler and less expensive than checking if the src file has a link count >= 2, and also is unlocked.	2016-01-13 14:19:31 -04:00
https://me.yahoo.com/a/EbvxpTI_xP9Aod7Mg4cwGhgjrCrdM5s-#7c0f4	50fad302a1	Added a comment: another side-effect of largefiles not being supported for my usecase	2016-01-12 17:20:08 +00:00
Joey Hess	f9c5aa84e0	add database benchmark The benchmark shows that the database access is quite fast indeed! And, it scales linearly to the number of keys, with one exception, getAssociatedKey. Based on this benchmark, I don't think I need worry about optimising for cases where all files are locked and the database is mostly empty. In those cases, database access will be misses, and according to this benchmark, should add only 50 milliseconds to runtime. (NB: There may be some overhead to getting the database opened and locking the handle that this benchmark doesn't see.) joey@darkstar:~/src/git-annex>./git-annex benchmark setting up database with 1000 setting up database with 10000 benchmarking keys database/getAssociatedFiles from 1000 (hit) time 62.77 μs (62.70 μs .. 62.85 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 62.81 μs (62.76 μs .. 62.88 μs) std dev 201.6 ns (157.5 ns .. 259.5 ns) benchmarking keys database/getAssociatedFiles from 1000 (miss) time 50.02 μs (49.97 μs .. 50.07 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 50.09 μs (50.04 μs .. 50.17 μs) std dev 206.7 ns (133.8 ns .. 295.3 ns) benchmarking keys database/getAssociatedKey from 1000 (hit) time 211.2 μs (210.5 μs .. 212.3 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 211.0 μs (210.7 μs .. 212.0 μs) std dev 1.685 μs (334.4 ns .. 3.517 μs) benchmarking keys database/getAssociatedKey from 1000 (miss) time 173.5 μs (172.7 μs .. 174.2 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 173.7 μs (173.0 μs .. 175.5 μs) std dev 3.833 μs (1.858 μs .. 6.617 μs) variance introduced by outliers: 16% (moderately inflated) benchmarking keys database/getAssociatedFiles from 10000 (hit) time 64.01 μs (63.84 μs .. 64.18 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 64.85 μs (64.34 μs .. 66.02 μs) std dev 2.433 μs (547.6 ns .. 4.652 μs) variance introduced by outliers: 40% (moderately inflated) benchmarking keys database/getAssociatedFiles from 10000 (miss) time 50.33 μs (50.28 μs .. 50.39 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 50.32 μs (50.26 μs .. 50.38 μs) std dev 202.7 ns (167.6 ns .. 252.0 ns) benchmarking keys database/getAssociatedKey from 10000 (hit) time 1.142 ms (1.139 ms .. 1.146 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.142 ms (1.140 ms .. 1.144 ms) std dev 7.142 μs (4.994 μs .. 10.98 μs) benchmarking keys database/getAssociatedKey from 10000 (miss) time 1.094 ms (1.092 ms .. 1.096 ms) 1.000 R² (1.000 R² .. 1.000 R²) mean 1.095 ms (1.095 ms .. 1.097 ms) std dev 4.277 μs (2.591 μs .. 7.228 μs)	2016-01-12 13:07:03 -04:00
Joey Hess	55ad30d1d9	update	2016-01-08 16:30:31 -04:00
Joey Hess	c96fb11a96	devblog	2016-01-07 18:03:06 -04:00
Joey Hess	722f56a99d	update	2016-01-07 15:47:19 -04:00
Joey Hess	d667a68b7e	test: Added --keep-failures option.	2016-01-06 13:44:12 -04:00
Joey Hess	50d25c186d	update	2016-01-05 17:41:46 -04:00
https://me.yahoo.com/a/EbvxpTI_xP9Aod7Mg4cwGhgjrCrdM5s-#7c0f4	1d6d7a6e16		2016-01-05 02:20:14 +00:00
Joey Hess	b1373f0d15	update	2016-01-01 16:02:11 -04:00
Joey Hess	f36f24197a	scan for unlocked files on init/upgrade of v6 repo	2016-01-01 15:09:42 -04:00
Joey Hess	b03a24dc10	update	2016-01-01 14:26:09 -04:00
Joey Hess	4be2b57606	add test: conflict resolution (mixed locked and unlocked file)	2015-12-30 16:36:39 -04:00
Joey Hess	0b8bba8031	test suite 100% pass in v6, finally! Set annex.largefiles when adding the conflicting non-annexed file, otherwise it would be added as an annexed file.	2015-12-30 15:12:45 -04:00
Joey Hess	7fd9fa3d72	update	2015-12-29 17:47:49 -04:00
Joey Hess	6aa19b7184	Merge branch 'master' of ssh://git-annex.branchable.com	2015-12-29 17:40:49 -04:00
umeboshi	813262448c	Added a comment: Link to TH	2015-12-29 20:36:42 +00:00
Joey Hess	b6b34f4916	automatic conflict resolution for v6 unlocked files Several tricky parts: * When the conflict is just between the same key being locked and unlocked, the unlocked version wins, and the file is not renamed in this case. * Need to update associated file map when conflict resolution renames an unlocked file. * git merge runs the smudge filter on the conflicting file, and actually overwrites the file with the same content it had before, and so invalidates its inode cache. This makes it difficult to know when it's safe to remove such files as conflict cruft, without going so far as to compare their entire contents. Dealt with this by preventing the smudge filter from populating the file when a merge is run. However, that also prevents the smudge filter being run for non-conflicting files, so eg moving a file won't put its new content into place. * Ideally, if a merge or a merge conflict resolution renames an unlocked file, the file in the work tree can just be moved, rather than copying the content to a new worktree file. This is attempted to be done in merge conflict resolution, but due to git merge's behavior of running smudge filters, what actually seems to happen is the old worktree file with the content is deleted and rewritten as a pointer file, so doesn't get reused. So, this is probably not as efficient as it optimally could be. If that becomes a problem, could look into running the merge in a separate worktree and updating the real worktree more efficiently, similarly to the direct mode merge. However, the direct mode merge had a lot of bugs, and I'd rather not use that more error-prone method unless really needed.	2015-12-29 15:41:09 -04:00
Joey Hess	121f5d5b0c	annex.thin Decided it's too scary to make v6 unlocked files have 1 copy by default, but that should be available to those who need it. This is consistent with git-annex not dropping unused content without --force, etc. * Added annex.thin setting, which makes unlocked files in v6 repositories be hard linked to their content, instead of a copy. This saves disk space but means any modification of an unlocked file will lose the local (and possibly only) copy of the old version. * Enable annex.thin by default on upgrade from direct mode to v6, since direct mode made the same tradeoff. * fix: Adjusts unlocked files as configured by annex.thin.	2015-12-27 15:59:59 -04:00
Joey Hess	025f284ac1	reorg	2015-12-26 15:15:02 -04:00
Joey Hess	fcb013044b	update	2015-12-26 15:13:05 -04:00
Joey Hess	4224fae71f	optimise read and write for Keys database (untested) Writes are optimised by queueing up multiple writes when possible. The queue is flushed after the Annex monad action finishes. That makes it happen on program termination, and also whenever a nested Annex monad action finishes. Reads are optimised by checking once (per AnnexState) if the database exists. If the database doesn't exist yet, all reads return mempty. Reads also cause queued writes to be flushed, so reads will always be consistent with writes (as long as they're made inside the same Annex monad). A future optimisation path would be to determine when that's not necessary, which is probably most of the time, and avoid flushing unncessarily. Design notes for this commit: - separate reads from writes - reuse a handle which is left open until program exit or until the MVar goes out of scope (and autoclosed then) - writes are queued - queue is flushed periodically - immediate queue flush before any read - auto-flush queue when database handle is garbage collected - flush queue on exit from Annex monad (Note that this may happen repeatedly for a single database connection; or a connection may be reused for multiple Annex monad actions, possibly even concurrent ones.) - if database does not exist (or is empty) the handle is not opened by reads; reads instead return empty results - writes open the handle if it was not open previously	2015-12-23 19:18:52 -04:00
Joey Hess	b3690c4499	update	2015-12-22 18:19:32 -04:00
Joey Hess	ca2c977704	wip v6 support for assistant Files are not yet added to v6 repos in unlocked mode.	2015-12-21 18:41:15 -04:00
Joey Hess	fbf6c25de5	interaction with shared clones	2015-12-17 18:46:52 -04:00
Joey Hess	e55ac3d383	update	2015-12-16 17:04:31 -04:00
Joey Hess	e61f3d1752	update todo list	2015-12-16 16:02:21 -04:00
Joey Hess	7800125783	starting to work on test suite for v6	2015-12-15 17:19:26 -04:00
Joey Hess	db8b32254c	update todo list	2015-12-15 16:07:02 -04:00
Joey Hess	f9d077186a	implemented upgrade of direct mode repo to v6	2015-12-15 16:00:26 -04:00
Joey Hess	71e2050f8f	have clean filter check if the filename was already in use by an old key The annex object for it may have been modified due to hard link, and that should be cleaned up when the new version is added. If another associated file has the old key's content, that's linked into the annex object. Otherwise, update location log to reflect that content has been lost.	2015-12-15 13:06:52 -04:00
Joey Hess	9fcc5046b3	todo	2015-12-15 12:38:32 -04:00
Joey Hess	cc2d78870c	update	2015-12-11 16:22:40 -04:00
Joey Hess	1dad3af3fc	checked getKeysPresent; it's ok for v6 unlocked files When a v6 unlocked files is removed from the work tree, unused doesn't show it. When it gets removed from the index, unused does show it. This is the same as a locked file.	2015-12-11 16:12:42 -04:00
Joey Hess	e7183d83d3	fsck for v6 unlocked files This only adds 1 stat to each file fscked for locked files, so added overhead is minimal. For unlocked files it has to access the database to see if a file is modified.	2015-12-11 16:07:54 -04:00
Joey Hess	7790e059b2	finish v6 git-annex lock This was a doozy!	2015-12-11 15:28:34 -04:00
Joey Hess	50e83b606c	only make 1 hardlink max between pointer file and annex object If multiple files point to the same annex object, the user may want to modify them independently, so don't use a hard link. Also, check diskreserve when copying.	2015-12-11 14:00:21 -04:00
Joey Hess	c910b4e255	wip	2015-12-11 10:42:18 -04:00
Joey Hess	e2c8dc6778	v6 git-annex unlock Note that the implementation uses replaceFile, so that the actual replacement of the work tree file is atomic. This seems a good property to have! It would be possible for unlock in v6 mode to be run on files that do not have their content present. However, that would be a behavior change from before, and I don't see any immediate need to support it, so I didn't implement it.	2015-12-10 16:12:48 -04:00
Joey Hess	9dffd3d255	add generalized linkAnnex'	2015-12-10 16:08:19 -04:00
Joey Hess	108f711d37	todo	2015-12-10 14:54:03 -04:00
Joey Hess	f80a3d8cd0	check InodeCache in inAnnex et al This avoids querying the database when the content file doen't exist (or otherwise fails the provided check). However, it does add overhead of querying the database, and will certianly impact performance.	2015-12-10 14:51:04 -04:00
Joey Hess	2b8f6b8b2f	check inode cache in prepSendAnnex This does mean one query of the database every time an object is sent. May impact performance.	2015-12-10 14:50:52 -04:00
Joey Hess	ce73a96e4e	use InodeCache when dropping a key to see if a pointer file can be safely reset The Keys database can hold multiple inode caches for a given key. One for the annex object, and one for each pointer file, which may not be hard linked to it. Inode caches for a key are recorded when its content is added to the annex, but only if it has known pointer files. This is to avoid the overhead of maintaining the database when not needed. When the smudge filter outputs a file's content, the inode cache is not updated, because git's smudge interface doesn't let us write the file. So, dropping will fall back to doing an expensive verification then. Ideally, git's interface would be improved, and then the inode cache could be updated then too.	2015-12-09 17:54:54 -04:00
Joey Hess	3311c48631	move InodeSentinal from direct mode code to its own module Will be used outside of direct mode for v6 unlocked files, and is already used outside of direct mode when adding files to annex.	2015-12-09 15:52:11 -04:00
Joey Hess	8a818088a3	link/copy pointer files to object content when it's added	2015-12-09 15:27:29 -04:00
Joey Hess	37c9026c6e	todo	2015-12-08 13:07:45 -04:00
Joey Hess	9923b8dc77	long walk led to long list of things to do	2015-12-07 17:24:16 -04:00
Joey Hess	712c9fc590	require "annex/objects/" before key in pointer files This removes ambiguity, because while someone might have "WORM--foo" in a file that's not intended to be a git-annex pointer file, "annex/objects/WORM--foo" is less likely. Also, `664cc987e8` had a caveat about symlink targets being parsed as pointer files, and now the same parser is used for both. I did not include any hash directories before the key in the pointer file, as they're not needed. However, if they were included, the parser would still work ok.	2015-12-07 15:45:08 -04:00
Joey Hess	2cbcb4f1a8	update associated files database on smudge and clean	2015-12-07 14:41:22 -04:00
Joey Hess	2fe21d47c5	init: Configure .git/info/attributes to use git-annex as a smudge filter. Note that this changes the default behavior of git add in a newly initialized repository; it will add files to the annex. Don't like that this could break workflows, but it's necessary in order for any pointer files in the repo to be handled by git-annex.	2015-12-04 17:57:15 -04:00
Joey Hess	e7f75b079d	don't let git-annex direct be run in a v6 repo	2015-12-04 16:33:09 -04:00
Joey Hess	ccc49861ca	add v6; keep v5 working for now and manual upgrade Since all places where a repo is used in direct mode need to have git-annex upgraded before the repo can safely be converted to v6, the upgrade needs to be manual for now. I suppose that at some point I'll want to drop all the direct mode support code. At that point, will stop supporting v5, and will need to auto-upgrade any remaining v5 repos. If possible, I'd like to carry the direct mode support for say, a year or so, to give people plenty of time to upgrade and avoid disruption.	2015-12-04 16:14:48 -04:00
Joey Hess	20ca89dfa3	skeleton smudge/clean filters	2015-12-04 13:03:39 -04:00
Joey Hess	f16e235983	addurl, importfeed: Changed to honor annex.largefiles settings, when the content of the url is downloaded. (Not when using --fast or --relaxed.) importfeed just calls addurl functions, so inherits this from it. Note that addurl still generates a temp file, and uses that key to download the file. It just adds it to the work tree at the end when the file is small.	2015-12-02 15:12:33 -04:00
Joey Hess	382f8a790a	fix name of comment	2015-12-02 12:06:04 -04:00
Joey Hess	6cd222fbe8	remove redundant and unnecessary todo Mostly because of the --	2015-12-02 12:00:41 -04:00
https://me.yahoo.com/a/EbvxpTI_xP9Aod7Mg4cwGhgjrCrdM5s-#7c0f4	7b7a9e2468		2015-11-30 18:49:49 +00:00
Joey Hess	3f63666727	file map analysis	2015-11-24 11:39:47 -04:00
Joey Hess	cf0130894e	notes on merge	2015-11-23 18:10:50 -04:00
Joey Hess	fe55caa2ae	upgrading	2015-11-23 17:57:47 -04:00
Joey Hess	33fb0de1a3	smudge design	2015-11-23 16:53:05 -04:00
https://me.yahoo.com/a/EbvxpTI_xP9Aod7Mg4cwGhgjrCrdM5s-#7c0f4	472df9c9b5	added [[!meta author=yoh]]	2015-11-10 19:25:28 +00:00
https://me.yahoo.com/a/EbvxpTI_xP9Aod7Mg4cwGhgjrCrdM5s-#7c0f4	ee69735b7f	Added a comment	2015-11-10 19:24:59 +00:00
Joey Hess	952d9e42dc	hmm	2015-11-10 15:12:12 -04:00
Joey Hess	78b63888a6	close	2015-11-06 13:52:47 -04:00
Joey Hess	c6fc0945f3	update	2015-11-04 17:03:32 -04:00
https://me.yahoo.com/a/EbvxpTI_xP9Aod7Mg4cwGhgjrCrdM5s-#7c0f4	bf0b84a86d		2015-11-02 20:59:38 +00:00
Joey Hess	e806e62fa3	document default --autostart --startdelay=5 and comment	2015-11-02 11:18:44 -04:00
https://id.koumbit.net/anarcat	3d0d832b6c	add three more alternatives...	2015-10-30 15:42:35 +00:00
https://id.koumbit.net/anarcat	ad87b9c708	Added a comment: re tox	2015-10-30 15:37:09 +00:00
Gastlag	c07dd514e0	Added a comment: Is xmpp the problem ?	2015-10-30 10:42:06 +00:00
parhuzamos	0ec35e8b45	User "bence" was using Google OpenID which is not supported anymore. Found this moved page, edited it to get notified if anything happens.	2015-10-28 21:01:29 +00:00
Jonan	cf3d3037b6		2015-10-28 11:10:44 +00:00
Jonan	c38b28f874		2015-10-28 11:06:30 +00:00
Joey Hess	6e7eddb5d6	comment	2015-10-26 15:33:32 -04:00
anarcat	48c78c9b2a	trick question	2015-10-20 04:44:39 +00:00
Joey Hess	f9adb905fc	Avoid unncessary write to the location log when a file is unlocked and then added back with unchanged content. Implemented with no additional overhead of compares etc. This is safe to do for presence logs because of their locality of change; a given repo's presence logs are only ever changed in that repo, or in a repo that has just been actively changing the content of that repo. So, we don't need to worry about a split-brain situation where there'd be disagreement about the location of a key in a repo. And so, it's ok to not update the timestamp when that's the only change that would be made due to logging presence info.	2015-10-12 14:46:47 -04:00
Joey Hess	82ba8c9a6a	comment	2015-10-12 13:29:00 -04:00
tribut	53d3b5a197		2015-10-11 17:33:20 +00:00
Joey Hess	9bcc32de3b	Merge branch 'master' of ssh://git-annex.branchable.com	2015-10-01 16:17:52 -04:00
Joey Hess	2fb3722ce9	Do verification of checksums of annex objects downloaded from remotes. * When annex objects are received into git repositories, their checksums are verified then too. * To get the old, faster, behavior of not verifying checksums, set annex.verify=false, or remote.<name>.annex-verify=false. * setkey, rekey: These commands also now verify that the provided file matches the key, unless annex.verify=false. * reinject: Already verified content; this can now be disabled by setting annex.verify=false. recvkey and reinject already did verification, so removed now duplicate code from them. fsck still does its own verification, which is ok since it does not use getViaTmp, so verification doesn't happen twice when using fsck --from.	2015-10-01 15:56:39 -04:00
dxld@02c834b220f9ffc0410d37263aa29d9373cc455b	9825b4cb15	Added a comment: Fully p2p alternative to XMPP	2015-10-01 17:22:44 +00:00
Joey Hess	0c3a3c5187	comment	2015-10-01 11:57:59 -04:00
Joey Hess	4aa055cb39	Merge branch 'master' of ssh://git-annex.branchable.com	2015-09-29 11:20:00 -04:00
kalle@bdf75651b439b088e51f28f10f5a46ffcd2a704d	2ed24b88ae	Added a comment: importfeed template	2015-09-28 19:52:16 +00:00
graboluk@f6de53961ab0f884e203f602f65eb5cdc0fb7513	31597d6676	Added a comment: timestamps are wrong as of 5.20150731	2015-09-26 18:31:46 +00:00
Joey Hess	209b8bbfbb	Merge branch 'master' of ssh://git-annex.branchable.com	2015-09-26 08:57:53 -04:00
Joey Hess	7f102cf43d	add	2015-09-26 07:23:08 -04:00
fastguy	8156862566	Added a comment: Any updates?	2015-09-25 19:35:18 +00:00
Joey Hess	f2b6ebd502	status: Show added but not yet committed files. Seems easy, but git ls-files can't list the right subset of files. So, I wrote a whole new parser for git status output, and converted the status command to use that. There are a few other small behavior changes. The order changed. Unlocked files show as T. In indirect mode, deleted files were not shown before, and that's fixed. Regular files checked directly into git and modified were not shown before, and are now.	2015-09-22 17:32:28 -04:00
Joey Hess	6885fe3c38	close, already implemented via a different todo	2015-09-22 15:46:42 -04:00
Joey Hess	c9acb6b89d	new todo	2015-09-22 15:31:08 -04:00
Joey Hess	9fa60b676c	close	2015-09-22 15:23:23 -04:00
Joey Hess	89238e9595	juggle dup bugs	2015-09-17 17:08:39 -04:00
Joey Hess	9cfb96c53d	Special remotes configured with autoenable=true will be automatically enabled when git-annex init is run.	2015-09-14 14:49:48 -04:00
Joey Hess	ffa8221517	annex.hardlink extended to also try to use hard links when copying from the repository to a remote. Also, it used to only check that one of the repos was not in direct mode; now when either repo is direct mode, annex.hardlink won't have an effect.	2015-09-14 12:13:38 -04:00
https://id.koumbit.net/anarcat	8e30053d5b	toc	2015-09-12 22:18:40 +00:00
Joey Hess	eab8c512d8	cleanup	2015-09-11 13:21:58 -04:00

1 2 3 4 5 ...

1465 commits