git-annex

Author	SHA1	Message	Date
Joey Hess	a1730cd6af	adeiu, MissingH Removed dependency on MissingH, instead depending on the split library. After laying groundwork for this since 2015, it was mostly straightforward. Added Utility.Tuple and Utility.Split. Eyeballed System.Path.WildMatch while implementing the same thing. Since MissingH's progress meter display was being used, I re-implemented my own. Bonus: Now progress is displayed for transfers of files of unknown size. This commit was sponsored by Shane-o on Patreon.	2017-05-16 01:03:52 -04:00
Joey Hess	c8e1e3dada	AssociatedFile newtype To prevent any further mistakes like `301aff34c4` This commit was sponsored by Francois Marier on Patreon.	2017-03-10 13:35:31 -04:00
Joey Hess	f617988a29	Make import --deduplicate and --skip-duplicates only hash once, not twice import: --deduplicate and --skip-duplicates were implemented inneficiently; they unncessarily hashed each file twice. They have been improved to only hash once. The new approach is to lock down (minimally) and hash files, and then reuse that information when importing them. This was rather tricky, especially in detecting changes to files while they are being imported. The output of import changed slightly. While before it silently skipped over files with eg --skip-duplicates, now it shows each file as it starts to act on it. Since every file is hashed first thing, it would otherwise not be clear what file import is chewing on. (Actually, it wasn't clear before when any of the duplicates switches were used.) This commit was sponsored by Alexander Thompson on Patreon.	2017-02-09 15:32:22 -04:00
Joey Hess	1a0e2c9901	get, move, copy, mirror: Added --failed switch which retries failed copies/moves Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/	2016-08-03 12:37:12 -04:00
Joey Hess	b7c8bf5274	Preserve execute bits of unlocked files in v6 mode. When annex.thin is set, adding an object will add the execute bits to the work tree file, and this does mean that the annex object file ends up executable. This doesn't add any complexity that wasn't already present, because git annex add of an executable file has always ingested it so that the annex object ends up executable. But, since an annex object file can be executable or not, when populating an unlocked file from one, the executable bit is always added or removed to match the mode of the pointer file.	2016-04-14 14:47:08 -04:00
Joey Hess	7c20bf6e7a	make sync aware of adjusted branches So, it will pull and push the original branch, not the adjusted one. And, for merging, it will use updateAdjustedBranch (not implemented yet). Note that remaining uses of Git.Branch.current need to be checked too; for things that should act on the original branch, and not the adjusted branch.	2016-02-29 15:23:08 -04:00
Joey Hess	4b819bee2b	avoid confusing git with a modified ctime in clean filter Linking the file to the tmp dir was not necessary in the clean filter, and it caused the ctime to change, which caused git to think the file was changed. This caused git status to get slow as it kept re-cleaning unchanged files.	2016-01-07 17:48:04 -04:00
Joey Hess	7c5c7bb04a	fix OSX build	2015-12-28 13:28:21 -04:00
Joey Hess	c4152654d2	combine PendingAddChanges for the same file into one In v6 unlocked mode, this fixes a problem that was making eg, echo > file cause the assistant to copy the file to the annex object, instead of hard linking it. That because 2 change events were seen (one for opening the file and one for closing) and processed together the file was then locked down twice. Which meant it had mutiple hard links, and so prevented linkAnnex from hard linking it. There might be scenarios where multiple events come in, but staggered such that a file gets locked down repeatedly, and it would still be copied to the annex object in that case.	2015-12-22 17:52:39 -04:00
Joey Hess	cfaac52b88	populate unlocked files with newly available content when ingesting This can happen when ingesting a new file in either locked or unlocked mode, when some unlocked files in the repo use the same key, and the content was not locally available before.	2015-12-22 16:22:28 -04:00
Joey Hess	4f60234690	finish v6 support for assistant Seems to basically work now!	2015-12-22 15:23:27 -04:00
Joey Hess	8e9608d7f0	refactoring no behavior changes	2015-12-22 13:42:58 -04:00
Joey Hess	ca2c977704	wip v6 support for assistant Files are not yet added to v6 repos in unlocked mode.	2015-12-21 18:41:15 -04:00
Joey Hess	3311c48631	move InodeSentinal from direct mode code to its own module Will be used outside of direct mode for v6 unlocked files, and is already used outside of direct mode when adding files to annex.	2015-12-09 15:52:11 -04:00
Joey Hess	90f7c4b6a2	add VerifiedCopy data type There should be no behavior changes in this commit, it just adds a more expressive data type and adjusts code that had been passing around a [UUID] or sometimes a Maybe Remote to instead use [VerifiedCopy]. Although, since some functions were taking two different [UUID] lists, there's some potential for me to have gotten it horribly wrong.	2015-10-08 16:55:11 -04:00
Joey Hess	addc82dab7	removed all uses of undefined from code base It's a code smell, can lead to hard to diagnose error messages.	2015-04-19 00:38:29 -04:00
Joey Hess	7b32e7acb5	make segmentXargs preserve order	2015-04-02 00:31:36 -04:00
Joey Hess	2b7f3ee3f2	assistant: Committing a whole lot of files at once could overflow command-line length limits and cause the commit to fail. This only happened when using the assistant in an indirect mode repository.	2015-03-26 14:02:35 -04:00
Joey Hess	32b3bed086	add a comment	2015-03-26 13:43:30 -04:00
Joey Hess	f5b830e07c	sync, assistant: Include repository name in head branch commit message. Note that while the assistant detects changes made to remote names, I left the commit message fixed rather than calculating it after every commit. It doesn't seem worth the CPU to do the latter.	2015-02-11 13:34:05 -04:00
Joey Hess	70736d2b41	Repository tuning parameters can now be passed when initializing a repository for the first time. * init: Repository tuning parameters can now be passed when initializing a repository for the first time. For details, see http://git-annex.branchable.com/tuning/ * merge: Refuse to merge changes from a git-annex branch of a repo that has been tuned in incompatable ways.	2015-01-27 17:38:06 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	d41849bc23	support commit.gpgsign Support users who have set commit.gpgsign, by disabling gpg signatures for git-annex branch commits and commits made by the assistant. The thinking here is that a user sets commit.gpgsign intending the commits that they manually initiate to be gpg signed. But not commits made in the background, whether by a deamon or implicitly to the git-annex branch. gpg signing those would be at best a waste of CPU and at worst would fail, or flood the user with gpg passphrase prompts, or put their signature on changes they did not directly do. See Debian bug #753720. Also makes all commits done by git-annex go through a few central control points, to make such changes easier in future. Also disables commit.gpgsign in the test suite. This commit was sponsored by Antoine Boegli.	2014-07-04 11:53:51 -04:00
Joey Hess	501cc8623a	assistant: Fix one-way assistant->assistant sync in direct mode. When in direct mode, update the master branch after committing to the annex/direct/master branch. Also, update the synced/master branch. This fixes a topology A->B where both A and B are in direct mode and running the assistant, and a change is made to B. Before this fix, A pulled the changes from B, but since they were only on the annex/direct/master branch, it did not merge them. Note that I considered making the assistant merge the remotes/B/annex/direct/master, but decided to keep it simple and only merge the sync branches as before.	2014-06-16 11:32:13 -04:00
Joey Hess	e4d7e2ebde	fix for Windows file timestamp timezone madness On Windows, changing the time zone causes the apparent mtime of files to change. This confuses git-annex, which natually thinks this means the files have actually been modified (since THAT'S WHAT A MTIME IS FOR, BILL <sheesh>). Work around this stupidity, by using the inode sentinal file to detect if the timezone has changed, and calculate a TSDelta, which will be applied when generating InodeCaches. This should add no overhead at all on unix. Indeed, I sped up a few things slightly in the refactoring. Seems to basically work! But it has a big known problem: If the timezone changes while the assistant (or a long-running command) runs, it won't notice, since it only checks the inode cache once, and so will use the old delta for all new inode caches it generates for new files it's added. Which will result in them seeming changed the next time it runs. This commit was sponsored by Vincent Demeester.	2014-06-12 13:42:21 -04:00
Joey Hess	a1432bce2f	Put non-object tmp files in .git/annex/misctmp, leaving .git/annex/tmp for only partially transferred objects. This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO of temp object files is too annoying (and if you don't want to keep partially transferred objects across reboots). .git/annex/misctmp must be on the same filesystem as the git work tree, since files are moved to there in a way that will not work cross-device, as well as symlinked into there. I first wanted to put the tmp objects in .git/annex/objects/tmp, but that would pose transition problems on upgrade when partially transferred objects existed. git annex info does not currently show the size of .git/annex/misctemp, since it should stay small. It would also be ok to make something clean it out, periodically.	2014-02-26 16:52:56 -04:00
Joey Hess	964a181026	try to drop unused object if it does not need to be transferred anywhere	2014-01-23 16:51:16 -04:00
Joey Hess	b7e3fe2ebd	flip for clarity	2013-12-16 16:24:57 -04:00
Joey Hess	58c7b0a56d	assistant: Always batch changes found in startup scan. Batch detection is heuristic, so can sometimes fail. I observed one such failure while starting up in a repository with 87000 files. After the first several batches of ~5000 files, it fell out of batch mode, and never re-entered it, and so made many more commits of a few files at a time than necessary. So, let's always use batch mode when in the startup scan. This avoids the heuristic there, at least. There is clearly also room to improve the heuristic. Possibly 10 files is too high a bar to be found during a commit, on a system that can commit quickly.	2013-12-16 16:16:19 -04:00
Joey Hess	9d323a98e2	avoid trying to use lsof when it's not in path and --forced	2013-12-04 17:39:44 -04:00
Joey Hess	03932212ec	Avoid using git commit in direct mode, since in some situations it will read the full contents of files in the tree. The assistant's commit code also always avoids git commit, for simplicity. Indirect mode sync still does a git commit -a to catch unstaged changes. Note that this means that direct mode sync no longer runs the pre-commit hook or any other hooks git commit might call. The git annex pre-commit hook action for direct mode is however explicitly run. (The assistant already ran git commit with hooks disabled, so no change there.)	2013-12-01 13:59:45 -04:00
Joey Hess	3ac9c4e672	hlint	2013-10-02 22:59:07 -04:00
Joey Hess	98fc7e8a19	add, import, assistant: Better preserve the mtime of symlinks, when when adding content that gets deduplicated. Note that this turned out to remove a syscall, not add any expense. Otherwise, I would not have done it.	2013-09-25 16:07:11 -04:00
Joey Hess	672cfc3923	better git version checking	2013-08-02 18:32:26 -04:00
Joey Hess	869c638b82	assistant: Fix bug that caused it to stall when adding a very large number of files at once (around 5 thousand). This bug was introduced in `82a6db8fe8`, which improved handling of adding very large numbers of files by ensuring that a minimum number of max size commits (5000 files each) were done. I accidentially made it wait for another change to appear after such a max size commit, even if a lot of queued changes were already accumulated. That resulted in a stall when it got to the end. Now fixed to not wait any longer than necessary to ensure the watcher has had time to wake back up after the max size commit. This commit was sponsored by Michael Linksvayer. Thanks!	2013-07-27 17:42:18 -04:00
Joey Hess	ec4d974dcf	assistant: Fix deadlock that could occur when adding a lot of files at once in indirect mode. This is a laziness problem. Despite the bang pattern on newfiles, the list was not being fully evaluated before cleanup was called. Moving cleanup out to after the list is actually used fixes this. More evidence that I should be using ResourceT or pipes, if any was needed.	2013-07-26 18:42:22 -04:00
Joey Hess	dba1e29949	webapp: Better display of added files.	2013-07-10 15:37:40 -04:00
Joey Hess	82a6db8fe8	committer tweak to wait for Watcher to resume after a max-size commit Without this, a very large batch add has commits of sizes approx 5000, 2500, 1250, etc down to 10, and then starts over at 5000. This fixes it so it's 5000+ every time.	2013-04-25 00:48:09 -04:00
Joey Hess	ebee93a837	get rid of need to run pre-commit hook when assistant commits in direct mode That hook updates associated file bookkeeping info for direct mode. But, everything already called addAssociatedFile when adding/changing a file. It only needed to also call removeAssociatedFile when deleting a file, or a directory. This should make bulk adds faster, by some possibly significant amount. Bulk removals may be a little slower, since it has to use catKeyFile now on each removed file, but will still be faster than adds.	2013-04-24 18:04:59 -04:00
Joey Hess	cd7055631f	batch commit every 5 thousand changes, not 10 thousand There's a tradeoff between making less frequent commits, and needing to use memory to store all the changes that are coming in. At 10 thousand, it needs 150 mb of memory. 5 thousand drops that down to 90 mb or so. This also turns out to have significant imact on total run time. I benchmarked 10k changes taking 27 minutes. But two 5k batches took only 21 minutes.	2013-04-24 16:40:35 -04:00
Joey Hess	bda237f14a	convert PendingAddChange back to Change when an add fails If an add failed, we should lose the KeySource, since it, presumably, differs due to a change that was made to the file. (The locked down file is already deleted.)	2013-04-24 16:29:25 -04:00
Joey Hess	a929e6641a	show one alert when bulk adding files Turns out that a lot of the time spent in a bulk add was just updating the add alert to rotate through each file that was added. Showing one alert makes for a significant speedup. Also, when the webapp is open, this makes it take quite a lot less cpu during bulk adds. Also, it lets the user know when a bulk add happened, which is sorta nice..	2013-04-24 13:04:46 -04:00
Joey Hess	ca72b1ac7b	assistant: when an add fails, requeue it for later See analysis in bug report for one way this could happen.	2013-04-23 18:23:04 -04:00
Joey Hess	090a69f00c	assistant: Work around misfeature in git 1.8.2 that makes `git commit --alow-empty -m ""` run an editor. See http://git-annex.branchable.com/bugs/assistant_hangs_during_commit/	2013-04-18 16:27:17 -04:00
Joey Hess	602baae12e	Bugfix: Direct mode no longer repeatedly checksums duplicated files. Fixed by storing a list of cached inodes for a key, instead of just one. Backwards compatability note: An old git-annex version will fail to parse an inode cache file that has been written by a new version, and has multiple items. It will succees if just one. So old git-annexes will have even worse behavior when there are duplicated files, if that is possible. I don't think it will be a problem. (Famous last words.) Also, note that it doesn't expire old and unused inode caches for a key. It would be possible to add this if needed; just look through the associated files for a key and if there are more cached inodes, throw out any not corresponding to associated files. Unless a file is being copied repeatedly and the old copy deleted, this lack of expiry should not be a problem.	2013-04-06 16:07:25 -04:00
Joey Hess	f1b0a4b404	Use lower case hash directories for storing files on crippled filesystems, same as is already done for bare repositories. * since this is a crippled filesystem anyway, git-annex doesn't use symlinks on it * so there's no reason to use the mixed case hash directories that we're stuck using to avoid breaking everyone's symlinks to the content * so we can do what is already done for all bare repos, and make non-bare repos on crippled filesystems use the all-lower case hash directories * which are, happily, all 3 letters long, so they cannot conflict with mixed case hash directories * so I was able to 100% fix this and even resuming `git annex add` in the test case will recover and it will all just work.	2013-04-04 15:46:33 -04:00
Joey Hess	35a0ae334c	assistant: Fix OSX bug that prevented committing changed files to a repository when in indirect mode.	2013-03-17 17:01:43 -04:00
Joey Hess	393340dc3b	better handling of batch renames Rather than wait a full second, which may be longer than needed, or too short to get all the rename events, we start a mode where we wait 1/10th of a second, and if there are Changes received, wait again. Basically we're back in batch mode when this happens.	2013-03-11 15:46:09 -04:00

1 2 3

111 commits