git-annex

Author	SHA1	Message	Date
Joey Hess	96c055eda2	migrate: WORM keys containing spaces will be migrated to not contain spaces anymore To work around the problem that the external special remote protocol does not support keys containing spaces. This commit was sponsored by Denis Dzyubenko on Patreon.	2017-08-17 15:09:38 -04:00
Joey Hess	51801cff6a	Prevent spaces from being embedded in the name of new WORM keys, as that handing spaces in keys would complicate things like the external special remote protocol.	2017-08-17 14:46:33 -04:00
Joey Hess	d39c120afa	add annex-ignore-command and annex-sync-command configs Added remote configuration settings annex-ignore-command and annex-sync-command, which are dynamic equivilants of the annex-ignore and annex-sync configurations. For this I needed a new DynamicConfig infrastructure. Its implementation should be as fast as before when there is no dynamic config, and it caches so shell commands are only run once. Note that annex-ignore-command exits nonzero when the remote should be ignored. While that may seem backwards, it allows using the same command for it as for annex-sync-command when you want to disable both. This commit was sponsored by Trenton Cronholm on Patreon.	2017-08-17 13:54:14 -04:00
Joey Hess	0b307f43e1	avoid accidental Show of VectorClock Removed its Show instance.	2017-08-14 14:51:54 -04:00
Joey Hess	2cecc8d2a3	Added GIT_ANNEX_VECTOR_CLOCK environment variable Can be used to override the default timestamps used in log files in the git-annex branch. This is a dangerous environment variable; use with caution. Note that this only affects writing to the logs on the git-annex branch. It is not used for metadata in git commits (other env vars can be set for that). There are many other places where timestamps are still used, that don't get committed to git, but do touch disk. Including regular timestamps of files, and timestamps embedded in some files in .git/annex/, including the last fsck timestamp and timestamps in transfer log files. A good way to find such things in git-annex is to get for getPOSIXTime and getCurrentTime, although some of the results are of course false positives that never hit disk (unless git-annex gets swapped out..) So this commit does NOT necessarily make git-annex comply with some HIPPA privacy regulations; it's up to the user to determine if they can use it in a way compliant with such regulations. Benchmarking: It takes 0.00114 milliseconds to call getEnv "GIT_ANNEX_VECTOR_CLOCK" when that env var is not set. So, 100 thousand log files can be written with an added overhead of only 0.114 seconds. That should be by far swamped by the actual overhead of writing the log files and making the commit containing them. This commit was supported by the NSF-funded DataLad project.	2017-08-14 14:19:58 -04:00
Joey Hess	e23839acf3	Avoid error about git-annex-shell not being found when syncing with -J with a git remote where git-annex-shell is not installed. This commit was sponsored by andrea rota.	2017-06-06 12:57:27 -04:00
Joey Hess	94351daba6	configuration to disable automatic merge conflict resolution * Added annex.resolvemerge configuration, which can be set to false to disable the usual automatic merge conflict resolution done by git-annex sync and the assistant. * sync: Added --no-resolvemerge option. Note that disabling merge conflict resolution is probably not a good idea in a direct mode repo or adjusted branch. Since updates to both are done outside the usual work tree, if it fails the tree is not left in a conflicted state, and it would be hard to manually resolve the conflict. Still, made annex.resolvemerge be supported in those cases for consistency. This commit was sponsored by Riku Voipio.	2017-06-01 12:51:01 -04:00
Joey Hess	7db37ddde0	Fix transfer log file locking problem when running concurrent transfers. orElse is great, but was not the right thing to use here because waitTakeLock could retry for other reasons than the lock being held, which made tryTakeLock fail when it shouldn't. Instead, move the code to tryTakeLock and implement waitTakeLock using tryTakeLock and retry. (Also, in runTransfer, when checkSaneLock fails, dropLock to avoid leaking a lock handle.) This commit was supported by the NSF-funded DataLad project.	2017-05-25 17:40:23 -04:00
Joey Hess	1d45e47e3f	clear regions before ssh prompt When built with concurrent-output 1.9, ssh password prompts will no longer interfere with the -J display. To avoid flicker, only done when ssh actually does need to prompt; ssh is first run in batch mode and if that succeeds the connection is up and no need to clear regions. This commit was supported by the NSF-funded DataLad project.	2017-05-16 15:50:11 -04:00
Joey Hess	89f9be3230	workaround is in place (and remove debug print)	2017-05-16 14:36:54 -04:00
Joey Hess	9bcaef1ec4	Work around bug in git 2.13.0 involving GIT_COMMON_DIR that broke merging changes into adjusted branches. Might want to remove this when it gets fixed, in case adjusted branches are used in a repo with a great many refs, which would become unnecessarily slow. This commit was supported by the NSF-funded DataLad project.	2017-05-16 14:35:37 -04:00
Joey Hess	a1730cd6af	adeiu, MissingH Removed dependency on MissingH, instead depending on the split library. After laying groundwork for this since 2015, it was mostly straightforward. Added Utility.Tuple and Utility.Split. Eyeballed System.Path.WildMatch while implementing the same thing. Since MissingH's progress meter display was being used, I re-implemented my own. Bonus: Now progress is displayed for transfers of files of unknown size. This commit was sponsored by Shane-o on Patreon.	2017-05-16 01:03:52 -04:00
Joey Hess	6dd806f1ad	stop using MissingH for MD5 Cryptonite is faster and allocates less, and I want to get rid of MissingH use. Note that the new dependency on memory is free; it's a dependency of cryptonite. This commit was supported by the NSF-funded DataLad project.	2017-05-15 21:36:03 -04:00
Joey Hess	18b9a4b802	remove absNormPathUnix again Moving toward dropping MissingH dep. I think I've addressed the problem identified earlier in `09a66f702d`. On Windows, absPathFrom "/tmp/repo/xxx" "y/bar" would be "/tmp/repo/xxx\\y/bar", which then confuses relPathDirToFile. Fixed by converting to unix (git) style paths. Also, relPathDirToFile was splitting only on \\ on windows and not / which broke the example in `09a66f702d` of relPathDirToFile (absPathFrom "/tmp/repo/xxx" "y/bar") "/tmp/repo/.git/annex/objects/xxx" Now, on windows, that will yield "..\\..\\..\\.git/annex/objects/xxx" which once converted to unix style paths is what we want.	2017-05-15 21:35:35 -04:00
Joey Hess	2c6cfbe503	also serialize ssh password prompting when json or quiet output is enable	2017-05-13 13:13:13 -04:00
Joey Hess	3f4b671486	fix sshCleanup race using STM	2017-05-11 18:29:51 -04:00
Joey Hess	6992fe133b	Ssh password prompting improved when using -J When ssh connection caching is enabled (and when GIT_ANNEX_USE_GIT_SSH is not set), only one ssh password prompt will be made per host, and only one ssh password prompt will be made at a time. This also fixes a race in prepSocket's stale ssh connection stopping when run with -J. It was possible for one thread to start a cached ssh connection, and another thread to immediately stop it, resulting in excess connections being made. This commit was supported by the NSF-funded DataLad project.	2017-05-11 17:36:03 -04:00
Joey Hess	a6416ba232	improve comment	2017-05-11 14:37:24 -04:00
Joey Hess	cfa6932dcc	fix build with old ghc	2017-05-10 14:39:15 -04:00
Joey Hess	76c63a4a66	avoiding depending on latest version of process except on Windows	2017-04-10 12:14:24 -04:00
Joey Hess	b6f26bac86	Disable git-annex's support for GIT_SSH and GIT_SSH_COMMAND, unless GIT_ANNEX_USE_GIT_SSH=1 is also set in the environment. This is necessary because as feared, the extra -n parameter that git-annex passes breaks uses of these environment variables that expect exactly the parameters that git passes. For example, see https://github.com/datalad/datalad/issues/1456 It would of course be possible to pre-close stdin before running ssh so not needing the -n, and I think that would not even break ssh's password caching. But it would probably involve a lot of work, possibly would need to deal with some layering violations, and would be error-prone. The really clean fix would be to make all the ssh stuff return a CreateProcess, which could have the handle closed when appropriate, but that would be a large reworing of the code base. This commit was supported by the NSF-funded DataLad project.	2017-04-07 11:35:27 -04:00
Joey Hess	c3970f6c1a	multicast: New command, uses uftp to multicast annexed files, for eg a classroom setting. This commit was supported by the NSF-funded DataLad project.	2017-03-30 19:35:30 -04:00
Joey Hess	6af15d0ec9	rest of fix for GIT_SSH_COMMAND -n parameter `c8a6be7eef` was incomplete	2017-03-20 23:35:29 -04:00
Joey Hess	faecd73f32	Support GIT_SSH and GIT_SSH_COMMAND They are handled close the same as they are by git. However, unlike git, git-annex sometimes needs to pass the -n parameter when using these. So, this has the potential for breaking some setup, and perhaps there ought to be a ANNEX_USE_GIT_SSH=1 needed to use these. But I'd rather avoid that if possible, so let's see if anyone complains. Almost all places where "ssh" was run have been changed to support the env vars. Anything still calling sshOptions does not support them. In particular, rsync special remotes don't. Seems that annex-rsync-transport already gives sufficient control there. (Fixed in passing: Remote.Helper.Ssh.toRepo used to extract remoteAnnexSshOptions and pass them to sshOptions, which was redundant since sshOptions also extracts those.) This commit was sponsored by Jeff Goeke-Smith on Patreon.	2017-03-17 16:20:37 -04:00
Joey Hess	c8e1e3dada	AssociatedFile newtype To prevent any further mistakes like `301aff34c4` This commit was sponsored by Francois Marier on Patreon.	2017-03-10 13:35:31 -04:00
Joey Hess	0534152685	get -J: Improve distribution of jobs amoung remotes when there are more jobs than remotes. It was distributing jobs to remotes that were not being used by any other job. But, suppose that there are only 2 remotes, and -J10. In such a case, the first 2 downloads would be distributed amoung the 2 remotes, but the other 8 would all go to remote #1. Improved by keeping a counter of how many jobs are assigned to a remote, and prefer remotes with fewer jobs. Note use of Data.Map.Strict to avoid blowing up space. I kept the bang-patterns as-is, although probably not needed with Data.Map.Strict. This commit was sponsored by Jack Hill on Patreon.	2017-03-08 14:49:30 -04:00
Joey Hess	7a32e08c4a	fix bug introduced in `07f1e638ee` Just totally wrong logic, oops. Caught by test suite.	2017-02-28 13:24:26 -04:00
Joey Hess	e53070c1ff	inheritable annex.securehashesonly * init: When annex.securehashesonly has been set with git-annex config, copy that value to the annex.securehashesonly git config. * config --set: As well as setting value in git-annex branch, set local gitconfig. This is needed especially for annex.securehashesonly, which is read only from local gitconfig and not the git-annex branch. doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn has the rationalle for doing it this way. There's no perfect solution; this seems to be the least-bad one. This commit was supported by the NSF-funded DataLad project.	2017-02-27 16:08:23 -04:00
Joey Hess	c33363dfa7	early cancelation of transfer that annex.securehashesonly prohibits This avoids sending all the data to a remote, only to have it reject it because it has annex.securehashesonly set. It assumes that local and remote will have the same annex.securehashesonly setting in most cases. If a remote does not have that set, and local does, the remote won't get some content it would otherwise accept. Also avoids downloading data that will not be added to the local object store due to annex.securehashesonly. Note that, while encrypted special remotes use a GPGHMAC key variety, which is not collisiton resistent, Transfers are not used for such keys, so this check is avoided. Which is what we want, so encrypted special remotes still work. This commit was sponsored by Ewen McNeill.	2017-02-27 15:21:24 -04:00
Joey Hess	49114cf4ea	securehash matching Added --securehash option to match files using a secure hash function, and corresponding securehash preferred content expression. This commit was sponsored by Ethan Aubin.	2017-02-27 15:02:44 -04:00
Joey Hess	07f1e638ee	annex.securehashesonly Cryptographically secure hashes can be forced to be used in a repository, by setting annex.securehashesonly. This does not prevent the git repository from containing files with insecure hashes, but it does prevent the content of such files from being pulled into .git/annex/objects from another repository. We want to make sure that at no point does git-annex accept content into .git/annex/objects that is hashed with an insecure key. Here's how it was done: * .git/annex/objects/xx/yy/KEY/ is kept frozen, so nothing can be written to it normally * So every place that writes content must call, thawContent or modifyContent. We can audit for these, and be sure we've considered all cases. * The main functions are moveAnnex, and linkToAnnex; these were made to check annex.securehashesonly, and are the main security boundary for annex.securehashesonly. * Most other calls to modifyContent deal with other files in the KEY directory (inode cache etc). The other ones that mess with the content are: - Annex.Direct.toDirectGen, in which content already in the annex directory is moved to the direct mode file, so not relevant. - fix and lock, which don't add new content - Command.ReKey.linkKey, which manually unlocks it to make a copy. * All other calls to thawContent appear safe. Made moveAnnex return a Bool, so checked all callsites and made them deal with a failure in appropriate ways. linkToAnnex simply returns LinkAnnexFailed; all callsites already deal with it failing in appropriate ways. This commit was sponsored by Riku Voipio.	2017-02-27 13:33:59 -04:00
Joey Hess	9c4650358c	add KeyVariety type Where before the "name" of a key and a backend was a string, this makes it a concrete data type. This is groundwork for allowing some varieties of keys to be disabled in file2key, so git-annex won't use them at all. Benchmarks ran in my big repo: old git-annex info: real 0m3.338s user 0m3.124s sys 0m0.244s new git-annex info: real 0m3.216s user 0m3.024s sys 0m0.220s new git-annex find: real 0m7.138s user 0m6.924s sys 0m0.252s old git-annex find: real 0m7.433s user 0m7.240s sys 0m0.232s Surprising result; I'd have expected it to be slower since it now parses all the key varieties. But, the parser is very simple and perhaps sharing KeyVarieties uses less memory or something like that. This commit was supported by the NSF-funded DataLad project.	2017-02-24 15:16:56 -04:00
Joey Hess	ca0daa8bb8	factor non-type stuff out of Key	2017-02-24 13:42:30 -04:00
Joey Hess	35915a30d5	mention GIT_SSH_COMMAND	2017-02-20 12:58:08 -04:00
Joey Hess	e6857e75a6	sync hack to make updateInstead work on eg FAT sync: When syncing with a local repository located on a crippled filesystem, run the post-receive hook there, since it wouldn't get run otherwise. This makes pushing to repos on FAT-formatted removable drives update them when receive.denyCurrentBranch=updateInstead. Made Remote.Git export onLocal, which was cleaned up to not have so many caveats about its use. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2017-02-17 15:21:52 -04:00
Joey Hess	00464fbed7	have onLocal stop any coprocesses, not only cat-file I have not seen any other coprocesses being started, but let's avoid problems if any do for whatever reason.	2017-02-17 14:30:18 -04:00
Joey Hess	d074532aff	post-recive hook to make updateInstead work in direct mode and adjusted branches * Added post-recieve hook, which makes updateInstead work with direct mode and adjusted branches. * init: Set up the post-receive hook. This commit was sponsored by Fernando Jimenez on Patreon.	2017-02-17 14:04:43 -04:00
Joey Hess	f07af03018	Run ssh with -n whenever input is not being piped into it ... to avoid it consuming stdin that it shouldn't. This fixes git-annex-checkpresentkey --batch remote, which didn't output results for all keys passed into it. Other git-annex commands that communicate with a remote over ssh may also have been consuming stdin that they shouldn't have, which could have impacted using them in eg, shell scripts. For example, a shell script reading files from stdin and passing them to git annex drop would be impacted by this bug, whenever git annex drop ran git-annex-shell checkpresent, it would consume part/all of the stdin that the shell script was supposed to consume. Fixed by adding a ConsumeStdin parameter to Annex.Ssh.sshOptions, which is used throughout git-annex to run ssh (in order for ssh connection caching to work). Every call site was checked to see if it used CreatePipe for stdin, and if not was marked NoConsumeStdin.	2017-02-15 15:08:46 -04:00
Edward Betts	0750913136	correct spelling mistakes	2017-02-12 17:30:23 -04:00
Joey Hess	f617988a29	Make import --deduplicate and --skip-duplicates only hash once, not twice import: --deduplicate and --skip-duplicates were implemented inneficiently; they unncessarily hashed each file twice. They have been improved to only hash once. The new approach is to lock down (minimally) and hash files, and then reuse that information when importing them. This was rather tricky, especially in detecting changes to files while they are being imported. The output of import changed slightly. While before it silently skipped over files with eg --skip-duplicates, now it shows each file as it starts to act on it. Since every file is hashed first thing, it would otherwise not be clear what file import is chewing on. (Actually, it wasn't clear before when any of the duplicates switches were used.) This commit was sponsored by Alexander Thompson on Patreon.	2017-02-09 15:32:22 -04:00
Joey Hess	5c804cf42e	add SetupStage parameter to RemoteType.setup Most remotes have an idempotent setup that can be reused for enableremote, but in a few cases, it needs to tell which, and whether a UUID was provided to setup was used. This is groundwork for making initremote be able to provide a UUID. It should not change any behavior. Note that it would be nice to make the UUID always be provided to setup, and make setup not need to generate and return a UUID. What prevented this simplification is Remote.Git.gitSetup, which needs to reuse the UUID of the git remote when setting it up, and so has to return that UUID. This commit was sponsored by Thom May on Patreon.	2017-02-07 14:55:58 -04:00
Joey Hess	9eb10caa27	Some optimisations to string splitting code. Turns out that Data.List.Utils.split is slow and makes a lot of allocations. Here's a much simpler single character splitter that behaves the same (even in wacky corner cases) while running in half the time and 75% the allocations. As well as being an optimisation, this helps move toward eliminating use of missingh. (Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and allocates even more.) I have not benchmarked the effect on git-annex, but would not be surprised to see some parsing of eg, large streams from git commands run twice as fast, and possibly in less memory. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2017-01-31 19:06:22 -04:00
Joey Hess	339464e847	config: New command for storing configuration in the git-annex branch. Any config names can be set using this; git-annex commands will only look at specific ones that make sense and are worth the overhead of querying the branch. This might also be useful for storing whatever other config-type stuff the user might want to shove into the git-annex branch. This commit was sponsored by Jochen Bartl on Patreon.	2017-01-30 16:46:38 -04:00
Joey Hess	8484c0c197	Always use filesystem encoding for all file and handle reads and writes. This is a big scary change. I have convinced myself it should be safe. I hope!	2016-12-24 14:46:31 -04:00
Joey Hess	48d9624a2d	Revert ServerAliveInterval Revert ServerAliveInterval change in 6.20161111, which caused problems with too many old versions of ssh and unusual ssh configurations. It should have not been needed anyway since ssh is supposted to have TCPKeepAlive enabled by default.	2016-12-13 12:12:38 -04:00
Joey Hess	9dd510bf29	make tor hidden service work when directory watching is not available Avoid crashing when built w/o inotify..	2016-12-09 16:40:47 -04:00
Joey Hess	e152c322f8	refactor ref change watching Added to change notification to P2P protocol. Switched to a TBChan so that a single long-running thread can be started, and serve perhaps intermittent requests for change notifications, without buffering all changes in memory. The P2P runner currently starts up a new thread each times it waits for a change, but that should allow later reusing a thread. Although each connection from a peer will still need a new watcher thread to run. The dependency on stm-chans is more or less free; some stuff in yesod uses it, so it was already indirectly pulled in when building with the webapp. This commit was sponsored by Francois Marier on Patreon.	2016-12-09 15:01:09 -04:00
Joey Hess	38516b2fca	update progress logs in remotedaemon send/receive	2016-12-08 19:56:02 -04:00
Joey Hess	a8c868c2e1	plumb assicated files through P2P protocol for updating transfer logs ReadContent can't update the log, since it reads lazily. This part of the P2P monad will need to be rethought. Associated files are heavily sanitized when received from a peer; they could be an exploit vector. This commit was sponsored by Jochen Bartl on Patreon.	2016-12-02 16:42:54 -04:00
Joey Hess	bfc8305814	implement p2p command	2016-11-30 14:35:24 -04:00
Joey Hess	0a4479b8ec	Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.	2016-11-15 21:29:54 -04:00
Joey Hess	7ed96a2405	Make .git/annex/ssh.config file work with versions of ssh older than 7.3, which don't support Include. When used with an older version of ssh, any ServerAliveInterval in ~/.ssh/config will be overridden by .git/annex/ssh.config. This commit was sponsored by Josh Taylor on Patreon.	2016-11-07 10:32:57 -04:00
Joey Hess	0ae08947ac	Run ssh with ServerAliveInterval 60 So that stalled transfers will be noticed within about 3 minutes, even if TCPKeepAlive is disabled or doesn't work. Rather than setting with -o, use -F with another config file, so that any settings in ~/.ssh/config or /etc/ssh/ssh_config overrides this.	2016-10-26 16:41:34 -04:00
Joey Hess	1a8ba7eab4	Improve ssh socket cleanup code to skip over the cruft that NFS sometimes puts in a directory when a file is being deleted.	2016-10-26 13:16:41 -04:00
Joey Hess	8e22114735	upgrade: Handle upgrade to v6 when the repository already contains v6 unlocked files whose content is already present. Closes https://github.com/datalad/datalad/issues/1020 The use of runWriter in scanUnlockedFiles broke due to this change; it failed with blocked indefinitely in mvar, because the database write handle was taken while linkFromAnnex needed to also write to it (to update the inode cache). So, switched to using a separate runWriter for each call to addAssociatedFileFast. A little less efficient, but not greatly; the writes should all still be cached.	2016-10-17 15:19:47 -04:00
Joey Hess	148bd0dbfd	refactor	2016-10-17 14:58:33 -04:00
Joey Hess	ee309d6941	lock: Fix edge cases where data loss could occur in v6 mode. In the case where the pointer file is in place, and not the content of the object, lock's performNew was called with filemodified=True, which caused it to try to repopulate the object from an unmodified associated file, of which there were none. So, the content of the object got thrown away incorrectly. This was the cause (although not the root cause) of data loss in https://github.com/datalad/datalad/issues/1020 The same problem could also occur when the work tree file is modified, but the object is not, and lock is called with --force. Added a test case for this, since it's excercising the same code path and is easier to set up than the problem above. Note that this only occurred when the keys database did not have an inode cache recorded for the annex object. Normally, the annex object would be in there, but there are of course circumstances where the inode cache is out of sync with reality, since it's only a cache. Fixed by checking if the object is unmodified; if so we don't need to try to repopulate it. This does add an additional checksum to the unlock path, but it's already checksumming the worktree file in another case, so it doesn't slow it down overall. Further investigation found a similar problem occurred when smudge --clean is called on a file and the inode cache is not populated. cleanOldKeys deleted the unmodified old object file in this case. This was also fixed by checking if the object is unmodified. In general, use of getInodeCaches and sameInodeCache is potentially dangerous if the inode cache has not gotten populated for some reason. Better to use isUnmodified. I breifly auited other places that check the inode cache, and did not see any immediate problems, but it would be easy to miss this kind of problem.	2016-10-17 13:58:43 -04:00
Joey Hess	933bc5c917	Support using v3 repositories without upgrading them to v5. An easy change now that supportedVersions is a list. Since v3 and v5 are identical other than version number, just add v3 to the list. This commit was sponsored by andrea rota.	2016-10-05 16:53:09 -04:00
Joey Hess	f867fc157f	When auto-upgrading a v3 remote, avoid upgrading to version 6, instead keep it at version 5. Fixes a bug introduced with v6 mode that I didn't notice until now. Probably not many v3 repos left out there, and upgrading them to v6 mode is not disastrous, only a little premature. This commit was sponsored by Riku Voipio	2016-10-05 16:23:09 -04:00
Joey Hess	34530e59d9	Avoid using a lot of memory when large objects are present in the git repository .. and have to be checked to see if they are a pointed to an annexed file. Cases where such memory use could occur included, but were not limited to: - git commit -a of a large unlocked file (in v5 mode) - git-annex adjust when a large file was checked into git directly Generally, any use of catKey was a potential problem. Fix by using git cat-file --batch-check to check size before catting. This adds another git batch process, which is included in the CatFileHandle for simplicity. There could be performance impact, anywhere catKey is used. Particularly likely to affect adjusted branch generation speed, and operations on unlocked files in v6 mode. Hopefully since the --batch-check and --batch read the same data, disk buffering will avoid most overhead. Leaving only the overhead of talking to the process over the pipe and whatever computation --batch-check needs to do. This commit was sponsored by Bruno BEAUFILS on Patreon.	2016-10-05 15:24:13 -04:00
Joey Hess	1cd02762bf	Optimisations to git-annex branch query and setting, avoiding repeated copies of the environment. Speeds up commands like "git-annex find --in remote" by over 50%. Profiling showed that adjustGitEnv was 21% of the time and 37% of the allocations of that command. It copied the environment each time with getEnvironment. The only repeated use of adjustGitEnv is in withIndexFile, which tends to be run at least once per file. So, it was optimised by keeping a cache of the environment, which can be reused. There could be other better ways to optimise this. Maybe get the while environment once at startup. But, then it would have to be serialized back out each time running a child process, so I doubt that would be a net win. It might be better to cache a version of the environment that is pre-modified to use .git-annex/index. But, profiling doesn't show that modifying the enviroment is taking any significant time.	2016-09-29 13:36:48 -04:00
Joey Hess	35446d3c3a	followup	2016-09-29 11:33:42 -04:00
Joey Hess	8794dcf27b	Optimisations to time it takes git-annex to walk working tree and find files to work on. Sped up by around 18%. key2file and file2key were top cost centers according to profiling. The repeated use of replace was not efficient. This new approach is quite a lot more efficient. This commit was sponsored by Denis Dzyubenko on Patreon.	2016-09-26 16:48:57 -04:00
Joey Hess	a569f195b7	fix bugs in handing of deep branches with sync and adjusted branches * sync: Previously, when run in a branch with a slash in its name, such as "foo/bar", the sync branch was "synced/bar". That conflicted with the sync branch used for branch "bar", so has been changed to "synced/foo/bar". * adjust: Previously, when adjusting a branch with a slash in its name, such as "foo/bar", the adjusted branch was "adjusted/bar(unlocked)". That conflicted with the adjusted branch used for branch "bar", so has been changed to "adjusted/foo/bar(unlocked)" * Also, running sync in an adjusted branch did not correctly sync changes back to the parent branch when it had a slash in its name. This bug has been fixed. Eliminate use of Git.Ref.under and Git.Ref.basename; using Git.Ref.underBase and Git.Ref.base make everything handle deep branches correctly. Probably noone was adjusting deep branches, and v6 is still experimental anyway, so I'm not going to worry about the mess that was left by that bug. In the case of git-annex sync, using a fixed git-annex with an old unfixed one will mean they use different sync branches for a deep branch, and so they may stop syncing until the old one is upgraded. However, that's only a problem when syncing between repositories without going via a central bare repository. Added a warning about this to the CHANGELOG, but it's probably not going to affect many people at all. This commit was sponsored by Riku Voipio.	2016-09-21 15:23:47 -04:00
Joey Hess	d4fbc3b460	make --json-progress work for url downloads	2016-09-09 16:15:39 -04:00
Joey Hess	8ef494a833	disentangle concurrency and message type This makes -Jn work with --json and --quiet, where before setting -Jn disabled those options. Concurrent json output is currently a mess though since threads output chunks over top of one-another.	2016-09-09 12:57:42 -04:00
Joey Hess	31289da691	get -J: Download different files from different remotes when the remotes have the same costs. Only done in -J mode because only if there's concurrency can downloading from two remotes be faster. Without concurrency, it's likely the case that sequential downloads from the same remote are faster than switching back and forth between two remotes. There is some hairy MVar code here, but basically it just keeps the activeremotes MVar full except when deciding which remote to assign to a thread. Also affects gets by sync --content -J This commit was sponsored by Jochen Bartl.	2016-09-06 12:45:21 -04:00
Joey Hess	10ddf2c3bd	remove TransferObserver unused after last commit	2016-08-03 13:46:20 -04:00
Joey Hess	f461bcae4b	Re-enable accumulating transfer failure log files for command-line actions This was disabled in commit `61ccf95004`, because only the assistant used them, and they were clutter. But, now --failed also uses them. Remove the failure log files after successful transfers. Should avoid most of the clutter problems. Commit `61ccf95004` mentions a subtle behavior change, which has now been reverted: There is one behavior change from this. If glacier is being used, and a manual git annex get --from glacier fails because the file isn't available yet, the assistant will no longer later see that failed transfer file and retry the get.	2016-08-03 13:41:07 -04:00
Joey Hess	1a0e2c9901	get, move, copy, mirror: Added --failed switch which retries failed copies/moves Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/	2016-08-03 12:37:12 -04:00
Joey Hess	bf3327ff25	Added metadata --batch option, which allows getting, setting, deleting, and modifying metadata for multiple files/keys.	2016-07-27 10:46:25 -04:00
Joey Hess	e5225f08fc	When built with ut uid-1.3.12, generate more random UUIDs than before Use nextRandom to generate the random UUID, rather than using randomIO. This gets fixes for the following two bugs in the uuid library. However, this did not impact git-annex much, so a hard depedency has not been added on uuid-1.3.12. https://github.com/aslatter/uuid/issues/15 "v4 UUIDs are not that random" This doesn't greatly affect git-annex, because even with only 2^64 possible UUIDs, the chance that two git-annex repositories that are clones of the same git repo get the same UUID is miniscule. And, git-annex generates only one UUID per run, so preducting subsequent UUIDs is not a problem. https://github.com/aslatter/uuid/issues/16 "Remove Random instance for UUID, or mark it as deprecated" git-annex was using that instance; let's stop before it gets deprecated or removed.	2016-07-27 07:46:08 -04:00
Joey Hess	d13194b230	--branch, stage 2 Show branch:file that is being operated on. I had to make ActionItem a type and not a type class because withKeyOptions' passed two different types of values when using the type class, and I could not get the type checker to accept that.	2016-07-20 15:23:43 -04:00
Joey Hess	2619019630	Avoid any access to keys database in v5 mode repositories, which are not supposed to use that database.	2016-07-19 12:12:19 -04:00
Joey Hess	154c939830	Speed up startup time by caching the refs that have been merged into the git-annex branch. This can speed up git-annex commands by as much as a second, depending on the number of remotes.	2016-07-17 12:24:34 -04:00
Joey Hess	cbe3813005	handle SomeAsyncException same as AsyncException This new class was added to base a while ago; I don't know what uses it, but it's intended to be an async exception, so make sure we don't catch it.	2016-06-20 10:31:47 -04:00
Joey Hess	142710d1b4	fix build on windows	2016-06-13 14:54:34 -04:00
Joey Hess	bfd00a0f8c	v6: Fix bad merge in an adjusted branch that resulted in an empty tree.	2016-06-13 14:18:22 -04:00
Joey Hess	b6b5a11601	Make git clean filter preserve the backend that was used for a file.	2016-06-09 15:17:08 -04:00
Joey Hess	0249f3aff5	Fix bug in initialization of clone from a repo with an adjusted branch that had not been synced back to master. This bug caused broken tree objects to get built by a later git annex sync. This is a somewhat unlikely but not impossible situation, and the test suite's union_merge_regression test tickled it when it was run on FAT.	2016-06-09 14:11:00 -04:00
Joey Hess	8e4cbefbc6	also avoid crashing in most circumstances if unable to determine the username Mostly the username is only used for the git committer or other display purposes, and we can just fall back to a dummy value in these cases. The only remaining place where an error is thrown is when starting local pairing, which needs the username to be known.	2016-06-08 15:04:15 -04:00
Joey Hess	9569d6be63	Fix bad automatic merge conflict resolution between an annexed file and a directory with the same name when in an adjusted branch. When running in an overlay work tree, all unchanged files show as deleted, so this code that stages deletions should not run.	2016-06-07 12:53:35 -04:00
Joey Hess	8148ee3d4b	withAltRepo needs a separate queue of changes The queue could potentially contain changes from before withAltRepo, and get flushed inside the call, which would apply the changes to the modified repo. Or, changes could be queued in withAltRepo that were intended to affect the modified repo, but don't get flushed until later. I don't know of any cases where either happens, but better safe than sorry. Note that this affect withIndexFile, which is used in git-annex branch updates. So, it potentially makes things slower. Should not be by much; the overhead consists only of querying the current queue a couple of times, and potentially flushing changes queued within withAltRepo earlier, that could have maybe been bundled with other later changes. Notice in particular that the existing queue is not flushed when calling withAltRepo. So eg when git annex add needs to stage files in the index, it will still bundle them together efficiently.	2016-06-03 13:57:00 -04:00
Joey Hess	907fc62f2c	Fix initialization of a bare clone of a repo that has an adjusted branch checked out.	2016-06-02 17:02:38 -04:00
Joey Hess	26887745a0	refactor isBareRepo	2016-06-02 16:59:47 -04:00
Joey Hess	3b97c09cde	better avoid switching to direct mode in clone of adjusted branch repo	2016-06-02 16:10:30 -04:00
Joey Hess	69bf128f76	avoid switching to direct mode in clone of adjusted branch repo	2016-06-02 15:36:52 -04:00
Joey Hess	72f0d3d384	Automatically enable v6 mode when initializing in a clone from a repo that has an adjusted branch checked out. The clone also has the adjusted branch checked out, so it needs to be initialized to a version that supports that.	2016-06-02 15:34:30 -04:00
Joey Hess	fbf5045d4f	sync --content: Fix bug that caused transfers of files to be made to a git remote that does not have a UUID. This particularly impacted clones from gcrypt repositories. Added guard in Annex.Transfer to prevent this problem at a deeper level. I'm unhappy ith NoUUID, but having Maybe UUID instead wouldn't help either if nothing checked that there was a UUID. Since there legitimately need to be Remotes that do not have a UUID, I can't see a way to fix it at the type level, short making there be two separate types of Remotes.	2016-06-02 13:50:43 -04:00
Yaroslav Halchenko	64e844e1fe	minor typo fixes throughout problematic flexibility	2016-06-02 11:22:18 -04:00
Joey Hess	714750e593	include 3 in upgradableVersions Does not change behavior, only git annex version output	2016-05-24 17:13:19 -04:00
Joey Hess	91df4c6b53	Pass the various gnupg-options configs to gpg in several cases where they were not before. Removed the instance LensGpgEncParams RemoteConfig because it encouraged code that does not take the RemoteGitConfig into account. RemoteType's setup was changed to take a RemoteGitConfig, although the only place that is able to provide a non-empty one is enableremote, when it's changing an existing remote. This led to several folow-on changes, and got RemoteGitConfig plumbed through.	2016-05-23 17:03:20 -04:00
Joey Hess	80b86ff78d	fix recent test suite reversion git annex adjust --force will overwrite any current adjusted branch. I didn't document this because for the user, deleting the branch is just as good.	2016-05-23 11:23:30 -04:00
Joey Hess	097605e2e9	git's handing of relative GIT_INDEX_FILE is more insane than I thought; always make absolute This is actually worse than I thought; when git is being run with a detached work tree, GIT_INDEX_FILE is treated as a path relative to CWD, instead of the normal behavior of relative the top of the work tree. This seems to make it basically impossible for any program that wants to use GIT_INDEX_FILE to use anything other than an absolute path to it; there are too many configurations to keep straight that can change how git interprets what should be a simple relative path to a file. (I have complained to the git developers.)	2016-05-22 15:02:55 -04:00
Joey Hess	823c28d2dc	nub transitionList to avoid ugly message after repeated transitions, and avoid redundant work for repeated ForgetDeadRemotes transitions	2016-05-18 12:26:38 -04:00
Joey Hess	766728c8cf	unify handling of unusual GIT_INDEX_FILE relative path This is probably a git bug that stuck in its interface.	2016-05-17 14:42:06 -04:00
Joey Hess	b4ab1fb093	Fix crash when entering/changing view in a subdirectory of a repo that has a dotfile in its root.	2016-05-17 13:49:10 -04:00
Joey Hess	e91037a38b	use indexEnv	2016-05-17 13:38:04 -04:00
Joey Hess	93c03b5dd5	Work around git bug in handling of relative path to GIT_INDEX_FILE when in a subdirectory of the repository. This affected git annex view. It turns out that some other places that use GIT_INDEX_FILE were already working around the bug. I removed the workaround from Annex.Branch since the new workaround will do.	2016-05-17 13:29:51 -04:00
Joey Hess	d56175164b	avoid checking locations in regular repo In commit `2d00523609` I accidentially made gitAnnexLocation do more work, checking content locations, when used in a regular repo.	2016-05-16 17:19:07 -04:00
Joey Hess	eda5d9cc74	adjust: Add --fix adjustment, which is useful when the git directory is in a nonstandard place.	2016-05-16 17:18:33 -04:00
Joey Hess	4efc26ca6c	move keys db closure to AutoMerge This makes git-annex sync also do it, which makes sure that the keys db info is fresh when doing a sync --content.	2016-05-16 15:11:14 -04:00
Joey Hess	9f05be393e	adjust: If the adjusted branch already exists, avoid overwriting it, since it might contain changes that have not yet been propigated to the original branch. Could not think of a foolproof way to detect if the old adjusted branch was just behind the current branch. It's possible that the user amended the adjusting commit at the head of the adjusted branch, for example. I decided to bail in this situation, instead of just entering the old branch, so that if git annex adjust succeeds the user is always in a current adjusted branch, not some old and out of date one. What could perhaps be done is enter the old branch and then update it. But that seems too magical; the user may have rebased master or something or may not want to propigate the changes from the old branch. Best to error out.	2016-05-13 14:04:22 -04:00
Joey Hess	2d00523609	In the unusual configuration where annex.crippledfilesystem=true but core.symlinks=true, store object contents in mixed case hash directories so that symlinks will point to them. Contents are searched for in both locations, same as before, so this does not add any overhead.	2016-05-10 15:00:22 -04:00
Joey Hess	8a81ddb448	improve comment	2016-05-10 14:42:57 -04:00
Joey Hess	c456833179	Windows: Fix an over-long temp directory name.	2016-05-06 12:49:41 -04:00
Joey Hess	6cf9dbb564	fix build warning on windows	2016-05-05 15:48:58 -04:00
Joey Hess	a9e8cf42d6	more windows path fixes normalize filepaths in the map because it may be constructed with windows-style paths and then queried for git-style	2016-05-04 13:00:02 -04:00
Joey Hess	b22409db38	avoid warnings about not exported System.Directory.isSymbolicLink	2016-04-28 15:18:11 -04:00
Joey Hess	5fe450514b	Fix build with directory-1.2.6.2. It started exporting a isSymbolicLink which supports windows. But, git-annex does no use symlinks on windows yet and this conflicts with the function by the same name from unix-compat, so hide it.	2016-04-28 13:18:44 -04:00
Joey Hess	46e3319995	assistant: Deal with upcoming git's refusal to merge unrelated histories by default git 2.8.1 (or perhaps 2.9.0) is going to prevent git merge from merging in unrelated branches. Since the webapp's pairing etc features often combine together repositories with unrelated histories, work around this behavior change by setting GIT_MERGE_ALLOW_UNRELATED_HISTORIES when the assistant merges. Note though that this is not done for git annex sync's merges, so it will follow git's default or configured behavior.	2016-04-22 14:26:44 -04:00
Joey Hess	0273cd5005	adjusted branches need git 2.2.0 or newer When git-annex is used with a git version older than 2.2.0, disable support for adjusted branches, since GIT_COMMON_DIR is needed to update them and was first added in that version of git.	2016-04-22 12:29:32 -04:00
Joey Hess	b56218f0c2	Fix bug that prevented annex.sshcaching=false configuration from taking effect when on a crippled filesystem. Thanks, divergentdave.	2016-04-20 14:43:43 -04:00
Joey Hess	9d952fe9d1	reinject: When src file's content cannot be verified, leave it alone, instead of deleting it.	2016-04-20 13:21:56 -04:00
Joey Hess	bd516af734	fsck: Warn when core.sharedRepository is set and an annex object file's write bit is not set and cannot be set due to the file being owned by a different user. Made all Annex.Perms file mode changing functions ignore errors when core.sharedRepository is set, because the file might be owned by someone else. I don't fancy getting bug reports about crashes due to set modes in this configuration, which is a very foot-shooty configuration in the first place. The fsck warning is necessary because old repos kept files mode 444, which doesn't allow locking them, and so if the mode remains 444 due to the file being owned by someone else, the user should be told about it.	2016-04-14 15:36:53 -04:00
Joey Hess	b7c8bf5274	Preserve execute bits of unlocked files in v6 mode. When annex.thin is set, adding an object will add the execute bits to the work tree file, and this does mean that the annex object file ends up executable. This doesn't add any complexity that wasn't already present, because git annex add of an executable file has always ingested it so that the annex object ends up executable. But, since an annex object file can be executable or not, when populating an unlocked file from one, the executable bit is always added or removed to match the mode of the pointer file.	2016-04-14 14:47:08 -04:00
Joey Hess	5e190913a4	add AdjBranch newtype; some simplications	2016-04-09 15:10:26 -04:00
Joey Hess	b5be04027c	change name of basis branch Making the name look too much like the adjusted branch was ambiguous.	2016-04-09 14:17:20 -04:00
Joey Hess	7d28110c68	fix master push overwrite race when updating adjusted branch, by maintaining basis ref	2016-04-09 14:12:25 -04:00
Joey Hess	cf06dac2b8	hard links on windows * annex.thin and annex.hardlink are now supported on Windows. * unannex --fast now makes hard links on Windows.	2016-04-08 15:25:32 -04:00
Joey Hess	251405eca2	avoid withWorkTreeRelated affecting annex symlink calculation	2016-04-08 14:24:00 -04:00
Joey Hess	6549049142	fix commit tree after merge into adjusted branch	2016-04-06 19:22:15 -04:00
Joey Hess	887ef93a7f	run out of tree merge with --no-ff This is how direct mode does it too, and somehow, for reasons that currently escape me, this makes git merge not care if it's run with an empty work tree.	2016-04-06 18:40:28 -04:00
Joey Hess	60bdffe43e	fix auto merge conflict resolution when doing out of tree merge for adjusted branch	2016-04-06 17:32:04 -04:00
Joey Hess	b9e4e2ba84	new method for merging changes into adjusted branch that avoids unncessary merge conflicts Still needs work when there are actual merge conflicts.	2016-04-06 15:36:18 -04:00
Joey Hess	2046502407	v6: Close pointer file handles more quickly, to avoid problems on Windows. Was using L.readFile, so the Handle would remain open until the garbage collector got around to it. Changed to explicit open and close, so we know it's always closed when the function returns.	2016-04-04 15:42:33 -04:00
Joey Hess	f78cbd9f0d	todo	2016-04-04 14:38:29 -04:00
Joey Hess	7c7f3a0f76	deal with cloning a repo that has an ajdusted branch checked out	2016-04-04 13:51:42 -04:00
Joey Hess	7ba836eec3	make way for git checkout output	2016-04-04 13:25:30 -04:00
Joey Hess	c3e0859846	Upgrading a direct mode repository to v6 has changed to enter an adjusted unlocked branch. This makes the direct mode to v6 upgrade able to be performed in one clone of a repository without affecting other clones, which can continue using v5 and direct mode.	2016-04-04 13:17:24 -04:00
Joey Hess	12ddb6e8b2	fixed merging of changes from adjusted branch + a remote	2016-03-31 18:54:35 -04:00
Joey Hess	860602a1e6	made some progress on syncing adjusted branches, but still buggy	2016-03-31 14:56:10 -04:00
Joey Hess	a585731935	add reflog messages	2016-03-31 12:27:48 -04:00
Joey Hess	02ce75c87d	clean up handling of commit lock Closing the lock manually caused a later exception when the bracket tried to close it again.	2016-03-31 12:04:05 -04:00
Joey Hess	8a69298bf2	init: Automatically enter the adjusted unlocked branch when in a v6 repo on a filesystem not supporting symlinks.	2016-03-29 13:54:42 -04:00
Joey Hess	42b7ccc89f	git annex add in adjusted unlocked branch Cached the current branch lookup just because it seems unnecessary overhead to run an extra git command per add to query the current branch.	2016-03-29 13:26:06 -04:00
Joey Hess	5e1d7bbc00	limit git annex adjust to v6 mode doesn't work in v5	2016-03-29 12:05:02 -04:00
Joey Hess	1df62b43d1	remove hashPointerFile' no longer needed now that hashPointerFile uses a long-running git hash-object handle	2016-03-29 11:15:21 -04:00
Joey Hess	70e8d6860e	Merge branch 'master' into adjustedbranch	2016-03-29 11:07:40 -04:00
Joey Hess	a2b668a8f6	reuse annex's HashObjectHandle	2016-03-14 16:29:59 -04:00
Joey Hess	2d234de781	Sped up git-annex merge by using git hash-object --batch. This does mean that it has to write out temp files containing updated objects for the merge. So may use more disk space, and disk IO, but that should generally win out over needing to launch N separate git hash-object processes.	2016-03-14 16:23:22 -04:00
Joey Hess	00d9da3534	use hash-object --batch Handle was plumbed through, but not used.	2016-03-14 16:12:55 -04:00
Joey Hess	88a4a6f396	Sped up git-annex add in direct mode and v6 by using git hash-object --batch. Speeds up hashSymlink and hashPointerFile.	2016-03-14 15:58:46 -04:00
Joey Hess	f2772f469a	followup	2016-03-14 15:54:46 -04:00
Joey Hess	1df49506c4	Correct git-annex info to include unlocked files in v6 repository. An unlocked present file does not have a pointer file in the worktree, so info skipped counting it. It may be that unused was also affected by the problem, but it seemed not to be in my tests. I think because of the use of the associatedFilesFilter. This fix slows down both info and unused a little bit, since they have to query the contents of files from git, but only when handling unlocked files.	2016-03-14 13:14:01 -04:00
Joey Hess	41b7c5f6aa	implement another adjustment -- easy to do now!	2016-03-11 19:54:10 -04:00
Joey Hess	a85196bd4e	simplify adjustment reversal	2016-03-11 19:41:11 -04:00
Joey Hess	ba1ef156a2	fix deletion of files in adjustTree	2016-03-11 16:30:06 -04:00
Joey Hess	b9184f69a7	improve propigation of commits from adjusted branches Only reverse adjust the changes in the commit, which means that adjustments do not need to be generally cleanly reversable. For example, an adjustment can unlock all locked files, but does not need to worry about files that were originally unlocked when reversing, because it will only ever be run on files that have been changed. So, it's ok if it locks all files when reversed, or even leaves all files as-is when reversed.	2016-03-11 16:05:06 -04:00
Joey Hess	97e97dccda	Merge branch 'master' into adjustedbranch	2016-03-11 12:21:26 -04:00
Joey Hess	4b3355cf3c	refactor	2016-03-09 13:43:22 -04:00
Joey Hess	9039bdb4ea	Always try to thaw content, even when annex.crippledfilesystem is set.	2016-03-09 13:33:13 -04:00
Joey Hess	be80c29dbc	Merge branch 'no-cbits'	2016-03-05 11:22:32 -04:00
Joey Hess	5e3f707c34	rebase on top of updated original branch	2016-03-03 17:00:48 -04:00
Joey Hess	ac08f6580e	fix abs filepath generation	2016-03-03 16:47:51 -04:00
Joey Hess	40509e20e5	change name of adjusted branches to eg adjusted/master(unlocked) Using adjusted/unlocked/master made lots of git stuff dealing with "master" complain that it was ambiguous. This new appoach is more like view branch names, and shows the adjustment right there in the branch display even if only the basename of the branch is shown.	2016-03-03 16:38:56 -04:00
Joey Hess	cf24e9b892	working toward adjusted commit propigation	2016-03-03 16:19:09 -04:00
Joey Hess	7811556a5b	Merge branch 'master' into adjustedbranch	2016-03-03 15:40:25 -04:00
Joey Hess	91f37673df	can't checkout adjusted branch while index is still locked There's a race here, but entering an adjusted branch for the first time is not something to do when a commit is being made at the same time. Although, may want to prevent the assistant from committing while entering the adjusted branch.	2016-03-03 14:20:39 -04:00
Joey Hess	6024108ab2	push original branch, not adjusted branch	2016-03-03 14:13:54 -04:00
Joey Hess	ef1abda78b	lock index while making index-less commits Avoids race with another git commit at the same time adjusted branch is being updated.	2016-03-03 12:55:00 -04:00
Joey Hess	84de8bd2d0	clarify	2016-03-01 16:22:47 -04:00
Joey Hess	3e91cd13ba	Fix data loss that can occur when annex.pidlock is set in a repository.	2016-03-01 12:12:57 -04:00
Joey Hess	a97a9aaaee	remove debug	2016-02-29 17:36:20 -04:00
Joey Hess	70e78cc53e	update keys database when adjusting branches	2016-02-29 17:27:19 -04:00
Joey Hess	d7bd4d971d	implement updateAdjustedBranch	2016-02-29 17:16:56 -04:00
Joey Hess	048d513233	make assistant aware of adjusted branches when merging	2016-02-29 15:57:47 -04:00
Joey Hess	7c20bf6e7a	make sync aware of adjusted branches So, it will pull and push the original branch, not the adjusted one. And, for merging, it will use updateAdjustedBranch (not implemented yet). Note that remaining uses of Git.Branch.current need to be checked too; for things that should act on the original branch, and not the adjusted branch.	2016-02-29 15:23:08 -04:00
Joey Hess	9e1ebc2336	include adjustment in the adjusted branch name Allows it to be recovered easily.	2016-02-29 15:04:58 -04:00
Joey Hess	3b4557c754	Merge branch 'master' into adjustedbranch	2016-02-29 14:05:10 -04:00
Joey Hess	e520366c4d	metadata: Added -r to remove all current values of a field.	2016-02-29 13:00:46 -04:00
Joey Hess	b946ca44c3	Support --metadata field<number, --metadata field>number etc to match ranges of numeric values. Similarly (well, for free), support preferred content expressions like metadata=field<number and metadata=field>number	2016-02-27 10:55:02 -04:00
Joey Hess	471a211d21	Include magic database in the linux and OSX standalone builds.	2016-02-26 11:54:15 -04:00
Joey Hess	0a1b02ce04	adjusted branches, proof of concept "git annex adjust" may be a temporary interface, but works for a proof of concept. It is pretty fast at creating the adjusted branch. The main overhead is injecting pointer files. It might be worth optimising that by reusing the symlink target as the pointer file content. When I tried to do that, the problem was that the clean filter doesn't use that same format, and so git thought files had changed. Could be dealt with, perhaps make the clean filter use symlink format for pointer files when on an adjusted branch? But the real overhead is in checking out the branch, when git runs the smudge filter once per file. That is perhaps too slow to be usable, although it may only affect initial checkout of the branch, and not updates. TBD.	2016-02-25 16:23:24 -04:00
Joey Hess	4712882776	add hashPointerFile'	2016-02-25 16:10:54 -04:00
Joey Hess	04d4830ac3	add catCommit	2016-02-25 15:34:46 -04:00
Joey Hess	be2e9427ad	refactor	2016-02-25 13:46:31 -04:00
Joey Hess	a5bf674bec	Avoid crashing when built with MagicMime support, but when the magic database cannot be loaded.	2016-02-23 14:39:56 -04:00
Joey Hess	b0081598c7	Fix memory leak in last release, which affected commands like git-annex status when a large non-annexed file is present in the work tree. The whole file was strictly read, and so buffered in memory, and remained buffered for some time when running git-annex status.	2016-02-19 14:45:26 -04:00
Joey Hess	3fba4f83ed	fix windows build	2016-02-16 16:15:32 -04:00
Joey Hess	aa569500d5	fix numerous problem with test suite on crippled filesystems etc	2016-02-16 15:30:59 -04:00
Joey Hess	15148ee9eb	annex.addunlocked * add, addurl, import, importfeed: When in a v6 repository on a crippled filesystem, add files unlocked. * annex.addunlocked: New configuration setting, makes files always be added unlocked. (v6 only)	2016-02-16 14:43:43 -04:00
Joey Hess	adc27f081a	escape slashes in annex pointer files The problem with having the slashes unescaped is, it broke parsing, since the parser takes the filename to get the part containing the key. That particularly affected URL keys. This makes the format be the same as symlinks point to, which keeps things simple. Existing pointer files will continue to work ok.	2016-02-16 14:10:08 -04:00
Joey Hess	7899f7248a	force strict file read Avoid possibly having the file open still when it gets deleted. Needed on Windows, particularly.	2016-02-15 16:47:34 -04:00
Joey Hess	4d89a1ffd1	allow \r in pointer files git-annex doesn't write \r, but it can be present due to line ending conversions or perhaps user edits.	2016-02-15 16:37:40 -04:00
Joey Hess	f9d79d194b	Windows: Fix v6 unlocked files to actually work. Pointer files were not being treated as annex content, so "git annex get" didn't replace them with the object.	2016-02-15 16:12:18 -04:00
Joey Hess	2e3b5e645f	When initializing a v6 repo on a crippled filesystem, don't force it into direct mode.	2016-02-15 15:41:49 -04:00
Joey Hess	540a0343ba	more windows build fix	2016-02-15 15:03:44 -04:00
Joey Hess	f55c576923	fix windows build	2016-02-15 14:58:45 -04:00
Joey Hess	40207b26ea	move old ghc compat code into separate module; eliminate WITH_CLIBS This avoids hsc2hs being run except when building for the old version of ghc. Should speed up builds.	2016-02-15 11:47:33 -04:00
Joey Hess	0983f136b8	create directory for transfer lock file, and catch perm error Before, the call to mkProgressUpdater created the directory as a side-effect, but since that ignored failure to create it, this led to a "does not exist" exception when the transfer lock file was created, rather than a permissions error. So, make sure the directory exists before trying to lock the file in it. When a PermissionDenied exception is caught, skip making the transfer lock. This lets downloads from readonly remotes happen. If an upload is being tried, and the lock file can't be written due to permissions, then probably the actual transfer will fail for the same reason, so I think it's ok that it continues w/o taking the lock in that case.	2016-02-12 14:11:25 -04:00
Joey Hess	17c97434f2	init: Fix bugs in submodule .git symlink fixup, that occurred when initializing in a subdirectory of a submodule and a submodule of a submodule.	2016-02-08 15:41:27 -04:00
Joey Hess	23cc315c38	matchexpression: Added --largefiles option to parse an annex.largefiles expression.	2016-02-03 16:58:36 -04:00
Joey Hess	5127cb59cc	annex.largefiles: Add support for mimetype=text/* etc, when git-annex is linked with libmagic.	2016-02-03 16:29:34 -04:00
Joey Hess	403b56fb91	Limit annex.largefiles parsing to the subset of preferred content expressions that make sense in its context. So, not "standard" or "lackingcopies", etc.	2016-02-03 15:04:42 -04:00
Joey Hess	cdf5977053	simplify	2016-02-03 13:23:34 -04:00
Joey Hess	5d9c7a1164	refactor	2016-02-03 13:08:15 -04:00
Joey Hess	aded00c5f0	avoid unnecessary building of a one-off Map A case lookup should be more efficient.	2016-02-03 12:59:28 -04:00
Joey Hess	d37fe6a547	annex.largefiles can be configured in .gitattributes too This is particulary useful for v6 repositories, since the .gitattributes configuration will apply in all clones of the repository.	2016-02-02 15:18:17 -04:00
Joey Hess	e8fc2ff27c	add "nothing" to preferred content DSL Same as "not anything"; will be particularly useful in annex.largefiles gitattributes.	2016-02-02 14:42:13 -04:00
Gabor Greif	daf8aa76fe	Unneded constraint	2016-01-28 12:34:07 -04:00
Gabor Greif	50e4ec36c7	Another redundant constraint	2016-01-28 12:34:07 -04:00
Joey Hess	710d44a16e	add the known associated file to the list of others	2016-01-26 14:48:19 -04:00
Joey Hess	039e83ed5d	Fix nasty reversion in the last release that broke sync --content's handling of many preferred content expressions. The type checker should have noticed this, but the changes to mapM that make it accept any Traversable hid the fact that it was not being passed a list at all. Thus, what should have returned an empty list most of the time instead returned [""] which was treated as the name of the associated file, with disasterout consequences. When I have time, I should add a test case checking what sync --content drops. I should also consider replacing mapM with one re-specialized to lists.	2016-01-26 14:28:43 -04:00
Joey Hess	23ff58cd4f	optimise getUUID This avoids a Map lookup each time it's called, instead the GitConfig field lazily looks it up once and then caches.	2016-01-20 16:55:06 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	b52cf5697b	immediate queue flushing when annex.queuesize=1 Previously, it only flushed when the queue got larger than 1. Also, make the queue auto-flush when items are added, rather than needing to be flushed as a separate step. This simplifies the code and make it more efficient too, as it avoids needing to read the queue out of the state to check if it should be flushed.	2016-01-13 14:55:01 -04:00
Joey Hess	bafcbe95c3	fix one more test failure with v6 unlocked file merge conflict resolution	2016-01-08 15:23:15 -04:00
Joey Hess	51bc32e21e	better fix for slash in view metadata The homomorphs are back, just encoded such that it doesn't crash in LANG=C However, I noticed a bug in the old escaping; [pseudoSlash] was escaped the same as ['/','/']. Fixed by using '%' to escape pseudoSlash. Which requires doubling '%' to escape it, but that's already done in the escaping of worktree filenames in a view, so is probably ok.	2016-01-08 13:55:35 -04:00
Joey Hess	42619e2231	view: Avoid using cute unicode homomorphs for '/' and '\' and instead use ugly escaping, as the unicode method doesn't work on non-unicode supporting systems.	2016-01-08 12:45:32 -04:00
Joey Hess	4b819bee2b	avoid confusing git with a modified ctime in clean filter Linking the file to the tmp dir was not necessary in the clean filter, and it caused the ctime to change, which caused git to think the file was changed. This caused git status to get slow as it kept re-cleaning unchanged files.	2016-01-07 17:48:04 -04:00
Joey Hess	3b960d1422	migrate and rekey v6 unlocked file support	2016-01-07 15:14:15 -04:00
Joey Hess	0b59fb423e	migrate: Copy over metadata to new key.	2016-01-07 14:21:12 -04:00
Joey Hess	b3d60ca285	use TopFilePath for associated files Fixes several bugs with updates of pointer files. When eg, running git annex drop --from localremote it was updating the pointer file in the local repository, not the remote. Also, fixes drop ../foo when run in a subdir, and probably lots of other problems. Test suite drops from ~30 to 11 failures now. TopFilePath is used to force thinking about what the filepath is relative to. The data stored in the sqlite db is still just a plain string, and TopFilePath is a newtype, so there's no overhead involved in using it in DataBase.Keys.	2016-01-05 17:22:19 -04:00
Joey Hess	f36f24197a	scan for unlocked files on init/upgrade of v6 repo	2016-01-01 15:09:42 -04:00
Joey Hess	a2c056df65	convert isPointerFile from Annex to IO	2016-01-01 13:22:38 -04:00
Joey Hess	829ae91009	fix failing git-annex unused test case in v6 WorkTree.lookupFile was finding a key for a file that's deleted from the work tree, which is different than the v5 behavior (though perhaps the same as the direct mode behavior). Fix by checking that the work tree file exists before catting its key. Hopefully this won't slow down much, probably the catKey is much more expensive. I can't see any way to optimise this, except perhaps to make Command.Unused check if work tree files exist before/after calling lookupFile. But, it seems better to make lookupFile really only find keys for worktree files; that's what it's intended to do.	2015-12-30 14:23:31 -04:00
Joey Hess	5057fffccd	flush queue before cleaning cruft Else, queued file stages won't have reached the index, and it won't find everthing. This evidently fixes a reversion in my work today, although I don't see how I broke it. It didn't use to flush the queue first, before, and worked somehow. Test suite for v5 is back to 100% green now.	2015-12-29 17:35:57 -04:00
Joey Hess	f3be28eedc	test suite noticed a direct mode reversion	2015-12-29 17:12:57 -04:00
Joey Hess	10ecc43790	rename	2015-12-29 17:02:14 -04:00
Joey Hess	996ae9b172	don't disable smudge filter while merging The smudge filter does need to be run, because if the key is in the local annex already (due to renaming, or a copy of a file added, or a new file added and its content has already arrived), git merge smudges the file and this should provide its content. This does probably mean that in merge conflict resolution, git smudges the existing file, re-copying all its content to it, and then the file is deleted. So, not efficient.	2015-12-29 16:36:21 -04:00
Joey Hess	24bbaa2346	avoid renaming file when auto-resolving conflict in annex pointer This is a behavior change for merge conflicts between locked files that both pointed to the same key, in different ways. Before, the conflict was resolved, but the file was renamed to .variant. This was unnecessary, because there was only one variant. Of course, this also handles conflicts between unlocked and locked, or even two unlocked files with different pointer contents.	2015-12-29 16:35:34 -04:00
Joey Hess	2e9341a47d	fix inode cache consistency bug when a merge unlocks a present file Since the file was present and locked, its annex object was not in the inode cache. So, despite not needing to update the annex object when the clean filter is run on the content by git merge, it does need to record the inode cache of the annex object. Otherwise, the annex object will be assumed to be bad, since its inode is not cached.	2015-12-29 16:26:27 -04:00
Joey Hess	b6b34f4916	automatic conflict resolution for v6 unlocked files Several tricky parts: * When the conflict is just between the same key being locked and unlocked, the unlocked version wins, and the file is not renamed in this case. * Need to update associated file map when conflict resolution renames an unlocked file. * git merge runs the smudge filter on the conflicting file, and actually overwrites the file with the same content it had before, and so invalidates its inode cache. This makes it difficult to know when it's safe to remove such files as conflict cruft, without going so far as to compare their entire contents. Dealt with this by preventing the smudge filter from populating the file when a merge is run. However, that also prevents the smudge filter being run for non-conflicting files, so eg moving a file won't put its new content into place. * Ideally, if a merge or a merge conflict resolution renames an unlocked file, the file in the work tree can just be moved, rather than copying the content to a new worktree file. This is attempted to be done in merge conflict resolution, but due to git merge's behavior of running smudge filters, what actually seems to happen is the old worktree file with the content is deleted and rewritten as a pointer file, so doesn't get reused. So, this is probably not as efficient as it optimally could be. If that becomes a problem, could look into running the merge in a separate worktree and updating the real worktree more efficiently, similarly to the direct mode merge. However, the direct mode merge had a lot of bugs, and I'd rather not use that more error-prone method unless really needed.	2015-12-29 15:41:09 -04:00
Joey Hess	645833774d	fix windows build	2015-12-28 12:44:04 -04:00
Joey Hess	121f5d5b0c	annex.thin Decided it's too scary to make v6 unlocked files have 1 copy by default, but that should be available to those who need it. This is consistent with git-annex not dropping unused content without --force, etc. * Added annex.thin setting, which makes unlocked files in v6 repositories be hard linked to their content, instead of a copy. This saves disk space but means any modification of an unlocked file will lose the local (and possibly only) copy of the old version. * Enable annex.thin by default on upgrade from direct mode to v6, since direct mode made the same tradeoff. * fix: Adjusts unlocked files as configured by annex.thin.	2015-12-27 15:59:59 -04:00
Joey Hess	54f87ef95f	get associated files from Keys database	2015-12-26 15:09:53 -04:00
Joey Hess	7593917147	cleanup	2015-12-26 15:09:47 -04:00
Joey Hess	289a3592c3	support v6 unlocked files This optimisation was not necessary, and didn't work for v6 unlocked files. Typically only a small number of files will be changed by a commit, so just catKey them all.	2015-12-26 15:04:26 -04:00
Joey Hess	60c36ef6ba	make views work with v6 unlocked files Have to only use the view index in one place; lookupFile was failing for unlocked files because it was run using the view index, which was empty.	2015-12-26 14:52:58 -04:00
Joey Hess	49fca49991	remove dead code	2015-12-26 14:45:07 -04:00
Joey Hess	f324ad24c1	improve comment	2015-12-26 13:47:36 -04:00
Joey Hess	0c03629173	clean up cruft in assistant fast rename code path	2015-12-22 18:03:47 -04:00
Joey Hess	d8a8c77a8f	move cleanOldKey into ingest	2015-12-22 16:55:49 -04:00
Joey Hess	cfaac52b88	populate unlocked files with newly available content when ingesting This can happen when ingesting a new file in either locked or unlocked mode, when some unlocked files in the repo use the same key, and the content was not locally available before.	2015-12-22 16:22:28 -04:00
Joey Hess	4f60234690	finish v6 support for assistant Seems to basically work now!	2015-12-22 15:23:27 -04:00
Joey Hess	4392140946	make linkAnnex detect when the file changes as it's being copied/linked in This fixes a race where the modified file ended up in annex/objects, and the InodeCache stored in the database was for the modified version, so git-annex didn't know it had gotten modified. The race could occur when the smudge filter was running; now it gets the InodeCache before generating the Key, which avoids the race.	2015-12-22 15:20:03 -04:00
Joey Hess	8e9608d7f0	refactoring no behavior changes	2015-12-22 13:42:58 -04:00
Joey Hess	ca2c977704	wip v6 support for assistant Files are not yet added to v6 repos in unlocked mode.	2015-12-21 18:41:15 -04:00
Joey Hess	35f6a78b66	fix reversion in v5 git-annex add of unlocked file In v5, lookupFile is supposed to only look at symlinks on disk (except when in direct mode). Note that v6 also has a bug when a locked file's symlink is deleted and is replaced with a new file. It sees that a link is staged and gets that key.	2015-12-16 14:27:12 -04:00
Joey Hess	38a23928e9	temporarily remove cached keys database connection The problem is that shutdown is not always called, particularly in the test suite. So, a database connection would be opened, possibly some changes queued, and then not shut down. One way this can happen is when using Annex.eval or Annex.run with a new state. A better fix might be to make both of them call Keys.shutdown (and be sure to do it even if the annex action threw an error). Complication: Sometimes they're run reusing an existing state, so shutting down a database connection could cause problems for other users of that same state. I think this would need a MVar holding the database handle, so it could be emptied once shut down, and another user of the database connection could then start up a new one if it got shut down. But, what if 2 threads were concurrently using the same database handle and one shut it down while the other was writing to it? Urgh. Might have to go that route eventually to get the database access to run fast enough. For now, a quick fix to get the test suite happier, at the expense of speed.	2015-12-16 14:05:26 -04:00
Joey Hess	7d0e79b9e1	Use git-annex init --version=6 to get v6 for now Not ready to make it default because of the direct mode upgrade needing to all happen at once.	2015-12-15 17:17:13 -04:00
Joey Hess	f9d077186a	implemented upgrade of direct mode repo to v6	2015-12-15 16:00:26 -04:00
Joey Hess	cdd27b8920	reorg	2015-12-15 15:34:28 -04:00
Joey Hess	2bc920e266	update inode cache to cover file even when nothing needs to be done to linkAnnex This covers the case where multiple files have the same content and are added with git add. Previously only the one that was linked to the annex got its inode cached; now both are.	2015-12-15 13:02:33 -04:00
Joey Hess	1dad3af3fc	checked getKeysPresent; it's ok for v6 unlocked files When a v6 unlocked files is removed from the work tree, unused doesn't show it. When it gets removed from the index, unused does show it. This is the same as a locked file.	2015-12-11 16:12:42 -04:00
Joey Hess	7790e059b2	finish v6 git-annex lock This was a doozy!	2015-12-11 15:28:34 -04:00
Joey Hess	50e83b606c	only make 1 hardlink max between pointer file and annex object If multiple files point to the same annex object, the user may want to modify them independently, so don't use a hard link. Also, check diskreserve when copying.	2015-12-11 14:00:21 -04:00
Joey Hess	c608a752a5	Merge branch 'master' into smudge	2015-12-11 13:50:31 -04:00
Joey Hess	abd66c7089	fsck: Failed to honor annex.diskreserve when checking a remote.	2015-12-11 13:50:27 -04:00
Joey Hess	c910b4e255	wip	2015-12-11 10:42:18 -04:00
Joey Hess	9dffd3d255	add generalized linkAnnex'	2015-12-10 16:08:19 -04:00
Joey Hess	06a8256bf6	always format pointer file with a trailing newline Before the smudge filter added a trailing newline, but other things that wrote formatPointer to a file did not. also some new pointer staging code to use later	2015-12-10 16:06:58 -04:00
Joey Hess	f80a3d8cd0	check InodeCache in inAnnex et al This avoids querying the database when the content file doen't exist (or otherwise fails the provided check). However, it does add overhead of querying the database, and will certianly impact performance.	2015-12-10 14:51:04 -04:00
Joey Hess	2b8f6b8b2f	check inode cache in prepSendAnnex This does mean one query of the database every time an object is sent. May impact performance.	2015-12-10 14:50:52 -04:00
Joey Hess	3b2a7f216d	move	2015-12-10 14:20:38 -04:00
Joey Hess	3719d1b390	make clear when code is using deprecated direct mode files	2015-12-09 19:43:15 -04:00
Joey Hess	aa88851ec1	reorder	2015-12-09 19:38:37 -04:00
Joey Hess	ce73a96e4e	use InodeCache when dropping a key to see if a pointer file can be safely reset The Keys database can hold multiple inode caches for a given key. One for the annex object, and one for each pointer file, which may not be hard linked to it. Inode caches for a key are recorded when its content is added to the annex, but only if it has known pointer files. This is to avoid the overhead of maintaining the database when not needed. When the smudge filter outputs a file's content, the inode cache is not updated, because git's smudge interface doesn't let us write the file. So, dropping will fall back to doing an expensive verification then. Ideally, git's interface would be improved, and then the inode cache could be updated then too.	2015-12-09 17:54:54 -04:00
Joey Hess	5e8c628d2e	add inode cache to the db Renamed the db to keys, since it is various info about a Keys. Dropping a key will update its pointer files, as long as their content can be verified to be unmodified. This falls back to checksum verification, but I want it to use an InodeCache of the key, for speed. But, I have not made anything populate that cache yet.	2015-12-09 17:00:37 -04:00
Joey Hess	3311c48631	move InodeSentinal from direct mode code to its own module Will be used outside of direct mode for v6 unlocked files, and is already used outside of direct mode when adding files to annex.	2015-12-09 15:52:11 -04:00
Joey Hess	8a818088a3	link/copy pointer files to object content when it's added	2015-12-09 15:27:29 -04:00
Joey Hess	751120c171	avoid pre-commit hook messing up new-style unlocked files in v6 repo	2015-12-09 15:18:54 -04:00
Joey Hess	78a6b8ce05	refactor and improve pointer file handling code	2015-12-09 14:27:43 -04:00
Joey Hess	712c9fc590	require "annex/objects/" before key in pointer files This removes ambiguity, because while someone might have "WORM--foo" in a file that's not intended to be a git-annex pointer file, "annex/objects/WORM--foo" is less likely. Also, `664cc987e8` had a caveat about symlink targets being parsed as pointer files, and now the same parser is used for both. I did not include any hash directories before the key in the pointer file, as they're not needed. However, if they were included, the parser would still work ok.	2015-12-07 15:45:08 -04:00
Joey Hess	664cc987e8	support pointer files Backend.lookupFile is changed to always fall back to catKey when operating on a file that's not a symlink. catKey is changed to understand pointer files, as well as annex symlinks. Before, catKey needed a file mode witness, to be sure it was looking at a symlink. That was complicated stuff. Now, it doesn't actually care if a file in git is a symlink or not; in either case asking git for the content of the file will get the pointer to the key. This does mean that git-annex will treat a link foo -> WORM--bar as a git-annex file, and also treats a regular file containing annex/objects/WORM--bar as a git-annex file. Calling catKey could make git-annex commands need to do more work than before. This would especially be the case if a repo contained many regular files, and only a few annexed files, as now git-annex will need to ask git about the contents of the regular files.	2015-12-07 15:35:36 -04:00
Joey Hess	62a2fba1cd	Merge branch 'master' into smudge	2015-12-07 12:29:34 -04:00
Joey Hess	2936153fc4	fix temp filename Was not putting it inside the temp dir, but next to it! This was just wrong, and it led to a longer filename that desired being used, leading to some bug reports.	2015-12-06 16:54:01 -04:00
Joey Hess	6e71094e7d	avoid too long temp dir template The filename might be at or close to the filename length limit, so using it as the template for the temp dir would then fail.	2015-12-06 16:42:40 -04:00
Joey Hess	e7f75b079d	don't let git-annex direct be run in a v6 repo	2015-12-04 16:33:09 -04:00
Joey Hess	ccc49861ca	add v6; keep v5 working for now and manual upgrade Since all places where a repo is used in direct mode need to have git-annex upgraded before the repo can safely be converted to v6, the upgrade needs to be manual for now. I suppose that at some point I'll want to drop all the direct mode support code. At that point, will stop supporting v5, and will need to auto-upgrade any remaining v5 repos. If possible, I'd like to carry the direct mode support for say, a year or so, to give people plenty of time to upgrade and avoid disruption.	2015-12-04 16:14:48 -04:00
Joey Hess	34ead644d9	auto-configure filter.annex.smudge and clean on init	2015-12-04 16:14:11 -04:00
Joey Hess	983c1894eb	avoid unnecessary reading of git-annex branch data when matching on annex.largefiles This makes git annex clean not look at the git-annex branch at all, and so speeds it up by 50% or more.	2015-12-04 15:06:41 -04:00
Joey Hess	99b2a524a0	clean filter should update location log when adding new content to annex	2015-12-04 14:20:32 -04:00
Joey Hess	2c6454a2e2	basic clean filter working	2015-12-04 13:39:14 -04:00
Joey Hess	0d432dd1a4	annex object file mode for core.sharedRepository When core.sharedRepository is set, annex object files are not made mode 444, since that prevents a user other than the file owner from locking them. Instead, a mode such as 664 is used in this case.	2015-11-18 15:45:32 -04:00
Joey Hess	3449c0e8ec	avoid spawning file size polling thread when not in -J mode	2015-11-16 21:21:58 -04:00
Joey Hess	e97fce35a6	Display progress meter in -J mode when downloading from the web. Including in addurl, and get --from web, but also in S3 and External special remotes when a web url is known for content in those remotes.	2015-11-16 21:00:54 -04:00
Joey Hess	262c37c16e	add missing checkSaneLock wrapper for pidlocks	2015-11-16 15:35:41 -04:00
Joey Hess	bb86eebfbd	init: Automatically enable annex.pidlock when necessary.	2015-11-13 13:35:29 -04:00
Joey Hess	aaf1ef268d	convert from Utility.LockPool to Annex.LockPool everywhere	2015-11-12 18:13:37 -04:00
Joey Hess	aa4192aea6	pid locking configuration and abstraction layer for git-annex (not actually used anywhere yet)	2015-11-12 17:50:34 -04:00
Joey Hess	7c741302cc	assistant: Pass ssh-options through 3 more git pull/push calls that were missed before. It was used for regular pull, but not for regular push, tagged push, or the fallback fetching.	2015-11-10 16:52:30 -04:00
Joey Hess	7938b87864	add: Fix error recovery rollback to not move the injested file content out of the annex back to the file, because other files may point to that same content. Instead, copy the injected file content out to recover. That was not a data loss, but it came close!	2015-11-06 15:28:20 -04:00
Joey Hess	51e60259e1	fix replaceFile makeAnnexLink race replaceFile created a temp file, which was guaranteed to not overlap with another temp file. However, makeAnnexLink then deleted that file, in preparation for making the symlink in its place. This caused a race, since some other replaceFile could create a temp file, using the same name! I was able to reproduce the race easily running git-annex add -J10 in a directory with 100 files (all with different contents). Some files would get ingested into the annex, but their annex links would fail to be added. There could be other situations where this same problem could occur. Perhaps when the assistant is adding a file, if the user manually also ran git-annex add. Perhaps in cases not involving adding a file. The new replaceFile makes a temprary directory, which is guaranteed to be unique, and doesn't make a temp file in there. makeAnnexLink can thus create the symlink without problem and the race is avoided. Audited all calls to replaceFile to make sure that the old behavior of providing an empty temp file was not relied on. The general problem of asking for a temp file and deleting it as part of the process of using it could reach beyond replaceFile. Did some quick audits and didn't find other cases of it. Probably only symlink creation stuff would tend to make that mistake, mostly.	2015-11-06 15:08:19 -04:00
Joey Hess	31472161e4	merge git command queue when joining with concurrent thread	2015-11-05 18:21:48 -04:00
Joey Hess	a4dd8503b8	add regions to concurrent output still no progress displays when getting files etc, but a big improvement	2015-11-04 14:52:07 -04:00
Joey Hess	640dba43b6	enableremote: List uuids and descriptions of remotes that can be enabled, and accept either the uuid or the description in leu if the name.	2015-10-26 14:55:40 -04:00
Joey Hess	806819be57	Avoid displaying network transport warning when a ssh remote does not yet have an annex.uuid set. Instead, only display transport error if the configlist output doesn't include an annex.uuid line, even an empty one. A recent change made git-annex init try to get all the remote uuids, and so the transport error would be displayed by it. It was also displayed when eg, copying files to a remote that had no uuid yet.	2015-10-15 15:36:54 -04:00
Joey Hess	3879f6e6be	do tmp dir cleanup in error case too	2015-10-15 14:27:14 -04:00
Joey Hess	27eaa6f410	avoid making post-merge-conflict-resolution commit when no conflicts were resolved sync, merge, assistant: When git merge failed for a reason other than a conflicted merge, such as a crippled filesystem not allowing particular characters in filenames, git-annex would make a merge commit that could omit such files or otherwise be bad. Fixed by aborting the whole merge process when git merge fails for any reason other than a merge conflict.	2015-10-15 14:22:46 -04:00
Joey Hess	9e90c033d3	Changed drop ordering when using git annex sync --content or the assistant, to drop from remotes first and from the local repo last. This works better with the behavior changes to drop in many cases.	2015-10-14 12:33:02 -04:00
Joey Hess	1ff7610118	fix windows build	2015-10-12 15:48:59 -04:00
Joey Hess	f9adb905fc	Avoid unncessary write to the location log when a file is unlocked and then added back with unchanged content. Implemented with no additional overhead of compares etc. This is safe to do for presence logs because of their locality of change; a given repo's presence logs are only ever changed in that repo, or in a repo that has just been actively changing the content of that repo. So, we don't need to worry about a split-brain situation where there'd be disagreement about the location of a key in a repo. And so, it's ok to not update the timestamp when that's the only change that would be made due to logging presence info.	2015-10-12 14:46:47 -04:00
Joey Hess	fa9333e99f	use action, not sideAction sideAction is for things not generally related to the current action being performed. And, it adds a newline after the side action. This was not the right thing to use for stuff like "checksum", where doing a checksum is part of the git annex get process, and indeed we want it to display "(checksum...) ok"	2015-10-11 13:29:44 -04:00
Joey Hess	3b89d5a20c	implement lockContent for ssh remotes	2015-10-09 16:55:41 -04:00
Joey Hess	e392ec112f	also generate a drop safety proof for move --from remote	2015-10-09 16:16:03 -04:00
Joey Hess	6a72045707	fix local dropping to not require extra locking of copies, but only that the local copy be locked for removal	2015-10-09 15:48:02 -04:00
Joey Hess	1043880432	improve message when drop failed due to no locked copy	2015-10-09 15:14:25 -04:00
Joey Hess	b021321aae	rename constructor	2015-10-09 15:01:33 -04:00
Joey Hess	45e1a7c361	verify local copy of content with locking	2015-10-09 14:57:32 -04:00
Joey Hess	4c6095b6f5	content locking during drop working for local git remotes Only ssh remotes lack locking now	2015-10-09 13:12:58 -04:00
Joey Hess	ceb5819538	finish and use lockContent interface	2015-10-09 12:36:04 -04:00
Joey Hess	cf79dffa4c	improve drop proof code	2015-10-09 11:09:46 -04:00
Joey Hess	f57ac29be1	refactor	2015-10-09 10:30:22 -04:00
Joey Hess	7f5958eec2	TrustedCopy is good enough to allow dropping By definition, a trusted repository is trusted to always have its location tracking log accurate. Thus, it should never be in a position where content is being dropped from it concurrently, as that would result in the location tracking log not being accurate.	2015-10-08 18:34:48 -04:00
Joey Hess	e4a33967a1	try harder to verify until at least one VerifiedCopyLock is obtained This avoids a failure where eg, we start with RecentlyVerifiedCopies for all remotes, and so didn't do any active verification, which is required. Also, dedup the list of VerifiedCopies when checking if we have enough, in case 2 copies of a UUID slip in.	2015-10-08 18:20:36 -04:00
Joey Hess	b17f5da6c9	require 1 locked copy while dropping from local or a remote See doc/bugs/concurrent_drop--from_presence_checking_failures.mdwn for discussion about why 1 locked copy is all we can require, and how this fixes concurrent dropping bugs. Note that, since nothing yet generates a VerifiedCopyLock yet, this commit breaks dropping temporarily.	2015-10-08 18:11:39 -04:00
Joey Hess	c75c79864d	support invalidating existing VerifiedCopys	2015-10-08 17:58:32 -04:00
Joey Hess	90f7c4b6a2	add VerifiedCopy data type There should be no behavior changes in this commit, it just adds a more expressive data type and adjusts code that had been passing around a [UUID] or sometimes a Maybe Remote to instead use [VerifiedCopy]. Although, since some functions were taking two different [UUID] lists, there's some potential for me to have gotten it horribly wrong.	2015-10-08 16:55:11 -04:00
Joey Hess	beedf1da25	unused import	2015-10-08 14:59:34 -04:00
Joey Hess	9cb9dab69b	I think this comment is stale/confusing; remove	2015-10-08 14:51:44 -04:00
Joey Hess	4d50958ed7	add lockContentShared Also, rename lockContent to lockContentExclusive inAnnexSafe should perhaps be eliminated, and instead use `lockContentShared inAnnex`. However, I'm waiting on that, as there are only 2 call sites for inAnnexSafe and it's fiddly.	2015-10-08 14:29:35 -04:00
Joey Hess	2def1d0a23	other 80% of avoding verification when hard linking to objects in shared repo In `c6632ee5c8`, it actually only handled uploading objects to a shared repository. To avoid verification when downloading objects from a shared repository, was a lot harder. On the plus side, if the process of downloading a file from a remote is able to verify its content on the side, the remote can indicate this now, and avoid the extra post-download verification. As of yet, I don't have any remotes (except Git) using this ability. Some more work would be needed to support it in special remotes. It would make sense for tahoe to implicitly verify things downloaded from it; as long as you trust your tahoe server (which typically runs locally), there's cryptographic integrity. OTOH, despite bup being based on shas, a bup repo under an attacker's control could have the git ref used for an object changed, and so a bup repo shouldn't implicitly verify. Indeed, tahoe seems unique in being trustworthy enough to implicitly verify.	2015-10-02 14:35:12 -04:00
Joey Hess	7c7fe895f9	disabling verification also disables size verification It's not expensive to do size verification, but let's be consistent and turn it off too.	2015-10-02 12:38:02 -04:00
Joey Hess	c6632ee5c8	avoid verification when hard linking to objects in shared repository Such a repository is implicitly trusted, so there's no point.	2015-10-02 12:36:03 -04:00
Joey Hess	2fb3722ce9	Do verification of checksums of annex objects downloaded from remotes. * When annex objects are received into git repositories, their checksums are verified then too. * To get the old, faster, behavior of not verifying checksums, set annex.verify=false, or remote.<name>.annex-verify=false. * setkey, rekey: These commands also now verify that the provided file matches the key, unless annex.verify=false. * reinject: Already verified content; this can now be disabled by setting annex.verify=false. recvkey and reinject already did verification, so removed now duplicate code from them. fsck still does its own verification, which is ok since it does not use getViaTmp, so verification doesn't happen twice when using fsck --from.	2015-10-01 15:56:39 -04:00
Joey Hess	b72d3fbeba	rename function	2015-10-01 14:18:57 -04:00
Joey Hess	807ba6a903	refactor	2015-10-01 14:07:06 -04:00
Joey Hess	dc2f1f09b7	Improve robustness of direct mode merge, avoiding a crash if the index file is missing. I couldn't find a good way to make an empty index file (zero byte file won't do), so I punted and just don't make index.lock when there's no index yet. This means some other git process could race and write an index file at the same time as the merge is ongoing, in theory. Only happens in new repos though.	2015-09-22 13:00:18 -04:00
Joey Hess	b88739f0d0	avoid auto-enabling a remote that's already enabled	2015-09-14 15:34:15 -04:00
Joey Hess	c919489c3e	avoid autoenable of dead special remotes	2015-09-14 15:28:14 -04:00
Joey Hess	9cfb96c53d	Special remotes configured with autoenable=true will be automatically enabled when git-annex init is run.	2015-09-14 14:49:48 -04:00
Joey Hess	97962591d6	init: Fix reversion in detection of repo made with git clone --shared	2015-09-09 13:56:37 -04:00
Joey Hess	c242e248e8	Fix reversion in init when ran as root, introduced in version 5.20150731.	2015-08-19 12:36:17 -04:00
Joey Hess	0f5d6c09ac	importfeed --relaxed: Avoid hitting the urls of items in the feed.	2015-08-19 12:24:55 -04:00
Joey Hess	23e9d3bb77	Fix setting/setting/viewing metadata that contains unicode or other special characters, when in a non-unicode locale. Oh boy, not again. So, another place that the filesystem encoding needs to be applied. Yay. In passing, I changed decodeBS so if a NUL is embedded in the input, the resulting FilePath doesn't get truncated at that NUL. This was needed to make prop_b64_roundtrips pass, and on reviewing the callers of decodeBS, I didn't see any where this wouldn't make sense. When a FilePath is used to operate on the filesystem, it'll get truncated at a NUL anyway, whereas if a String is being used for something else, it might conceivably have a NUL in it, and we wouldn't want it to get truncated when going through decodeBS. (NB: There may be a speed impact from this change.)	2015-08-11 18:40:59 -04:00
Joey Hess	f7d7995172	clean	2015-08-04 17:07:45 -04:00
Joey Hess	3c971c414e	sshopts is never going to be null; the concat of it may be	2015-08-04 16:53:38 -04:00
Joey Hess	a6374b7a3d	typo	2015-08-04 15:44:46 -04:00
Joey Hess	f041a65c33	Windows: Fix bug that caused git-annex sync to fail due to missing environment variable. I think that the problem was caused by windows not having a concept of an env var that is set, but to the empty string. So, GIT_ANNEX_SSHOPTION got set to "" and was not seen as set at all. Easy fix, which also makes git-annex sync a little faster is to not set GIT_SSH, when GIT_ANNEX_SSHOPTION has no options. Might as well let git use ssh per usual in this case, no need to run git-annex as the proxy ssh command..	2015-08-04 15:27:48 -04:00
Joey Hess	6c15cdfcb8	proxy: Fix proxy git commit of non-annexed files in direct mode. * proxy: Fix proxy git commit of non-annexed files in direct mode. * proxy: If a non-proxied git command, such as git revert would normally fail because of unstaged files in the work tree, make the proxied command fail the same way.	2015-08-04 14:01:59 -04:00
Joey Hess	ea765ec022	windows build warning fixes	2015-08-03 15:54:29 -04:00
Joey Hess	9dfe03dbcd	Improve shutdown due to --time-limit, especially for fsck * Perform a clean shutdown when --time-limit is reached. This includes running queued git commands, and cleanup actions normally run when a command is finished. * fsck: Commit incremental fsck database when --time-limit is reached. Previously, some of the last files fscked did not make it into the database when using --time-limit. Note that this changes Annex.addCleanup hooks, to run after --time-limit expires. Fsck was using such a hook to clean up after a --incremental-schedule, and that shouldn't run when --time-limit exipires it. So, instead, moved that cleanup code to be run by cleanupIncremental. Resulted in some data type juggling.	2015-07-31 16:01:54 -04:00
Joey Hess	b30324fec7	init: Detect when the filesystem is crippled such that it ignores attempts to remove the write bit from a file, and enable direct mode. Seen with eg, NTFS fuse on linux.	2015-07-30 14:06:17 -04:00
Joey Hess	267f397d82	avoid calling copy when file DNE This avoids an ugly warning when running git annex fsck --from a rsync remote in a repo in direct mode.	2015-07-30 13:40:17 -04:00
Joey Hess	24800b1bf1	Only look at reflogs for relevant branches, not for git-annex branches This speeds it up quite a bit.. May still be too slow in large repos.	2015-07-07 17:36:30 -04:00
Joey Hess	b11d2f5a8a	unused: --used-refspec can now be configured to look at refs in the reflog. This provides a way to not consider old versions of files to be unused after they have reached a specified age, when the old refs in the reflog expire. May be slow.	2015-07-07 17:13:50 -04:00
Joey Hess	f7dc20595e	refactor ls-tree params All in one place to avoid bugs like `174da80ddc`	2015-07-06 14:21:43 -04:00
Joey Hess	174da80ddc	bugfix: Pass --full-tree when using git ls-files to get a list of files on the git-annex branch, so it works when run in a subdirectory. This bug affected git-annex unused, and potentially also transitions running code and other things.	2015-07-06 14:09:54 -04:00
Joey Hess	adba0595bd	use bloom filter in second pass of sync --all --content This is needed because when preferred content matches on files, the second pass would otherwise want to drop all keys. Using a bloom filter avoids this, and in the case of a false positive, a key will be left undropped that preferred content would allow dropping. Chances of that happening are a mere 1 in 1 million.	2015-06-16 18:50:13 -04:00
Joey Hess	a0a8127956	instance Hashable Key for bloomfilter	2015-06-16 18:37:41 -04:00
Joey Hess	8b74aec3ea	Increased the default annex.bloomaccuracy from 1000 to 10000000 This makes git annex unused use around 48 mb more memory than it did before, but the massive increase in accuracy makes this worthwhile for all but the smallest systems. Also, I want to use the bloom filter for sync --all --content, to avoid dropping files that the preferred content doesn't want, and 1/1000 false positives would be far too many in that use case, even if it were acceptable for unused. Actual memory use numbers: 1000: 21.06user 3.42system 0:26.40elapsed 92%CPU (0avgtext+0avgdata 501552maxresident)k 1000000: 21.41user 3.55system 0:26.84elapsed 93%CPU (0avgtext+0avgdata 549496maxresident)k 10000000: 21.84user 3.52system 0:27.89elapsed 90%CPU (0avgtext+0avgdata 549920maxresident)k Based on these numbers, 10 million seemed a better pick than 1 million.	2015-06-16 18:12:00 -04:00
Joey Hess	8c46ea22c2	Added new "anything" preferred content expression, which matches all versions of all files.	2015-06-16 17:03:34 -04:00
Joey Hess	0a998032ed	Fix bug that prevented enumerating locally present objects in repos tuned with annex.tune.objecthash1=true Need to walk 1 level of subdirs less in this case. The git-annex branch traversal code didn't have a similar bug.	2015-06-11 15:15:05 -04:00
Joey Hess	de3bd11a2c	import --clean-duplicates: Fix bug that didn't count local or trusted repo's copy of a file as one of the necessary copies to allow removing it from the import location.	2015-06-03 13:15:38 -04:00
Joey Hess	d28e8fbfd5	get --incomplete: New option to resume any interrupted downloads.	2015-06-02 14:20:38 -04:00
Joey Hess	eb33569f9d	remove Params constructor from Utility.SafeCommand This removes a bit of complexity, and should make things faster (avoids tokenizing Params string), and probably involve less garbage collection. In a few places, it was useful to use Params to avoid needing a list, but that is easily avoided. Problems noticed while doing this conversion: * Some uses of Params "oneword" which was entirely unnecessary overhead. * A few places that built up a list of parameters with ++ and then used Params to split it! Test suite passes.	2015-06-01 13:52:23 -04:00
Joey Hess	a6d54e49a0	sync, remotedaemon: Pass configured ssh-options even when annex.sshcaching is disabled.	2015-05-30 22:01:52 -04:00
Joey Hess	83b262f1b6	fix windows build	2015-05-22 13:54:54 -04:00
Joey Hess	167539a354	better memoize core.sharedrepository handling It was memoized, but that was not used consistently. Move it to Types.GitConfig so it will auto-memoize.	2015-05-19 15:04:24 -04:00
Joey Hess	b47c9fd587	honor core.sharedRepository settings in lockContent The content file may not be owned by the user running git-annex, in which case, setting the owner write bit was not enough to let lockContent act on the file. However, with some core.sharedRepository configs, the file should be writable by the user's group. So, the thing to do is to call thawContent on it.	2015-05-19 14:53:19 -04:00
Joey Hess	f4e2093760	fix inAnnexSafe result for direct file that is being dropped It was returning Just False in this situation, which differed from indirect mode behavior. I don't think this led to any actual problems; things that checked if the file being dropped was present just failed to fail, and instead reported it wasn't present, possibly incorrectly. Hmm, it's possible that this could have made git annex fsck --from remote update the location log wrongly, if a remote was in direct mode, and was in the middle of trying to drop a key, and the drop later failed.	2015-05-19 14:26:07 -04:00
Joey Hess	1312e721ed	convert lockContent to use new LockPools Also cleaned up the code, avoiding creating a lock file if we're going to open it for create later anyway. And, if there's an exception while preparing to lock the file, but not at the point of actually taking the lock, throw an exception, instead of silently not locking and pretending to succeed. And, on Windows, always use lock file, even if the repo somehow got into indirect mode (maybe with cygwin git..)	2015-05-19 14:12:23 -04:00
Joey Hess	ecb0d5c087	use lock pools throughout git-annex The one exception is in Utility.Daemon. As long as a process only daemonizes once, which seems reasonable, and as long as it avoids calling checkDaemon once it's already running as a daemon, the fcntl locking gotchas won't be a problem there. Annex.LockFile has it's own separate lock pool layer, which has been renamed to LockCache. This is a persistent cache of locks that persist until closed. This is not quite done; lockContent stil needs to be converted.	2015-05-19 14:09:52 -04:00
Joey Hess	7ebf234616	Stale transfer lock and info files will be cleaned up automatically when get/unused/info commands are run. Deleting lock files is tricky, tricky stuff. I think I got it right!	2015-05-12 20:11:23 -04:00
Joey Hess	7299bbb639	don't clean up transfer lock file when retrying transfer This affected callers that used forwardRetry; if the 1st attempt failed it would clean up the transfer lock before retrying.	2015-05-12 19:43:24 -04:00
Joey Hess	8c2dd7d8ee	Fix an unlikely race that could result in two transfers of the same key running at once. As discussed in bug report.	2015-05-12 19:39:28 -04:00
Joey Hess	e25ecab7dd	convert to using Utility.Lockfile for transfer lock files Should be no behavior changes, just simplified code. The only actual difference is it doesn't truncate the lock file. I think that was a holdover from when transfer info was written to the lock file.	2015-05-12 19:36:16 -04:00
Joey Hess	61ccf95004	Avoid accumulating transfer failure log files unless the assistant is being used. Only the assistant uses these, and only the assistant cleans them up, so make only git annex transferkeys write them, There is one behavior change from this. If glacier is being used, and a manual git annex get --from glacier fails because the file isn't available yet, the assistant will no longer later see that failed transfer file and retry the get. Hope no-one depended on that old behavior.	2015-05-12 15:53:38 -04:00
Joey Hess	a812d598ef	Take space that will be used by running downloads into account when checking annex.diskreserve.	2015-05-12 15:20:22 -04:00
Joey Hess	e27b97d364	Merge branch 'master' into concurrentprogress Conflicts: Command/Fsck.hs Messages.hs Remote/Directory.hs Remote/Git.hs Remote/Helper/Special.hs Types/Remote.hs debian/changelog git-annex.cabal	2015-05-12 13:23:22 -04:00
Joey Hess	64a4553e0b	rename traverse to walk since Data.Traversable is imported by default in ghc 7.10	2015-05-10 16:43:09 -04:00
Joey Hess	08308dc9b3	fix build warning with ghc 7.10	2015-05-10 15:28:13 -04:00
Joey Hess	9f3e51dd51	move nubbing into function whose algo needs a nubbed list	2015-04-30 14:11:59 -04:00
Joey Hess	38c458b407	refactor	2015-04-30 14:02:56 -04:00
Joey Hess	5948c148fb	Make repo init more robust. The setDifferences that got added to initialize turns out to make a git commit, and before ensureCommit has been used. Thus, repo init can fail when the system has a broken hostname etc. Move the ensureCommit to the very first thing to avoid this kind of breakage.	2015-04-20 14:01:41 -04:00
Joey Hess	3a078ab357	When a key's size is unknown, still check the annex.diskreserve, and avoid getting content if the disk is too full. We can't check if there's enough disk space to download the content, but we can check if there's certainly not enough!	2015-04-17 21:29:15 -04:00
Joey Hess	86a2f9dc4d	Merge branch 'master' into concurrentprogress Conflicts: debian/changelog	2015-04-14 15:35:15 -04:00
Joey Hess	2b79e6fe08	a few hlints	2015-04-11 00:10:34 -04:00
Joey Hess	9971c82ead	refactor	2015-04-10 17:53:58 -04:00
Joey Hess	8077ccbd54	get, move, copy, mirror: Concurrent downloads and uploads are now supported! This works, and seems fairly robust. Clean get of 20 files at -J3. At -J10, there are some messages about ssh multiplexing, probably due to a race spinning up the ssh connection cacher. But, it manages to get all the files ok regardless. The progress bars are a scrambled mess though, due to bugs in ascii-progress, which I've already filed. Particularly this one: https://github.com/yamadapc/haskell-ascii-progress/issues/8	2015-04-10 17:08:07 -04:00
Joey Hess	0880c8319e	simplify and make more atomic	2015-04-10 15:16:17 -04:00
Joey Hess	ce0a82f493	contentlocationn: New plumbing command.	2015-04-09 15:34:47 -04:00
Joey Hess	b99b8d5d4c	followup to bug I cannot reproduce, and analysis based presumptive fix	2015-04-09 14:03:44 -04:00
Joey Hess	42e46a8701	avoid using --literal-pathspecs with git older than 1.8.1 which added it Windows is still building with an older git.	2015-04-06 13:46:11 -04:00
Joey Hess	1d57f142f1	Merge branch 'concurrentprogress'	2015-04-04 15:01:00 -04:00
Joey Hess	2343f99c85	well along the way to fully quiet --quiet Came up with a generic way to filter out progress messages while keeping errors, for commands that use stderr for both. --json mode will disable command outputs too.	2015-04-04 14:34:03 -04:00
Joey Hess	ff2eeaf054	avoid progress bar for url download with --quiet	2015-04-03 20:38:56 -04:00
Joey Hess	bd110516c0	init: Improve fifo test to detect NFS systems that support fifos but not well enough for sshcaching. ssh tries to hard link a fifo, and if not, complains: muxserver_listen: link mux listener .git/annex/ssh/SHARD1@iabak.archiveteam.org.QK8zOCbtNebI7q54 => .git/annex/ssh/SHARD1@iabak.archiveteam.org: Operation not permitted	2015-04-03 14:57:10 -04:00
Joey Hess	0a6933771d	cleanup	2015-03-30 19:55:35 -04:00
Joey Hess	15d45186cc	use --literal-pathspecs globally, as a better way to avoid globbing This might be overkill; I only know I need it in ls-files, but other git commands can also do their own globbing, it turns out, and I am pretty sure I never want them too when git-annex is using them as plumbing. Test suite still passes and it looks ok.	2015-03-30 19:44:13 -04:00
Joey Hess	5be536e523	Fix bug introduced in the last release that broke git-annex sync when git-annex was installed from the standalone tarball. This was introduced by commit `450ee53ab6` However, the same problem could affect other calls to programPath, specifically some on the assistant. So, I fixed it at a deeper level.	2015-03-27 12:55:18 -04:00
Joey Hess	3af4691978	Improve error message when --in @date is used and there is no reflog for the git-annex branch.	2015-03-26 11:15:15 -04:00
Joey Hess	798da6cf2e	Added a post-update-annex hook, which is run after the git-annex branch is updated. Needed for git update-server-info. See https://github.com/datalad/datalad/issues/1#issuecomment-84094406	2015-03-20 14:52:58 -04:00
Joey Hess	cf903d5a3c	fixup annex link target calculation when submodules are used in filesystems not supporting symlinks	2015-03-04 16:08:41 -04:00
Joey Hess	e322826e33	Submodules are now supported by git-annex! Seems to work, but still experimental until it's been tested more. When repositories are on filesystems not supporting symlinks, the .git dir symlink trick cannot be used. Since we're going to be in direct mode anyway, the .git dir symlink is not strictly needed. However, I have not fixed the code that creates new annex symlinks to handle this case -- the committed symlinks will be wrong. git annex sync happens to currently fail in a submodule using direct mode, because there's no HEAD ref. That also needs to be dealt with to get this fully working in crippled filesystems. Leaving http://github.com/datalad/datalad/issues/44 open until these issues are dealt with.	2015-03-02 16:43:44 -04:00
Joey Hess	450ee53ab6	When re-execing git-annex, use current program location, rather than ~/.config/git-annex/program, when possible. Most of the time, there will be no discreprancy between programPath and readProgramFile. But, the programFile might have been written by an old version of git-annex that is still installed, while a newer one is currently running. In this case, we want to run the same one that's currently running. This is especially important for things like the GIT_SSH=git-annex used for ssh connection caching. The only code that still uses readProgramFile directly is the upgrade code, which needs to know where the standalone git-annex was installed, in order to upgrade it.	2015-02-28 17:23:13 -04:00
Joey Hess	b9275b65f9	make programPath return FilePath not Maybe FilePath Looking at the few current callers, it's ok to have programPath throw an exception, in the unusual case where it cannot find git-annex.	2015-02-28 16:59:52 -04:00
Joey Hess	afb3e3e472	avoid crash when starting fsck --incremental when one is already running Turns out sqlite does not like having its database deleted out from underneath it. It might suffice to empty the table, but I would rather start each fsck over with a new database, so I added a lock file, and running incremental fscks use a shared lock. This leaves one concurrency bug left; running two concurrent fsck --more will lead to: "SQLite3 returned ErrorBusy while attempting to perform step." and one or both will fail. This is a concurrent writers problem.	2015-02-17 13:30:24 -04:00
Joey Hess	15107d2c5a	propigate ssh-options everywhere ssh caching is used * sync: Use the ssh-options git config when doing git pull and push. * remotedaemon: Use the ssh-options git config. Note that the rename env var means that if a new git-annex calls an old one for git-annex ssh, or a new calls an old, nothing much will go wrong; just ssh caching won't happen.	2015-02-12 16:14:53 -04:00
Joey Hess	5be7ba7ee5	The ssh-options git config is now used by gcrypt, rsync, and ddar special remotes that use ssh as a transport.	2015-02-12 15:44:10 -04:00
Joey Hess	7fce85adac	Improve race recovery code when committing to git-annex branch.	2015-02-09 18:34:48 -04:00
Joey Hess	b94eb9b22c	relFile does not have to be relative; rename to currFile	2015-02-06 16:03:02 -04:00
Joey Hess	c8163ce29a	use a Set	2015-01-28 18:17:10 -04:00
Joey Hess	b0575c621f	implement annex.tune.branchhash1 I hope this doesn't impact speed much -- it does have to pull out a value from Annex state every time it accesses the branch now. The test case I dropped has never caught any problems that I can remember, and would have been rather difficult to convert.	2015-01-28 17:17:26 -04:00
Joey Hess	009bd050c1	implement annex.tune.objecthashlower Split out Annex.DirHashes which never really belonged in Locations.	2015-01-28 16:52:08 -04:00
Joey Hess	e8c376e0ad	import Data.Default in Common	2015-01-28 16:11:28 -04:00
Joey Hess	ba3825441c	rework Differences data type Eliminated complexity and future proofed. The most important change is that all functions over Difference are now total; any Difference that can be expressed should be handled. Avoids needs for sanity checking of inputs, and version skew with the future. Also, the difference.log now serializes a [Difference], not a Differences. This saves space and keeps it simpler. Note that [Difference] might contain conflicting differences (eg, [Version5, Version6]. In this case, one of them needs to consistently win over the others, probably based on Ord.	2015-01-28 13:50:02 -04:00
Joey Hess	70736d2b41	Repository tuning parameters can now be passed when initializing a repository for the first time. * init: Repository tuning parameters can now be passed when initializing a repository for the first time. For details, see http://git-annex.branchable.com/tuning/ * merge: Refuse to merge changes from a git-annex branch of a repo that has been tuned in incompatable ways.	2015-01-27 17:38:06 -04:00
Joey Hess	f50b6779f9	Fix default repository description created by git annex init, which got broken by the relative path changes in the last release.	2015-01-22 14:59:57 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	068aaf943b	on second thought, InodeCache should use getFileSize This is necessary for interop between inode caches created on unix and windows. Which is more important than supporting inodecaches for large keys with the wrong size, which are broken anyway. There should be no slowdown from this change, except on Windows.	2015-01-20 19:35:50 -04:00
Joey Hess	4f657aa14e	add getFileSize, which can get the real size of a large file on Windows Avoid using fileSize which maxes out at just 2 gb on Windows. Instead, use hFileSize, which doesn't have a bounded size. Fixes support for files > 2 gb on Windows. Note that the InodeCache code only needs to compare a file size, so it doesn't matter it the file size wraps. So it has been left as-is. This was necessary both to avoid invalidating existing inode caches, and because the code passed FileStatus around and would have become more expensive if it called getFileSize. This commit was sponsored by Christian Dietrich.	2015-01-20 17:09:24 -04:00
Joey Hess	6035f94666	Windows: Fix running of the pre-commit-annex hook.	2015-01-20 14:48:16 -04:00
Joey Hess	f4de021a54	convert parentDir to be based on takeDirectory, but fixed for trailing /	2015-01-09 14:26:52 -04:00
Joey Hess	3bab5dfb1d	revert parentDir change Reverts `965e106f24` Unfortunately, this caused breakage on Windows, and possibly elsewhere, because parentDir and takeDirectory do not behave the same when there is a trailing directory separator.	2015-01-09 13:11:56 -04:00
Joey Hess	184ad45b42	Merge branch 'master' into relativepaths	2015-01-06 21:10:01 -04:00
Joey Hess	d7f1449b2b	fix view generation code to work when run in a subdirectory; no longer needs to setCurrentDirectory to top of repo	2015-01-06 21:01:05 -04:00
Joey Hess	858d776352	Merge branch 'master' into relativepaths Conflicts: Locations.hs debian/changelog	2015-01-06 19:00:01 -04:00
Joey Hess	965e106f24	made parentDir return a Maybe FilePath; removed most uses of it parentDir is less safe than takeDirectory, especially when working with relative FilePaths. It's really only useful in loops that want to terminate at / This commit was sponsored by Audric SCHILTKNECHT.	2015-01-06 18:55:56 -04:00
Joey Hess	8a1c5956eb	absolute path to index file; test suite passes There are still known problems; for example git annex view a=b fails when run in a subdir of the repo.	2015-01-06 17:34:02 -04:00
Joey Hess	d8a2f658dd	direct mode merge relative path trickiness This fixes 9 test suite failures. There are some tricky things going on with the paths to the index file, and git's working directory, which are hard to get right with relative paths. So, I switched back to absolute here, at least for now. Only 2 test suite failures remain on this branch, but there are other potential problems the test suite doesn't catch. Including some calls to setCurrentDirectory -- I was wrong and git-annex does do that in a few places, like when generating a view.	2015-01-06 17:18:12 -04:00
Joey Hess	cd865c3b8f	Switch to using relative paths to the git repository. This allows the git repository to be moved while git-annex is running in it, with fewer problems. On Windows, this avoids some of the problems with the absurdly small MAX_PATH of 260 bytes. In particular, git-annex repositories should work in deeper/longer directory structures than before. See http://git-annex.branchable.com/bugs/__34__git-annex:_direct:_1_failed__34___on_Windows/ There are several possible ways this change could break git-annex: 1. If it changes its working directory while it's running, that would be Bad News. Good news everyone! git-annex never does so. It would also break thread safety, so all such things were stomped out long ago. 2. parentDir "." -> "" which is not a valid path. I had to fix one instace of this, and I should probably wipe all calls to parentDir out of the git-annex code base; it was never a good idea. 3. Things like relPathDirToFile require absolute input paths, and code assumes that the git repo path is absolute and passes it to it as-is. In the case of relPathDirToFile, I converted it to not make this assumption. Currently, the test suite has 16 failures.	2015-01-06 16:19:41 -04:00
Joey Hess	a4cf80f460	Windows: Fix handling of views of filenames containing '%'	2014-12-30 17:48:04 -04:00
Joey Hess	402bfff665	fix test case on windows "a:" is an absolute path, so viewedfile test cannot be run on it.	2014-12-30 16:04:06 -04:00
Joey Hess	33f1062bc3	Revert "temporary debugging code for windows autobuilder test suite failure" This reverts commit `0d9fbd18c1`.	2014-12-30 15:18:38 -04:00
Joey Hess	0d9fbd18c1	temporary debugging code for windows autobuilder test suite failure	2014-12-30 15:17:51 -04:00
Joey Hess	c9a3e80d32	fixed all remaining build warnings on Windows	2014-12-29 17:30:20 -04:00
Joey Hess	7e422269a6	move dummy uuids to Annex.UUID	2014-12-17 13:57:52 -04:00
Joey Hess	7ae16bb6f7	Revert "let url claims optionally include a suggested filename" This reverts commit `85df9c30e9`. Putting filename in the claim was a bad idea.	2014-12-11 14:09:57 -04:00
Joey Hess	85df9c30e9	let url claims optionally include a suggested filename	2014-12-11 12:47:57 -04:00
Joey Hess	6ecd3ff421	diffdriver: New git-annex command, to make git external diff drivers work with annexed files. Closes https://github.com/datalad/datalad/issues/18	2014-11-24 16:14:06 -04:00
Joey Hess	864086a956	proxy: for all your direct mode repository munging needs This allows bypassing the direct mode guard in a safe way to do all sorts of things including git revert, git mv, git checkout ... This commit was sponsored by the WikiMedia Foundation.	2014-11-12 15:51:46 -04:00
Joey Hess	5ccc2a2d7c	no longer used imports	2014-11-06 14:18:38 -04:00
Joey Hess	334f366979	Remove fixup code for bad bare repositories created by versions 5.20131118 through 5.20131127. That fixup code would accidentially fire when --git-dir was incorrectly pointed at the working tree of a git-annex repository, resulting in data loss. Closes: #768093	2014-11-04 18:04:19 -04:00
Joey Hess	0f6aaf8012	Windows: Fix crash when user.name is not set in git config.	2014-10-31 16:14:12 -04:00
Joey Hess	4edfda59c0	fix windows build	2014-10-16 15:48:30 -04:00
Joey Hess	1e59df083d	Use haskell setenv library to clean up several ugly workarounds for inability to manipulate the environment on windows. Didn't know that this library existed! This includes making git-annex not re-exec itself on start on windows, and making the test suite on Windows run tests without forking.	2014-10-15 20:33:52 -04:00
Joey Hess	db9121ecee	vicfg: Deleting configurations now resets to the default, where before it has no effect. Added a Default instance for TrustLevel, and was able to use that to clear up several other parts of the code too. This commit was sponsored by Stephan Schulz	2014-10-14 14:15:07 -04:00
Joey Hess	9fd95d9025	indent with tabs not spaces Found these with: git grep "^ " $(find -type f -name \*.hs) \|grep -v ': where' Unfortunately there is some inline hamlet that cannot use tabs for indentation. Also, Assistant/WebApp/Bootstrap3.hs is a copy of a module and so I'm leaving it as-is.	2014-10-09 15:09:26 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	0598412e5c	Fix transfer lock file FD leak that could occur when two separate git-annex processes were both working to perform the same set of transfers.	2014-09-11 13:53:26 -04:00
Joey Hess	b874f84086	New annex.hardlink setting. Closes: #758593 * New annex.hardlink setting. Closes: #758593 * init: Automatically detect when a repository was cloned with --shared, and set annex.hardlink=true, as well as marking the repository as untrusted. Had to reorganize Logs.Trust a bit to avoid a cycle between it and Annex.Init.	2014-09-05 13:44:09 -04:00
Joey Hess	6eb5c3f479	Do not preserve permissions and acls when copying files from one local git repository to another. Timestamps are still preserved as long as cp --preserve=timestamps is supported. This avoids cp -a overriding the default mode acls that the user might have set in a git repository. With GNU cp, this behavior change should not be a breaking change, because git-anex also uses rsync sometimes in the same situation, and has only ever preserved timestamps when using rsync. Systems without GNU cp will no longer use cp -a, but instead just cp. So, timestamps will no longer be preserved. Preserving timestamps when copying between repos is not guaranteed anyway. Closes: #729757	2014-08-26 17:10:25 -07:00
Joey Hess	2b234634f6	fix imports for windows	2014-08-23 16:27:24 -07:00
Joey Hess	aebcc395ff	use types to enforce that removeAnnex can only be called inside lockContent This fixed one bug where it needed to be and wasn't (in Assistant.Unused). And also found one place where lockContent was used unnecessarily (by drop --from remote). A few other places like uninit probably don't really need to lockContent, but it doesn't hurt to do call it anyway. This commit was sponsored by David Wagner.	2014-08-20 20:13:47 -04:00
Joey Hess	1994771215	more lock file refactoring Also fixes a test suite failures introduced in recent commits, where inAnnexSafe failed in indirect mode, since it tried to open the lock file ReadWrite. This is why the new checkLocked opens it ReadOnly. This commit was sponsored by Chad Horohoe.	2014-08-20 18:58:14 -04:00
Joey Hess	e386e26ef2	avoid trying to create a content file in order to lock it The nice refactoring in `ec7dd0446a` highlighted a bug in lockContent -- when the content is not present, this incorrectly created an empty lock file, using the same filename as the content file. This seems like it could result in empty objects, which fsck would detect and complain about. Both drop and move --to call lockContent, as does Remote.Git.dropKey -- I think we got lucky and this bug didn't show up because both all of those only operate on files that are present. So this bug could only manifest if there was a race, and a file's content was dropped at just the wrong time, just as another process was about to drop it. (And then only if the other process's dropping failed, otherwise it'd delete the empty object file.) Hmm, move --from also called lockContent. Unnecessarily, since the content is not being removed from the local annex. In this case, the combination of the 2 bugs could result in an empty lock file being written, and then if the download of the content failed, left in the object directory as the content. This commit also optimises lockContent, avoiding an unncessary doesFileExist test and instead just catching the exception that's thrown when the file doesn't exist. This commit was sponsored by Justine Lam.	2014-08-20 17:25:30 -04:00
Joey Hess	ec7dd0446a	more lock file refactoring	2014-08-20 17:03:04 -04:00
Joey Hess	d279180266	reorganize and refactor lock code Added a convenience Utility.LockFile that is not a windows/posix portability shim, but still manages to cut down on the boilerplate around locking. This commit was sponsored by Johan Herland.	2014-08-20 16:45:58 -04:00
Joey Hess	0a4d301051	fix lockFileShared to actually create lock file This was a bug, but it was only used for ssh locks and by the hook special remote locking. At least in the case of ssh locks, the lock files happened to already exist before this tried to use them, so the bug didn't cause anything to break.	2014-08-20 15:49:49 -04:00
Joey Hess	bf3133ebb0	whoops, I the debug prints	2014-08-20 12:14:56 -04:00
Joey Hess	96dc423e39	When accessing a local remote, shut down git-cat-file processes afterwards, to ensure that remotes on removable media can be unmounted. Closes: #758630 This does mean that eg, copying multiple files to a local remote will become slightly slower, since it now restarts git-cat-file after each copy. Should not be significant slowdown. The reason git-cat-file is run on the remote at all is to update its location log. In order to add an item to it, it needs to get the current content of the log. Finding a way to avoid needing to do that would be a good path to avoiding this slowdown if it does become a problem somehow. This commit was sponsored by Evan Deaubl.	2014-08-20 12:07:57 -04:00
Joey Hess	83dc82c232	forgot some lifts	2014-08-20 11:51:47 -04:00
Joey Hess	092041fab0	Ensure that all lock fds are close-on-exec, fixing various problems with them being inherited by child processes such as git commands. (With the exception of daemon pid locking.) This fixes at part of #758630. I reproduced the assistant locking eg, a removable drive's annex journal lock file and forking a long-running git-cat-file process that inherited that lock. This did not affect Windows. Considered doing a portable Utility.LockFile layer, but git-annex uses posix locks in several special ways that have no direct Windows equivilant, and it seems like it would mostly be a complication. This commit was sponsored by Protonet.	2014-08-20 11:37:02 -04:00
Joey Hess	e0227dfedf	memoize construction of the Request -> Request function to apply the UrlOptions	2014-08-15 17:47:21 -04:00
Joey Hess	852185c242	git-annex-shell sendkey: Don't fail if a remote asks for a key to be sent that already has a transfer lock file indicating it's being sent to that remote. The remote may have moved between networks, or reconnected.	2014-08-15 14:17:05 -04:00
Joey Hess	bb6cec3461	direct: Avoid leaving file content in misctemp if interrupted.	2014-08-15 13:38:05 -04:00
Joey Hess	d8be828734	direct: Fix ugly warning messages. replaceFileOr was broken and ran the rollback action always. Luckily, for replaceFile, the rollback action was safe to run, since it just nuked a temp file that had already been moved into place. However, when `git annex direct` used replaeFileOr, its rollback printed a scary message: /home/joey/tmp/rrrr/.git/annex/misctmp/tmp32268: rename: does not exist (No such file or directory) There was actually no bad result though.	2014-08-12 13:00:08 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	5aa2286e7b	Merge branch 'newchunks' I am happy enough with this to make it live!	2014-08-01 18:00:47 -04:00
Joey Hess	76d894f2e5	Display exception message when a transfer fails due to an exception. For example, I had a copy to a remote that was failing for an unknown reason. This let me see the exception was createDirectory: permission denied; the underlying problem being a permissions issue.	2014-07-30 15:57:19 -04:00
Joey Hess	f3e457a195	add missing Ord constraint (fixes android build) Probably the new ghc used on android is the root cause of needing this constraint.	2014-07-30 11:57:40 -04:00
Joey Hess	bc9e4697b9	better type for Retriever Putting a callback in the Retriever type allows for the callback to remove the retrieved file when it's done with it. I did not really want to make Retriever be fixed to Annex Bool, but when I tried to use Annex a, I got into some type of type mess.	2014-07-29 18:41:41 -04:00
Joey Hess	47e522979c	allow Retriever action to update the progress meter Needed for eg, Remote.External. Generally, any Retriever that stores content in a file is responsible for updating the meter, while ones that procude a lazy bytestring cannot update the meter, so are not asked to.	2014-07-29 17:18:49 -04:00
Joey Hess	7496355031	add some more exception handling primitives	2014-07-26 23:24:27 -04:00
Joey Hess	e2c44bf656	implement chunk logs Slightly tricky as they are not normal UUIDBased logs, but are instead maps from (uuid, chunksize) to chunkcount. This commit was sponsored by Frank Thomas.	2014-07-24 16:23:36 -04:00
Joey Hess	3e4cb0e7f9	fix windows build	2014-07-14 15:55:48 -04:00
Joey Hess	822f4619ae	resolvemerge: finish up by committing	2014-07-11 16:59:49 -04:00
Joey Hess	61a35de433	Deal with change in git 2.0 that made indirect mode merge conflict resolution leave behind old files. I think this is a git behavior change, but have not checked to be sure. Conflict cruft used to look like $foo~HEAD, but now just $foo is left behind as conflict cruft. With test case.	2014-07-11 16:56:19 -04:00
Joey Hess	cb66ca3a76	resolvemerge: New plumbing command that runs the automatic merge conflict resolver.	2014-07-11 16:45:18 -04:00
Joey Hess	1efa51f344	direct: Fix handling of case where a work tree subdirectory cannot be written to due to permissions. Running `git annex direct` would cause loss of data, because the object was moved to a temp file, which it then tried to replace the work tree file with, and on failure, the temp file got deleted. Now it's instead moved back into the annex object location.	2014-07-10 14:15:46 -04:00
Joey Hess	26ee27915a	refactor locking	2014-07-10 00:32:23 -04:00
Joey Hess	e5b88713a1	refactor	2014-07-10 00:16:53 -04:00
Joey Hess	d9d76cf98b	Fix minor FD leak in journal code. Minor because normally only 1 FD is leaked per git-annex run. However, the test suite leaks a few hundred FDs, and this broke it on the Debian autobuilders, which seem to have a tigher than usual ulimit. The leak was introduced by the lazy getDirectoryContents' that was introduced in `e6330988dd` in order to scale to millions of journal files -- if the lazy list was never fully consumed, the directory handle did not get closed. Instead, pull in openDirectory/readDirectory/closeDirectory code that I already developed and submitted in a patch to the haskell directory library earlier. Using this in journalDirty avoids the place that the lazy list caused a problem. And using it in stageJournal eliminates the need for getDirectoryContents'. The getJournalFiles* functions are switched back to using the regular strict getDirectoryContents. I'm not sure if those always consume the whole list, so this avoids any leak. And the things that call those are things like git annex unused, which also look at every file committed to the git-annex branch, so would need more work to scale to insane numbers of files anyway.	2014-07-09 23:36:53 -04:00
Joey Hess	c75193e88b	fix build warning	2014-07-09 15:39:19 -04:00
Joey Hess	84186ee626	fix windows build	2014-07-09 15:37:25 -04:00
Joey Hess	58acaf8026	prospective fix for bad_merge_commit_deleting_all_files Assuming my analysis of a race is correct. In any case, this certianly closes a race..	2014-07-09 15:08:19 -04:00
Joey Hess	ba42b67c70	Fix bug in automatic merge conflict resolution When one side is an annexed symlink, and the other side is a non-annexed symlink. In this case, git-merge does not replace the annexed symlink in the work tree with the non-annexed symlink, which is different from it's handling of conflicts between annexed symlinks and regular files or directories. So, while git-annex generated the correct merge commit, the work tree didn't get updated to reflect it. See comments on bug for additional analysis. Did not add this to the test suite yet; just unloaded a truckload of firewood and am feeling lazy. This commit was sponsored by Adam Spiers.	2014-07-08 13:55:11 -04:00
Joey Hess	4a66cd3f91	assistant: Fix bug, introduced in last release, that caused the assistant to make many unncessary empty merge commits.	2014-07-05 17:12:05 -04:00
Joey Hess	c90e4e8778	work around getDirectoryContents not streaming lazily	2014-07-04 17:59:26 -04:00
Joey Hess	e6330988dd	Fix memory leak when committing millions of changes to the git-annex branch Eg after git-annex add has run on 2 million files in one go. Slightly unhappy with the neeed to use a temp file here, but I cannot see any other alternative (see comments on the bug report). This commit was sponsored by Hamish Coleman.	2014-07-04 15:28:07 -04:00
Joey Hess	d41849bc23	support commit.gpgsign Support users who have set commit.gpgsign, by disabling gpg signatures for git-annex branch commits and commits made by the assistant. The thinking here is that a user sets commit.gpgsign intending the commits that they manually initiate to be gpg signed. But not commits made in the background, whether by a deamon or implicitly to the git-annex branch. gpg signing those would be at best a waste of CPU and at worst would fail, or flood the user with gpg passphrase prompts, or put their signature on changes they did not directly do. See Debian bug #753720. Also makes all commits done by git-annex go through a few central control points, to make such changes easier in future. Also disables commit.gpgsign in the test suite. This commit was sponsored by Antoine Boegli.	2014-07-04 11:53:51 -04:00
Joey Hess	fc80956092	really add non-date metadata too	2014-07-03 14:35:20 -04:00
Joey Hess	d0c1a22e7c	import metadata from feeds When annex.genmetadata is set, metadata from the feed is added to files that are imported from it. Reused the same feedtitle and itemtitle, feedauthor, itemauthor, etc names that are used in --template. Also added title and author, which are the item title/author if available, falling back to the feed title/author. These are more likely to be common metadata fields. (There is a small bit of dupication here, but once git gets around to packing the object, it will compress it away.) The itempubdate field is not included in the metadata as a string; instead it is used to generate year and month fields, same as is done when adding files with annex.genmetadata set. This commit was sponsored by Amitai Schlair, who cooincidentially is responsible for ikiwiki generating nice feed metadata!	2014-07-03 14:15:00 -04:00
Joey Hess	7a8f8b5ac9	refactor	2014-06-16 18:59:23 -04:00
Joey Hess	b30de0dfd2	work around a bug in git http://marc.info/?l=git&m=140262402204212&w=2 This git bug manifested on FAT and Windows as the test suite failing in 3 places. All involved merge conflict resolution. It turned out that the associated file mappings were getting messed up, and that happened because this git bug lost track of what files were supposed to be symlinks. This commit was sponsored by Eric Kidd.	2014-06-12 22:00:02 -04:00
Joey Hess	4fe2e53f5b	finish fixing windows timezone madness Rather than calculating the TSDelta once, and caching it, this now reads the inode sential file's InodeCache file once, and then each time a new InodeCache is generated, looks at the sentinal file to get the current delta. This way, if the time zone changes while git-annex is running, it will adapt. This adds some inneffiency, but only on Windows, and only 1 stat per new file added. The worst innefficiency is that `git annex status` and `git annex sync` will now (on Windows) stat the inode sentinal file once per file in the repo. It would be more efficient to use getCurrentTimeZone, rather than needing to stat the sentinal file. This should be easy to do, once the time package gets my bugfix patch. This commit was sponsored by Jürgen Lüters.	2014-06-12 13:54:08 -04:00
Joey Hess	e4d7e2ebde	fix for Windows file timestamp timezone madness On Windows, changing the time zone causes the apparent mtime of files to change. This confuses git-annex, which natually thinks this means the files have actually been modified (since THAT'S WHAT A MTIME IS FOR, BILL <sheesh>). Work around this stupidity, by using the inode sentinal file to detect if the timezone has changed, and calculate a TSDelta, which will be applied when generating InodeCaches. This should add no overhead at all on unix. Indeed, I sped up a few things slightly in the refactoring. Seems to basically work! But it has a big known problem: If the timezone changes while the assistant (or a long-running command) runs, it won't notice, since it only checks the inode cache once, and so will use the old delta for all new inode caches it generates for new files it's added. Which will result in them seeming changed the next time it runs. This commit was sponsored by Vincent Demeester.	2014-06-12 13:42:21 -04:00
Joey Hess	a44fd2c019	export CreateProcess fields from Utility.Process update code to avoid cwd and env redefinition warnings	2014-06-10 19:20:14 -04:00
Joey Hess	f08fcb5030	simplify	2014-06-09 20:32:11 -04:00
Joey Hess	ab72456bb3	avoid fast-forwarding when a merge conflict was auto-resolved	2014-06-09 20:10:12 -04:00
Joey Hess	d6711800ad	avoid bad commits after interrupted direct mode sync (or merge) It was possible for a interrupted sync or merge in direct mode to leave the work tree out of sync with the last recorded commit. This would result in the next commit seeing files missing from the work tree, and committing their removal. Now, a direct mode merge happens not only in a throwaway work tree, but using a temporary index file, and without any commits or index changes being made until the real work tree has been updated. If the merge is interrupted, the work tree may have some updated files, but worst case a commit will redundantly commit changes that come from the merge. This commit was sponsored by Tony Cantor.	2014-06-09 19:40:28 -04:00
Joey Hess	dcddacfd5c	fixed getting files from bare repos on windows	2014-06-05 15:54:06 -04:00
Joey Hess	eb86f1338f	wip	2014-06-05 15:31:23 -04:00
Joey Hess	1ab3d7c810	Windows: Fix bug introduced in last release that caused files in the git-annex branch to have lines teminated with \r.	2014-06-05 14:57:01 -04:00
Joey Hess	2dd274e4ca	webapp: When adding a new local repository, fix bug that caused its group and preferred content to be set in the current repository, even when not combining. There was a tricky bit here, when it does combine, the edit form is shown, and so the info needs to be committed to the new repository, but then pulled into the current one. And caches need to be invalidated for it to be visible in the edit form.	2014-05-29 20:17:05 -04:00
Ben Gamari	99b89b22fd	Use exceptions in place of deprecated MonadCatchIO-transformers	2014-05-28 17:03:40 -04:00
Joey Hess	95ca3bb022	Fix encoding of data written to git-annex branch. Avoid truncating unicode characters to 8 bits. Allow any encoding to be used, as with filenames (but utf8 is the sane choice). Affects metadata and repository descriptions, and preferred content expressions. The question of what's the right encoding for the git-annex branch is a vexing one. utf-8 would be a nice choice, but this leaves the possibility of bad data getting into a git-annex branch somehow, and this resulting in git-annex crashing with encoding errors, which is a failure mode I want to avoid. (Also, preferred content expressions can refer to filenames, and filenames can have any encoding, so limiting to utf-8 would not be ideal.) The union merge code already took care to not assume any encoding for a file. Except it assumes that any \n is a literal newline, and not part of some encoding of a character that happens to contain a newline. (At least utf-8 avoids using newline for anything except liternal newlines.) Adapted the git-annex branch code to use this same approach. Note that there is a potential interop problem with Windows, since FileSystemEncoding doesn't work there, and instead things are always decoded as utf-8. If someone uses non-utf8 encoding for data on the git-annex branch, this can lead to an encoding error on windows. However, this commit doesn't actually make that any worse, because the union merge code would similarly fail with an encoding error on windows in that situation. This commit was sponsored by Kyle Meyer.	2014-05-27 14:16:33 -04:00
Joey Hess	79cf404e75	support being run by ssh as ssh-askpass replacement To use, set GIT_ANNEX_SSHASKPASS to point to a fifo or regular file (FIFO is better, avoids touching disk or multiple readers) that contains the password. Then set SSH_ASKPASS=git-annex, and when ssh runs it, it will tell ssh the password. This is not yet used..	2014-04-29 18:08:10 -04:00
Joey Hess	2807e14904	use a subdir of GIT_ANNEX_TMP for ssh connection caching sockets To prevent any possible collisions with other, non-socket files, like the xmppgit directory.	2014-04-20 16:56:01 -04:00
Joey Hess	a8bd7a607d	When init detects that git is not configured to commit, and sets user.email to work around the problem, also make it set user.name. I was able to reproduce git failing to commit despite user.email being set, in a test account on my laptop. The account had no GECOS information.	2014-04-20 14:17:57 -04:00
Joey Hess	e880d0d22c	replace (Key, Backend) with Key Only fsck and reinject and the test suite used the Backend, and they can look it up as needed from the Key. This simplifies the code and also speeds it up. There is a small behavior change here. Before, all commands would warn when acting on an annexed file with an unknown backend. Now, only fsck and reinject show that warning.	2014-04-17 18:03:39 -04:00
Joey Hess	915d038bec	reinit: New command that can initialize a new reposotory using the configuration of a previously known repository. Useful if a repository got deleted and you want to clone it back the way it was.	2014-04-15 20:13:35 -04:00
Joey Hess	138d25518d	Merge branch 'master' into remotecontrol Conflicts: doc/devblog/day_152__more_ssh_connection_caching.mdwn	2014-04-14 13:38:35 -04:00
Joey Hess	2ff9ba9f74	add missing Network.URI Ord instance for Debian stable	2014-04-14 13:25:49 -04:00
Joey Hess	a0ef99b3f9	don't try to use ssh connection caching for non-ssh urls	2014-04-13 21:39:04 -04:00
Joey Hess	a33b30d0c4	remotedaemon: When network connection is lost, close all cached ssh connections. This commit was sponsored by Cedric Staub.	2014-04-12 16:32:59 -04:00
Joey Hess	15917ec1a8	sync, assistant, remotedaemon: Use ssh connection caching for git pushes and pulls. For sync, saves 1 ssh connection per remote. For remotedaemon, the same ssh connection that is already open to run git-annex-shell notifychanges is reused to pull from the remote. Only potential problem is that this also enables connection caching when the assistant syncs with a ssh remote. Including the sync it does when a network connection has just come up. In that case, cached ssh connections are likely to be stale, and so using them would hang. Until I'm sure such problems have been dealt with, this commit needs to stay on the remotecontrol branch, and not be merged to master. This commit was sponsored by Alexandre Dupas.	2014-04-12 15:59:34 -04:00
Johan Kiviniemi	4025515616	Notification: Add action/status-dependent icon and urgency	2014-04-05 20:45:11 +03:00
Johan Kiviniemi	7760dfcc7f	Notification: summary is not optional Use the summary field instead of body.	2014-04-05 20:44:06 +03:00
Joey Hess	a772b23c62	fix warning on !dbus	2014-04-02 18:10:03 -04:00
Joey Hess	fe19e15040	reorg matcher types; no non-type code changes	2014-03-29 14:43:34 -04:00
Joey Hess	0df4848a9e	forget --drop-dead: Avoid removing the dead remote from the trust.log, so that if git remotes for it still exist anywhere, git annex info will still know it's dead and not show it.	2014-03-26 13:28:26 -04:00
Joey Hess	16025a4f12	fix build w/o DesktopNotification	2014-03-23 18:17:35 -04:00
Joey Hess	fb8a32cc7f	notifications on drop	2014-03-22 15:01:48 -04:00
Joey Hess	a5dcd0e4bd	fix failure notification	2014-03-22 14:26:36 -04:00
Joey Hess	e426fac273	add desktop notifications Motivation: Hook scripts for nautilus or other file managers need to provide the user with feedback that a file is being downloaded. This commit was sponsored by THM Schoemaker.	2014-03-22 14:12:19 -04:00
Joey Hess	f64c2d6138	toplevel lastchanged field	2014-03-19 19:10:55 -04:00
Joey Hess	1052eeface	Windows: Fix some filename encoding bugs. http://git-annex.branchable.com/bugs/Unicode_file_names_ignored_on_Windows/ Not a complete fix yet.	2014-03-19 15:57:56 -04:00
Joey Hess	caa97d1271	Each for each metadata field, there's now an automatically maintained "$field-lastchanged" that gives the timestamp of the last change to that field. Note that this is a nearly entirely free feature. The data was already stored in the metadata log in an easily accessible way, and already was parsed to a time when parsing the log. The generation of the metadata fields may even be done lazily, although probably not entirely (the map has to be evaulated to when queried).	2014-03-18 18:55:43 -04:00
Joey Hess	6a4dd42328	finish wiring up groupwanted	2014-03-15 17:08:55 -04:00
Joey Hess	3551d40b05	"standard" can now be used as a first-class keyword in preferred content expressions. For example "standard or (include=otherdir/*)" or even "not standard" Note that the implementation avoids any potential for loops (if a standard preferred content expression itself mentioned standard). This commit was sponsored by Jochen Bartl.	2014-03-14 15:04:33 -04:00
Joey Hess	ba02cd8a80	Fix ssh connection caching stop method to work with openssh 6.5p1, which broke the old method. Old ssh did not check the hostname passed to -O stop, so I had used "any". But now ssh does check it! I think this happened as part of the client-side hostname canonicalization changes in 6.5p1, but have not verified that introduced the problem. The symptom was that it would try to dns lookup "any", which often caused a bit of a delay at shutdown. And the old ssh connection kept running, so it would do it over and over again. Fixed by using localhost, which hopefully reliably resolves to some address that ssh will accept.. Also nukeFile the socket after ssh has been asked to shutdown, just in case.	2014-03-13 19:35:06 -04:00
Joey Hess	8e2997aa69	only run sshCleanup when the command actually used ssh connection caching Optimises query commands that do not. More importantly, avoids any ssh connection cleanup delay causing problems at the end of such commands.	2014-03-13 19:30:13 -04:00
Joey Hess	1f99a6778f	Fix direct mode getKeysPresent false positive & also sped up direct mode unused and unannex unused: In direct mode, files that are deleted from the work tree are no longer incorrectly detected as unused. Direct mode `git annex info` slows down a bit due to more stringent checking, but not by a lot.	2014-03-07 12:43:56 -04:00
Joey Hess	8ee3b47d2b	style	2014-03-04 22:55:40 -04:00
Joey Hess	14d1e878ab	sync: Automatically resolve merge conflict between and annexed file and a regular git file. This is a new feature, it was not handled before, since it's a bit of an edge case. However, it can be handled exactly the same as a file/dir conflict, just leave the non-annexed item alone. While implementing this, the core resolveMerge' function got a lot simpler and clearer. Note especially that where before there was an asymetric call to stagefromdirectmergedir, now graftin is called symmetrically in both cases. And, in order to add that `graftin us`, the current branch needed to be known (if there is no current branch, there cannot be a merge conflict). This led to some cleanups of how autoMergeFrom behaved when there is no current branch. This commit was sponsored by Philippe Gauthier.	2014-03-04 19:35:55 -04:00
Joey Hess	99295f2c1d	factor out Annex.AutoMerge from Command.Sync	2014-03-04 16:26:15 -04:00
Joey Hess	4a847cdc08	finish fixing direct mode merge bug involving unstaged local files Added test cases for both ways this can happen, with a conflict involving a file, or a directory. Cleaned up resolveMerge to not touch the work tree in direct mode, which turned out to be the only way to handle things.. And makes it much nicer. Still need to run test suite on windows.	2014-03-04 02:03:15 -04:00
Joey Hess	fb88e0f02c	fix `1192d98721` to handle annexed files in conflicted merge In the case of a conflicted merge where the remote adds a directory, and we have a file (which is checked in), resolveMerge' will create the link, and so the fix for `1192d98721` looked at that, thought it was an unannexed file (it's not in the oldref), and preserved it. This is a hacky fix. It would be better for resolveMerge' to not update the work tree, at least in direct mode, and only stage the changes, which mergeDirectCleanUp could then move into tree. I want to make that change, but this is not the time to do it.	2014-03-03 17:09:53 -04:00
Joey Hess	1192d98721	sync: Fix bug in direct mode that caused a file not checked into git to be deleted when merging with a remote that added a file by the same name. (Thanks, jkt)	2014-03-03 14:57:16 -04:00
Joey Hess	04b77328ef	fix handling of nonexistant hook	2014-03-03 13:59:36 -04:00
Joey Hess	d0fce426c4	pre-commit-annex hook script to automatically extract metadata from lots of types of files Using the extract(1) program to do the heavy lifting. Decided to make git-annex run pre-commit-annex when committing. Since git-annex pre-commit also runs it, it'll be run when git commit is run too, via the pre-commit hook. This basically gives back the pre-commit hook that git-annex took away. The implementation avoids repeatedly looking for the hook script when the assistant is running and committing repeatedly; only checks if the hook is available once. To make the script simpler, made git-annex metadata -s field?=value only set a field when it's not already got a value. This commit was sponsored by bak.	2014-03-02 20:11:58 -04:00
Joey Hess	308d4b67f3	fix combining of FIlterValues	2014-03-02 15:44:14 -04:00
Joey Hess	7d9486a709	vadd: Allow listing multiple desired values for a field.	2014-03-02 15:36:45 -04:00
Joey Hess	c2e8c21ca6	view, vfilter: Add support for filtering tags and values out of a view, using !tag and field!=value. Note that negated globs are not supported. Would have complicated the code to add them, without changing the data type serialization in a non-backwards-compatable way. This commit was sponsored by Denver Gingerich.	2014-03-02 14:53:19 -04:00
Joey Hess	7ac37a7854	Probe for quvi version at run time. Overhead: git annex addurl runs quvi --version once. And more bloat to Annex state..	2014-02-28 14:54:02 -04:00
Joey Hess	a1432bce2f	Put non-object tmp files in .git/annex/misctmp, leaving .git/annex/tmp for only partially transferred objects. This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO of temp object files is too annoying (and if you don't want to keep partially transferred objects across reboots). .git/annex/misctmp must be on the same filesystem as the git work tree, since files are moved to there in a way that will not work cross-device, as well as symlinked into there. I first wanted to put the tmp objects in .git/annex/objects/tmp, but that would pose transition problems on upgrade when partially transferred objects existed. git annex info does not currently show the size of .git/annex/misctemp, since it should stay small. It would also be ok to make something clean it out, periodically.	2014-02-26 16:52:56 -04:00
Joey Hess	06e9080f01	metadata: FIeld names are now case insensative.	2014-02-25 18:45:09 -04:00
Joey Hess	b9147b4012	fix test to work on Windows	2014-02-25 18:09:45 -04:00
Joey Hess	003fc2b7e1	add UrlOptions sum type	2014-02-24 22:00:25 -04:00
Joey Hess	c69d6eb035	Make annex.web-options be used in several places that call curl.	2014-02-24 21:29:37 -04:00
Joey Hess	8d5158fa31	Preserve metadata when staging a new version of an annexed file. Performance impact: When adding a large tree of new files, this needs to do some git cat-file queries to check if any of the files already existed and might need a metadata copy. I tried a benchmark in a copy of my sound repository (so there was already a significant git tree to check against. Adding 10000 small files, with a cold cache: before: 1m48.539s after: 1m52.791s So, impact is 0.0004 seconds per file added. Which seems acceptable, so did not add some kind of configuration to enable/disable this. This commit was sponsored by Lisa Feilen.	2014-02-24 14:41:33 -04:00
Joey Hess	7498c5dd96	annex.genmetadata can be set to make git-annex automatically set metadata (year and month) when adding files	2014-02-23 00:08:29 -04:00
Joey Hess	fa6f553083	exclude derived metadata when extracting metadata from a viewed file	2014-02-22 18:16:28 -04:00
Joey Hess	079b35a1a8	views: add automatically constructed file location metadata When constructing views, metadata is available about the location of the file in the view's reference branch. Allows incorporating parts of the directory hierarchy in a view. For example `git annex view tag=* podcasts/=` makes a view in the form tag/showname. Performance impact: I benchmarked git annex view tag= in the conference proceedings repo to take 6.459s before this change, and 6.544s after. FWIW, I considered making the syntax for this be podcasts/, which might be easier for the user to learn. However, I think it's not as good: The user has to then juggle two different syntaxes, and podcasts/* will be expanded by the shell so they also need to quote it, while podcasts/=* is unlikely to be expanded by the shell. * It would allow for things like podcasts// and *.mp3 which do not map well into views. This commit was sponsored by Aurélien Pinceaux.	2014-02-22 16:27:53 -04:00
Joey Hess	73a5245502	prune imports	2014-02-22 14:58:05 -04:00
Joey Hess	cc0a576ab0	change directory encoding in ViewedFile such that the original directory can be extracted from it	2014-02-22 14:54:53 -04:00
Joey Hess	1435c4f149	factor out new module	2014-02-22 13:35:50 -04:00
Joey Hess	24f8136504	--metadata field=value can now use globs to match, and matches case insensatively, the same as git annex view field=value does. Also refactored glob code into its own module.	2014-02-21 18:34:34 -04:00
Joey Hess	db48b8a4a3	unused: Fix to actually detect unused keys when in direct mode.	2014-02-20 13:53:49 -04:00
Joey Hess	dd7b99c860	add tip about metadata driven views (and more flexible view filtering) While writing this documentation, I realized that there needed to be a way to stay in a view like tag=* while adding a filter like tag=work that applies to the same field. So, there are really two ways a view can be refined. It can have a new "field=explicitvalue" filter added to it, which does not change the "shape" of the view, but narrows the files it shows. Or, it can have a new view added, which adds another level of subdirectories. So, added a vfilter command, which takes explicit values to add to the filter, and rejects changes that would change the shape of the view. And, made vadd only accept changes that change the shape of the view. And, changed the View data type slightly; now components that can match multiple metadata values can be visible, or not visible. This commit was sponsored by Stelian Iancu.	2014-02-19 16:29:56 -04:00
Joey Hess	39ebfa1a2e	pre-commit: Update metadata when committing changes to annexed files within a view. So the user can now switch to a view and then move files around within it to manage metadata. For example, moving a file into a new directory when in the tags=* view adds a tag to it. Implementation is fairly efficient. One diff-index, which is no more expensive than the first stage of a git commit, followed by possibly some cat-file --batch traffic to find the key (when deleting a file). Very similar to what's done in direct mode when committing. And like direct mode when updating the WC after a merge, it has to buffer the diff-tree values in order to make 2 passes over them. When not in a view, pre-commit now does one extra git symbolic-ref, which is tiny overhead. This commit was sponsored by Andrew Eskridge.	2014-02-19 14:17:58 -04:00
Joey Hess	1fe6cd3c0d	decruft	2014-02-19 02:32:22 -04:00
Joey Hess	fb266e2da6	make view globs case-insensative, memoized, and bring back TFDA I was careful to write the code so its clear how laziness memoizes it, although it's likely that much less explicit currying would have had the same effect. Verified that the memoization works using a Debug.Trace.	2014-02-19 02:30:14 -04:00
Joey Hess	0b7ede2088	reject views with too many nested subdirs	2014-02-19 01:28:48 -04:00
Joey Hess	4e0be2792b	remove Read instance for Ref Removed instance, got it all to build using fromRef. (With a few things that really need to show something using a ref for debugging stubbed out.) Then added back Read instance, and made Logs.View use it for serialization. This changes the view log format.	2014-02-19 01:19:57 -04:00
Joey Hess	72c118152f	fix view changing when in subdir Failed reading some files with relative paths. This is a quick and dirty fix.	2014-02-18 20:57:14 -04:00
Joey Hess	9b51d43318	view: preserve toplevel dotfiles	2014-02-18 20:32:00 -04:00
Joey Hess	5790acc19f	improve view filenames	2014-02-18 20:01:44 -04:00
Joey Hess	67fd06af76	add git annex view command (And a vpop command, which is still a bit buggy.) Still need to do vadd and vrm, though this also adds their documentation. Currently not very happy with the view log data serialization. I had to lose the TDFA regexps temporarily, so I can have Read/Show instances of View. I expect the view log format will change in some incompatable way later, probably adding last known refs for the parent branch to View or something like that. Anyway, it basically works, although it's a bit slow looking up the metadata. The actual git branch construction is about as fast as it can be using the current git plumbing. This commit was sponsored by Peter Hogg.	2014-02-18 18:22:20 -04:00
Joey Hess	103dab702b	better data types	2014-02-17 00:38:33 -04:00
Joey Hess	e806152f77	split out types	2014-02-17 00:18:57 -04:00
Joey Hess	d7a95098fb	tricky view refining code that keeps track of whether the view is widenening or narrowing	2014-02-16 22:44:28 -04:00
Joey Hess	410f603383	support globs when built w/o TDFA, just slower	2014-02-16 21:26:57 -04:00
Joey Hess	613f8f02e3	add another quickcheck property, and several edge cases handled	2014-02-16 21:00:12 -04:00
Joey Hess	81628d24c8	simplify type	2014-02-16 17:46:52 -04:00
Joey Hess	9633c67842	filter branches (incomplete) Promosing work toward metadata driven filter branches. A few methods to construct them are stubbed out; all the data types and pure code seems good. This commit was sponsored by Walter Somerville.	2014-02-16 17:39:54 -04:00
Joey Hess	2075cdeb59	limiting files based on metadata Note that there is currently no caching, so --metadata foo=bar --metadata tag=blah will currently read the log 2x per file.	2014-02-13 02:24:30 -04:00
Joey Hess	9f7e76130e	add metadata command to get/set metadata Adds metadata log, and command. Note that unsetting field values seems to currently be broken. And in general this has had all of 2 minutes worth of testing. This commit was sponsored by Julien Lefrique.	2014-02-12 21:30:33 -04:00
Joey Hess	2d480602aa	random hlint (to give the autobuilder something new to build)	2014-02-11 00:41:19 -04:00
Joey Hess	43d17632f6	remove workaround for old bug `ef24751922` described a bug moving between remotes in direct mode; I can no longer reproduce it with this strange workaround removed. Also test suite still passes. Hope the broken code just got fixed in the meantime.	2014-02-06 17:36:14 -04:00
Joey Hess	897d877472	work around absNormPath not working on Windows When making git-annex links, we want unix-style paths in the link targets.	2014-02-06 17:17:35 -04:00
Joey Hess	a44e01c29c	--in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}"	2014-02-06 12:43:56 -04:00
Joey Hess	08afe3a1f6	fix failing test case on Windows ensure file being modified is all read before it's opened for write	2014-02-03 10:20:18 -04:00
Joey Hess	1572c460e8	avoid using openFile when withFile can be used Potentially fixes some FD leak if an action on an opened file handle fails for some reason. There have been some hard to reproduce reports of git-annex leaking FDs, and this may solve them.	2014-02-03 10:19:06 -04:00
Joey Hess	fd1382f96f	factor out utility function	2014-02-03 10:08:28 -04:00
Joey Hess	a542781f6a	remove some monkey faces	2014-02-01 17:14:38 -04:00
Joey Hess	1669e80e85	Windows: Avoid using unix-compat's rename, which refuses to rename directories. Opened a bug about this: https://github.com/jystic/unix-compat/issues/10	2014-01-29 15:19:03 -04:00
Joey Hess	721cc0cd22	rework annexed object locking in direct mode & support Windows Seems that locking of annexed objects when they're being dropped was broken in direct mode: * When taking the lock before dropping, it created the .git/annex/objects file, as an empty file. It seems that the dropping code deleted that, but that is not right, and for all I know could in some situation cause a corrupted object to leak out. * When the lock was checked, it actually tried to open each direct mode file, and checked if it was locked. Not the same lock used above, and could also fail if some consumer of the file locked it. Fixed this, and added windows support by switching direct mode to lock a .lck file.	2014-01-28 16:43:11 -04:00
Joey Hess	891c85cd88	use locking on Windows This is all the easy cases, where there was already a separate lock file.	2014-01-28 14:42:03 -04:00
Joey Hess	2ecd42b43b	remove debug print just saw it legitimately occur when 2 git-annex were running	2014-01-26 17:04:12 -04:00
Joey Hess	74b101d1dd	reorg	2014-01-26 16:36:31 -04:00
Joey Hess	b93e485ef1	added annex.secure-erase-command config option.	2014-01-24 12:58:52 -04:00
Joey Hess	3518c586cf	fix transfers of key with no associated file Several places assumed this would not happen, and when the AssociatedFile was Nothing, did nothing. As part of this, preferred content checks pass the Key around. Note that checkMatcher is sometimes now called with Just Key and Just File. It currently constructs a FileMatcher, ignoring the Key. However, if it constructed a FileKeyMatcher, which contained both, then it might be possible to speed up parts of Limit, which currently call the somewhat expensive lookupFileKey to get the Key. I have not made this optimisation yet, because I am not sure if the key is always the same. Will need some significant checking to satisfy myself that's the case..	2014-01-23 16:44:02 -04:00
Joey Hess	4b55afe9e9	add "unused" preferred content expression With a really nice optimisation that keeps it from having any overhead in normal operation! This commit was sponsored by Ulises Vitulli.	2014-01-22 16:35:32 -04:00
Joey Hess	f2713a3bb9	benchmarked numcopies .gitattributes in preferred content Checking .gitattributes adds a full minute to a git annex find looking for files that don't have enough copies. 2:25 increasts to 3:27. I feel this is too much of a slowdown to justify making it the default. So, exposed two versions of the preferred content expression, a slow one and a fast but approximate one. I'm using the approximate one in the default preferred content expressions to avoid slowing down the assistant.	2014-01-21 18:49:25 -04:00
Joey Hess	f7cdc40f7b	reorg	2014-01-21 18:08:56 -04:00
Joey Hess	0ef282a116	numcopies cleanup, part 2 This includes several bug fixes.	2014-01-21 17:25:39 -04:00
Joey Hess	b40df4f0d0	reorganize numcopies code (no behavior changes) Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.	2014-01-21 16:08:59 -04:00
Joey Hess	3159da2693	Add and use numcopiesneeded preferred content expression. * Add numcopiesneeded preferred content expression. * Client, transfer, incremental backup, and archive repositories now want to get content that does not yet have enough copies. This means the asssistant will make copies of files that don't yet meet the configured numcopies, even to places that would not normally want the file. For example, if numcopies is 4, and there are 2 client repos and 2 transfer repos, and 2 removable backup drives, the file will be sent to both transfer repos in order to make 4 copies. Once a removable drive get a copy of the file, it will be dropped from one transfer repo or the other (but not both). Another example, numcopies is 3 and there is a client that has a backup removable drive and two small archive repos. Normally once one of the small archives has a file, it will not be put into the other one. But, to satisfy numcopies, the assistant will duplicate it into the other small archive too, if the backup repo is not available to receive the file. I notice that these examples are fairly unlikely setups .. the old behavior was not too bad, but it's nice to finally have it really correct. .. Almost. I have skipped checking the annex.numcopies .gitattributes out of fear it will be too slow. This commit was sponsored by Florian Schlegel.	2014-01-20 17:35:29 -04:00
Joey Hess	d66535f065	global numcopies setting * numcopies: New command, sets global numcopies value that is seen by all clones of a repository. * The annex.numcopies git config setting is deprecated. Once the numcopies command is used to set the global number of copies, any annex.numcopies git configs will be ignored. * assistant: Make the prefs page set the global numcopies. This global numcopies setting is needed to let preferred content expressions operate on numcopies. It's also convenient, because typically if you want git-annex to preserve N copies of files in a repo, you want it to do that no matter which repo it's running in. Making it global avoids needing to warn the user about gotchas involving inconsistent annex.numcopies settings. (See changes to doc/numcopies.mdwn.) Added a new variety of git-annex branch log file, that holds only 1 value. Will probably be useful for other stuff later. This commit was sponsored by Nicolas Pouillard.	2014-01-20 16:47:56 -04:00
Joey Hess	73c420ffcf	much better command action handling for sync --content	2014-01-20 13:31:03 -04:00
Joey Hess	34c8af74ba	fix inversion of control in CommandSeek (no behavior changes) I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.	2014-01-20 04:57:36 -04:00
Joey Hess	b6ba0bd556	sync --content: New option that makes the content of annexed files be transferred. Similar to the assistant, this honors any configured preferred content expressions. I am not entirely happpy with the implementation. It would be nicer if the seek function returned a list of actions which included the individual file gets and copies and drops, rather than the current list of calls to syncContent. This would allow getting rid of the somewhat reundant display of "sync file [ok\|failed]" after the get/put display. But, do that, withFilesInGit would need to somehow be able to construct such a mixed action list. And it would be less efficient than the current implementation, which is able to reuse several values between eg get and drop. Note that currently this does not try to satisfy numcopies when getting/putting files (numcopies are of course checked when dropping files!) This makes it like the assistant, and unlike get --auto and copy --auto, which do duplicate files when numcopies is not yet satisfied. I don't know if this is the right decision; it only seemed to make sense to have this parallel the assistant as far as possible to start with, since I know the assistant works. This commit was sponsored by Øyvind Andersen Holm.	2014-01-19 17:49:54 -04:00
Joey Hess	8ce515ffe4	improve matcher data type to allow matching Keys, instead of just files (no behavior changes)	2014-01-18 14:51:55 -04:00
Joey Hess	207ac67aaa	avoid needing a build-dep on hxt for Data.AssocList	2014-01-14 16:42:10 -04:00
Joey Hess	d07f2d7865	Fix a long-standing bug that could cause the wrong index file to be used when committing to the git-annex branch, if GIT_INDEX_FILE is set in the environment. This typically resulted in git-annex branch log files being committed to the master branch and later showing up in the work tree. (These log files can be safely removed.)	2014-01-14 15:36:33 -04:00
Joey Hess	78c7c54fdb	also check diskreserve for quvi downloads	2014-01-04 15:38:59 -04:00
Joey Hess	f9e7b6cf61	addurl, importfeed: Honor annex.diskreserve as long as the size of the url can be checked. This adds a http HEAD before the download is done. That was already the case when the assistant was running, and it seems worth it to avoid filling up the whole disk, like happened to my server today.	2014-01-04 15:08:06 -04:00
Joey Hess	3e68c1c2fd	add remote state logs This allows a remote to store a piece of arbitrary state associated with a key. This is needed to support Tahoe, where the file-cap is calculated from the data stored in it, and used to retrieve a key later. Glacier also would be much improved by using this. GETSTATE and SETSTATE are added to the external special remote protocol. Note that the state is left as-is even when a key is removed from a remote. It's up to the remote to decide when it wants to clear the state. The remote state log, $KEY.log.rmt, is a UUID-based log. However, rather than using the old UUID-based log format, I created a new variant of that format. The new varient is more space efficient (since it lacks the "timestamp=" hack, and easier to parse (and the parser doesn't mess with whitespace in the value), and avoids compatability cruft in the old one. This seemed worth cleaning up for these new files, since there could be a lot of them, while before UUID-based logs were only used for a few log files at the top of the git-annex branch. The transition code has also been updated to handle these new UUID-based logs. This commit was sponsored by Daniel Hofer.	2014-01-03 16:35:57 -04:00
Joey Hess	b1d7474c1d	Auto-upgrade v3 indirect repos to v5 with no changes. This also fixes a problem when a direct mode repo was somehow set to v3 rather than v4, and so the automatic direct mode upgrade to v5 was not done.	2013-12-29 13:06:23 -04:00
Joey Hess	7d5b25515c	Add plumbing-level lookupkey command.	2013-12-15 14:02:23 -04:00
Joey Hess	bef567c31f	Fix direct mode's handling when modifications to non-annexed files are pulled from a remote. A bug prevented the files from being updated in the work tree, and this caused the modification to be reverted.	2013-12-12 15:57:09 -04:00
Joey Hess	c160bf9d88	format comment	2013-12-12 15:16:44 -04:00
Joey Hess	03932212ec	Avoid using git commit in direct mode, since in some situations it will read the full contents of files in the tree. The assistant's commit code also always avoids git commit, for simplicity. Indirect mode sync still does a git commit -a to catch unstaged changes. Note that this means that direct mode sync no longer runs the pre-commit hook or any other hooks git commit might call. The git annex pre-commit hook action for direct mode is however explicitly run. (The assistant already ran git commit with hooks disabled, so no change there.)	2013-12-01 13:59:45 -04:00
Joey Hess	b25abdb3e6	fix reversion in relative paths to local remotes of direct mode repos `0980f3dae6` broke support for local remotes from direct mode repos, because the relative path was taken to be from the gitdir, rather than from the work tree.	2013-11-26 19:33:26 -04:00
Joey Hess	f913deab78	move programPath out of Config.Files to Annex.Path This works around horribleness in the Mavericks cpp, which falls over on the #if when configure is running. Moving it avoids the file being built at that point. But it's also a location that makes sense..	2013-11-24 16:03:03 -04:00
Joey Hess	e563c7e6f4	fsck distribution key	2013-11-23 21:58:39 -04:00
Joey Hess	b8e74bf489	fix standalone build of this module	2013-11-22 12:21:37 -04:00
Joey Hess	b876df6fdb	Ensure that core.sharedrepository is honored when creating the .git/annex directory.	2013-11-18 18:20:20 -04:00
Joey Hess	310c549b5a	Ensure execute bit is set on directories when core.sharedrepsitory is set.	2013-11-18 18:13:09 -04:00
Joey Hess	5561b46416	fix windows build	2013-11-18 11:05:16 -04:00
Joey Hess	d48b00ebed	Direct mode .git/annex/objects directories are no longer left writable Because that allowed writing to symlinks of files that are not present, which followed the link and put bad content in an object location. fsck: Fix up .git/annex/object directory permissions. This commit was sponsored by an anonymous bitcoin donor.	2013-11-15 14:52:03 -04:00
Joey Hess	b0f85b3e22	Fix direct mode merge bug when a direct mode file was deleted and replaced with a directory. An ordering problem caused the directory to not get created in this case. Thanks to Tim for the test cases.	2013-11-15 13:40:12 -04:00
Joey Hess	59ecc804cd	add new status command This works for both direct and indirect mode. It may need some performance tuning. Note that unlike git status, it only shows the status of the work tree, not the status of the index. So only one status letter, not two .. and since files that have been added and not yet committed do not differ between the work tree and the index, they are not shown. Might want to add display of the index vs the last commit eventually. This commit was sponsored by an unknown bitcoin contributor, whose contribution as been going up lately! ;)	2013-11-07 14:07:25 -04:00
Joey Hess	00c91816fb	Merge branch 'master' into directguard	2013-11-06 13:02:35 -04:00
Joey Hess	81117e8a9d	typo	2013-11-06 12:39:14 -04:00
Joey Hess	ee23be55fd	Fix exception handling bug that could cause .git/annex/index to be used for git commits outside the git-annex branch. Known to affect git-annex when used with the git shipped with Ubuntu 13.10.	2013-11-06 12:21:50 -04:00
Joey Hess	3802f2f270	work around lack of receive.denyCurrentBranch in direct mode Now that direct mode sets core.bare=true, git's normal prohibition about pushing into the currently checked out branch doesn't work. A simple fix for this would be an update hook which blocks the pushes.. but git hooks must be executable, and git-annex needs to be usable on eg, FAT, which lacks x bits. Instead, enabling direct mode switches the branch (eg master) to a special purpose branch (eg annex/direct/master). This branch is not pushed when syncing; instead any changes that git annex sync commits get written to master, and it's pushed (along with synced/master) to the remote. Note that initialization has been changed to always call setDirect, even if it's just setDirect False for indirect mode. This is needed because if the user has just cloned a direct mode repo, that nothing has synced with before, it may have no master branch, and only a annex/direct/master. Resulting in that branch being checked out locally too. Calling setDirect False for indirect mode moves back out of this branch, to a new master branch, and ensures that a manual "git push" doesn't push changes directly to the annex/direct/master of the remote. (It's possible that the user makes a commit w/o using git-annex and pushes it, but nothing I can do about that really.) This commit was sponsored by Jonathan Harrington.	2013-11-05 21:08:31 -04:00
Joey Hess	4510819215	v5 for direct mode, with automatic upgrade This includes storing the current state of the HEAD ref, which git annex sync is going to need, but does not make sync use it.	2013-11-05 17:05:03 -04:00
Joey Hess	0edd9ec03a	refactored hook setup	2013-11-05 15:29:56 -04:00
Joey Hess	230bfa9688	add --want-get and --want-drop options New --want-get and --want-drop options which can be used to test preferred content settings. For example, "git annex find --in . --want-drop"	2013-10-28 14:50:17 -04:00
Joey Hess	049e80e865	refactor	2013-10-28 14:05:55 -04:00
Joey Hess	435ea52f3c	repair command: add handling of git-annex branch and index	2013-10-23 13:00:45 -04:00
Joey Hess	4f871f89ba	git-recover-repository 1/2 done	2013-10-20 17:50:51 -04:00
Joey Hess	19816bca41	update for DiffTree type change (which fixes assistant in subdir confusion bug)	2013-10-17 15:11:21 -04:00
Joey Hess	78acbfeb6a	ensure merge directory is empty before starting merge Don't want some past failed merge to lead to bad results, potentially.	2013-10-16 14:57:58 -04:00
Joey Hess	18f4d1b400	queue downloads of keys that fsck finds with bad content	2013-10-10 17:27:00 -04:00
Joey Hess	267c124f67	run ssh in the directory with its socket when stopping This guarantees that stopping an existing socket never fails. This might be the route out of the mess of needing to worry about socket lengths in general. However, it would need quite a lot of refactoring to make every place in git-annex that runs ssh run it with a cwd that was determined by the location of its connection caching socket. If this wasn't already such a mess, I'd consider even the thought of that API a bad idea..	2013-10-06 21:11:39 -04:00
Joey Hess	6f38426cb8	work around ssh brain-damange The control socket path passed to ssh needs to be 17 characters shorter than the maximum unix domain socket length, because ssh appends stuff to it to make a temporary filename. Closes: #725512 Also, take the shorter of the relative and the absolute paths to the socket. Typically the relative path will be a lot shorter (unless deep inside a subdirectory of the repository), and so using it will avoid flirting with the maximum safe socket lenghts in more situations, and so lead to less breakage if all my attempts at fixing this are still buggy.	2013-10-06 20:59:36 -04:00
Joey Hess	f8880c4fe4	Automatically and safely detect and recover from dangling .git/annex/index.lock files, which would prevent git from committing to the git-annex branch, eg after a crash.	2013-10-03 15:43:08 -04:00
Joey Hess	83b4b8d589	rename confusing function The index.lck file is not a lock file. Kept the historical name for now as changing it would be work.	2013-10-03 15:06:58 -04:00
Joey Hess	f2ee4ef86d	ensure that commitBranch is only called when the journal is locked This is not strictly a requirement, since it does not actually update the journal. But it's a nice invariant to enforce.	2013-10-03 14:48:46 -04:00
Joey Hess	56c3f68a53	use types to partially prove correctness of journal locking code My implementation does not guard against double locking of the journal. But it does ensure that the journal is always locked when operated on, by using a type that is only produced by lockJournal, and which is required as a parameter of all functions that operate on the journal. Note that I had to add the fooStale functions for cases where it does not make sense to lock the journal when querying it. I was more concerned about ensuring that anything that modifies the journal is locked. setJournalFile's implementation ensures that any query of the journal will get one value or the other atomically, even if the journal is being changed at the time.	2013-10-03 14:41:57 -04:00
Joey Hess	7a9a16b337	lockJournal when running performTransitions This may not strictly be needed -- the transition code bypasses the journal. However, this ensures that the git-annex branch is only committed with the journal locked. This will allow for further improvements.	2013-10-03 14:37:46 -04:00
Joey Hess	12f6b9693a	Send a git-annex user-agent when downloading urls. Overridable with --user-agent option. Not yet done for S3 or WebDAV due to limitations of libraries used -- nether allows a user-agent header to be specified. This commit sponsored by Michael Zehrer.	2013-09-28 14:35:21 -04:00
Joey Hess	c45f5fbdb3	indirect: Better behavior when a file in direct mode is not owned by the user running the conversion.	2013-09-25 15:29:56 -04:00
Joey Hess	b405295aee	hlint test suite still passes	2013-09-25 03:09:06 -04:00
Joey Hess	3588729f0d	completely solve catKey memory leak Since `006cf7976f` was incomplete, not being able to get the right mode of the file when the index differs from HEAD, this is a final workaround. Only buffering the start of the file in this case avoids leaking memory. This does not prevent git-cat-file being asked to output the whole file, which needs to be consumed, and can be slow. But this only happens in a rare edge case.	2013-09-19 20:09:03 -04:00
Joey Hess	006cf7976f	more completely solve catKey memory leak Done using a mode witness, which ensures it's fixed everywhere. Fixing catFileKey was a bear, because git cat-file does not provide a nice way to query for the mode of a file and there is no other efficient way to do it. Oh, for libgit2.. Note that I am looking at tree objects from HEAD, rather than the index. Because I cat-file cannot show a tree object for the index. So this fix is technically incomplete. The only cases where it matters are: 1. A new large file has been directly staged in git, but not committed. 2. A file that was committed to HEAD as a symlink has been staged directly in the index. This could be fixed a lot better using libgit2.	2013-09-19 16:41:21 -04:00
Joey Hess	eb42bde19a	sync, pre-commit, indirect: Avoid unnecessarily catting non-symlink files from git, which can be so large it runs out of memory.	2013-09-19 14:48:42 -04:00
Joey Hess	ab9dd6d8a0	sync: Fix bug that caused direct mode mappings to not be updated when merging files into the tree on Windows.	2013-09-13 13:49:28 -04:00
Joey Hess	a48a4e2f8a	automatically derive an annex-uuid from a gcrypt-uuids	2013-09-05 16:02:39 -04:00
Joey Hess	4079f9cfe8	avoid double commit during transition The second commit had some bad refs which resulted in the race detection code running. But that commit was unnecessary anyway, it only was there to merge in the other refs.	2013-09-03 16:33:15 -04:00
Joey Hess	db83cc82d6	Merge branch 'forget' Conflicts: debian/changelog	2013-09-03 14:36:00 -04:00
Joey Hess	67fda9e669	Honor core.sharedrepository when receiving and adding files in direct mode.	2013-09-03 13:35:49 -04:00
Joey Hess	0831e18372	forget --drop-dead: Completely removes mentions of repositories that have been marked as dead from the git-annex branch. Wrote nice pure transition calculator, and ugly code to stage its results into the git-annex branch. Also had to split up several Log modules that Annex.Branch needed to use, but that themselves used Annex.Branch. The transition calculator is limited to looking at and changing one file at a time. While this made the implementation relatively easy, it precludes transitions that do stuff like deleting old url log files for keys that are being removed because they are no longer present anywhere.	2013-08-31 17:51:13 -04:00
Joey Hess	2f57d74534	remove print	2013-08-29 20:28:45 -04:00
Joey Hess	6147652cc6	wording	2013-08-29 16:41:59 -04:00
Joey Hess	6cdac3a003	sync, assistant: Force push of the git-annex branch. Necessary to ensure it gets pushed to remotes after being rewritten by forget. See inline rationalles for why I think this is safe!	2013-08-29 14:27:53 -04:00
Joey Hess	c181efe437	use --force in taggedPush This should make the assistant force update its tagged push branch after a transition like git annex forget.	2013-08-29 13:31:29 -04:00
Joey Hess	336d5ec349	Merge branch 'master' into forget	2013-08-29 13:23:02 -04:00
Joey Hess	d3af414568	typo	2013-08-28 17:05:07 -04:00
Joey Hess	4a915cd3cd	add forget command Works, more or less. --dead is not implemented, and so far a new branch is made, but keys no longer present anywhere are not scrubbed. git annex sync fails to push the synced/git-annex branch after a forget, because it's not a fast-forward of the existing synced branch. Could be fixed by making git-annex sync use assistant-style sync branches.	2013-08-28 16:41:13 -04:00
Joey Hess	fcd5c167ef	untested transition detection on merging, and transition running code	2013-08-28 15:57:42 -04:00
Joey Hess	46b6d75274	Youtube support! (And 53 other video hosts) When quvi is installed, git-annex addurl automatically uses it to detect when an page is a video, and downloads the video file. web special remote: Also support using quvi, for getting files, or checking if files exist in the web. This commit was sponsored by Mark Hepburn. Thanks!	2013-08-22 18:50:43 -04:00
Joey Hess	412dcb8017	Fix bug that caused typechanged symlinks to be assumed to be unlocked files, so they were added to the annex by the pre-commit hook.	2013-08-22 13:57:07 -04:00
Joey Hess	a3224ce35b	avoid more build warnings on Windows	2013-08-04 14:05:36 -04:00
Joey Hess	06db8e0bd9	squash compiler warnings on Windows	2013-08-04 13:18:05 -04:00
Joey Hess	b191d5c595	gitignore support for the assistant and watcher Requires git 1.8.4 or newer. When it's installed, a background git check-ignore process is run, and used to efficiently check ignores whenever a new file is added. Thanks to Adam Spiers, for getting the necessary support into git for this. A complication is what to do about files that are gitignored but have been checked into git anyway. git commands assume the ignore has been overridden in this case, and not need any more overriding to commit a changed version. However, for the assistant to do the same, it would have to run git ls-files to check if the ignored file is in git. This is somewhat expensive. Or it could use the running git-cat-file process to query the file that way, but that requires transferring the whole file content over a pipe, so it can be quite expensive too, for files that are not git-annex symlinks. Now imagine if the user knows that a file or directory tree will be getting frequent changes, and doesn't want the assistant to sync it, so gitignores it. The assistant could overload the system with repeated ls-files checks! So, I've decided that the assistant will not automatically commit changes to files that are gitignored. This is a tradeoff. Hopefully it won't be a problem to adjust .gitignore settings to not ignore files you want the assistant to autocommit, or to manually git annex add files that are listed in .gitignore. (This could be revisited if git-annex gets access to an interface to check the content of the index w/o forking a git command. This could be libgit2, or perhaps a separate git cat-file --batch-check process, so it wouldn't need to ship over the whole file content.) This commit was sponsored by Francois Marier. Thanks!	2013-08-02 20:37:03 -04:00
Joey Hess	93f2371e09	get rid of __WINDOWS__, use mingw32_HOST_OS The latter is harder for me to remember, but avoids build failures in code used by the configure program.	2013-08-02 12:27:32 -04:00
Joey Hess	ddd46db09a	Fix a few bugs involving filenames that are at or near the filesystem's maximum filename length limit. Started with a problem when running addurl on a really long url, because the whole url is munged into the filename. Ended up doing a fairly extensive review for places where filenames could get too large, although it's hard to say I'm not missed any.. Backend.Url had a 128 character limit, which is fine when the limit is 255, but not if it's a lot shorter on some systems. So check the pathconf() limit. Note that this could result in fromUrl creating different keys for the same url, if run on systems with different limits. I don't see this is likely to cause any problems. That can already happen when using addurl --fast, or if the content of an url changes. Both Command.AddUrl and Backend.Url assumed that urls don't contain a lot of multi-byte unicode, and would fail to truncate an url that did properly. A few places use a filename as the template to make a temp file. While that's nice in that the temp file name can be easily related back to the original filename, it could lead to `git annex add` failing to add a filename that was at or close to the maximum length. Note that in Command.Add.lockdown, the template is still derived from the filename, just with enough space left to turn it into a temp file. This is an important optimisation, because the assistant may lock down a bunch of files all at once, and using the same template for all of them would cause openTempFile to iterate through the same set of names, looking for an unused temp file. I'm not very happy with the relatedTemplate hack, but it avoids that slowdown. Backend.WORM does not limit the filename stored in the key. I have not tried to change that; so git annex add will fail on really long filenames when using the WORM backend. It seems better to preserve the invariant that a WORM key always contains the complete filename, since the filename is the only unique material in the key, other than mtime and size. Since nobody has complained about add failing (I think I saw it once?) on WORM, probably it's ok, or nobody but me uses it. There may be compatability problems if using git annex addurl --fast or the WORM backend on a system with the 255 limit and then trying to use that repo in a system with a smaller limit. I have not tried to deal with those. This commit was sponsored by Alexander Brem. Thanks!	2013-07-30 19:18:29 -04:00
Joey Hess	7b0970b340	Fix inverted logic in last release's fix for data loss bug, that caused git-annex sync on FAT or other crippled filesystems to add symlink standin files to the annex.	2013-07-30 16:08:09 -04:00
Joey Hess	7e66d260ea	importfeed: git-annex becomes a podcatcher in 150 LOC	2013-07-28 16:55:42 -04:00
Joey Hess	6ae2637eb1	For long hostnames, use a hash of the hostname to generate the socket file for ssh connection caching. This is ok to do now that the socket filename never needs to be mapped back to a hostname. Short hostnames will still appear in the clear, which is less obfuscated. So this cannot possibly make ssh connection caching fail for a hostname it used to work for.	2013-07-22 15:09:41 -04:00
Joey Hess	c6a020ad1f	stop cached ssh connection w/o needing to look up host and port Turns out that with -O stop -S socketfile, ssh does not need the real hostname, or port to be specificed. This is because it simply talks to the ssh behind the socket and tells it to stop. So, can eliminate the conversion back from a socketfile to host and port. Which will allow using shorter filenames for sockets in the future.	2013-07-21 14:14:54 -04:00
Joey Hess	ecdfa40cbe	avoid false positives when detecting core.symlinks=false symlink standin files If the file is > 8192 bytes, it's certianly not a symlink file. And if it contains nuls or newlines or whitespace, it's certianly not a link to annexed content. But it might be a tarball containing a git-annex repo.	2013-07-20 19:28:02 -04:00
Joey Hess	ae341c1a37	avoid reading files that are not symlinks when core.symlinks=false This hack is only needed on FAT filesystems, so there's no point in doing it the rest of the time. And it's possible for there to be a false positive, so it's best to avoid the hack when possible.	2013-07-20 19:14:29 -04:00
Joey Hess	3e422cb5fa	fix uninit to delete content from annex when it ended up hard linked back to the work tree	2013-07-18 13:30:12 -04:00
Joey Hess	c1307b1388	fsck: Don't claim to fix direct mode when run on a symlink whose content is not present.	2013-07-08 17:29:42 -04:00
Joey Hess	d84a000e92	detect system with no dot in FQDN, where git commit will fail, and workaround Sigh, git is so fragile. Or rather, across the set of systems that use git-annex, where are no many horribly broken systems..	2013-07-05 12:24:28 -04:00
Joey Hess	7a7e426352	moved AssociatedFile definition	2013-07-04 02:36:02 -04:00
Joey Hess	72ab02ca48	avoid failure creating inode sentinal file Test suite on windows failed running git annex init in a bare clone of an annexed repo. The annex directory didn't exist when it tried to write the inode sentinal file.	2013-06-18 15:38:17 -04:00
Joey Hess	1312cffad0	Revert "Windows: Ssh connection caching is now supported." Yeah, that didn't actually work. Got error messages like it couldn't read from the control socket, so probably ssh doesn't really support that on Windows, at least the cygwin ssh build I'm using.	2013-06-17 22:13:28 -04:00
Joey Hess	07a17f58b7	Windows: Ssh connection caching is now supported. Turns out the socket stuff just works on windows.	2013-06-17 22:05:49 -04:00
Joey Hess	d80a0f62a4	avoid lazy read of file contents On Windows, that means the file could still be open when later code wants to delete it, which fails. Since we're only reading 8k anyway, just read it, strictly. However, avoid reading the whole file strictly, so no getContentsStrict here.	2013-06-17 21:12:09 -04:00
Joey Hess	b7674b464b	typo in comment	2013-06-17 20:45:04 -04:00
Joey Hess	0527c74c0f	assistant: In direct mode, objects are now only dropped when all associated files are unwanted. This avoids a repreated drop/get loop of a file that has a copy in an archive directory, and a copy not in an archive directory. (Indirect mode still has some buggy behavior in this area, since it does not keep track of associated files.) Closes: #712060	2013-06-15 14:44:43 -04:00
Joey Hess	92f036fcb4	avoid warnings when built with ghc 7.6	2013-06-02 15:01:58 -04:00
Joey Hess	eba9ee5bc6	remove debug print	2013-05-27 11:18:18 -04:00
Joey Hess	3b1aedea3d	Merge branch 'robustness'	2013-05-25 15:22:18 -04:00
Joey Hess	5eeea0fac9	make direct mode merge cleanup more robust If the cleanup of a single file fails for some reason, continue to clean up other files. This could happen because of a race. The merge pulls in a change to a file, which gets changed locally at the same time.	2013-05-25 15:22:16 -04:00
Joey Hess	bf86b5ca16	improve robustness of fromDirect and replaceFile Made fromDirect check that a file in the tree has good content (and is not a broken symlink either) before copying it to another file that has the same key. Made replaceFile clean up the temp file if the action that creates it, or the file replacement action fails.	2013-05-25 15:06:02 -04:00
Joey Hess	729eab1f89	assistant: Work around git-cat-file's not reloading the index after files are staged. Argh.	2013-05-25 00:37:41 -04:00
Joey Hess	2b14fe2c98	refactor	2013-05-24 23:07:26 -04:00
Joey Hess	08c03b2af3	XMPP: Avoid redundant and unncessary pushes. Note that this breaks compatibility with previous versions of git-annex, which will refuse to accept any XMPP pushes from this version.	2013-05-21 18:24:29 -04:00
Joey Hess	0cb34f3caa	update inode cache after copying content This was also tripped by the test suite's automatic conflict resolution test. Which also shows BTW that an unnecessary copy of content is done sometimes when merging in direct mode. Not going to try to speed that up now.	2013-05-20 17:11:40 -04:00
Joey Hess	d88be65495	didn't quite get removeDirect right before, this passes test suite	2013-05-20 16:28:33 -04:00
Joey Hess	3d8355d984	Fix a bug in the git-annex branch handling code that could cause info from a remote to not be merged and take effect immediately. This bug was turned up by the test suite, running fsck in direct mode. A repository was cloned, was put into direct mode, was fscked, and fsck incorrectly said that no copy existed of a file, that was actually present in origin. This turned out to occur because fsck first did a Annex.Branch.change, recording that it did not locally have the file. That was recorded in the journal. Since neither the git annex direct not the fsck had yet needed to read any info from the branch, but had only made changes to it, the origin/git-annex branch was not yet merged in. So the journal got a location log entry written to it, but this did not include the location log info for the origin. When fsck then did a Annex.Branch.get, it trusted the journal was cosnsitent, and returned it, again w/o merging from origin/git-annex. This latter behavior is the actual bug. Refer to commit `e9bfa8eaed` for the thinking behind it being ok to make a change to a file on the branch, without first merging the branch. That thinking still stands. However, it means that files in the journal cannot be trusted to be consistent if the branch has not been merged. So, to fix, just enure the branch gets merged, even when reading from the journal. In tests, this does not seem to cause any extra merging. Except, of course, in the one case described above. But git annex add, etc, are able to make changes w/o first merging the branch.	2013-05-20 15:14:59 -04:00
Joey Hess	4c22c2261f	minor optimisation and warning fix	2013-05-20 13:58:41 -04:00
Joey Hess	f4ba19f2b8	direct mode bug fix: After a conflicted merge was automatically resolved, the content of a file that was already present could incorrectly be replaced with a symlink. The bug was in movein, which just replaceFile'd the file with a symlink, even if it already had the desired content, before trying to pull the content out of the annex and replace the symlink with it. That was ok-ish for non conflicted merges, where if the file existed it would be an old version of the content. But for conflicted merges, the automatic merge resolver has already run, and will have already put the desired content into the file for the local variant. Also, made removeDirect not trust that the associated files map is correct. Only if it can verify that another file has the content will it not move it into .git/annex/objects.	2013-05-20 13:41:09 -04:00
Joey Hess	345ee4f37c	Switch to MonadCatchIO-transformers for better handling of state while catching exceptions. As seen in this bug report, the lifted exception handling using the StateT monad throws away state changes when an action throws an exception. http://git-annex.branchable.com/bugs/git_annex_fork_bombs_on_gpg_file/ .. Which can result in cached values being redundantly calculated, or other possibly worse bugs when the annex state gets out of sync with reality. This switches from a StateT AnnexState to a ReaderT (MVar AnnexState). All changes to the state go via the MVar. So when an Annex action is running inside an exception handler, and it makes some changes, they immediately go into affect in the MVar. If it then throws an exception (or even crashes its thread!), the state changes are still in effect. The MonadCatchIO-transformers change is actually only incidental. I could have kept on using lifted-base for the exception handling. However, I'd have needed to write a new instance of MonadBaseControl for the new monad.. and I didn't write the old instance.. I begged Bas and he kindly sent it to me. Happily, MonadCatchIO-transformers is able to derive a MonadCatchIO instance for my monad. This is a deep level change. It passes the test suite! What could it break? Well.. The most likely breakage would be to code that runs an Annex action in an exception handler, and wants state changes to be thrown away. Perhaps the state changes leaves the state inconsistent, or wrong. Since there are relatively few places in git-annex that catch exceptions in the Annex monad, and the AnnexState is generally just used to cache calculated data, this is unlikely to be a problem. Oh yeah, this change also makes Assistant.Types.ThreadedMonad a bit redundant. It's now entirely possible to run concurrent Annex actions in different threads, all sharing access to the same state! The ThreadedMonad just adds some extra work on top of that, with its own MVar, and avoids such actions possibly stepping on one-another's toes. I have not gotten rid of it, but might try that later. Being able to run concurrent Annex actions would simplify parts of the Assistant code.	2013-05-19 14:16:36 -04:00
Joey Hess	630a8b9ad2	warning	2013-05-19 12:43:44 -04:00
Joey Hess	1b616c5d37	improve handling of receiving object in direct mode when associated files are modified Before, if a direct mode repo had one or more associated files that were modifed, moving the object into it would overwrite the associated files with the pristine object. Now, modified associated files are left unchanged. To ensure that, when an object is moved into a direct mode repo, it's not thrown away, it gets stored in indirect mode.	2013-05-17 16:25:18 -04:00
Joey Hess	94cb037aa3	store copy in inode cache too	2013-05-17 16:16:10 -04:00
Joey Hess	b8e5b9c645	test suite passes in direct mode This fixes a bug with git annex add in direct mode. If some files already existed in the tree pointing at the same key as a file that was just added, and their content was not present, add neglected to copy the content to those files. I also changed the behavior of moveAnnex slightly: When content is moved into the annex in direct mode, it does not overwrite any content already present in direct mode files. That content may be modified after all.	2013-05-17 15:59:37 -04:00
Joey Hess	3240006c56	fix android build, broken by changes for windows port	2013-05-16 11:52:48 -04:00
Joey Hess	aba49995b6	Merge branch 'master' into windows	2013-05-15 19:18:04 -04:00
Joey Hess	4829eae883	fix toDirectGen bug introduced in `247b7e9e58`	2013-05-15 19:15:40 -04:00
Joey Hess	c62b54d80d	start one git-cat-file per index file This reverts `1c83b6c439` and properly fixes the issue discussed there. This makes git-annex behave much nicer in direct mode.	2013-05-15 18:46:38 -04:00
Joey Hess	25cb9a48da	fix the day's Windows permissions damage	2013-05-14 20:15:14 -04:00
Joey Hess	8a2ff023a3	convert from internal git path when checking symlink standin file	2013-05-14 15:08:40 -05:00
Joey Hess	15af92291f	Merge remote-tracking branch 'gnu/windows' into windows	2013-05-14 14:21:49 -05:00
Joey Hess	fee6cd4635	fix imports	2013-05-14 14:21:35 -05:00
Joey Hess	e7936b1a34	always try to read symlink; only fall back to looking inside file On Windows with Cygwin, checking out a git-annex repo will create symlinks on disk, so we need to always try to read the symlink, even when core.symlinks says they're not supported.	2013-05-14 14:18:47 -04:00
Joey Hess	17952a893e	fix imports	2013-05-14 13:53:29 -04:00
Joey Hess	43f2de8522	Merge branch 'windows' of git://git-annex.branchable.com into windows	2013-05-13 20:11:30 -05:00
Joey Hess	1093302eba	read inode cache file strictly to avoid failure to drop on windows Seems that Windows doesn't allow deleting a file that the same process has open. Here the inode cache file was read and a the value from it gets used later. But due to laziness, the old file is still open when it gets deleted. Adding strictness avoids this problem. Of course, the file is small, so it's no problem to read it all strictly, so this is probably an improvement even outside of Windows.	2013-05-13 19:29:52 -05:00
Joey Hess	13b629c208	fix warnings	2013-05-13 15:30:18 -04:00
Joey Hess	25a8d4b11c	rename module	2013-05-12 19:19:28 -04:00
Joey Hess	03e8594369	fix the day's windows permissions damage	2013-05-12 19:09:48 -04:00
Joey Hess	73d2f8b280	deal with git using / internally, even on DOS	2013-05-12 17:29:49 -05:00
Joey Hess	2f3ce4c02f	fix	2013-05-12 15:43:59 -05:00
Joey Hess	838b984797	deal with dos path separators	2013-05-12 15:37:32 -05:00
Joey Hess	abe8d549df	fix permission damage (thanks, Windows)	2013-05-11 23:54:25 -04:00
Joey Hess	18bdff3fae	clean up from windows porting	2013-05-11 18:23:41 -04:00
Joey Hess	3c7e30a295	git-annex now builds on Windows (doesn't work)	2013-05-11 15:03:00 -05:00
Joey Hess	763cbda14f	fixup #if 0 stubs to use #ifndef mingw32_HOST_OS That's needed in files used to build the configure program. For the other files, I'm keeping my __WINDOWS__ define, as I find that much easier to type. I may search and replace it to use the mingw32_HOST_OS thing later.	2013-05-10 16:57:21 -05:00
Joey Hess	6c74a42cc6	stub out POSIX stuff	2013-05-10 16:29:59 -05:00
Joey Hess	adde00f4f3	git-annex-shell: Ensure that received files can be read. Files transferred from some Android devices may have very broken permissions as received.	2013-05-06 17:30:57 -04:00
Joey Hess	247b7e9e58	direct: Fix a bug that could cause some files to be left in indirect mode. It's possible for files in indirect mode to have a direct mode mapping file. Probably from when they were in direct mode. In this case, toDirectGen tried to copy the content from the direct mode file that the mapping said had it. But, being in indirect mode, it didn't really have the content. So it did nothing. This fix makes it always move the content from .git/annex/objects/ when it's there.	2013-05-06 12:43:03 -04:00
Joey Hess	543ffa5b9f	work around git/environment/gecos/android suck I don't know why, but I can't seem to set the environment variables inside git-annex to work around the git error caused by android's crappy username and hostname settings. This workaround works, and that's all that's good about it.	2013-05-03 14:08:26 -04:00
Joey Hess	e23a7598e2	set EMAIL when GECOS workaround is needed Git fails on Android, because it gets some weird domain for local host like "localhost.(none)". This works around that. I made it always set EMAIL when GECOS workaround was needed (unless EMAIL is already set). It might be nicer to try to get the hostname.domain as git does, and only set it if that fails. But I don't want to be stuck trying to exactly duplicate whatever git is doing.	2013-05-03 11:52:04 -04:00
Joey Hess	0807211a67	thaw content directory in direct mode too A content directory can be frozen in direct mode. One way this can happen is if the content is transferred before direct mode has a mapping for it, so it's stored in the content directory. So, we need to thaw the content directory before doing things with it.	2013-04-30 19:33:43 -04:00
Joey Hess	11ca4cee34	refactor	2013-04-30 19:09:36 -04:00
Joey Hess	0ae8c82c53	per-IA-item content directories	2013-04-25 23:44:55 -04:00
Joey Hess	07580dc3df	sync: Bug fix, avoid adding to the annex the dummy symlinks used on crippled filesystems. The root of the problem is that toInodeCache sees a non-symlink, and so goes on and generates a new inode cache for the dummy symlink. Any place that toInodeCache, or sameFileStatus, or genInodeCache are called may need to deal with this case. Although many of them are ok. For example, prepSendAnnex calls sameInodeCache, which calls genInodeCache.. but if the file content is not present, the InodeCache generated for its standin file is appropriately not the same, and so it returns Nothing. I've audited some, but have to say I'm not happy with this; it should be handled at the type level somehow, or a toInodeCache wrapper be used that is aware of dummy symlinks. (The Watcher already dealt with it, via the guardSymlinkStandin function.)	2013-04-23 17:14:28 -04:00
Joey Hess	8a2d1988d3	expose Control.Monad.join I think I've been looking for that function for some time. Ie, I remember wanting to collapse Just Nothing to Nothing.	2013-04-22 20:24:53 -04:00
Joey Hess	9cb223a8b3	Detect systems that have no user name set in GECOS, and also don't have user.name set in git config, and put in a workaround so that commits to the git-annex branch (and the assistant) will still succeed despite git not liking the system configuration.	2013-04-22 15:36:34 -04:00
guilhem	a1eded8641	Allow rsync to use other remote shells. Introduced a new per-remote option 'annex-rsync-transport' to specify the remote shell that it to be used with rsync. In case the value is 'ssh', connections are cached unless 'sshcaching' is unset.	2013-04-13 19:26:24 -04:00
Joey Hess	4f5ceffead	implement massReplace This looks at the string one char at a time, which is hardly efficient.. but more than good enough for expanding variables in relatively short command lines.	2013-04-08 23:56:37 -04:00
Joey Hess	d440b6047b	Added annex.web-download-command setting.	2013-04-08 23:34:05 -04:00
Joey Hess	602baae12e	Bugfix: Direct mode no longer repeatedly checksums duplicated files. Fixed by storing a list of cached inodes for a key, instead of just one. Backwards compatability note: An old git-annex version will fail to parse an inode cache file that has been written by a new version, and has multiple items. It will succees if just one. So old git-annexes will have even worse behavior when there are duplicated files, if that is possible. I don't think it will be a problem. (Famous last words.) Also, note that it doesn't expire old and unused inode caches for a key. It would be possible to add this if needed; just look through the associated files for a key and if there are more cached inodes, throw out any not corresponding to associated files. Unless a file is being copied repeatedly and the old copy deleted, this lack of expiry should not be a problem.	2013-04-06 16:07:25 -04:00
Joey Hess	f1b0a4b404	Use lower case hash directories for storing files on crippled filesystems, same as is already done for bare repositories. * since this is a crippled filesystem anyway, git-annex doesn't use symlinks on it * so there's no reason to use the mixed case hash directories that we're stuck using to avoid breaking everyone's symlinks to the content * so we can do what is already done for all bare repos, and make non-bare repos on crippled filesystems use the all-lower case hash directories * which are, happily, all 3 letters long, so they cannot conflict with mixed case hash directories * so I was able to 100% fix this and even resuming `git annex add` in the test case will recover and it will all just work.	2013-04-04 15:46:33 -04:00
Joey Hess	8a5b397ac4	hlint	2013-04-03 03:52:41 -04:00
Joey Hess	0b57113c42	cleanup	2013-04-02 19:45:52 -04:00
Joey Hess	38d61f934d	Update working tree files fully atomically This avoids commit churn by the assistant when eg, replacing a file with a symlink. But, just as importantly, it prevents the working tree being left with a deleted file if git-annex, or perhaps the whole system, crashes at the wrong time. (It also probably avoids confusing displays in file managers.)	2013-04-02 15:02:00 -04:00
Joey Hess	67e817c6a1	New annex.largefiles setting, which configures which files `git annex add` and the assistant add to the annex. I would have sort of liked to put this in .gitattributes, but it seems it does not support multi-word attribute values. Also, making this a single config setting makes it easy to only parse the expression once. A natural next step would be to make the assistant `git add` files that are not annex.largefiles. OTOH, I don't think `git annex add` should `git add` such files, because git-annex command line tools are not in the business of wrapping git command line tools.	2013-03-29 16:17:13 -04:00
Joey Hess	75a1c2f91a	cleanup debug print	2013-03-28 14:18:26 -04:00
Joey Hess	80c8c0e62a	comment typo	2013-03-18 13:17:43 -04:00
Joey Hess	7a77f98576	move comment to right place	2013-03-18 11:18:04 -04:00
Joey Hess	b3d3ece2ab	remove old debug print	2013-03-16 17:04:48 -04:00
Joey Hess	f7de51e8b6	Bugfix: Fix bug in inode cache sentinal check, which broke copying to local repos if the repo being copied from had moved to a different filesystem or otherwise changed all its inodes'	2013-03-12 16:41:54 -04:00
Joey Hess	61c5e8736c	detect renames during commit, and .. um, do nothing special because it's lunch time But I'm well set up to fast-track direct mode adds for renames now.	2013-03-11 12:56:47 -04:00
Joey Hess	40df015d90	remove Eq instance for InodeCache There are two types of equality here, and which one is right varies, so this forces me to consider and choose between them. Based on this, I learned that the commit in git anex sync was always doing a strong comparison, even when in a repository where the inodes had changed. Fixed that.	2013-03-11 02:57:48 -04:00
Joey Hess	cbb6e1fae4	tag xmpp pushes with jid This fixes the issue mentioned in the last commit. Turns out just collecting UUID of clients behind a XMPP remote is insufficient (although I should probably still do it for other reasons), because a single remote repo might be connected via both XMPP and local pairing. So a way is needed to know when a push was received from any client using a given XMPP remote over XMPP, as opposed to via ssh.	2013-03-06 16:29:19 -04:00
Joey Hess	c23ea9e311	assistant: Get back in sync with XMPP remotes after network reconnection, and on startup. Make manualPull send push requests over XMPP. When reconnecting with remotes, those that are XMPP remotes cannot immediately be pulled from and scanned, so instead maintain a set of (probably) desynced remotes, and put XMPP remotes on it. (This set could be used in other ways later, if we can detect we're out of sync with other types of remotes.) The merger handles detecting when a XMPP push is received from a desynced remote, and triggers a scan then, if they have in fact diverged. This has one known bug: A single XMPP remote can have multiple clients behind it. When this happens, only the UUID of one client is recorded as the UUID of the XMPP remote. Pushes from the other XMPP clients will not trigger a scan. If the client whose UUID is expected responds to the push request, it'll work, but when that client is offline, we're SOL.	2013-03-06 15:09:31 -04:00
Joey Hess	974d075108	Run ssh with -T to avoid tty allocation and any login scripts that may do undesired things with it.	2013-03-04 23:36:07 -04:00
Joey Hess	0c13d3065e	git subcommand cleanup Pass subcommand as a regular param, which allows passing git parameters like -c before it. This was already done in the pipeing set of functions, but not the command running set.	2013-03-03 13:39:07 -04:00
Joey Hess	cbd53b4a8c	Makefile now builds using cabal, taking advantage of cabal's automatic detection of appropriate build flags. The only thing lost is ./ghci Speed: make fast used to take 20 seconds here, when rebuilding from touching Command/Unused.hs. With cabal, it's 29 seconds.	2013-02-27 02:39:22 -04:00
Joey Hess	2d9c046dea	annex.version is now set to 4 for direct mode repositories To avoid old versions of git-annex getting confused. There is no upgrade required though. We switch back to 3 when going from direct to indirect.	2013-02-26 15:13:10 -04:00
Joey Hess	e423190b11	fix	2013-02-24 17:40:14 -04:00
Joey Hess	6ff1ce76b7	hopefully fix a bug	2013-02-24 17:21:04 -04:00
Joey Hess	afb21353c8	remove debug print	2013-02-23 14:34:02 -04:00
Joey Hess	051476c2a9	squelch warning	2013-02-22 18:22:12 -04:00
Joey Hess	08854afa10	fix inverted logic	2013-02-22 17:01:48 -04:00
Joey Hess	4689fbde35	fix sameInodeCache to check the inode change sentinal This should fix the problem where the assistant, on Android, re-adds every file on startup.	2013-02-22 15:19:28 -04:00
Joey Hess	faa9d3c22b	work around broken getEnvironment on Android in the most important place: git annex init This resulted in a lot of user complains that git annex init had git telling them they needed to run git config --global user.email .. which didn't work because even HOME was not passed into git.	2013-02-22 14:47:29 -04:00
Joey Hess	6f9be431e6	only create inode sentinal file when initializing a new repo	2013-02-20 13:55:53 -04:00
Joey Hess	00b465e213	shorter directory to external ssh socket Before it was too long to be used.	2013-02-19 17:31:08 -04:00
Joey Hess	624e34649f	Direct mode: Support filesystems like FAT which can change their inodes each time they are mounted.	2013-02-19 17:31:03 -04:00
Joey Hess	0f4cc559a7	Android: Support ssh connection caching.	2013-02-19 14:57:45 -04:00
Joey Hess	d799ef3182	set fileSystemEncoding when reading files that might be binary	2013-02-18 17:19:37 -04:00
Joey Hess	422dd28f0b	hlint	2013-02-18 02:39:40 -04:00
Joey Hess	9aa979edbd	types	2013-02-18 02:35:38 -04:00
Joey Hess	d7c93b8913	fully support core.symlinks=false in all relevant symlink handling code Refactored annex link code into nice clean new library. Audited and dealt with calls to createSymbolicLink. Remaining calls are all safe, because: Annex/Link.hs: ( liftIO $ createSymbolicLink linktarget file only when core.symlinks=true Assistant/WebApp/Configurators/Local.hs: createSymbolicLink link link test if symlinks can be made Command/Fix.hs: liftIO $ createSymbolicLink link file command only works in indirect mode Command/FromKey.hs: liftIO $ createSymbolicLink link file command only works in indirect mode Command/Indirect.hs: liftIO $ createSymbolicLink l f refuses to run if core.symlinks=false Init.hs: createSymbolicLink f f2 test if symlinks can be made Remote/Directory.hs: go [file] = catchBoolIO $ createSymbolicLink file f >> return True fast key linking; catches failure to make symlink and falls back to copy Remote/Git.hs: liftIO $ catchBoolIO $ createSymbolicLink loc file >> return True ditto Upgrade/V1.hs: liftIO $ createSymbolicLink link f v1 repos could not be on a filesystem w/o symlinks Audited and dealt with calls to readSymbolicLink. Remaining calls are all safe, because: Annex/Link.hs: ( liftIO $ catchMaybeIO $ readSymbolicLink file only when core.symlinks=true Assistant/Threads/Watcher.hs: ifM ((==) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) code that fixes real symlinks when inotify sees them It's ok to not fix psdueo-symlinks. Assistant/Threads/Watcher.hs: mlink <- liftIO (catchMaybeIO $ readSymbolicLink file) ditto Command/Fix.hs: stopUnless ((/=) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) $ do command only works in indirect mode Upgrade/V1.hs: getsymlink = takeFileName <$> readSymbolicLink file v1 repos could not be on a filesystem w/o symlinks Audited and dealt with calls to isSymbolicLink. (Typically used with getSymbolicLinkStatus, but that is just used because getFileStatus is not as robust; it also works on pseudolinks.) Remaining calls are all safe, because: Assistant/Threads/SanityChecker.hs: \| isSymbolicLink s -> addsymlink file ms only handles staging of symlinks that were somehow not staged (might need to be updated to support pseudolinks, but this is only a belt-and-suspenders check anyway, and I've never seen the code run) Command/Add.hs: if isSymbolicLink s \|\| not (isRegularFile s) avoids adding symlinks to the annex, so not relevant Command/Indirect.hs: \| isSymbolicLink s -> void $ flip whenAnnexed f $ only allowed on systems that support symlinks Command/Indirect.hs: whenM (liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f) $ do ditto Seek.hs:notSymlink f = liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f used to find unlocked files, only relevant in indirect mode Utility/FSEvents.hs: \| Files.isSymbolicLink s = runhook addSymlinkHook $ Just s Utility/FSEvents.hs: \| Files.isSymbolicLink s -> Utility/INotify.hs: \| Files.isSymbolicLink s -> Utility/INotify.hs: checkfiletype Files.isSymbolicLink addSymlinkHook f Utility/Kqueue.hs: \| Files.isSymbolicLink s = callhook addSymlinkHook (Just s) change all above are lower-level, not relevant Audited and dealt with calls to isSymLink. Remaining calls are all safe, because: Annex/Direct.hs: \| isSymLink (getmode item) = This is looking at git diff-tree objects, not files on disk Command/Unused.hs: \| isSymLink (LsTree.mode l) = do This is looking at git ls-tree, not file on disk Utility/FileMode.hs:isSymLink :: FileMode -> Bool Utility/FileMode.hs:isSymLink = checkMode symbolicLinkMode low-level Done!!	2013-02-17 16:43:14 -04:00
Joey Hess	397082013a	proper fix for dropunused Now getKeysPresent checks that the key's content, not only its directory, exists. In direct mode, the inode cache file is used as a standin for the content. removeAnnex always removes the inode cache file, and drop and move --from always call removeAnnex, even if the object does not seem to be inAnnex, to ensure it's always deleted.	2013-02-15 17:58:49 -04:00
Joey Hess	5a8fb26d0a	Revert "Clean up direct mode cache and mapping info when dropping keys." This reverts commit `57780cb3a4`. This was buggy, it caused the direct mode cache to be lost when dropping keys, so when the file is gotten back, it's stored in indirect mode. Note to self: Do not attempt bug fixes at 6 am!	2013-02-15 16:37:57 -04:00
Joey Hess	5ea4b91fb4	start to support core.symlinks=false Utility functions to handle no symlink mode, and converted Annex.Content to use them; still many other places to convert.	2013-02-15 16:03:11 -04:00
Joey Hess	7ce30b534f	add: Improved detection of files that are modified while being added. In indirect mode, now checks the inode cache to detect changes to a file. Note that a file can still be changed if a process has it open for write, after landing in the annex. In direct mode, some checking of the inode cache was done before, but from a much later point, so fewer modifications could be detected. Now it's as good as indirect mode. On crippled filesystems, no lock down is done before starting to add a file, so checking the inode cache is the only protection we have.	2013-02-14 16:54:36 -04:00
Joey Hess	a52f8f382b	split out Utility.InodeCache	2013-02-14 16:17:40 -04:00
Joey Hess	47477b2807	crippled filesystem support, probing and initial support git annex init probes for crippled filesystems, and sets direct mode, as well as `annex.crippledfilesystem`. Avoid manipulating permissions of files on crippled filesystems. That would likely cause an exception to be thrown. Very basic support in Command.Add for cripped filesystems; avoids the lock down entirely since doing it needs both permissions and hard links. Will make this better soon.	2013-02-14 14:15:26 -04:00
Joey Hess	f202d997f4	Now uses the Haskell uuid library, rather than needing a uuid program. Been meaning to do this for some time; Android port was last straw. Note that newer versions of the uuid library have a Data.UUID.V4 that generates random UUIDs slightly more cleanly, but Debian has an old version of the library, so I do it slightly round-about.	2013-02-10 14:52:54 -04:00
Joey Hess	57780cb3a4	Clean up direct mode cache and mapping info when dropping keys. These files were left behind, and made getKeysPresent find keys that were not present. It would be expensive to make getKeysPresent check that the actual key files are present (it just lists the directories). But that's not needed if we just clean up the stale cache and mapping files. To handle systems that were in direct mode and got switched back with stale direct mode files, made cleanObjectLoc remove all files in the key's directory. git annex unused will still list keys that are gone but for which the stale direct mode files exists. To deal with that, made dropunused remove the key's directory even if the key does not seem to be present.	2013-02-07 08:28:40 -04:00
Joey Hess	af3a25ee03	Deal with stale mappings for deleted file in direct mode. The most common way for a mapping to be stale is when a file was deleted, or renamed. Nothing updates the mappings for deletions yet. But they can also become stale in other ways. For example a file can be modified. So, the mapping is not trusted to be consistent. When we get a key, only replace symlinks that still point to that key with its content. When we drop a key, only put back symlinks for files that still have the direct mode content.	2013-02-05 16:48:00 -04:00
Joey Hess	0e3f931f37	add another setting to GitConfig	2013-01-28 00:33:19 +11:00
Joey Hess	103b572d8e	ensure that content directory is thawed when writing direct mode mapping and cache files	2013-01-26 20:09:15 +11:00
Joey Hess	f86462b475	allow lazy reading of map contents Don't explicitly close; hGetContents will close when read is done.	2013-01-18 13:16:16 -04:00
Joey Hess	e481ca7658	some more direct mode fixes Avoid a crash if a mapping contains files that no longer exist. This could happen because eg, one was deleted and a commit has not yet been done to update the mapping. Fix path calculation.	2013-01-18 12:39:26 -04:00
Joey Hess	5c58e9c101	Avoid filename encoding errors when writing direct mode mappings.	2013-01-18 12:26:45 -04:00
Joey Hess	bbf0e74f72	Fix direct mode mapping code to always store direct mode filenames relative to the top of the repository, even when operating inside a subdirectory.	2013-01-18 12:20:08 -04:00
Joey Hess	85c564ea94	In direct mode, files with the same key are no longer hardlinked, as that would cause a surprising behavior if modifying one, where the other would also change.	2013-01-14 11:56:37 -04:00
Joey Hess	a6a5ed8121	check for direct mode file change when copying to a local git remote	2013-01-10 11:45:44 -04:00
Joey Hess	1bc49b7158	Special remotes now all rollback storage of keys that get modified during the transfer, which can happen in direct mode.	2013-01-09 18:42:29 -04:00
Joey Hess	858ad6783b	add works in direct mode Also, changed sync to no longer automatically add files in direct mode. That was only necessary before because add didn't work.	2013-01-06 17:24:22 -04:00
Joey Hess	909f67443f	Fix transferring files to special remotes in direct mode.	2013-01-06 14:29:01 -04:00
Joey Hess	e457be7631	direct: Avoid hardlinking symlinks that point to the same content when the content is not present.	2013-01-06 13:57:53 -04:00
Joey Hess	1c83b6c439	work around a very strange git-cat-file behavior Sometimes it seems that git-cat-file --batch stops getting info for files in the current repo, when ":file" is fed to it. I have not reproduced this at the command line, but only when using git annex whereis and git annex move inside a direct mode repo. Those failed, because cat-file returned "file missing". OTOH, git annex find works fine, despite passing the same file to cat-file. It seems that the failing commands first asked cat-file to show a file on the git-annex branch. Perhaps it got "stuck" on that branch? But I cannot repoduce it running cat-file by hand. Most strange. HEAD is a workaround for this extreme weirdness, since I spent a good 2 hours struggling with it already.	2013-01-05 17:06:24 -04:00
Joey Hess	1cdf2b923d	assistant: Make expensive transfer scan work fully in direct mode. The expensive scan uses lookupFile, but in direct mode, that doesn't work for files that are present. So the scan was not finding things that are present that need to be uploaded. (It did find things not present that needed to be downloaded.) Now lookupFile also works in direct mode. Note that it still prefers symlinks on disk to info committed to git, in direct mode. This is necessary to make things like Assistant.Threads.Watcher.onAddSymlink work correctly, when given a new symlink not yet checked into git (or replacing a file checked into git).	2013-01-05 15:57:53 -04:00
Joey Hess	4008590c68	type based git config handling for remotes Still a couple of places that use git config ad-hoc, but this is most of it done.	2013-01-01 13:58:14 -04:00
Joey Hess	7f7c31df1c	type based git config handling Now there's a Config type, that's extracted from the git config at startup. Note that laziness means that individual config values are only looked up and parsed on demand, and so we get implicit memoization for all of them. So this is not only prettier and more type safe, it optimises several places that didn't have explicit memoization before. As well as getting rid of the ugly explicit memoization code. Not yet done for annex.<remote>.* configuration settings.	2012-12-29 23:10:18 -04:00
Joey Hess	2fdefc656b	fix logic error breaking direct mode assistant autocommit of modified files	2012-12-28 16:00:19 -04:00
Joey Hess	eb40227d15	assistant direct mode file add/change bookkeeping When a file is changed in direct mode, the old content is probably lost (at least from the local repo), and bookeeping needs to be updated to reflect this. Also, synthetic add events are generated at assistant startup, so make it detect when the file has not really changed, and avoid re-adding it. This does add the overhead of querying the runing git cat-file for the key that's recorded in git for the file, each time a file is added or modified in direct mode.	2012-12-25 15:48:15 -04:00
Joey Hess	ddb0adb998	more quickcheck fun	2012-12-19 16:36:19 -04:00
Joey Hess	93c430c2a4	comment	2012-12-19 12:46:35 -04:00
Joey Hess	97d670b0d5	normalise associated files Sometimes ./file will be passed in, and sometimes file; need to treat these the same.	2012-12-19 12:44:24 -04:00
Joey Hess	05ec4587dd	partial and incomplete automatic merging in direct mode Handles our file right, but not theirs.	2012-12-18 17:15:16 -04:00
Joey Hess	53dbcce645	direct mode merging works! Automatic merge resoltion code needs to be fixed to preserve objects from direct mode files.	2012-12-18 15:04:44 -04:00
Joey Hess	5df3c66a85	added direct and indirect commands	2012-12-13 15:44:56 -04:00
Joey Hess	cfe354eccd	whitespace fix	2012-12-13 00:46:30 -04:00
Joey Hess	ffdd08fd2e	Merge branch 'master' into desymlink	2012-12-13 00:46:10 -04:00
Joey Hess	0d50a6105b	whitespace fixes	2012-12-13 00:45:27 -04:00
Joey Hess	b080a58b76	Merge branch 'master' into desymlink Conflicts: Annex/CatFile.hs Annex/Content.hs Git/LsFiles.hs Git/LsTree.hs	2012-12-13 00:29:06 -04:00
Joey Hess	f87a781aa6	finished where indentation changes	2012-12-13 00:24:19 -04:00
Joey Hess	e7b8cb0063	direct mode committing	2012-12-12 19:20:38 -04:00
Joey Hess	f2ed0f9659	fix associated files to not fall back to object location	2012-12-12 13:11:59 -04:00
Joey Hess	752b5354ab	make parent directory	2012-12-12 13:05:50 -04:00
Joey Hess	9d133270c2	update	2012-12-10 15:02:44 -04:00
Joey Hess	514957914d	direct mode mappings now updated by git annex sync Still lots to do to make sync handle direct mode, but this is a good first step.	2012-12-10 14:37:24 -04:00
Joey Hess	b4c6da9cbd	Got object sending working in direct mode. However, I don't yet have a reliable way to deal with files being modified while they're being transferred. I have code that detects it on the sending side, but the receiver is still free to move the wrong content into its annex, and record that it has the content. So that's not acceptable, and I'll need to work on it some more. However, at this point I can use a direct mode repository as a remote and transfer files from and to it.	2012-12-08 17:03:39 -04:00
Joey Hess	664765e757	update the cache automatically when moving objects in or out	2012-12-08 13:13:36 -04:00
Joey Hess	ef24751922	support for checking presence of objects in direct mode Also for dropping objects in direct mode. Checking presence reliably needs a cache of mtime, size, and inode. This way, if a file is modified, keys that point to it are no longer present. Also, the code for restoring the symlink when removing objects is unnecessarily messy. calcGitLink was generating links starting with "../../remote/.git/", when running "git annex move --from remote". I put in a workaround, but calcGitLink should probably be fixed. There is not yet support for getting objects from repositories in direct mode; it still looks for content in .git/annex/objects, and there's no once place I can change to fix that. Also, getting objects from direct mode repositories is problematic since the can be changed while the object is being transferred. It probably needs to quarantine it first.	2012-12-07 17:29:55 -04:00
Joey Hess	3898d8c091	support for storing files in direct mode	2012-12-07 14:53:02 -04:00
Joey Hess	99a8a5297c	--auto fixes * get/copy --auto: Transfer data even if it would exceed numcopies, when preferred content settings want it. * drop --auto: Fix dropping content when there are no preferred content settings.	2012-12-06 13:22:16 -04:00
Joey Hess	b5a9560a1b	squelch warning	2012-11-26 16:30:46 -04:00
Joey Hess	da6fb44446	finished XMPP pairing! This includes keeping track of which buddies we're pairing with, to know which PairAck are legitimate.	2012-11-05 17:43:17 -04:00
Joey Hess	9767562f65	rsync special remote: Include annex-rsync-options when running rsync to test a key's presence. Also, use the new withQuietOutput function to avoid running the shell to /dev/null stderr in two other places.	2012-10-28 13:51:14 -04:00
Joey Hess	3417c55189	remove git-annex branch read cache This cache prevented noticing changes made by another process. The case I just ran into involved the assistant dropping a file, which cached its presence info. Then the same file was downloaded again, but the assistant didn't know its presence info had changed. I don't see a way to keep this cache. Will instead rely on the OS level file cache, for files in the journal. May need to add more higher-level caching of info that it's ok to have a potentially stale copy of, although much of git-annex already does so.	2012-10-19 14:25:15 -04:00
Joey Hess	e7780a39f5	Preferred content path matching bugfix. When in a subdir, both the normal filepath, and the filepath relative to the top of the git repo are needed for matching. The former for key lookup, and the latter for include/exclude to match against. Previously, key lookup didn't work in this situation.	2012-10-17 16:01:09 -04:00
Joey Hess	3156febec8	disable ssh connection caching for standalone builds The standalone build does not bundle its own ssh, so should be built to support as wide an array of ssh versions as possible, so turn off connection caching. Unfortunatly, as implemented this forces a full rebuild when building the standalone binary, and of course it makes it somewhat slower. This is not ideal, but neither is probing the ssh version every time it's run (slow), or once when initializing a repo (fragile).	2012-10-15 14:49:40 -04:00
Joey Hess	97ea08e2d1	Avoid unsetting HOME when running certian git commands. Closes: #690193 Setting GIT_INDEX_FILE clobbers the rest of the environment, making git not read ~/.gitconfig, and blow up if GECOS didn't have a name for the user. I'm not entirely happy with getEnvironment being run every time now, that's somewhat expensive. It may make sense to just set GIT_COMMITTER_* and GIT_AUTHOR_*, but I worry that clobbering the rest could break PATH, or GIT_PATH, or something else that might be used by a command run in here. And caching the environment is not a good idea either; it can change..	2012-10-11 12:58:24 -04:00
Joey Hess	39be7eea40	add standard group selector to repo edit form	2012-10-10 16:04:28 -04:00
Joey Hess	9da7dd8874	webapp: configure new repos to use the standard preferred content settings	2012-10-10 15:35:10 -04:00
Joey Hess	3490977d97	webapp: put new repos in standard groups I'm using transfer for most things, both removable drives and cloud storage, because it's the safest choice. We'll see if it makes sense to prompt for the group when setting this up, or let the user pick something else after the fact.	2012-10-10 15:27:25 -04:00
Joey Hess	f9b81c7a75	refactor	2012-10-10 15:15:56 -04:00
Joey Hess	5ac15149cc	assistant: Now honors preferred content settings when deciding what to transfer. Both when queueing downloads, and uploads, consults the preferred content settings. I didn't make it check yet when requeing failed transfers or queuing deferred downloads; dealing with the preferred content settings (or indeed, other settings) changing while the assistant is running still needs work.	2012-10-09 12:18:41 -04:00
Joey Hess	fee40dd374	generalized Annex.Wanted this should make it easy to use from inside the assistant, where everything is an AssociatedFile.	2012-10-08 17:14:01 -04:00
Joey Hess	836561e057	fix invered logic for shouldDrop	2012-10-08 16:12:02 -04:00
Joey Hess	1eedf495c3	make copy --to check preferred content of the remote	2012-10-08 16:06:56 -04:00
Joey Hess	34e7faf71a	uninit: Unset annex.version. Closes: #689852	2012-10-07 16:04:03 -04:00
Joey Hess	47314c0fad	fix last zombies in the assistant Made Git.LsFiles return cleanup actions, and everything waits on processes now, except of course for Seek.	2012-10-04 19:56:32 -04:00
Joey Hess	5594bf0643	more zombie fighting I'm down to 9 places in the code that can produce unwaited for zombies. Most of these are pretty innocuous, at least for now, are only used in short-running commands, or commands that run a set of actions and explicitly reap zombies after each one. The one from Annex.Branch.files could be trouble later, since both Command.Fsck and Command.Unused can trigger it, and the assistant will be doing those eventally. Ditto the one in Git.LsTree.lsTree, which Command.Unused uses. The only ones currently affecting the assistant though, are in Git.LsFiles. Several threads use several of those. (And yeah, using pipes or ResourceT would be a less ad-hoc approach, but I don't really feel like ripping my entire code base apart right now to change a foundation monad. Maybe one of these days..)	2012-10-04 18:47:31 -04:00
Joey Hess	bc83179a76	Test that uuid -m works, falling back to plain uuid if not.	2012-09-25 10:48:20 -04:00
Joey Hess	3887432c54	fixes for transfer resume Fix resuming of downloads, which do not have a transfer info file to read. When checking upload progress, use the MVar, rather than re-reading the info file. Catch exceptions in the transfer action. Required a tryAnnex.	2012-09-24 13:18:16 -04:00
Joey Hess	e8188ea611	flip catchDefaultIO	2012-09-17 00:18:07 -04:00
Joey Hess	ba0334116c	more descriptive name for oneshot	2012-09-15 20:46:38 -04:00
Joey Hess	750c4ac6c2	bugfix: avoid staging but not committing changes to git-annex branch Branch.get is not able to see changes that have been staged to the index but not committed. This is a limitation of git cat-file --batch; when reading from the index, as opposed to from a branch, it does not notice changes made after the first time it reads the index. So, had to revert the changes made in `1f73db3469` to make annex.alwayscommit=false stage changes. Also, ensure that Branch.change and Branch.get always see changes at all points during a commit, by not deleting journal files when staging to the index. Delete them only after committing the branch. Before, there was a race during commits where a different git-annex could see out-of-date info from the branch while a commit was in progress. That's also done when updating the branch to merge in remote branches. In the case where the local git-annex branch has had changes pushed into it that are not yet reflected in the index, and there are journalled changes as well, a merge commit has to be done.	2012-09-15 20:15:16 -04:00
Joey Hess	a1f93f06fd	eliminate some commits to the git-annex branch Commits used to be made to the git-annex branch whenever there were journalled changes from a previous command, and the current command looked up the value of a file. This no longer happens. This means that transferkey, which is a oneshot command that stages changes, can be run multiple times by the assistant, without each of them committing the changes made by the command before. Which will be a lot faster and use less space by batching up the commits. Commits still happen if a remote git-annex branch has been changed and is merged in.	2012-09-15 18:36:42 -04:00
Joey Hess	ca45cea113	Revert "add catFileIndex" This interface is not a good idea, because a running git cat-file --batch does not notice when existing files in the index are changed.	2012-09-15 18:30:53 -04:00
Joey Hess	e1baf48d88	add catFileIndex	2012-09-15 17:06:10 -04:00
Joey Hess	87fb9c690e	remove withIndexUpdate helper	2012-09-15 15:48:21 -04:00
Joey Hess	5573911d25	Disable ssh connection caching if the path to the control socket would be too long (and use relative path to minimise path to the control socket).	2012-09-13 19:26:39 -04:00
Joey Hess	c9b3b8829d	thread safe git-annex index file use	2012-08-24 20:50:39 -04:00
Joey Hess	5c3e14649e	avoid unnecessary transfer scans when syncing a disconnected remote Found a very cheap way to determine when a disconnected remote has diverged, and has new content that needs to be transferred: Piggyback on the git-annex branch update, which already checks for divergence. However, this does not check if new content has appeared locally while disconnected, that should be transferred to the remote. Also, this does not handle cases where the two git repos are in sync, but their content syncing has not caught up yet. This code could have its efficiency improved: * When multiple remotes are synced, if any one has diverged, they're all queued for transfer scans. * The transfer scanner could be told whether the remote has new content, the local repo has new content, or both, and could optimise its scan accordingly.	2012-08-22 15:05:57 -04:00
Joey Hess	9fc94d780b	better readProcess	2012-07-19 00:57:40 -04:00
Joey Hess	1db7d27a45	add back debug logging Make Utility.Process wrap the parts of System.Process that I use, and add debug logging to them. Also wrote some higher-level code that allows running an action with handles to a processes stdin or stdout (or both), and checking its exit status, all in a single function call. As a bonus, the debug logging now indicates whether the process is being run to read from it, feed it data, chat with it (writing and reading), or just call it for its side effect.	2012-07-19 00:46:52 -04:00
Joey Hess	d1da9cf221	switch from System.Cmd.Utils to System.Process Test suite now passes with -threaded! I traced back all the hangs with -threaded to System.Cmd.Utils. It seems it's just crappy/unsafe/outdated, and should not be used. System.Process seems to be the cool new thing, so converted all the code to use it instead. In the process, --debug stopped printing commands it runs. I may try to bring that back later. Note that even SafeSystem was switched to use System.Process. Since that was a modified version of code from System.Cmd.Utils, it needed to be converted too. I also got rid of nearly all calls to forkProcess, and all calls to executeFile, which I'm also doubtful about working well with -threaded.	2012-07-18 18:00:24 -04:00
Joey Hess	05310538ef	more debugging	2012-07-18 13:31:00 -04:00
Joey Hess	75b6ee81f9	avoid ByteString.Char8 where not needed Its truncation behavior is a red flag, so avoid using it in these places where only raw ByteStrings are used, without looking at the data inside.	2012-06-20 13:13:40 -04:00
Joey Hess	e0095b0bdc	fishy commit	2012-06-14 00:01:48 -04:00
Joey Hess	942d8f7298	hlint	2012-06-12 11:32:06 -04:00
Joey Hess	ca9ee21bd7	crazy optimisation Crazy like a fox..	2012-06-10 19:58:34 -04:00
Joey Hess	c5707c84d3	queue size fix Increase queue size for update-index actions, because otherwise they'll never be flushed.	2012-06-10 13:56:04 -04:00
Joey Hess	d45a9a7831	refactor and function name cleanup (oops, I had a calcMerge and a calc_merge!)	2012-06-08 00:29:39 -04:00
Joey Hess	20f425be19	make watch use the queue May not work. Certianly needs to flush the queue from time to time when only symlink changes are being made.	2012-06-07 15:40:44 -04:00
Joey Hess	0a11b35d89	extend Git.Queue to be able to queue more than simple git commands While I was in there, I noticed and fixed a bug in the queue size calculations. It was never encountered only because Queue.add was only ever run with 1 file in the list.	2012-06-07 15:19:44 -04:00
Joey Hess	b819f644ad	close the git add race There's a race adding a new file to the annex: The file is moved to the annex and replaced with a symlink, and then we git add the symlink. If someone comes along in the meantime and replaces the symlink with something else, such as a new large file, we add that instead. Which could be bad.. This race is fixed by avoiding using git add, instead the symlink is directly staged into the index. It would be nice to make `git annex add` use this same technique. I have not done so yet because it currently runs git update-index once per file, which would slow does `git annex add`. A future enhancement would be to extend the Git.Queue to include the ability to run update-index with a list of Streamers.	2012-06-06 14:29:10 -04:00
Joey Hess	993e6459a3	factor out nukeFile	2012-06-06 13:13:13 -04:00
Joey Hess	27cfeca4ea	Merge branch 'master' into watch	2012-06-06 02:16:21 -04:00
Joey Hess	f1bd72ea54	factor out generic update-index code from unionmerge code	2012-06-06 00:10:34 -04:00
Joey Hess	7a6fb8ae4e	flush the git queue when a new type of action is being added to it This allows the queue to be used in a single process for multiple possibly conflicting commands, like add and rm, without running them out of order. This assumes that running the same git subcommand with different parameters cannot itself conflict.	2012-06-04 20:41:22 -04:00
Joey Hess	bb4f31a0ee	Clean up handling of git directory and git worktree. Baked into the code was an assumption that a repository's git directory could be determined by adding ".git" to its work tree (or nothing for bare repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are used to separate the two. This was attacked at the type level, by storing the gitdir and worktree separately, so Nothing for the worktree means a bare repo. A complication arose because we don't learn where a repository is bare until its configuration is read. So another Location type handles repositories that have not had their config read yet. I am not entirely happy with this being a Location type, rather than representing them entirely separate from the Git type. The new code is not worse than the old, but better types could enforce more safety. Added support for core.worktree. Overriding it with -c isn't supported because it's not really clear what to do if a git repo's config is read, is not bare, and is then overridden to bare. What is the right git directory in this case? I will worry about this if/when someone has a use case for overriding core.worktree with -c. (See Git.Config.updateLocation) Also removed and renamed some functions like gitDir and workTree that misused git's terminology. One minor regression is known: git annex add in a bare repository does not print a nice error message, but runs git ls-files in a way that fails earlier with a less nice error message. This is because before --work-tree was always passed to git commands, even in a bare repo, while now it's not.	2012-05-18 17:03:12 -04:00
Joey Hess	f7d8982672	Fix use of several config settings annex.ssh-options, annex.rsync-options, annex.bup-split-options. And adjust types to avoid the bugs that broke several config settings recently. Now "annex." prefixing is enforced at the type level.	2012-05-05 20:16:56 -04:00
Joey Hess	76102c1c75	display "Recording state in git..." when staging the journal A bit tricky to avoid printing it twice in a row when there are queued git commands to run and journal to stage. Added a generic way to run an action that may output multiple side messages, with only the first displayed.	2012-04-27 13:54:33 -04:00
Joey Hess	e0b7012ccc	uninit: Clear annex.uuid from .git/config. Closes: #670639	2012-04-27 12:21:38 -04:00
Joey Hess	84ac8c58db	Add annex.httpheaders and annex.httpheader-command config settings Allow custom headers to be sent with all HTTP requests. (Requested by the Internet Archive)	2012-04-22 01:13:09 -04:00
Joey Hess	ed79596b75	noop	2012-04-21 23:32:33 -04:00
Joey Hess	bee420bd2d	in which I discover void void :: Functor f => f a -> f () -- ah, of course that's useful :)	2012-04-21 23:06:19 -04:00
Joey Hess	cab63b89f2	cache parsed core.sharedrepository	2012-04-21 19:42:49 -04:00
Joey Hess	b98b69e8c6	honor core.sharedRepository when making all the other files in the annex Lock files, directories, etc.	2012-04-21 19:36:03 -04:00
Joey Hess	7e45712d19	better file mode setting code	2012-04-21 16:01:56 -04:00
Joey Hess	b4a5e39ee6	Support git's core.sharedRepository configuration This is incomplete, it does not honor it yet for hash directories and other annex bookkeeping files. Some of that is not needed for a bare repo; some of it may be.	2012-04-21 15:36:52 -04:00
Joey Hess	b65e257b13	inverted logic	2012-04-20 16:16:13 -04:00
Joey Hess	262017e17d	export a more generalized checkDiskSpace	2012-04-20 16:06:10 -04:00
Joey Hess	e38a839a80	Rewrote free disk space checking code Moving the portability handling into a small C library cleans up things a lot, avoiding the pain of unpacking structs from inside haskell code.	2012-03-22 17:32:47 -04:00
Joey Hess	f1398b5583	use new getConfig	2012-03-22 17:32:47 -04:00
Joey Hess	4eb5112681	rationalize getConfig getConfig got a remote-specific config, and this confusing name caused it to be used a couple of places that only were interested in global configs. Rename to getRemoteConfig and make getConfig only get global configs. There are no behavior changes here, but remote.<name>.annex-web-options never actually worked (and per-remote web options is a very unlikely to be useful case so I didn't make it work), so fix the documentation for it.	2012-03-22 17:32:47 -04:00
Joey Hess	188e2edc41	status: Prints available local disk space, or shows if git-annex doesn't know.	2012-03-21 21:55:02 -04:00
Joey Hess	181d2ccd20	Improve detection of inability to check free disk space. Don't check if configure indicated checks won't work. This should fix a FTBFS on mipsel, where configure correctly detects the checks won't work, while garbage is returned for disk space info at git-annex runtime. It also means that, when built via cabal, disk space checks are not enabled, unfortunatly.	2012-03-21 21:21:20 -04:00
Joey Hess	60ab3d84e1	added ifM and nuked 11 lines of code no behavior changes	2012-03-14 17:43:34 -04:00
Joey Hess	b325694645	getKeysPresent is now fully lazy .. Allowing it to be used by things in constant space! Random statistics: git annex status has gone from taking 239 mb of memory and 26 seconds in a repo, to 8 mb and 13 seconds. The trick here is the unsafeInterleaveIO, and the form of the function's recursion, which I cribbed heavily from System.IO.HVFS.Utils.recurseDirStat. The difference is, this one goes to a limited depth and avoids statting everything.	2012-03-11 18:04:58 -04:00
Joey Hess	ff3644ad38	status: Fixed to run in nearly constant space. Before, it leaked space due to caching lists of keys. Now all necessary data about keys is calculated as they stream in. The "nearly constant" is due to getKeysPresent, which builds up a lot of [] thunks as it traverses .git/annex/objects/. Will deal with it later.	2012-03-11 17:15:58 -04:00
Joey Hess	d08ee1a9d2	syscall optimisation	2012-03-06 13:56:20 -04:00
Joey Hess	12b89a3eb8	configure: Check if ssh connection caching is supported by the installed version of ssh and default annex.sshcaching accordingly.	2012-02-25 19:15:29 -04:00
Joey Hess	1f73db3469	improve alwayscommit=false mode Now changes are staged into the branch's index, but not committed, which avoids growing a large journal. And sync and merge always explicitly commit, ensuring that even when they do nothing else, they commit the staged changes. Added a flag file to indicate that the branch's journal contains uncommitted changes. (Could use git ls-files, but don't want to run that every time.) In the future, this ability to have uncommitted changes staged in the journal might be used on remotes after a series of oneshot commands.	2012-02-25 16:18:55 -04:00
Joey Hess	b49c0c2633	add annex.alwayscommit option To avoid commits of data to the git-annex branch after each command is run, set annex.alwayscommit=false. Its data will then be committed less frequently, when a merge or sync is done.	2012-02-25 15:31:42 -04:00
Joey Hess	bd66f962d3	Deal with NFS problem that caused a failure to remove a directory when removing content from the annex. I was able to reproduce this on linux using the kernel's nfs server and mounting localhost:/. Determined that removing the directory fails when the just-deleted file in it was locked. Considered dropping the lock before removing the directory, but this would complicate parts of the code that should not need to worry about locking. So instead, ignore the failure to remove the directory in this case. While I was at it, made it attempt to remove both levels of hash directories, in case they're empty.	2012-02-24 16:30:47 -04:00
Joey Hess	a1e52f0ce5	hlint	2012-02-16 00:44:51 -04:00
Joey Hess	52c5b164d8	Added a annex.queuesize setting useful when adding hundreds of thousands of files on a system with plenty of memory. git add gets quite slow in such a large repository, so if the system has more than the ~32 mb of memory the queue can use by default, it's a useful optimisation to increase the queue size, in order to decrease the number of times git add is run.	2012-02-15 11:14:19 -04:00
Joey Hess	03c559f8d6	tweak	2012-02-14 14:51:26 -04:00
Joey Hess	7ebd98d8d8	fix memory leak when staging the journal The list of files had to be retained until the end so it could be deleted. Also, a list of update-index lines was generated and only then fed into it. Now everything streams in constant space.	2012-02-14 14:37:59 -04:00
Joey Hess	a40ec5e03e	Fixed a memory leak due to excessive strictness when committing journal files. When hashing the files, the entire list of shas was read strictly. That was entirely unnecessary, since there's a cleanup action run after they're consumed.	2012-02-14 11:20:34 -04:00
Joey Hess	cbaebf538a	rework git check-attr interface Now gitattributes are looked up, efficiently, in only the places that really need them, using the same approach used for cat-file. The old CheckAttr code seemed very fragile, in the way it streamed files through git check-attr. I actually found that `cad8824852` was still deadlocking with ghc 7.4, at the end of adding a lot of files. This should fix that problem, and avoid future ones. The best part is that this removes withAttrFilesInGit and withNumCopies, which were complicated Seek methods, as well as simplfying the types for several other Seek methods that had a Backend tupled in.	2012-02-13 23:52:21 -04:00
Joey Hess	d55f3c0716	Fix teardown of stale cached ssh connections.	2012-02-09 21:49:46 -04:00
Joey Hess	146c36ca54	IO exception rework ghc 7.4 comaplains about use of System.IO.Error to catch exceptions. Ok, use Control.Exception, with variants specialized to only catch IO exceptions.	2012-02-03 16:47:24 -04:00
Joey Hess	b81d662cbf	Avoid repeated location log commits when a remote is receiving files. Done by adding a oneshot mode, in which location log changes are written to the journal, but not committed. Taking advantage of git-annex's existing ability to recover in this situation. This is used by git-annex-shell and other places where changes are made to a remote's location log.	2012-01-28 15:41:52 -04:00
Joey Hess	ba6088b249	rename readMaybe to readish a stricter (but also partial) readMaybe is getting added to base	2012-01-23 17:00:10 -04:00
Joey Hess	eb9001044f	order user provided params after connection caching params So the user can override them.	2012-01-20 17:32:32 -04:00
Joey Hess	6ef82665de	add annex.sshcaching config setting	2012-01-20 17:15:46 -04:00
Joey Hess	47250a153a	ssh connection caching Ssh connection caching is now enabled automatically by git-annex. Only one ssh connection is made to each host per git-annex run, which can speed some things up a lot, as well as avoiding repeated password prompts. Concurrent git-annex processes also share ssh connections. Cached ssh connections are shut down when git-annex exits. Note: The rsync special remote does not yet participate in the ssh connection caching.	2012-01-20 17:14:56 -04:00
Joey Hess	61dbad505d	fsck --from remote --fast Avoids expensive file transfers, at the expense of checking file size and/or contents. Required some reworking of the remote code.	2012-01-20 13:23:11 -04:00
Joey Hess	effaa298fa	optimise fsck --from normal git remotes For a local git remote, can symlink the file. For a git remote using rsync, can preseed any local content. There are a few reasons to use fsck --from on a normal git remote. One is if it's using gitosis or similar, and you don't have shell access to run git annex locally. Another reason could be if you just want to fsck certian files of a bare remote.	2012-01-19 17:10:44 -04:00
Joey Hess	81856c3175	add a configure check for StatFS This way, the build log will indicate whether StatFS can be relied on. I've tested all the failing architectures now, and on all of them, the StatFS code now returns Nothing, rather than Just nonsense. Also, if annex.diskreserve is set on a platform where StatFS is not working, git-annex will complain. Also, the Makefile was missing the sources target used when building with cabal.	2012-01-15 13:49:32 -04:00
Joey Hess	a3d97e0c85	tweak	2012-01-14 14:31:16 -04:00
Joey Hess	5e2b4e16ba	avoid multiple unnecessary stats of the index file Up to one per file processed.	2012-01-14 12:07:36 -04:00
Joey Hess	abdacf58ed	tweaks	2012-01-11 00:06:54 -04:00
Joey Hess	16e7178f20	reorg	2012-01-10 15:29:10 -04:00
Joey Hess	a3a9f87047	log: New command that displays the location log for file, showing each repository they were added to and removed from. This needs to run git log on the location log files to get at all past versions of the file, which tends to be a bit slow. It would be possible to make a version optimised for showing the location logs for every key. That would only need to run git log once, so would be faster, but it would need to process an enormous amount of data, so would not speed up the individual file case. In the future it would be nice to support log --format. log --json also doesn't work right yet.	2012-01-06 15:40:07 -04:00
Joey Hess	aa0882691b	Added remote.name.annex-web-options configuration setting, which can be used to provide parameters to whichever of wget or curl git-annex uses (depends on which is available, but most of their important options suitable for use here are the same).	2012-01-02 14:20:20 -04:00
Joey Hess	252376d639	Merge branch 'master' into autosync	2011-12-30 20:38:59 -04:00
Joey Hess	52104dae6f	refactor	2011-12-30 18:36:40 -04:00
Joey Hess	925b6390aa	add forceUpdate This code is picked from my tweak-fetch branch, which already did the needed refactoring.	2011-12-30 15:57:28 -04:00
Joey Hess	6d4382a89e	Merge branch 'new-monad-control'	2011-12-24 23:02:42 -04:00
Joey Hess	ee3b5b2a42	use Common in a few more modules	2011-12-20 14:37:53 -04:00
Joey Hess	ef28b3fef7	split out Git/Command.hs	2011-12-14 15:56:11 -04:00
Joey Hess	02f1bd2bf4	split more stuff out of Git.hs	2011-12-14 15:43:13 -04:00
Joey Hess	25b2cc4148	move commit to Git.Branch	2011-12-13 15:08:44 -04:00
Joey Hess	13fff71f20	split out three modules from Git Constructors and configuration make sense in separate modules. A separate Git.Types is needed to avoid cycles.	2011-12-13 15:06:49 -04:00
Joey Hess	46588674b0	avoid closing pipe before all the shas are read from it Could have just used hGetContentsStrict here, but that would require storing all the shas in memory. Since this is called at the end of a git-annex run, it may have created a lot of shas, so I avoid that memory use and stream them out like before.	2011-12-12 21:41:37 -04:00
Joey Hess	0e45b762a0	broke out Git/HashObject.hs	2011-12-12 21:24:55 -04:00
Joey Hess	31a0c07ee9	broke out Git/Branch.hs and reorganized	2011-12-12 21:12:51 -04:00
Joey Hess	543d0d2501	split out Git/Ref.hs	2011-12-12 18:30:33 -04:00
Joey Hess	da95cbadca	split out Annex/Journal.hs	2011-12-12 18:03:28 -04:00
Joey Hess	98dfc0c9b0	split out Annex/BranchState.hs	2011-12-12 17:38:46 -04:00
Joey Hess	b2f934e07a	update comment	2011-12-12 17:24:12 -04:00
Joey Hess	79345ad5fc	optimisation avoids a redundant call to git show-ref	2011-12-12 03:30:47 -04:00
Joey Hess	f9cd3f6ad1	optimisation avoids a useless diff from git-annex..refs/heads/git-annex	2011-12-12 02:31:07 -04:00
Joey Hess	2332afb4bc	cleanup	2011-12-12 02:04:48 -04:00
Joey Hess	29b88ad657	avoid redundant call to updateIndex commitBranch calls updateIndex	2011-12-11 21:46:21 -04:00
Joey Hess	c4c965d602	detect and recover from branch push/commit race Dealing with a race without using locking is exceedingly difficult and tricky. Fully tested, I hope. There are three places left where the branch can be updated, that are not covered by the race recovery code. Let's prove they're all immune to the race: 1. tryFastForwardTo checks to see if a fast-forward can be done, and then does git-update-ref on the branch to fast-forward it. If a push comes in before the check, then either no fast-forward will be done (ok), or the push set the branch to a ref that can still be fast-forwarded (also ok) If a push comes in after the check, the git-update-ref will undo the ref change made by the push. It's as if the push did not come in, and the next git-push will see this, and try to re-do it. (acceptable) 2. When creating the branch for the very first time, an empty index is created, and a commit of it made to the branch. The commit's ref is recorded as the current state of the index. If a push came in during that, it will be noticed the next time a commit is made to the branch, since the branch will have changed. (ok) 3. Creating the branch from an existing remote branch involves making the branch, and then getting its ref, and recording that the index reflects that ref. If a push creates the branch first, git-branch will fail (ok). If the branch is created and a racing push is then able to change it (highly unlikely!) we're still ok, because it first records the ref into the index.lck, and then updating the index. The race can cause the index.lck to have the old branch ref, while the index has the newly pushed branch merged into it, but that only results in an unnecessary update of the index file later on.	2011-12-11 20:41:35 -04:00
Joey Hess	e04852c8af	Merge branch 'master' into new-monad-control Conflicts: git-annex.cabal	2011-12-11 16:55:36 -04:00
Joey Hess	cfbbda99f4	optimize index updating The last branch ref that the index was updated to is stored in .git/annex/index.lck, and the index only updated when the current branch ref differs. (The .lck file should later be used for locking too.) Some more optimization is still needed, since there is some redundancy in calls to git show-ref.	2011-12-11 16:14:59 -04:00
Joey Hess	8680c415de	slow, stupid, and safe index updating Always merge the git-annex branch into .git/annex/index before making a commit from the index. This ensures that, when the branch has been changed in any way (by a push being received, or changes pulled directly into it, or even by the user checking it out, and committing a change), the index reflects those changes. This is much too slow; it needs to be optimised to only update the index when the branch has really changed, not every time. Also, there is an unhandled race, when a change is made to the branch right after the index gets updated. I left it in for now because it's unlikely and I didn't want to complicate things with additional locking yet.	2011-12-11 15:05:53 -04:00
Joey Hess	0ba4b1de18	move a file location to Locations.hs	2011-12-11 14:14:28 -04:00
Joey Hess	eecaf42485	no need to show, it's a string	2011-12-10 12:30:31 -04:00
Joey Hess	d64132a43a	hslint	2011-12-09 01:57:13 -04:00
Joey Hess	f3a2f60abc	adjust to build with monad-control-0.3 I had to, I hope temporarily, lose my nice Annex newtype, and use a type synonym. This because I cannot find a way to derive a MonadBaseControl instance of the Annex newtype. I've emailed Bas van Dijk in hope he can help get the newtype back. Otherwise appears to build & work.	2011-12-05 22:51:37 -04:00
Joey Hess	598eb2e2da	cleanup	2011-11-30 12:01:15 -04:00
Joey Hess	da9cd315be	add support for using hashDirLower in addition to hashDirMixed Supporting multiple directory hash types will allow converting to a different one, without a flag day. gitAnnexLocation now checks which of the possible locations have a file. This means more statting of files. Several places currently use gitAnnexLocation and immediately check if the returned file exists; those need to be optimised.	2011-11-28 22:43:51 -04:00
Joey Hess	6869e6023e	support .git/annex on a different disk than the rest of the repo The only fully supported thing is to have the main repository on one disk, and .git/annex on another. Only commands that move data in/out of the annex will need to copy it across devices. There is only partial support for putting arbitrary subdirectories of .git/annex on different devices. For one thing, but this can require more copies to be done. For example, when .git/annex/tmp is on one device, and .git/annex/journal on another, every journal write involves a call to mv(1). Also, there are a few places that make hard links between various subdirectories of .git/annex with createLink, that are not handled. In the common case without cross-device, the new moveFile is actually faster than renameFile, avoiding an unncessary stat to check that a file (not a directory) is being moved. Of course if a cross-device move is needed, it is as slow as mv(1) of the data.	2011-11-28 16:17:55 -04:00
Joey Hess	128b4bd015	tweaks	2011-11-19 15:57:08 -04:00
Joey Hess	0fa1d136dc	tweak	2011-11-19 15:40:40 -04:00
Joey Hess	1ffd54ef78	ensure branch exists before trying to update it The branch may not exist, if .git/annex has been copied over from another repo (or a corrupted repo). I suppose it could also have gotten deleted somehow. Without this, there is a confusing failure.	2011-11-16 18:56:06 -04:00
Joey Hess	9290095fc2	improve type signatures with a Ref newtype In git, a Ref can be a Sha, or a Branch, or a Tag. I added type aliases for those. Note that this does not prevent mixing up of eg, refs and branches at the type level. Since git really doesn't care, except rare cases like git update-ref, or git tag -d, that seems ok for now. There's also a tree-ish, but let's just use Ref for it. A given Sha or Ref may or may not be a tree-ish, depending on the object type, so there seems no point in trying to represent it at the type level.	2011-11-16 02:41:46 -04:00
Joey Hess	272a67921c	better name	2011-11-16 01:46:46 -04:00
Joey Hess	21a925dcf1	merge: Now runs in constant space. Before, a merge was first calculated, by running various actions that called git and built up a list of lines, which were at the end sent to git update-index. This necessarily used space proportional to the size of the diff between the trees being merged. Now, lines are streamed into git update-index from each of the actions in turn. Runtime size of git-annex merge when merging 50000 location log files drops from around 100 mb to a constant 4 mb. Presumably it runs quite a lot faster, too.	2011-11-15 23:28:01 -04:00
Joey Hess	04edae6791	Optimised union merging; now only runs git cat-file once.	2011-11-12 17:45:12 -04:00
Joey Hess	e9bfa8eaed	avoid unnecessary auto-merge when only changing a file in the branch. Avoids doing auto-merging in commands that don't need fully current information from the git-annex branch. In particular, git annex add no longer needs to auto-merge. Affected commands: Anything that doesn't look up data from the branch, but does write a change to it. It might seem counterintuitive that we can change a value without first making sure we have the current value. This optimisation works because these two sequences are equivilant: 1. pull from remote 2. union merge 3. read file from branch 4. modify file and write to branch vs. 1. read file from branch 2. modify file and write to branch 3. pull from remote 4. union merge After either sequence, the git-annex branch contains the same logical content for the modified file. (Possibly with lines in a different order or additional old lines of course).	2011-11-12 15:15:57 -04:00
Joey Hess	897bf938f6	merge: Improve commit messages to mention what was merged.	2011-11-12 14:51:19 -04:00
Joey Hess	637b5feb45	lint	2011-11-11 01:52:58 -04:00
Joey Hess	49d2177d51	factored out some useful error catching methods	2011-11-10 20:57:28 -04:00
Joey Hess	9570421251	better message when content is locked	2011-11-10 02:59:13 -04:00
Joey Hess	a218ce41cf	exclusive locks, ugh	2011-11-09 22:15:33 -04:00
Joey Hess	cf0174c922	content locking I've tested that this solves the cyclic drop problem. Have not looked at cyclic move, etc.	2011-11-09 21:54:42 -04:00
Joey Hess	d3e1a3619f	safer inannex checking git-annex-shell inannex now returns always 0, 1, or 100 (the last when it's unclear if content is currently in the index due to it currently being moved or dropped). (Actual locking code still not yet written.)	2011-11-09 18:33:15 -04:00
Joey Hess	8ce7e73f74	reorg to allow taking content lock The lock will only persist during the perform stage, so the content must be removed from the annex then, rather than in the cleanup stage. (No lock is actually taken yet.)	2011-11-09 16:54:18 -04:00
Joey Hess	56b8194470	cleanup	2011-11-09 01:33:20 -04:00
Joey Hess	bf460a0a98	reorder repo parameters last Many functions took the repo as their first parameter. Changing it consistently to be the last parameter allows doing some useful things with currying, that reduce boilerplate. In particular, g <- gitRepo is almost never needed now, instead use inRepo to run an IO action in the repo, and fromRepo to get a value from the repo. This also provides more opportunities to use monadic and applicative combinators.	2011-11-08 16:27:20 -04:00
Joey Hess	b11a63a860	clean up read/show abuse Avoid ever using read to parse a non-haskell formatted input string. show :: Key is arguably still show abuse, but displaying Keys as filenames is just too useful to give up.	2011-11-08 00:17:54 -04:00
Joey Hess	63a292324d	add a UUID type Should have done this a long time ago.	2011-11-07 15:59:16 -04:00
Joey Hess	f229911715	optimization The last commit added some git-log calls to a merge. This removes some, by only merging branches that have unique refs.	2011-11-06 15:33:15 -04:00
Joey Hess	c99fb58909	merge: Use fast-forward merges when possible. Thanks Valentin Haenel for a test case showing how non-fast-forward merges could result in an ongoing pull/merge/push cycle. While the git-annex branch is fast-forwarded, git-annex's index file is still updated using the union merge strategy as before. There's no other way to update the index that would be any faster. It is possible that a union merge and a fast-forward result in different file contents: Files should have the same lines, but a union merge may change their order. If this happens, the next commit made to the git-annex branch will have some unnecessary changes to line orders, but the consistency of data should be preserved. Note that when the journal contains changes, a fast-forward is never attempted, which is fine, because committing those changes would be vanishingly unlikely to leave the git-annex branch at a commit that already exists in one of the remotes. The real difficulty is handling the case where multiple remotes have all changed. git-annex does find the best (ie, newest) one and fast forwards to it. If the remotes are diverged, no fast-forward is done at all. It would be possible to pick one, fast forward to it, and make a merge commit to the rest, I see no benefit to adding that complexity. Determining the best of N changed remotes requires N*2+1 calls to git-log, but these are fast git-log calls, and N is typically small. Also, typically some or all of the remote refs will be the same, and git-log is not called to compare those. In the real world I expect this will almost always add only 1 git-log call to the merge process. (Which already makes N anyway.)	2011-11-06 15:22:40 -04:00
Joey Hess	5f3dd3d246	ensure directory exists when locking journal Fixes git annex init in a bare repository that already has a git-annex branch.	2011-11-02 15:09:19 -04:00
Joey Hess	1826b3bd67	cleanup	2011-10-27 18:01:52 -04:00
Joey Hess	373cad993d	Sped up some operations on remotes that are on the same host. Specifically, disabled trying to update the git-annex branch on the remote, since that data is never used by operations that act on such remotes. Also, when copying content to such a remote, skip committing the presence information changes to its git-annex branch. Leaving it in the journal there is ok: Any command run on the remote that needs the info will flush the journal. This may partially solve this bug: http://git-annex.branchable.com/bugs/fails_to_handle_lot_of_files/ Although I still see unreaped git processes piling up when doing a copy --to.	2011-10-27 14:55:06 -04:00
Joey Hess	91366c896d	clean Annex stuff out of Utility/	2011-10-16 00:04:26 -04:00
Joey Hess	ee9af605bc	break out non-log stuff to separate module	2011-10-15 17:47:03 -04:00
Joey Hess	1a29b5b52e	reorganize log modules no code changes	2011-10-15 16:21:08 -04:00
Joey Hess	b505ba83e8	minor syntax changes	2011-10-11 14:43:45 -04:00
Joey Hess	025ded4a2d	tweaks	2011-10-10 17:37:44 -04:00
Joey Hess	f0153f9fd7	fix a race Another process may stage journalled files before the lock is taken, so need to get the list of journalled files afterwards. It's unfortunate this means getting the directory contents twice, but it seems better to do that than sometimes take the lock unnecessarily.	2011-10-09 16:19:09 -04:00
Joey Hess	dfee6e1ed6	better layout And a theoretical fix to branchstate cache invalidation, but not a bug that could actually happen.	2011-10-07 13:59:34 -04:00
Joey Hess	82e655efd0	performance fix It was checking if it needed to merge on every branch access, fix it to only check once.	2011-10-07 13:38:56 -04:00
Joey Hess	44fc358885	avoid merging multiple branches that point to the same tree avoids git warning "error: duplicate parent xxx ignored"	2011-10-07 13:37:01 -04:00
Joey Hess	3acdba3995	faster union merge of multiple branches into index only write index once	2011-10-07 13:36:48 -04:00
Joey Hess	6a6ea06cee	rename	2011-10-05 16:02:51 -04:00
Joey Hess	cfe21e85e7	rename	2011-10-04 00:59:08 -04:00
Joey Hess	ff21fd4a65	factor out Annex exception handling module	2011-10-04 00:34:04 -04:00

... 19 20 21 22 23 ...

1971 commits