git-annex

Author	SHA1	Message	Date
Joey Hess	586db7f06d	Avoid making a commit when upgrading from direct mode to v7 Three reasons: * Committing as part of an upgrade is very unusual and unexpected. * The commit was failing with a weird error message when done during an automatic upgrade. * Let me remove more of that sweet^Whorrible direct mode code.	2019-08-26 16:35:44 -04:00
Joey Hess	689d1fcc92	remove most remnants of direct mode A few remain, as needed for upgrades, and for accessing objects from remotes that are direct mode repos that have not been converted yet.	2019-08-26 16:27:48 -04:00
Joey Hess	20741b1eb4	Automatically convert direct mode repositories to v7 with adjusted unlocked branches * Automatically convert direct mode repositories to v7 with adjusted unlocked branches and set annex.thin. * init: When run on a crippled filesystem with --version=5, will error out, since version 7 is needed for adjusted unlocked branch. * direct: This command always errors out as direct mode is no longer supported. * indirect: This command has become a deprecated noop. * proxy: This command is deprecated because it was only needed in direct mode. (But it continues to work.) Also removed mentions of direct mode throughough the documentation. I have not removed all the direct mode code yet.	2019-08-26 15:05:25 -04:00
Joey Hess	f6fb4b8cdb	avoid side message when doing automatic upgrade to v7 An automatic upgrade is supposed to be silent.	2019-08-26 13:54:52 -04:00
Joey Hess	5877a15d7b	fix hard links when upgrading from direct mode When upgrading a direct mode repo to v7 with adjusted unlocked branches, fix a bug that prevented annex.thin from taking effect for the files in working tree. The hard links used to be ok, but commit `8e22114735` accidentially broke them. It repopulates the worktree file, which is already a hard link, and when it's creating the new file, the link count is already 2, and so it doesn't make a hard link then.	2019-08-26 13:54:39 -04:00
Joey Hess	1e02360283	remove only case	2019-08-26 13:28:28 -04:00
Joey Hess	2fd27c6df5	assistant: When creating a new repository use v7 adjusted branches with annex.thin Rather than direct mode, which this is a small step on the path to removing. Init on a crippled filesystem already used v7 adjusted branches, and like that, this doesn't pose any interoperability issues with old versions of git-annex that clone the same repo, because files are only unlocked on the adjusted branch.	2019-08-26 12:54:14 -04:00
Joey Hess	b599e8e6ac	move module only used by assistant	2019-08-26 12:32:45 -04:00
Joey Hess	bb16a26109	use headExists Turns out that `7be690f326` broke the test suite on the i386ancient builder. There, git show-ref --verify HEAD fails with "'HEAD' - not a valid ref". Apparently git 2.1.4 didn't support that. headExists works there and does the same thing.	2019-08-19 11:12:19 -04:00
Joey Hess	f845636e30	correct license to AGPL This code was already AGPL, except for the bit split out to Utility/MD5.hs in commit `426053cb6c`. That commit accidentially updated the license of this file from AGPL to GPL. Thanks to Sean Whitton for spotting this.	2019-08-17 14:08:07 -04:00
Joey Hess	e4a8366162	fix edge case failure in prop_view_roundtrips "./" made it fail, because that gets eliminated	2019-08-16 11:35:32 -04:00
Joey Hess	dc672863c3	init: Install working hook scripts when run on a crippled filesystem and on Windows	2019-08-13 15:14:17 -04:00
Joey Hess	868942e19b	fix unused module import warnings when building on windows	2019-08-08 12:18:53 -04:00
Joey Hess	8ba4de2d9c	remove unused import	2019-07-30 12:16:41 -04:00
Joey Hess	5080a7be1e	fix build	2019-07-29 12:41:45 -04:00
Joey Hess	426053cb6c	Corrected some license statements In `40ecf58d4b` I changed the license of code I wrote from GPL to AGPL. But, two files containing code I wrote combined with code by others were updated to say their license is AGPL, while in fact part of it was (the code I wrote) but part remained under the original license (the code written by others). Remote/Ddar.hs is now changed entirely back to GPL 3. Annex/DirHashes.hs stays AGPL, but I broke out Utility/MD5.hs with the code not written by me, and corrected its license statement to GPL-2, which is the actual version of the GPL included with the code in its original distribution at http://www.cs.ox.ac.uk/people/ian.lynagh/md5/	2019-07-28 14:27:33 -04:00
Joey Hess	4c5a489f3e	avoid build warning when built w/o magic-mime	2019-07-22 11:03:26 -04:00
Joey Hess	7fd650355e	merge from http-client-restricted I made some improvements to its API after splitting it out of git-annex, so merge those back in. This is groundwork for removing the embedded copy of it and depending on it. Also moved the managerResponseTimeout disabling to Annex.Url as it's git-annex specific. This commit was sponsored by Ethan Aubin on Patreon.	2019-07-17 16:48:50 -04:00
Joey Hess	7be690f326	check headRef not Branch.current Support running v7 upgrade in a repo where there is no branch checked out, but HEAD is set directly to some other ref. This commit was sponsored by Jack Hill on Patreon.	2019-07-16 12:36:29 -04:00
Joey Hess	9a5ddda511	remove many old version ifdefs Drop support for building with ghc older than 8.4.4, and with older versions of serveral haskell libraries than will be included in Debian 10. The only remaining version ifdefs in the entire code base are now a couple for aws! This commit should only be merged after the Debian 10 release. And perhaps it will need to wait longer than that; it would make backporting new versions of git-annex to Debian 9 (stretch) which has been actively happening as recently as this year. This commit was sponsored by Ilya Shlyakhter.	2019-07-05 15:09:37 -04:00
Joey Hess	26c54d6ea3	make metered more generic Allow it to be used when the Key is not known.	2019-06-25 12:33:36 -04:00
Joey Hess	8355dba5cc	plumb MeterUpdate into getKey No behavior changes, but this shows everywhere that a progress meter could be displayed when hashing a file to add to the annex. Many of the places don't make sense to display a progress meter though, eg when importing the copy of the file probably swamps the hashing of the file.	2019-06-25 11:43:24 -04:00
Joey Hess	84e729fda5	fix init default description reversion init: Fix a reversion in the last release that prevented automatically generating and setting a description for the repository. Seemed best to factor out uuidDescMapRaw that does not have the default mempty descrition behavior. I don't much like that behavior, but I know things depend on it. One thing in particular is `git annex info` which lists the uuids and descriptions; if the current repo has been initialized in some way that means it does not have a description, it would not show up w/o that. (Not only repos created due to this bug might lack that. For example a repo that was marked dead and had --drop-dead delete its git-annex branch info, and then came back from the dead would similarly not be in the uuid.log. Also there have been other versions of git-annex that didn't set a default description; for years there was no default description.)	2019-06-20 20:30:24 -04:00
Joey Hess	ba433bdc85	refactor	2019-06-19 20:19:38 -04:00
Joey Hess	26f0f8b20f	optimisation Avoid an unncessary STM transaction. This will happen when the worker pool is not completely full of the new stage, which is the common case. In the uncommon case, this adds only a tiny bit of overhead for the extra traversal of the worker pool. And the thread is going to block for some time anyway.	2019-06-19 20:13:19 -04:00
Joey Hess	37d505dd6b	avoid STM deadlock When all worker threads are running and enteringStage is called, it waits for an idle slot. If all off the other threads then call it in turn, a deadlock occurrs. This is the same problem I didn't actually fix in `5a9842d7ed`. Fixed by doing two separate STM transactions, the first replaces its active thread with an idle thread, and the second waits for another idle thread. That guarantees there will eventually be an idle thread to find. The changes to WorkerPool were necessary because it can't add an idle thread containing the Annex state and go on to run an action using that same state, so I had to remove the Annex state from IdleWorker.	2019-06-19 18:15:25 -04:00
Joey Hess	9671248fff	speed up enteringStage in non-concurrent mode Avoid a STM transaction. Also got rid of UnallocatedWorkerPool.	2019-06-19 15:47:54 -04:00
Joey Hess	05a908c3c9	fix oops	2019-06-19 14:52:44 -04:00
Joey Hess	9d36c826c0	use fine-grained WorkerStages when transferring and verifying This means that Command.Move and Command.Get don't need to manually set the stage, and is a lot cleaner conceptually. Also, this makes Command.Sync.syncFile use the worker pool better. In the scenario where it first downloads content and then uploads it to some other remotes, it will start in TransferStage, then enter VerifyStage and then go back to TransferStage for each transfer to the remotes. Before, it entered CleanupStage after the download, and stayed in it for the upload, so too many transfer jobs could run at the same time. Note that, in Remote.Git, it uses runTransfer and also verifyKeyContent inside onLocal. That has a Annex state for the remote, with no worker pool. So the resulting calls to enteringStage won't block in there. While Remote.Git.copyToRemote does do checksum verification, I realized that should not use a verification slot in the WorkerPool to do it. Because, it's reading back from eg, a removable disk to checksum. That will contend with other writes to that disk. It's best to treat that checksum verification as just part of the transer. So, removed the todo item about that, as there's nothing needing to be done.	2019-06-19 13:24:20 -04:00
Joey Hess	53882ab4a7	make WorkerStage an open type Rather than limiting it to PerformStage and CleanupStage, this opens it up so any number of stages can be added as needed by commands. Each concurrent command has a set of stages that it uses, and only transitions between those can block waiting for a free slot in the worker pool. Calling enteringStage for some other stage does not block, and has very little overhead. Note that while before the Annex state was duplicated on the first call to commandAction, this now happens earlier, in startConcurrency. That means that seek stage actions should that use startConcurrency and then modify Annex state won't modify the state of worker threads they then start. I audited all of them, and only Command.Seek did so; prepMerge changes the working directory and so has to come before startConcurrency. Also, the remote list is built before duplicating the state, which means that it gets built earlier now than it used to. This would only have an effect of making commands that end up not needing to perform any actions unncessary build the remote list (only when they're run with concurrency enable), but that's a minor overhead compared to commands seeking through the work tree and determining they don't need to do anything.	2019-06-19 13:05:03 -04:00
Joey Hess	8e5ea28c26	finish CommandStart transition The hoped for optimisation of CommandStart with -J did not materialize. In fact, not runnign CommandStart in parallel is slower than -J3. So, CommandStart are still run in parallel. (The actual bad performance I've been seeing with -J in my big repo has to do with building the remoteList.) But, this is still progress toward making -J faster, because it gets rid of the onlyActionOn roadblock in the way of making CommandCleanup jobs run separate from CommandPerform jobs. Added OnlyActionOn constructor for ActionItem which fixes the onlyActionOn breakage in the last commit. Made CustomOutput include an ActionItem, so even things using it can specify OnlyActionOn. In Command.Move and Command.Sync, there were CommandStarts that used includeCommandAction, so output messages, which is no longer allowed. Fixed by using startingCustomOutput, but that's still not quite right, since it prevents message display for the includeCommandAction run inside it too.	2019-06-12 13:24:01 -04:00
Joey Hess	436f107715	make CommandStart return a StartMessage The goal is to be able to run CommandStart in the main thread when -J is used, rather than unncessarily passing it off to a worker thread, which incurs overhead that is signficant when the CommandStart is going to quickly decide to stop. To do that, the message it displays needs to be displayed in the worker thread, after the CommandStart has run. Also, the change will mean that CommandStart will no longer necessarily run with the same Annex state as CommandPerform. While its docs already said it should avoid modifying Annex state, I audited all the CommandStart code as part of the conversion. (Note that CommandSeek already sometimes runs with a different Annex state, and that has not been a source of any problems, so I am not too worried that this change will lead to breakage going forward.) The only modification of Annex state I found was it calling allowMessages in some Commands that default to noMessages. Dealt with that by adding a startCustomOutput and a startingUsualMessages. This lets a command start with noMessages and then select the output it wants for each CommandStart. One bit of breakage: onlyActionOn has been removed from commands that used it. The plan is that, since a StartMessage contains an ActionItem, when a Key can be extracted from that, the parallel job runner can run onlyActionOn' automatically. Then commands won't need to worry about this detail. Future work. Otherwise, this was a fairly straightforward process of making each CommandStart compile again. Hopefully other behavior changes were mostly avoided. In a few cases, a command had a CommandStart that called a CommandPerform that then called showStart multiple times. I have collapsed those down to a single start action. The main command to perhaps suffer from it is Command.Direct, which used to show a start for each file, and no longer does. Another minor behavior change is that some commands used showStart before, but had an associated file and a Key available, so were changed to ShowStart with an ActionItemAssociatedFile. That will not change the normal output or behavior, but --json output will now include the key. This should not break it for anyone using a real json parser.	2019-06-06 17:13:54 -04:00
Joey Hess	258a7c5cd1	add Key to all ActionItem constructors	2019-06-06 12:53:24 -04:00
Joey Hess	659640e224	separate queue for cleanup actions When running multiple concurrent actions, the cleanup phase is run in a separate queue than the main action queue. This can make some commands faster, because less time is spent on bookkeeping in between each file transfer. But as far as I can see, nothing will be sped up much by this yet, because all the existing cleanup actions are very light-weight. This is just groundwork for deferring checksum verification to cleanup time. This change does mean that if the user expects -J2 will mean that they see no more than 2 jobs running at a time, they may be surprised to see 4 in some cases (if the cleanup actions are slow enough to notice). It might also make sense to enable background cleanup without the -J, for at least one cleanup action. Indeed, that's the behavior that -J1 has now. At some point in the future, it make make sense to make the behavior with no -J the same as -J1. The only reason it's not currently is that git-annex can build w/o concurrent-output, and also any bugs in concurrent-output (such as perhaps misbehaving on non-VT100 compatible terminals) are avoided by default by only using it when -J is used.	2019-06-05 17:54:35 -04:00
Joey Hess	c04b2af3e1	improved WorkerPool abstraction No behavior changes.	2019-06-05 14:26:48 -04:00
Joey Hess	082e1f1738	Don't try to import .git directories from special remotes Because git does not support storing git repositories inside a git repository.	2019-06-04 15:14:20 -04:00
Joey Hess	67c06f5121	add back support for ftp urls Add back support for ftp urls, which was disabled as part of the fix for security hole CVE-2018-10857 (except for configurations which enabled curl and bypassed public IP address restrictions). Now it will work if allowed by annex.security.allowed-ip-addresses.	2019-05-30 14:51:34 -04:00
Joey Hess	1871295765	rename annex.security.allowed-http-addresses Renamed annex.security.allowed-http-addresses to annex.security.allowed-ip-addresses because it is not really specific to the http protocol, also limiting eg, git-annex's use of ftp and via youtube-dl, several other protocols. The old name for the config will still work. If both old and new name are set, the new name will win.	2019-05-30 12:43:40 -04:00
Joey Hess	a14f6ce758	fix repo description setting bugs * init: When the repository already has a description, don't change it. * describe: When run with no description parameter it used to set the description to "", now it will error out.	2019-05-23 12:51:01 -04:00
Joey Hess	16a2bed710	avoid build warning on Windows about unused import	2019-05-23 12:15:33 -04:00
Joey Hess	e06feb7316	honor preferred content when importing Importing from a special remote honors its preferred content too; unwanted files are not imported. But, some preferred content expressions can't be checked before files are imported, and trying to import with such an expression will fail. Tested this with scenarios including changing the preferred content expression and making sure merging the import didn't delete files that were no longer wanted. There was one minor inefficiency mentioned in the todo that I punted on.	2019-05-21 14:38:06 -04:00
Joey Hess	0bd39c1315	remove a TODO I checked yesterday	2019-05-21 12:54:39 -04:00
Joey Hess	3b9a19171a	Merge branch 'master' into preferred	2019-05-21 11:34:45 -04:00
Joey Hess	5e1221ad53	Improve shape of commit tree when importing from unversioned special remotes Make the import have the previous import as a parent, so eg `git log --stat` displays a useful diff. Also a minor optimisation, only calculate the depth of the imported history once.	2019-05-21 11:32:54 -04:00
Joey Hess	97fd9da6e7	add back non-preferred files to imported tree Prevents merging the import from deleting the non-preferred files from the branch it's merged into. adjustTree previously appended the new list of items to the old, which could result in it generating a tree with multiple files with the same name. That is not good and confuses some parts of git. Gave it a function to resolve such conflicts. That allowed dealing with the problem of what happens when the import contains some files (or subtrees) with the same name as files that were filtered out of the export. The files from the import win.	2019-05-20 16:43:52 -04:00
Joey Hess	568af1073e	filter exported tree through remote's preferred content setting The filtering is fairly efficient as far as building the trees goes, since it reuses adjustTree. But it still needs to traverse the whole tree, and look up the keys used by every file. The tree that gets recorded to export.log is the filtered tree. This way resumes of interrupted sync to an export uses it without needing to recalculate it. And, a change to the preferred content settings of the remote will result in a different tree, so the export will be updated accordingly. The original tree is still used in the remote tracking branch. That branch represents the special remote as a git remote, and if it were a normal git remote, the tree in its head would not be affected by preferred content.	2019-05-20 11:54:55 -04:00
Joey Hess	354c0eb57f	support standard and groupwanted in keyless mode Only when the preferred content expression includes them will a parse failure due to them needing keys result in the preferred content expression not parsing in keyless mode.	2019-05-14 14:59:03 -04:00
Joey Hess	9411a7c93c	matching preferred content before key is known This will let import try to match preferred content expressions before downloading the content and generating its key. If an expression needs a key, it preferredContentParser with preferredContentKeylessTokens will fail to parse it. standard and groupwanted are not in preferredContentKeylessTokens because they may refer to an expression that refers to a key. That needs further work to support them.	2019-05-14 14:28:23 -04:00
Joey Hess	aa7710982b	avoid list lookup by parseToken Minor optimisation to parsing of a preferred content expression.	2019-05-14 13:11:29 -04:00
Joey Hess	c1957b6aeb	whitespace	2019-05-14 13:01:50 -04:00

1 2 3 4 5 ...

1286 commits