git-annex

Author	SHA1	Message	Date
Joey Hess	25ba8156bc	improve benchmark --databases * benchmark: Changed --databases to take a parameter specifiying the size of the database to benchmark. * benchmark --databases: Display size of the populated database. * benchmark --databases: Improve the "addAssociatedFile to (new)" benchmark to really add new values, not overwriting old values.	2019-11-21 17:25:20 -04:00
Joey Hess	6f35b576d7	encourage use of import from directory special remote rather than legacy interface	2019-11-19 13:30:27 -04:00
Joey Hess	890330f0fe	make --json-error-messages capture url download errors Convert Utility.Url to return Either String so the error message can be displated in the annex monad and so captured. (When curl is used, its errors are still not caught.)	2019-11-12 13:52:38 -04:00
Joey Hess	0be23bae2f	refactor Better to not have a single function module, and better to have a more specific type than Bool. This commit was sponsored by Jack Hill on Patreon	2019-11-11 19:10:52 -04:00
Joey Hess	3b34d123ed	Added annex.allowsign option. This commit was sponsored by Ilya Shlyakhter on Patreon.	2019-11-11 16:28:56 -04:00
Joey Hess	25f912de5b	benchmark: Add --databases to benchmark sqlite databases Rescued from commit `11d6e2e260` which removed db benchmarks in favor of benchmarking arbitrary git-annex commands. Which is nice and general, but microbenchmarks are useful too.	2019-10-29 16:59:27 -04:00
Joey Hess	4a3f3a2cb5	make git add only annex when configured by annex.largefiles	2019-10-24 14:17:29 -04:00
Joey Hess	168f91efec	avoid warning over name	2019-10-24 11:46:40 -04:00
Joey Hess	bd197be3ad	annex.gitaddtoannex configuration Added annex.gitaddtoannex configuration. Setting it to false prevents git add from usually adding files to the annex. (Unless the file was annexed before, or a renamed annexed file is detected.) Currently left at true; some users are encouraging it be set to false.	2019-10-23 15:29:46 -04:00
Joey Hess	ec08b66bda	shouldAnnex: check isInodeKnown Renamed unlocked files are now detected, and will always be annexed, unless annex.largefiles disallows it. This allows for git add's behavior to later be changed to otherwise not annex files (whether by default or as a config option), without worrying about the rename case. This is not a major behavior change; annexing is still the default. But there is one case where the behavior is changed, I think for the better: touch f git -c annex.largefiles=nothing add f git add bigfile git commit -m ... mv bigfile f git add f Before, git-annex would see that f was previously not annexed, and so the renamed bigfile content gets added to git. Now, it notices that the inode is the one that bigfile used, and so it annexes it. This potentially slows down git add a lot in some repositories because of the poor performance of isInodeKnown when there are a lot of unlocked files. Configuring annex.largefiles avoids the speed hit.	2019-10-23 14:49:45 -04:00
Joey Hess	3d4aab38ce	remove obsolete comment	2019-10-21 13:51:38 -04:00
Joey Hess	668b878995	remove recently added and unncessary cwd parameter I later made Utility.Su change back to the cwd, so this parameter is not needed.	2019-10-21 13:48:52 -04:00
Joey Hess	9a5d9019ba	Deal with pkexec changing to root's home directory when running a command. Wow, that's not documented anywhere, and seems like a major gotcha in pkexec. Broke enable-tor.	2019-10-21 12:39:19 -04:00
Joey Hess	9828f45d85	add RemoteStateHandle This solves the problem of sameas remotes trampling over per-remote state. Used for: * per-remote state, of course * per-remote metadata, also of course * per-remote content identifiers, because two remote implementations could in theory generate the same content identifier for two different peices of content While chunk logs are per-remote data, they don't use this, because the number and size of chunks stored is a common property across sameas remotes. External special remote had a complication, where it was theoretically possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or EXPORTSUPPORTED. Since the uuid of the remote is typically generate in Remote.setup, it would only be possible to pass a Maybe RemoteStateHandle into it, and it would otherwise have to construct its own. Rather than go that route, I decided to send an ERROR in this case. It seems unlikely that any existing external special remote will be affected. They would have to make up a git-annex key, and set state for some reason during INITREMOTE. I can imagine such a hack, but it doesn't seem worth complicating the code in such an ugly way to support it. Unfortunately, both TestRemote and Annex.Import needed the Remote to have a new field added that holds its RemoteStateHandle.	2019-10-14 13:51:42 -04:00
Joey Hess	debafcba2b	autoenable sameas remotes	2019-10-11 15:52:40 -04:00
Joey Hess	ec778888d2	got enableremote working for sameas Also the assistant can enable sameas remotes, should work, but not tested.	2019-10-11 15:11:08 -04:00
Joey Hess	35d7ffe128	initremote --sameas fully working And using sameas remotes is working. Moved annex-config-uuid setting out of Remote.Helper.Special. EnableRemote will also have to set it.	2019-10-11 14:19:10 -04:00
Joey Hess	91eed85fd4	add sameas inherited configs to newConfig This makes initremote --sameas work with encryption inherited.	2019-10-11 13:05:20 -04:00
Joey Hess	59908586f4	rename RemoteConfigKey to RemoteConfigField And some associated renames. I was going to have some values named fooKeyKey otherwise..	2019-10-10 15:44:05 -04:00
Joey Hess	d1130ea04a	get rid of hardcoded "name" lookups Support "sameas-name" being set instead. In RenameRemote, rename which ever of the two is set.	2019-10-10 13:25:10 -04:00
Joey Hess	97b499a4dc	use sameas-name and sameas-uuid for sameas remotes initremote --sameas=remotename sets sameas-name and sameas-uuid Using sameas-name rather than name prevents old git-annex initremote from enabling a sameas remote by name, since it would not handle it correctly.	2019-10-10 12:32:05 -04:00
Joey Hess	61b384d2b7	add --sameas option, not yet used	2019-10-01 12:36:25 -04:00
Joey Hess	2b55a2b882	remotedaemon: Don't list --stop in help since it's not supported. Also, move out of plumbing section. When using tor, the remotedaemon is part of the user's workflow, as it runs the tor hidden service.	2019-09-30 14:40:46 -04:00
Joey Hess	090898a138	adjust --lock: This enters an adjusted branch where files are locked. Straightforward, except for the issue of how to reverse LockAdjustment. With --unlock, a commit that modifies/adds unlocked files gets reverse adjusted to use locked files. That's fairly reasonable, I think. But reversing --lock by unlocking all modified files feels wrong. Maybe that's just because repositories typically seem to still have mostly locked files in them (unless one is in an adjusted unlocked branch of course!) It may be that eventually how to reverse both will need to be configurable, I don't know.	2019-09-27 14:23:25 -04:00
Joey Hess	53fd746705	avoid some build warnings on windows	2019-09-12 14:11:19 -04:00
Joey Hess	99b509572d	post-receive hook updateInstead emulation cleanup The code is only needed because for a long time, git-annex didn't install hooks in repos on crippled filesystems. Now it does, and they work at least on FAT (where all files are executable) and Windows. It would be possible to remove this code in v8 simply by re-installing the hooks.	2019-09-11 14:41:51 -04:00
Joey Hess	061231621e	Merge branch 'master' into v7-default	2019-09-10 16:06:43 -04:00
Joey Hess	0af7ebdc2a	info: Display trust level when getting info on a uuid, same as on a remote.	2019-09-01 16:48:46 -04:00
Joey Hess	f845195354	Added annex.autoupgraderepository configuration Can be set to false to prevent any automatic repository upgrades. Also, removed direct mode specific upgrade code in Annex.Init, and made needsUpgrade always include the name/path of the repo, so if there's a problem it's clear what repo has the problem. And, made needsUpgrade catch any exceptions that might occur during the upgrade, so it can display a more useful error message than just the exception.	2019-09-01 13:42:26 -04:00
Joey Hess	3f0eef4baa	v7 for all repositories * Default to v7 for new repositories. * Automatically upgrade v5 repositories to v7.	2019-08-30 14:09:14 -04:00
Joey Hess	4f59ac05b6	info: remove "repository mode" info: Removed the "repository mode" from its output (including the --json output) since with the removal of direct mode, there is no repository mode.	2019-08-29 14:12:22 -04:00
Joey Hess	36cf61d752	simplification Whether or not there's a false index, it can't Restage here. When there's a false index, restaging would alter it and not the real index, but it fails anyway because that index is locked. When there's not a false index, the index is locked, and so restaging can't alter it.	2019-08-28 15:46:35 -04:00
Joey Hess	da6f4d8887	remove direct mode support from Annex.Content No longer used. The only possible user of it would be code in Upgrade.V5, so I verified that the parts of Annex.Content it used were not used to manipulate direct mode files.	2019-08-27 13:14:06 -04:00
Joey Hess	3a0842d9f8	fix bug introduced in direct mode conversion oops, the code was "if direct && not present" and I removed the direct which made the wrong path be taken.	2019-08-27 12:29:05 -04:00
Joey Hess	a51a479fb9	fix a couple warnings	2019-08-27 12:24:31 -04:00
Joey Hess	689d1fcc92	remove most remnants of direct mode A few remain, as needed for upgrades, and for accessing objects from remotes that are direct mode repos that have not been converted yet.	2019-08-26 16:27:48 -04:00
Joey Hess	20741b1eb4	Automatically convert direct mode repositories to v7 with adjusted unlocked branches * Automatically convert direct mode repositories to v7 with adjusted unlocked branches and set annex.thin. * init: When run on a crippled filesystem with --version=5, will error out, since version 7 is needed for adjusted unlocked branch. * direct: This command always errors out as direct mode is no longer supported. * indirect: This command has become a deprecated noop. * proxy: This command is deprecated because it was only needed in direct mode. (But it continues to work.) Also removed mentions of direct mode throughough the documentation. I have not removed all the direct mode code yet.	2019-08-26 15:05:25 -04:00
Joey Hess	c650389118	info: error out when file matching options used on non-directory When file matching options are specified when getting info of something other than a directory, they won't have any effect, so error out to avoid confusion. This commit was sponsored by mo on Patreon.	2019-08-24 13:20:19 -04:00
Joey Hess	88c61dea00	typo	2019-08-13 13:36:52 -04:00
Joey Hess	3049271fd0	fix build warnings	2019-08-13 13:12:41 -04:00
Joey Hess	b87ea12b6b	git-annex merge branch * merge: When run with a branch parameter, merges from that branch. This is especially useful when using an adjusted branch, because it applies the same adjustment to the branch before merging it.	2019-08-09 13:21:15 -04:00
Joey Hess	70b71bf660	have init --version fail when repo is already initialized with other version init: When the repo is already initialized, and --version requests a different version, error out rather than silently not changing the version.	2019-08-08 14:13:02 -04:00
Joey Hess	9a5ddda511	remove many old version ifdefs Drop support for building with ghc older than 8.4.4, and with older versions of serveral haskell libraries than will be included in Debian 10. The only remaining version ifdefs in the entire code base are now a couple for aws! This commit should only be merged after the Debian 10 release. And perhaps it will need to wait longer than that; it would make backporting new versions of git-annex to Debian 9 (stretch) which has been actively happening as recently as this year. This commit was sponsored by Ilya Shlyakhter.	2019-07-05 15:09:37 -04:00
Joey Hess	d2cc747d66	add back setDirect, lost in recent commit Oops, thanks goodness for test suite that found this..	2019-06-25 13:38:18 -04:00
Joey Hess	42c386fc47	add: Display progress meter when hashing files. * add: Display progress meter when hashing files. * add: Support --json-progress option.	2019-06-25 13:12:47 -04:00
Joey Hess	8355dba5cc	plumb MeterUpdate into getKey No behavior changes, but this shows everywhere that a progress meter could be displayed when hashing a file to add to the annex. Many of the places don't make sense to display a progress meter though, eg when importing the copy of the file probably swamps the hashing of the file.	2019-06-25 11:43:24 -04:00
Joey Hess	7264203eb1	importfeed: When there's a problem parsing the feed, --debug will output the feed content that was downloaded. And let the user know about it in the failure messages.	2019-06-20 12:37:07 -04:00
Joey Hess	9d36c826c0	use fine-grained WorkerStages when transferring and verifying This means that Command.Move and Command.Get don't need to manually set the stage, and is a lot cleaner conceptually. Also, this makes Command.Sync.syncFile use the worker pool better. In the scenario where it first downloads content and then uploads it to some other remotes, it will start in TransferStage, then enter VerifyStage and then go back to TransferStage for each transfer to the remotes. Before, it entered CleanupStage after the download, and stayed in it for the upload, so too many transfer jobs could run at the same time. Note that, in Remote.Git, it uses runTransfer and also verifyKeyContent inside onLocal. That has a Annex state for the remote, with no worker pool. So the resulting calls to enteringStage won't block in there. While Remote.Git.copyToRemote does do checksum verification, I realized that should not use a verification slot in the WorkerPool to do it. Because, it's reading back from eg, a removable disk to checksum. That will contend with other writes to that disk. It's best to treat that checksum verification as just part of the transer. So, removed the todo item about that, as there's nothing needing to be done.	2019-06-19 13:24:20 -04:00
Joey Hess	53882ab4a7	make WorkerStage an open type Rather than limiting it to PerformStage and CleanupStage, this opens it up so any number of stages can be added as needed by commands. Each concurrent command has a set of stages that it uses, and only transitions between those can block waiting for a free slot in the worker pool. Calling enteringStage for some other stage does not block, and has very little overhead. Note that while before the Annex state was duplicated on the first call to commandAction, this now happens earlier, in startConcurrency. That means that seek stage actions should that use startConcurrency and then modify Annex state won't modify the state of worker threads they then start. I audited all of them, and only Command.Seek did so; prepMerge changes the working directory and so has to come before startConcurrency. Also, the remote list is built before duplicating the state, which means that it gets built earlier now than it used to. This would only have an effect of making commands that end up not needing to perform any actions unncessary build the remote list (only when they're run with concurrency enable), but that's a minor overhead compared to commands seeking through the work tree and determining they don't need to do anything.	2019-06-19 13:05:03 -04:00
Joey Hess	04cc470201	run download checksum verification in separate job pool get, move, copy, sync: When -J or annex.jobs has enabled concurrency, checksum verification uses a separate job pool than is used for downloads, to keep bandwidth saturated. Not yet done for upload checksum verification, but that only affects remotes on local disks.	2019-06-17 14:58:02 -04:00
Joey Hess	ba2551da6f	add startingNoMessage Fixes the last wart in the StartMessage transition. A few commands include other CommandStart actions that generate output, and do not themselves need to display a start/end message.	2019-06-12 14:11:23 -04:00
Joey Hess	8e5ea28c26	finish CommandStart transition The hoped for optimisation of CommandStart with -J did not materialize. In fact, not runnign CommandStart in parallel is slower than -J3. So, CommandStart are still run in parallel. (The actual bad performance I've been seeing with -J in my big repo has to do with building the remoteList.) But, this is still progress toward making -J faster, because it gets rid of the onlyActionOn roadblock in the way of making CommandCleanup jobs run separate from CommandPerform jobs. Added OnlyActionOn constructor for ActionItem which fixes the onlyActionOn breakage in the last commit. Made CustomOutput include an ActionItem, so even things using it can specify OnlyActionOn. In Command.Move and Command.Sync, there were CommandStarts that used includeCommandAction, so output messages, which is no longer allowed. Fixed by using startingCustomOutput, but that's still not quite right, since it prevents message display for the includeCommandAction run inside it too.	2019-06-12 13:24:01 -04:00
Joey Hess	436f107715	make CommandStart return a StartMessage The goal is to be able to run CommandStart in the main thread when -J is used, rather than unncessarily passing it off to a worker thread, which incurs overhead that is signficant when the CommandStart is going to quickly decide to stop. To do that, the message it displays needs to be displayed in the worker thread, after the CommandStart has run. Also, the change will mean that CommandStart will no longer necessarily run with the same Annex state as CommandPerform. While its docs already said it should avoid modifying Annex state, I audited all the CommandStart code as part of the conversion. (Note that CommandSeek already sometimes runs with a different Annex state, and that has not been a source of any problems, so I am not too worried that this change will lead to breakage going forward.) The only modification of Annex state I found was it calling allowMessages in some Commands that default to noMessages. Dealt with that by adding a startCustomOutput and a startingUsualMessages. This lets a command start with noMessages and then select the output it wants for each CommandStart. One bit of breakage: onlyActionOn has been removed from commands that used it. The plan is that, since a StartMessage contains an ActionItem, when a Key can be extracted from that, the parallel job runner can run onlyActionOn' automatically. Then commands won't need to worry about this detail. Future work. Otherwise, this was a fairly straightforward process of making each CommandStart compile again. Hopefully other behavior changes were mostly avoided. In a few cases, a command had a CommandStart that called a CommandPerform that then called showStart multiple times. I have collapsed those down to a single start action. The main command to perhaps suffer from it is Command.Direct, which used to show a start for each file, and no longer does. Another minor behavior change is that some commands used showStart before, but had an associated file and a Key available, so were changed to ShowStart with an ActionItemAssociatedFile. That will not change the normal output or behavior, but --json output will now include the key. This should not break it for anyone using a real json parser.	2019-06-06 17:13:54 -04:00
Joey Hess	258a7c5cd1	add Key to all ActionItem constructors	2019-06-06 12:53:24 -04:00
Joey Hess	082e1f1738	Don't try to import .git directories from special remotes Because git does not support storing git repositories inside a git repository.	2019-06-04 15:14:20 -04:00
Joey Hess	a14f6ce758	fix repo description setting bugs * init: When the repository already has a description, don't change it. * describe: When run with no description parameter it used to set the description to "", now it will error out.	2019-05-23 12:51:01 -04:00
Joey Hess	e06feb7316	honor preferred content when importing Importing from a special remote honors its preferred content too; unwanted files are not imported. But, some preferred content expressions can't be checked before files are imported, and trying to import with such an expression will fail. Tested this with scenarios including changing the preferred content expression and making sure merging the import didn't delete files that were no longer wanted. There was one minor inefficiency mentioned in the todo that I punted on.	2019-05-21 14:38:06 -04:00
Joey Hess	97fd9da6e7	add back non-preferred files to imported tree Prevents merging the import from deleting the non-preferred files from the branch it's merged into. adjustTree previously appended the new list of items to the old, which could result in it generating a tree with multiple files with the same name. That is not good and confuses some parts of git. Gave it a function to resolve such conflicts. That allowed dealing with the problem of what happens when the import contains some files (or subtrees) with the same name as files that were filtered out of the export. The files from the import win.	2019-05-20 16:43:52 -04:00
Joey Hess	568af1073e	filter exported tree through remote's preferred content setting The filtering is fairly efficient as far as building the trees goes, since it reuses adjustTree. But it still needs to traverse the whole tree, and look up the keys used by every file. The tree that gets recorded to export.log is the filtered tree. This way resumes of interrupted sync to an export uses it without needing to recalculate it. And, a change to the preferred content settings of the remote will result in a different tree, so the export will be updated accordingly. The original tree is still used in the remote tracking branch. That branch represents the special remote as a git remote, and if it were a normal git remote, the tree in its head would not be affected by preferred content.	2019-05-20 11:54:55 -04:00
Joey Hess	354c0eb57f	support standard and groupwanted in keyless mode Only when the preferred content expression includes them will a parse failure due to them needing keys result in the preferred content expression not parsing in keyless mode.	2019-05-14 14:59:03 -04:00
Joey Hess	9411a7c93c	matching preferred content before key is known This will let import try to match preferred content expressions before downloading the content and generating its key. If an expression needs a key, it preferredContentParser with preferredContentKeylessTokens will fail to parse it. standard and groupwanted are not in preferredContentKeylessTokens because they may refer to an expression that refers to a key. That needs further work to support them.	2019-05-14 14:28:23 -04:00
Joey Hess	2d33122215	avoid ingest lockdown file escaping the withOtherTmp call Fixes bug that caused git-annex to fail to add a file when another git-annex process cleaned up the temp directory it was using. Solution is just to push withOtherTmp out to a higher level, so that the whole ingest process can be completed inside it. But in the assistant, that was not practical to do, since withOtherTmp runs in the Annex monad and the assistant does not. Worked around by introducing a separate temp directory that only the assistant uses for lockdown. Since only one assistant can run at a time, it's easy to clean up that directory of old cruft at startup.	2019-05-07 13:04:57 -04:00
Joey Hess	bf7ecd6892	fix export subtree reversion Fix reversion in last release that caused wrong tree to be written to remote tracking branch after an export of a subtree. The invariant "commitsha should have the treesha as its tree" was not met due to a bug. Guarantee it's met by catting the commitsha to find its actual tree. A little bit slower, but this is not run often.	2019-05-06 13:57:13 -04:00
Joey Hess	700a3f2787	Merge branch 'master' into import-from-s3	2019-05-01 14:30:52 -04:00
Joey Hess	2bd0e07ed8	make merge commit on export that preserves the import history	2019-05-01 13:13:00 -04:00
Joey Hess	1503b86a14	make import tree from remote generate a merge commit This way no history is lost, neither what was exported to the remote, or the history of changes that is imported from it. No complicated correlation of two possibly very different histories is needed, just record what we know and then git merge will do a good job. Also, it notices when the remote tracking branch doesn't need to be updated, and avoids doing anything, so noop remotes are super cheap. The only catch here is that, since the commits generated for imports from the remote don't have a stable date or author/committer, each (non-noop) import generates different commits for the same imported trees. So, when the imported remote tracking branch is merged into master and then a change is imported again, there will be an extra series of commits, which will get more and more expensive each time. This seems to call for making stable commits for imports. Also that seems a good idea to make importing in several repositories have the same result.	2019-04-30 16:13:21 -04:00
Joey Hess	9dd764e6f7	Added mimeencoding= term to annex.largefiles expressions. * Added mimeencoding= term to annex.largefiles expressions. This is probably mostly useful to match non-text files with eg "mimeencoding=binary" * git-annex matchexpression: Added --mimeencoding option.	2019-04-30 12:17:22 -04:00
Joey Hess	0f78b4db09	distinguish between feed download and parse failures	2019-04-21 10:35:08 -04:00
Joey Hess	c57695007b	prevent renaming to name already in use Also, look up the name in the special remote log first, only fall back to remote name/uuid/description lookup if it fails. This should avoid violating least surprise in cases where the special remote they wish t rename is not enabled, or has a git remote with a different name.	2019-04-16 12:23:46 -04:00
Joey Hess	c0c38e986d	added renameremote command	2019-04-15 13:49:03 -04:00
Joey Hess	f95f340c73	sync: When listing contents on an import remote fails, proceed with other syncing instead of aborting Switch listContents to being a proper CommandStart, so if it throws an exception, it will be treated like any other command action that fails. downloadImport apparently does not ever throw an exception, and itself uses commandAction, so it can't be a CommandStart.	2019-04-10 17:02:56 -04:00
Joey Hess	3d6f1b7dba	Made git-annex sync --content much faster when all the remotes it's syncing with are export/import remotes It was unnecessarily going over all files and checking preferred content against no remotes.	2019-04-10 12:42:10 -04:00
Joey Hess	37041b629d	improve messages around export/import conflicts A conflict can be caused by either export or import when the remote supports both.	2019-04-09 13:03:59 -04:00
Joey Hess	5ab97333e4	import: Let --force overwrite symlinks, not only regular files The docs already implied this should work.	2019-03-18 16:40:15 -04:00
Joey Hess	d5ee5fef65	fsck: Detect situations where annex.thin has caused data loss to the content of locked files. In particular, when two files had the same content, and one was unlocked and modified, with annex.thin that can corrupt the content of the annex object, and so fsck on the other file should detect that. getKeyStatus was relying on Database.Keys.getAssociatedFiles to tell when a file is unlocked, but that can false positive because the database can list old associated files. Instead, separate out the case of unlocked object which has multiple hardlinks when annex.thin is in use.	2019-03-18 15:59:43 -04:00
Joey Hess	8758f9c561	addurl --file: Fix a bug that made youtube-dl be used unneccessarily when adding an html url that does not contain any media.	2019-03-18 13:34:29 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	28e46d947a	avoid sync --content trying to sendKey to exporttree remotes	2019-03-11 14:09:46 -04:00
Joey Hess	057999f0fc	fix sync --content with remote.name.annex-tracking-branch=master:subdir It was exporting the whole tree not just the subdir. Now tested fully working in both directions.	2019-03-11 14:07:52 -04:00
Joey Hess	8ae0db925b	fix name of annex-tracking-branch config	2019-03-11 13:56:59 -04:00
Joey Hess	e46e40bf05	fix update of export tracking branch when exporting branch:subdir	2019-03-11 13:44:12 -04:00
Joey Hess	2912429640	better indicate when special remotes do not support renameExport Avoid a warning message when renameExport is not supported, and just fallback to deleting with a subsequent re-upload. Especially needed for importtree remotes, where renameExport needs to be disabled. This changes the external special remote protocol, but in a backwards-compatible way. A reply of UNSUPPORTED-REQUEST to an older version of git-annex will cause it to make renameExport return False.	2019-03-11 12:53:24 -04:00
Joey Hess	c755788256	sync: import when annex-tracking-branch is configured This works, and tested syncing both gets changes from a special remote and sends changes to it, keeping it fully in sync nicely! But have not tried it with a subdir configured.	2019-03-09 13:57:49 -04:00
Joey Hess	ca1a3caaa8	refactor	2019-03-09 13:34:57 -04:00
Joey Hess	633021e135	--no-push and remote.name.annex-push prevent exporting trees to special remotes Users may want sync to only export, or only import and this is broadly analagous to push and pull, so it makes sense to use the same configuration for it.	2019-03-09 13:21:49 -04:00
Joey Hess	d9ee048d85	doc updates for import	2019-03-09 13:10:30 -04:00
Joey Hess	e412129523	concurrency and status messages when downloading from import	2019-03-08 12:33:44 -04:00
Joey Hess	e3a704224f	fix export db locking deadlock	2019-03-07 16:06:02 -04:00
Joey Hess	be6085cfe5	fix option parser Alternative doesn't combine the subparsers the way I wanted. Unfortunately this new parser has suboptimal usage because everything is all jumbled together.	2019-03-06 13:10:29 -04:00
Joey Hess	5767b1b00d	avoid updating tracking branch when transfer to export throws exception	2019-03-05 16:51:13 -04:00
Joey Hess	aaacf431d8	handle importtree=yes config For now, it's only allowed when exporttree=yes is also set. That simplified the implementation, but could later be changed if there's a remote that makes sense to be an import but not an export. However, it may work just as well to make a remote be readonly to prevent export to it while still allowing import.	2019-03-04 16:07:35 -04:00
Joey Hess	18d7a1dbbb	make export and sync update special remote tracking branch The branch is only updated once the export is 100% complete. This way, if an export is started but interrupted and so the remote does not yet contain some of the files, an import will make a commit on the old branch, and so won't delete the missing files.	2019-03-01 16:35:48 -04:00
Joey Hess	519cadd1de	refactor RemoteTrackingBranch Not specific to Import; export will use it too.	2019-03-01 14:47:56 -04:00
Joey Hess	d28b0a8bd0	use disconnected history for import tracking branch This avoids the first merge from it deleting all files in the current branch, which was very surpring and unwanted behavior.	2019-03-01 14:33:29 -04:00
Joey Hess	45aacd888b	import downloader complete (untested) Made some api changes. listImportableContents needs to provide the size of the data, so the downloader can check disk free space. retrieveExportWithContentIdentifier is passed the filepath to write to Use temporary "CID" key during download of a ContentIdentifier from a remote, so withTmp can be used and then move the content to the real key once it's known.	2019-02-27 13:15:02 -04:00
Joey Hess	f4b773e9a1	incomplete action to download files from import	2019-02-26 15:25:28 -04:00
Joey Hess	b6e2a5e9c2	reorg	2019-02-26 14:22:08 -04:00
Joey Hess	e4e464da65	import command is updating tracking branch	2019-02-26 13:15:48 -04:00
Joey Hess	5afe4135c2	import --from option parsing	2019-02-26 12:06:19 -04:00
Joey Hess	4747fa923d	export: Deprecated the --tracking option. Instead, users can configure remote.<name>.annex-tracking-branch themselves.	2019-02-23 15:54:33 -04:00
Joey Hess	8fdea8f444	WIP Added graftTree but it's buggy. Should use graftTree in Annex.Branch.graftTreeish; it will be faster than the current implementation there. Started Annex.Import, but untested and it doesn't yet handle tree grafting.	2019-02-21 17:32:59 -04:00
Joey Hess	7c25cc7715	fix build	2019-02-20 17:31:08 -04:00
Joey Hess	c3f47ba389	make .noannex file prevent repo fixups Avoid performing repository fixups for submodules and git-worktrees when there's a .noannex file that will prevent git-annex from being used in the repository. This change is ok as long as the .noannex file is really going to prevent git-annex from being used. But, init --force could override the file. Which would result in the repo being initialized without the fixups having run. To avoid that situation decided to change init, to not let --force be used to override a .noannex file. Instead the user can just delete the file.	2019-02-05 14:43:23 -04:00
Joey Hess	b080699a95	fromkey --json * fromkey: Added --json. * fromkey --batch output changed to support using it with --json. The old output was not parseable for any useful information, so this is not expected to break anything.	2019-02-05 14:03:29 -04:00
Joey Hess	7b46b43c48	fromkey: Made idempotent If the worktree file already exists, and is annexed and uses the same key, avoid failing, nothing needs to be done. Had to add lookupFileNotHidden to handle the case where an adjust --hide-missing is in use, and the worktree file was hidden due to the object content being missing. lookupFile would return the key of the hidden file, but it makes sense that after fromkey succeeds, the worktree must contain the file it was supposed to set up.	2019-02-05 13:13:13 -04:00
Joey Hess	7b9701675e	Display progress bar when getting files from export remotes And moved the progress bar display into storeExport as well. This commit was sponsored by John Pellman on Patreon.	2019-01-31 13:34:12 -04:00
Joey Hess	9cebfd7002	purify exportActions Purifying exportActions will allow introspecting and modifying it, which is needed to add progress bar display to it. Only S3 and WebDAV ran an Annex action while constructing ExportActions. There was a small performance gain from them doing that, since a resource was able to be prepared and reused for multiple actions by Command.Export. As seen in commit `809cfbbd8a` and `5d394023eb` S3 and WebDAV actually create a new handle for each access in normal, non-export use. It doesn't seem worth making export use of them marginally more efficient than normal use. It would be better to do that work upfront when constructing the remote. Or perhaps use a MVar to cache a handle. This commit was sponsored by Nick Piper on Patreon.	2019-01-30 15:11:40 -04:00
Joey Hess	ad1d422dd7	fix false positive in export conflict detection Like the earlier fixed one in Command.Export, it occurred when the same tree was exported by multiple clones. Previous fix was incomplete since several other places looked at the list of exported trees to detect when there was an export conflict. Added a single unified function to avoid missing any places it needed to be fixed. This commit was sponsored by mo on Patreon.	2019-01-30 12:36:30 -04:00
Joey Hess	a9593a43e9	explain why numcopies is not checked in performUnexport	2019-01-26 12:52:56 -04:00
Joey Hess	68198e803e	fix build with old version of optparse-applicative	2019-01-18 14:20:44 -04:00
Joey Hess	50a9a77148	fix build with old version of feed	2019-01-18 14:16:22 -04:00
Joey Hess	f8e7ea77fc	check present when testing readonly too The object is supposed to be present on the readonly remote; have to assume the location log is right about that, so the presence check should succeed.	2019-01-17 16:08:25 -04:00
Joey Hess	d5f2463702	misctmp cleanup * Switch to using .git/annex/othertmp for tmp files other than partial downloads, and make stale files left in that directory when git-annex is interrupted be cleaned up promptly by subsequent git-annex processes. * The .git/annex/misctmp directory is no longer used and git-annex will delete anything lingering in there after it's 1 week old. Also, in Annex.Ingest, made the filename it uses in the tmp dir be prefixed with "ingest-" to avoid potentially using a filename used by some other code.	2019-01-17 16:02:22 -04:00
Joey Hess	8555169e71	testremote: Support testing readonly remotes with the --test-readonly option This commit was sponsored by Ilya Shlyakhter on Patreon.	2019-01-17 12:44:52 -04:00
Joey Hess	96aba8eff7	Revert "cache the serialization of a Key" This reverts commit `4536c93bb2`. That broke Read/Show of a Key, and unfortunately Key is read in at least one place; the GitAnnexDistribution data type. It would be worth bringing this optimisation back, but it would need either a custom Read/Show instance that preserves back-compat, or wrapping Key in a data type that contains the serialization, or changing how GitAnnexDistribution is serialized. Also, the Eq instance would need to compare keys with and without a cached seralization the same.	2019-01-16 16:21:59 -04:00
Joey Hess	f0a57825e2	shorten some too-long descriptions	2019-01-16 14:16:32 -04:00
Joey Hess	4536c93bb2	cache the serialization of a Key This will speed up the common case where a Key is deserialized from disk, but is then serialized to build eg, the path to the annex object. It means that every place a Key has any of its fields changed, the cache has to be dropped. I've grepped and found them all. But, it would be better to avoid that gotcha somehow..	2019-01-14 16:37:28 -04:00
Joey Hess	5d98cba923	use ByteStrings when reading annex symlinks and pointers Now there's a ByteString used all the way from disk to Key. The main complication in this conversion was the use of fromInternalGitPath in several places to munge things on Windows. The things that used that were changed to parse the ByteString using either path separator. Also some code that had read from files to a String lazily was changed to read a minimal strict ByteString.	2019-01-14 15:37:08 -04:00
Joey Hess	0acbbf208f	use fileKey here This doesn't change behavior in any way worth mentioning, but it's the right thing to do.	2019-01-14 13:22:33 -04:00
Joey Hess	303e828b7c	rest of the deserializeKey renameing	2019-01-14 13:17:47 -04:00
Joey Hess	1791447cc8	avoid creating work tree files in subdirectories in an edge case A keyName could contain "/", though this is unlikely and certianly only ever could happen with WORM keys. The change to addunused to escape that is no problem at all. The change to VariantFile to escape it means that different versions of git-annex could resolve a merge conflict differently in this case, which is unfortunate. There would be different .variant files used, so the two resolutions would themselves merge together without additional conflicts, but the user would have to clean up the extra .variant files.	2019-01-14 13:14:25 -04:00
Joey Hess	d3ab5e626b	rename key2file and file2key What these generate is not really suitable to be used as a filename, which is why keyFile and fileKey further escape it. These are just serializing Keys. Also removed a quickcheck test that was very unlikely to test anything useful, since it relied on random chance creating something that looks like a serialized key. The other test is sufficient for testing what that was intended to test anyway.	2019-01-14 13:03:35 -04:00
Joey Hess	727767e1e2	make everything build again after ByteString Key changes	2019-01-11 16:39:46 -04:00
Joey Hess	6f66b53a30	newtype Group to ByteString This may speed up queries for things in groups, due to Eq and Ord being faster.	2019-01-09 15:05:49 -04:00
Joey Hess	cb375977a6	follow-on changes from MetaData type changes Including writing and parsing the metadata log files with bytestring-builder and attoparsec.	2019-01-07 15:51:05 -04:00
Joey Hess	5ba14b5095	build cleanrly when benchmark flag is not enabled	2019-01-05 08:09:28 -04:00
Joey Hess	11d6e2e260	new improved benchmark command that can benchmark anything git-annex does	2019-01-04 13:46:36 -04:00
Joey Hess	7d51b0c109	import Utility.FileSystemEncoding in Common	2019-01-03 11:37:02 -04:00
Joey Hess	894716512d	add a UUIDDesc type containing a ByteString Groundwork for handling uuid.log using ByteString	2019-01-01 16:17:54 -04:00
Joey Hess	9cc6d5549b	convert UUID from String to ByteString This should make == comparison of UUIDs somewhat faster, and perhaps a few other operations around maps of UUIDs etc. FromUUID/ToUUID are used to convert String, which is still used for all IO of UUIDs. Eventually the hope is those instances can be removed, and all git-annex branch log files etc use ByteString throughout, for a real speed improvement. Note the use of fromRawFilePath / toRawFilePath -- while a UUID usually contains only alphanumerics and so could be treated as ascii, it's conceivable that some git-annex repository has been initialized using a UUID that is not only not a canonical UUID, but contains high unicode or invalid unicode. Using the filesystem encoding avoids any problems with such a thing. However, a NUL in a UUID seems extremely unlikely, so I didn't use encodeBS / decodeBS to avoid their extra overhead in handling NULs. The Read/Show instance for UUID luckily serializes the same way for ByteString as it did for String.	2019-01-01 14:45:33 -04:00
Joey Hess	f4bde87525	fix layout	2019-01-01 12:31:03 -04:00
Joey Hess	6512b40bac	importfeed: Better error message when downloading the feed fails It used to display the "bad feed content" message indicating there were no enclosures found, which was misleading when the http request for the feed failed. This commit was sponsored by Ewen McNeill on Patreon.	2018-12-30 16:14:55 -04:00
Joey Hess	f943138508	avoid unnecessary monad	2018-12-30 15:59:15 -04:00
Joey Hess	365286279f	unused: Update suggested git log message to see where data was previously used so it will also work with v7 unlocked pointer files.	2018-12-19 13:53:49 -04:00
Joey Hess	6d381df0e6	sync --content: Fix dropping unwanted content from the local repository This fixes a bug with the numcopies counting when using sync --content. It did not always pass the local repo uuid to handleDropsFrom, and so the numcopies counting was off by one, and unwanted local content would only be dropped when there were numcopies+1 remote copies. Also, support dropping local content that has reached an exporttree remote that is not untrusted (currently only S3 remotes with versioning).	2018-12-18 13:58:12 -04:00
Joey Hess	904be4e6be	add --branch option to git-annex find and mildly deprecate findref in favor of it No deprecation warning at run time, just one on the man page. One thing findref remains able to do that find cannot is to run in a bare repo. Find was made to refuse to run in a bare repo because it seemed confusing for it to not list any files ever in that situation. It would be better for find --branch to work in a bare repo but not without --branch but I don't currently have a way to do that. Probably a better solution would be to make git-annex in a bare repo default to --branch master or something like that instead of --all. This commit was sponsored by Denis Dzyubenko on Patreon.	2018-12-09 14:10:37 -04:00
Joey Hess	029ae8d4db	support findred and --branch with file matching options * findref: Support file matching options: --include, --exclude, --want-get, --want-drop, --largerthan, --smallerthan, --accessedwithin * Commands supporting --branch now apply file matching options --include, --exclude, --want-get, --want-drop to filenames from the branch. Previously, combining --branch with those would fail to match anything. * add, import, findref: Support --time-limit. This commit was sponsored by Jake Vosloo on Patreon.	2018-12-09 13:38:35 -04:00
Joey Hess	e89bb4361b	distinguish between cached and uncached creds p2p and multicast creds are not cached the same way that s3 and webdav creds are. The difference is that p2p and multicast obtain the creds themselves, as part of a process like pairing. So they're storing the only extant copy of the creds. In s3 and webdav etc the creds are provided by the cloud storage provider. This is a fine difference, but I do think it's a reasonable difference. If the user wants to prevent s3 and webdav etc creds from being stored unencrypted on disk, they won't feel the same about p2p auth tokens used for tor, or a multicast encryption key, or for that matter their local ssh private key. This commit was sponsored by Fernando Jimenez on Patreon.	2018-12-04 14:09:18 -04:00
Joey Hess	aa8243df4c	dropunused edge case when annex.thin caused unused object to be modified dropunused: When an unused object file has gotten modified, eg due to annex.thin being set, don't silently skip it, but display a warning and let --force drop it. This commit was sponsored by Ethan Aubin.	2018-12-04 12:20:34 -04:00
Joey Hess	83109affd1	remove leftovers from removed TestSuite build flag Test suite is always built, so this can be simplified.	2018-11-19 12:39:16 -04:00
Joey Hess	c8bd5710b1	check onlyActionOn in Drop * drop -J: Avoid processing the same key twice at the same time when multiple annexes files use it. This prevents a drop of a key conflicting with another drop of the same key. This commit was sponsored by Brock Spratlen on Patreon.	2018-11-15 15:43:51 -04:00
Joey Hess	71cc9cfaa2	improve smudge --clean behavior on outside work tree files smudge: When passed a file located outside the working tree, eg by git diff, avoid erroring out. This commit was sponsored by Ewen McNeill on Patreon.	2018-11-15 13:04:40 -04:00
Joey Hess	c3fa1f2b08	avoid redundant export uploads export, sync --content: Avoid unnecessarily trying to upload files to an exporttree remote that already contains the files. When the export was origianly made in one repo and now git-annex is running in a different repo, the export database is not yet populated with information about the exportLocation of files. So, it was trying to upload the files to the export, even when it already contained them. sync --content would first download the content from the export, and then re-upload the content back. And this also led to "not available" failures for each file that was not locally present yet. Fix: Just use checkPresentExport before uploading; if it succeeds update the database. This is a surprising oversight, it's possible it fixes a reversion because I would have thought I'd have noticed this problem when originally developing exporttree remotes. This commit was sponsored by Jochen Bartl on Patreon.	2018-11-14 11:47:40 -04:00
Joey Hess	d65df7ab21	improve messages around export conflicts When an export conflict prevents accessing a special remote, be clearer about what the problem is and how to resolve it. This commit was sponsored by Trenton Cronholm on Patreon.	2018-11-13 15:50:06 -04:00
Joey Hess	abe4b7ebd6	importfeed: Avoid erroring out when a feed has been repeatedly broken That can leave other imported files not checked into git, because the git command queue is not flushed when git-annex errors out. And since it only happens once git-annex has concluded a feed is broken, it's an intermittent bug, worst kind. Been seeing it for a while, only tracked down today. Instead, by returning False, git-annex importfeed will cleanly shutdown and still exit nonzero. This commit was sponsored by Denis Dzyubenko on Patreon.	2018-11-04 17:41:49 -04:00
Joey Hess	5ab0f48ffb	high-res mtimes Cache high-resolution mtimes for improved detection of modified files in v7 (and direct mode). Including on Windows. With back-compat support so old low-res mtimes won't break anything, and so the new information also won't break old versions of git-annex.	2018-10-30 00:41:26 -04:00
Joey Hess	2e9f128dea	moved module and relicensed	2018-10-29 23:13:36 -04:00
Joey Hess	5d97898a7c	touch files with high-resolution timestamp Needs unix 2.7.2, but that was included in ghc 8.0.1 (and much older) so not really a new dep.	2018-10-29 22:25:21 -04:00
Joey Hess	4431b82bce	migrate: Fix failure to migrate from URL keys. (Reversion introduced in version 6.20180926)	2018-10-29 16:36:36 -04:00
Joey Hess	a622488758	remove CHECKURL-MULTI single url response special case Removed undocumented special case in handling of a CHECKURL-MULTI response with only a single file listed. Rather than ignoring the url that was in the response, use it. This allows external special remotes that want to provide some better url to do so, although I don't entirely agree with using CHECKURL-MULTI to accomplish that. I'm more of the feeling that an undocumented special case that throws data away is just not a good idea. This could in theory break some external special remote program that relied on the current behavior, but its seems unlikely that it would because such a program must already handle the multiple url case, unless it only ever provides a single url response to CHECKURL-MULTI. Make addurl --file work with a single item CHECKURL-MULTI response. It already did for external special remotes due to the special case, but now it also will for builtin ones like the BitTorrent special remote. This commit was sponsored by Ilya Shlyakhter on Patron.	2018-10-29 14:52:12 -04:00
Joey Hess	9f87133bf5	snap --version= to auto-upgrade This makes --version=6 still work, despite v6 not being in supportedVersions. Which is useful for scripts that use it. I didn't document it on the man page, because it's indistinguishable from an automatic upgrade after initting as v6.	2018-10-26 11:44:05 -04:00
Joey Hess	234842a347	v7 Install new git hooks in this version. This does beg the question of what to do if git later gets eg a post-smudge hook, that could run git-annex smudge --update. I think the thing to do in that case would be to make git-annex smudge --update install the new hooks. That way, as the user uses git-annex, the hook would be created pretty quickly and without needing any extra syscalls except for when git-annex smudge --update is called. I considered doing something like that for installation of the post-checkout and post-merge hooks, which would have avoided the need for v7. But the only place it was cheap to do it would be in git-annex smudge which could cheaply notice that smudge.log didn't exist yet and so know the hooks needed to be installed. But since smudge used to populate pointer files, it would be quite surprising if a single git checkout/merge failed to update the work tree, and so that idea didn't work out. The other reason for v7 is psychological -- users don't need to worry about whether they might be running an old version of git-annex that doesn't support their v7 repository very well. And bug reports about "v6" have gotten a bit of a bad association in my head since they often hit one of the known limitations and didn't realize it was experimental. newtyped RepoVersion Int to avoid needing 2 comparisons in versionSupportsUnlockedPointers etc. Also it's just nicer. This commit was sponsored by John Pellman on Patreon.	2018-10-25 18:24:23 -04:00
Joey Hess	c28ca8294f	optimize smudge --clean of unmodified file Usually, git won't run clean filter when a file is unmodified. But, when git checkout runs git annex smudge --update, it populates the pointer runs git update-index, which sees the file has changed and runs git annex smudge --clean, which was checksumming the file unncessarily as it re-ingested it. With annex.thin set, this is the difference between git checkout of a branch with a 1 gb file taking 30s and 0.1s. This commit was sponsored by Brett Eisenberg on Patreon.	2018-10-25 16:46:46 -04:00
Joey Hess	ca7de61454	git post-checkout and post-merge hooks * init, upgrade: Install git post-checkout and post-merge hooks that run git annex smudge --update. * precommit: Run git annex smudge --update, because the post-merge hook is not run when there is a merge conflict. So the work tree will be updated when a commit is made to resolve the merge conflict. * precommit: Run git annex smudge --update, because the post-merge hook is not run when there is a merge conflict. So the work tree will be updated when a commit is made to resolve the merge conflict. * Note that git has no hooks run after git stash or git cherry-pick, so the user will have to manually run git annex smudge --update after such commands. Nothing currently installs the hooks into v6 repos that already exist. Something will need to be done about that, either move this behavior to v7, or document that the user will need to manually fix up their v6 repos. This commit was sponsored by Eric Drechsel on Patreon.	2018-10-25 15:59:51 -04:00
Joey Hess	917a2c6095	defer updating unlocked files until after smudge filter The smuge filter no longer provides git with annexed file content, to avoid a git memory leak, and because that did not honor annex.thin. git annex smudge --update has to be run after a checkout to update unlocked files in the working tree with annexed file contents. No hooks yet to run it. This commit was sponsored by Nick Piper on Patreon.	2018-10-25 15:08:20 -04:00
Joey Hess	f2a4db724c	remove redundant test populatePointerFile checks the same thing	2018-10-25 14:31:45 -04:00
Joey Hess	fcca7adaff	instrument P2P --debug with connection and thread info For debugging http://git-annex.branchable.com/bugs/annex_get_-J_16_via_ssh_stalls_/ This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.	2018-10-22 15:52:11 -04:00
Joey Hess	4a6ebb1034	make sync update adjusted branch to hide/unhide This completes initial support for --hide-missing, although the assistant still needs to be updated and it perhaps needs to be sped up, and maybe there needs to be a way for git-annex get to operate on missing files. Opened some more todos for those things. This commit was sponsored by Henrik Riomar.	2018-10-20 14:22:28 -04:00
Joey Hess	4a788fbb3b	sync --content now supports --hide-missing adjusted branches This relies on git ls-files --with-tree, which I'm using in a way that its man page does not document. Hm. I emailed the git list to try to get the docs improved, but at least the git test suite does test the same kind of use case I'm using here. Performance impact when not in an adjusted branch is limited to some additional MVar accesses, and a single git call to determine the name of the current branch. So very minimal. When in an adjusted branch, the performance impact is in Annex.WorkTree.lookupFile, which starts doing an equal amount of work for files that didn't exist as it already did for files that were unlocked. This commit was sponsored by Jochen Bartl on Patreon.	2018-10-19 17:51:25 -04:00
Joey Hess	8be5a7269a	refactor getCurrentBranch Both Command.Sync and Annex.Ingest had their own versions of this. The one in Annex.Ingest used Git.Branch.currentUnsafe, but does not seem to need it. That is only checking to see if it's in an adjusted unlocked branch, and when in an adjusted branch, the branch does in fact exist, so the added check that Git.Branch.current does is fine. This commit was sponsored by Denis Dzyubenko on Patreon.	2018-10-19 17:29:18 -04:00
Joey Hess	24838547e2	adjust --hide-missing * At long last there's a way to hide annexed files whose content is missing from the working tree: git-annex adjust --hide-missing * When already in an adjusted branch, running git-annex adjust again will update the branch as needed. This is mostly useful with --hide-missing to hide/unhide files after their content has been dropped or received. Still needs integration with sync and the assistant, and not as fast as it could be, but already usable. This commit was sponsored by Ethan Aubin.	2018-10-18 15:32:42 -04:00
Joey Hess	a6c8de84b6	improve types to allow combining some adjustments Combinations like --hide-misssing --unlocked seem very useful. On the other hand, combining --fix with --unlock doesn't make sense because a file can be either unlocked or a symlink that can be fixed, but not both. Changed the serialization of HideMissingAdjustment in passing, but it has not actually been used yet so nothing will be broken. This commit was sponsored by Trenton Cronholm on Patreon.	2018-10-18 12:59:05 -04:00
Joey Hess	558520d27a	fix rekey/migrate bookkeeping in v6 After `220317df5a` the test suite still detected a problem; migrate of an unlocked file replaced it with a pointer file rather than a file with the content. This was a bookeeping problem; the worktree file was being copied to the object file and the inode cache updated, but if that database write didn't get flushed in time, later checks would think the content was not present. Fixed by copying the object file to the worktree file instead, which avoids needing to update the inode cache. Also, only copy when there's a hard link to break, not always. This commit was sponsored by Brock Spratlen on Patreon.	2018-10-16 17:18:21 -04:00
Joey Hess	220317df5a	v6: fix migrate of unlocked file After commit `b2bafdb2fc` the test suite threw up a failure migrating unlocked files. I'm not clear how that commit broke it (presumably by inAnnex reporting the right information now), but the actual problem is plain: The inodecache for the worktree file is generated, but then the file is replaced with a copy (unncessarily unless annex.link is set, but the code always does so) and so linkToAnnex/linkAnnex then fails because it notices the inode cache is not valid. This commit was sponsored by Jake Vosloo on Patreon.	2018-10-16 16:45:08 -04:00
Joey Hess	bdf6783b92	improve error message	2018-10-16 15:52:40 -04:00
Joey Hess	40dba8e933	prevent find running in bare repo	2018-10-16 10:44:09 -04:00
Joey Hess	38d691a10f	removed the old Android app Running git-annex linux builds in termux seems to work well enough that the only reason to keep the Android app would be to support Android 4-5, which the old Android app supported, and which I don't know if the termux method works on (although I see no reason why it would not). According to [1], Android 4-5 remains on around 29% of devices, down from 51% one year ago. [1] https://www.statista.com/statistics/271774/share-of-android-platforms-on-mobile-devices-with-android-os/ This is a rather large commit, but mostly very straightfoward removal of android ifdefs and patches and associated cruft. Also, removed support for building with very old ghc < 8.0.1, and with yesod < 1.4.3, and without concurrent-output, which were only being used by the cross build. Some documentation specific to the Android app (screenshots etc) needs to be updated still. This commit was sponsored by Brett Eisenberg on Patreon.	2018-10-13 01:41:11 -04:00
Joey Hess	91b799d1a6	export: Fix false positive in export conflict detection It occurred when the same tree was exported by multiple clones. nub out identical trees. This commit was sponsored by Jochen Bartl on Patreon.	2018-10-09 15:54:12 -04:00
Joey Hess	451171b7c1	clean up url removal presence update * rmurl: Fix a case where removing the last url left git-annex thinking content was still present in the web special remote. * SETURLPRESENT, SETURIPRESENT, SETURLMISSING, and SETURIMISSING used to update the presence information of the external special remote that called them; this was not documented behavior and is no longer done. Done by making setUrlPresent and setUrlMissing only update presence info for the web, and only when the url is a web url. See the comment for reasoning about why that's the right thing to do. In AddUrl, had to make it update location tracking, to handle the non-web-url case. This commit was sponsored by Ewen McNeill on Patreon.	2018-10-04 17:35:49 -04:00
Joey Hess	53526136e8	move commandAction out of CmdLine.Seek This is groundwork for nested seek loops, eg seeking over all files and then performing commandActions on a list of remotes, which can be done concurrently. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-10-01 14:12:06 -04:00
Joey Hess	9adee3f2fb	sync: Warn when a remote's export is not updated to the current tree because export tracking is not configured. Only display the warning when the current branch has a tree that is not the same as the tree in the export. Note that it doesn't check to see if the current tree is in incompleteExportedTreeish; it might be worth checking that and reminding the user about an incomplete export, but when export tracking is not configured, they are probably not in the right clone of the repository to resolve the incomplete export. This commit was sponsored by Ethan Aubin.	2018-09-27 15:41:18 -04:00
Joey Hess	6134431254	clean P2P protocol shutdown on EOF try 2 Same goal as `b18fb1e343` but without breaking backwards compatability. Just return IO exceptions when running the P2P protocol, so that git-annex-shell can detect eof and avoid the ugly message. This commit was sponsored by Ethan Aubin.	2018-09-25 16:49:59 -04:00
Joey Hess	4ecba916a1	annex.maxextensionlength Added annex.maxextensionlength for use cases where extensions longer than 4 characters are needed. This commit was sponsored by Henrik Riomar on Patreon.	2018-09-24 12:10:18 -04:00
Joey Hess	1d1054faa6	added -z Added -z option to git-annex commands that use --batch, useful for supporting filenames containing newlines. It only controls input to --batch, the output will still be line delimited unless --json or etc is used to get some other output. While git often makes -z affect both input and output, I don't like trying them together, and making it affect output would have been a significant complication, and also git-annex output is generally not intended to be machine parsed, unless using --json or a format option. Commands that take pairs like "file key" still separate them with a space in --batch mode. All such commands take care to support filenames with spaces when parsing that, so there was no need to change it, and it would have needed significant changes to the batch machinery to separate tose with a null. To make fromkey and registerurl support -z, I had to give them a --batch option. The implicit batch mode they enter when not provided with input parameters does not support -z as that would have complicated option parsing. Seemed better to move these toward using the same --batch as everything else, though the implicit batch mode can still be used. This commit was sponsored by Ole-Morten Duesund on Patreon.	2018-09-20 16:11:47 -04:00
Joey Hess	50217f62a1	avoid duplicate add action for v6 unlocked modified file The new second pass sees the file as type changed because the first pass's changes have typically not reached git yet. So, have to explicitly check for unmodified files in the second pass. Note that, if the file has been touched but not really modified, the first pass will handle it, and so the second pass does nothing. This commit was sponsored by Jochen Bartl on Patreon.	2018-09-12 15:20:34 -04:00
Joey Hess	2743224658	change v6 git-annex add of staged unmodified unlocked file v6: When a file is unlocked but has not been modified, and the unlocking is only staged, git-annex add did not lock it. Now it will, for consistency with how modified files are handled and with v5. Note the removal of the sameInodeCache check. Otherwise it would see that the unmodified file is unmodified and stop there. That check seems to have been copied from the direct mode branch. But, direct mode had a specific reason to check for unmodified content, that does not apply to v6. The second pass means there is potential for a race, eg the unlocked file could be modified in between the first and second passes. No problem with that, since both passes do the same thing. This commit was sponsored by Jake Vosloo on Patreon.	2018-09-12 14:00:05 -04:00
Joey Hess	fcff64f8bb	optimisation: avoid stat call This commit was sponsored by Paul Walmsley on Patreon.	2018-09-05 17:26:12 -04:00
Joey Hess	d65a081f3f	improve message	2018-09-02 16:17:50 -04:00
Joey Hess	d0ef049cca	comment typo	2018-09-02 16:16:08 -04:00
Joey Hess	5c99f6247e	per-remote metadata storage Actually very straightforward reuse of the metadata log file code. Although I had to add a todo item as git-annex forget won't clean up dead remote's metadata yet. This would be worth adding to the external special remote interface sometime. Have not opened a todo though, guess I'll wait until something needs it. This commit was supported by the NSF-funded DataLad project.	2018-08-31 12:23:22 -04:00
Joey Hess	76f32012af	avoid sync/assistant drop from appendonly Make git-annex sync and the assistant skip trying to drop from appendonly remotes since it's just going to fail. git-annex drop and similar commands will still try to drop from appendonly, so the user will see failure messages when they try to do that. To do otherwise would be confusing since the user has explicitly asked for a drop with those commands. This commit was supported by the NSF-funded DataLad project.	2018-08-30 11:23:57 -04:00
Joey Hess	8b39db20b5	export appendonly support Make `git annex export` check appendonly when removing a file from an export, and not update the location log, since the remote still contains the content. This commit was supported by the NSF-funded DataLad project.	2018-08-30 11:18:20 -04:00
Joey Hess	6001b3cf45	fix build warning	2018-08-28 13:17:06 -04:00
Joey Hess	10138056dc	v6: avoid accidental conversion when annex.largefiles is not configured v6: When annex.largefiles is not configured for a file, running git add or git commit, or otherwise using git to stage a file will add it to the annex if the file was in the annex before, and to git otherwise. This is to avoid accidental conversion. Note that git-annex add's behavior has not changed, for reasons explained in the added comment. Performance: No added overhead when annex.largefiles is configured. When not configured, there is an added call to catObjectMetaData, which involves a round trip through git cat-file --batch. However, the earlier catKeyFile primes the cache for it. This commit was supported by the NSF-funded DataLad project.	2018-08-27 14:51:10 -04:00
Joey Hess	98fd7ec6c9	recover from race between git mv+commit and git-annex get Last of the known v6 races. This also makes git add of a pointer file populate it when its content is present in the annex. Which makes sense to do, I think. This commit was supported by the NSF-funded DataLad project.	2018-08-22 16:01:50 -04:00
Joey Hess	7ee3b02d49	replace stack trace with an explanation	2018-08-20 21:26:07 -04:00
Joey Hess	48e9e12961	finally fixed v6 get/drop git status After updating the worktree for an add/drop, update git's index, so git status will not show the files as modified. What actually happens is that the index update removes the inode information from the index. The next git status (or similar) run then has to do some work. It runs the clean filter. So, this depends on the clean filter being reasonably fast and on git not leaking memory when running it. Both problems were fixed in `a96972015d`, but only for git 2.5. Anyone using an older git will see very expensive git status after an add/drop. This uses the same git update-index queue as other parts of git-annex, so the actual index update is fairly efficient. Of course, updating the index does still have some overhead. The annex.queuesize config will control how often the index gets updated when working on a lot of files. This is an imperfect workaround... Added several todos about new problems this workaround causes. Still, this seems a lot better than the old behavior. This commit was supported by the NSF-funded DataLad project.	2018-08-14 16:23:58 -04:00
Joey Hess	a96972015d	massive v6 add speed/memory improvement v6 add: Take advantage of improved SIGPIPE handler in git 2.5 to speed up the clean filter by not reading the file content from the pipe. This also avoids git buffering the whole file content in memory. When built with an older git, still consumes stdin. If built with a newer git and used with an older one, it breaks, but that's acceptable -- checking the git version every time would make repeated smudge runs slow. This commit was supported by the NSF-funded DataLad project.	2018-08-09 18:17:46 -04:00
Joey Hess	12460fcea6	make --batch honor matching options When --batch is used with matching options like --in, --metadata, etc, only operate on the provided files when they match those options. Otherwise, a blank line is output in the batch protocol. Affected commands: find, add, whereis, drop, copy, move, get In the case of find, the documentation for --batch already said it honored the matching options. The docs for the rest didn't, but it makes sense to have them honor them. While this is a behavior change, why specify the matching options with --batch if you didn't want them to apply? Note that the batch output for all of the affected commands could already output a blank line in other cases, so batch users should already be prepared to deal with it. git-annex metadata didn't seem worth making support the matching options, since all it does is output metadata or set metadata, the use cases for using it in combination with the martching options seem small. Made it refuse to run when they're combined, leaving open the possibility for later support if a use case develops. This commit was sponsored by Brett Eisenberg on Patreon.	2018-08-08 12:07:06 -04:00
Joey Hess	4d4d238a08	add missing type signature	2018-08-06 15:41:44 -04:00
Joey Hess	38ddd6072d	addurl: Include filename in --json-progress output when known.	2018-08-06 12:53:44 -04:00
Joey Hess	ae11394efa	added annex.commitmessage Added annex.commitmessage config that can specify a commit message for the git-annex branch instead of the usual "update". This commit was supported by the NSF-funded DataLad project.	2018-08-02 14:06:06 -04:00
Joey Hess	fd5a392006	cache remotes via annex-speculate-present Added remote.name.annex-speculate-present config that can be used to make cache remotes. Implemented it in Remote.keyPossibilities, which is used by the get/move/copy/mirror commands, and nothing else. This way, things like whereis will not show content that's speculatively present. The assistant and sync --content were not using Remote.keyPossibilities, and were changed to use it. The efficiency hit should be small; Remote.keyPossibilities is only used before transferring a file, which is the expensive operation. And, it's only doing one lookup of the remoteList and a very cheap filter over it. Note that, git-annex still updates the location log when copying content to a remote with annex-speculate-present set. In this case, the location tracking will indicate that content is present in the remote. This may not be wanted for caches, or may not be a real problem for them. TBD. This commit was supported by the NSF-funded DataLad project.	2018-08-01 14:28:05 -04:00
Joey Hess	cc2cb46857	unused --from: Allow specifiying a repository by uuid or description. This commit was sponsored by Jake Vosloo on Patreon.	2018-07-11 16:01:35 -04:00
Joey Hess	79ac177ea5	improve tmp file cleanup If youtubeDl fails, remove the tmp file. Here tmp is the html file downloaded to check if the url is html, not what youtube-dl might have started to download. If the tmp file were retained, a re-run of addurl would try to resume downloading it, which the web server might not support, causing the resume to fail. And it's a smallish html page anyway so no benefit to keeping it for such a resume.	2018-06-28 12:51:51 -04:00
Joey Hess	dc6cb6aa5f	Merge branch 'later'	2018-06-25 21:59:20 -04:00
Joey Hess	6091b7b9db	info: Display uuid and description when a repository is identified by uuid, and for "here".	2018-06-24 17:38:18 -04:00
Joey Hess	b657242f5d	enforce retrievalSecurityPolicy Leveraged the existing verification code by making it also check the retrievalSecurityPolicy. Also, prevented getViaTmp from running the download action at all when the retrievalSecurityPolicy is going to prevent verifying and so storing it. Added annex.security.allow-unverified-downloads. A per-remote version would be nice to have too, but would need more plumbing, so KISS. (Bill the Cat reference not too over the top I hope. The point is to make this something the user reads the documentation for before using.) A few calls to verifyKeyContent and getViaTmp, that don't involve downloads from remotes, have RetrievalAllKeysSecure hard-coded. It was also hard-coded for P2P.Annex and Command.RecvKey, to match the values of the corresponding remotes. A few things use retrieveKeyFile/retrieveKeyFileCheap without going through getViaTmp. * Command.Fsck when downloading content from a remote to verify it. That content does not get into the annex, so this is ok. * Command.AddUrl when using a remote to download an url; this is new content being added, so this is ok. This commit was sponsored by Fernando Jimenez on Patreon.	2018-06-21 13:37:01 -04:00
Joey Hess	28720c795f	limit url downloads to whitelisted schemes Security fix! Allowing any schemes, particularly file: and possibly others like scp: allowed file exfiltration by anyone who had write access to the git repository, since they could add an annexed file using such an url, or using an url that redirected to such an url, and wait for the victim to get it into their repository and send them a copy. * Added annex.security.allowed-url-schemes setting, which defaults to only allowing http and https URLs. Note especially that file:/ is no longer enabled by default. * Removed annex.web-download-command, since its interface does not allow supporting annex.security.allowed-url-schemes across redirects. If you used this setting, you may want to instead use annex.web-options to pass options to curl. With annex.web-download-command removed, nearly all url accesses in git-annex are made via Utility.Url via http-client or curl. http-client only supports http and https, so no problem there. (Disabling one and not the other is not implemented.) Used curl --proto to limit the allowed url schemes. Note that this will cause git annex fsck --from web to mark files using a disallowed url scheme as not being present in the web. That seems acceptable; fsck --from web also does that when a web server is not available. youtube-dl already disabled file: itself (probably for similar reasons). The scheme check was also added to youtube-dl urls for completeness, although that check won't catch any redirects it might follow. But youtube-dl goes off and does its own thing with other protocols anyway, so that's fine. Special remotes that support other domain-specific url schemes are not affected by this change. In the bittorrent remote, aria2c can still download magnet: links. The download of the .torrent file is otherwise now limited by annex.security.allowed-url-schemes. This does not address any external special remotes that might download an url themselves. Current thinking is all external special remotes will need to be audited for this problem, although many of them will use http libraries that only support http and not curl's menagarie. The related problem of accessing private localhost and LAN urls is not addressed by this commit. This commit was sponsored by Brett Eisenberg on Patreon.	2018-06-16 11:57:50 -04:00
Joey Hess	391a83c985	remove unused value	2018-06-14 12:32:36 -04:00
Joey Hess	b6e4ed9aa7	export: re-send lost exported files after fsck notices they're gone When content has been lost from an export remote and git-annex fsck --from remote has noticed it's gone, re-running git-annex export or git-annex sync --content will re-upload it. Note that normally there's no way to remove a single file from an export. doc/design/exporting_trees_to_special_remotes.mdwn talks about this in the section "dropping from exports and copying to exports". But, if a file is somehow deleted or corrupted on the export, and fsck notices this, it will update the location log to say it's missing. So, checking the location log when determining if a file needs to be sent to the export will let such missing files be added back in. There's otherwise no way to do so. It does not fall afoul of the races documented in the abovementioned section, I think. This commit was sponsored by Ryan Newton on Patreon.	2018-06-14 12:22:12 -04:00
Joey Hess	a5f598a6aa	remove use of remoteGitConfig Unfortunately one more use remains.. This should be just as fast as the other method. The remote's Git.Repo has already had its config read, so Annex.new's call to Git.Config.read is a noop. Thid commit was sponsored by andrea rota.	2018-06-05 13:15:04 -04:00
Joey Hess	67e46229a5	change Remote.repo to Remote.getRepo This is groundwork for letting a repo be instantiated the first time it's actually used, instead of at startup. The only behavior change is that some old special cases for xmpp remotes were removed. Where before git-annex silently did nothing with those no-longer supported remotes, it may now fail in some way. The additional IO action should have no performance impact as long as it's simply return. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon	2018-06-04 15:30:26 -04:00
Joey Hess	2e6a6024c2	avoid unncessary version output differences in different contexts Show operating system and repository version list when run outside a git repo too. Also made it only display the local repository version when in a git-annex repo. Before it showed "unknown" when run in a git repo that was not git-annex initialized. That seemed like confusing behavior. This commit was sponsored by Jochen Bartl on Patreon.	2018-06-04 12:26:18 -04:00
Joey Hess	1c8ee99b46	Fix build with ghc 8.4+, which broke due to the Semigroup Monoid change https://prime.haskell.org/wiki/Libraries/Proposals/SemigroupMonoid I am not happy with the fragile pile of CPP boilerplate required to support ghc back to 7.0, which git-annex still targets for both the android build and the standalone build targeting old linux kernels. It makes me unlikely to want to use Semigroup more in git-annex, because the benefit of the abstraction is swamped by the ugliness. I actually considered ripping out all the Semigroup instances, but some are needed to use optparse-applicative. The problem, I think, is they made this transaction on too fast a timeline. (Although ironically, work on it started in 2015 or earlier!) In particular, Debian oldstable is not out of security support, and it's not possible to follow the simpler workarounds documented on the wiki and have it build on oldstable (because the semigroups package in it is too old). I have only tested this build with ghc 8.2.2, not the newer and older versions that branches of the CPP support. So there could be typoes, we'll see. This commit was sponsored by Brock Spratlen on Patreon.	2018-05-30 12:28:43 -04:00
Joey Hess	c3064edac9	setpresentkey: Added --batch support (for ronnypfa) This commit was sponsored by Peter on Patreon.	2018-05-27 14:56:14 -04:00
Joey Hess	85f9360d9b	GIT_ANNEX_SHELL_APPENDONLY Makes it allow writes, but not deletion of annexed content. Note that securing pushes to the git repository is left up to the user. This commit was sponsored by Jack Hill on Patreon.	2018-05-25 13:17:56 -04:00
Joey Hess	2da2ae0919	fix migration bug and make fsck warn * migrate: Fix bug in migration between eg SHA256 and SHA256E, that caused the extension to be included in SHA256 keys, and omitted from SHA256E keys. (Bug introduced in version 6.20170214) * migrate: Check for above bug when migrating from SHA256 to SHA256 (and same for SHA1 to SHA1 etc), and remove the extension that should not be in the SHA256 key. * fsck: Detect and warn when keys need an upgrade, either to fix up from the above migrate bug, or to add missing size information (a long ago transition), or because of a few other past key related bugs. This commit was sponsored by Henrik Riomar on Patreon.	2018-05-23 14:07:51 -04:00
Joey Hess	2fabd7cdb5	remove the older move --force, which never behaved as documented and seems useless * move: --force was accidentially enabling two unrelated behaviors since 6.20180427. The older behavior, which has never been well documented and seems almost entirely useless, has been removed. * copy: --force no longer does anything. This commit was sponsored by Øyvind Andersen Holm.	2018-05-21 13:21:19 -04:00
Joey Hess	442e607b0a	Don't allow entering a view with staged or unstaged changes. In some cases, unstaged changes are safe, eg dotfiles in the top which are not affected by a view. Or non-annexed files in general which would prevent view branch checkout from proceeding. But in other cases, particularly unstaged changes to annexed files, entering a view would wipe out those changes! And so don't allow entering a view with any unstaged changes. Staged changes are not safe when entering a view, because the changes get committed to the view branch, and so the user is unlikely to remember them when they exit the view, and so will effectively lose them, even if they're still present in the view branch. Also, improved the git status parser, although the improvement turned out to not really be needed. This commit was sponsored by Eric Drechsel on Patreon.	2018-05-14 16:51:06 -04:00
Joey Hess	d7021d420f	reuse hashes of dotfiles/dirs/submodules when entering view This fixes a crash when a git submodule has a name starting with a dot. Such a submodule might contain dotfiles that are intended to be used when inside the view (since a dot-directory that's not a submodule was already preserved when entering a view). So, rather than eliminating the submodule from the view, its git ls-files --stage hash is copied over into the view. dotfiles/dirs have their git ls-files --stage hashes similarly copied over to the view. This is more efficient and simpler than the old method, and also won't break if git ever adds a new type of tree item, like was done with submodules. Since the content of dotfiles in the working tree is no longer hashed when entering a view, when there are unstaged modifications, they are not included in the view branch. Entering the view branch still works, but git checkout shows "M .dotfile", and git diff will show the unstaged changes. This seems like an improvement over the old behavior. Also made Command.View not delete empty directories that are submodules when entering a view, while still deleting other empty directories. This commit was supported by the NSF-funded DataLad project.	2018-05-14 15:35:20 -04:00
Joey Hess	0b7f6d24d3	rename BlobType and add submodule to it This was badly named, it's a not a blob necessarily, but anything that a tree can refer to. Also removed the Show instance which was used for serialization to git format, instead use fmtTreeItemType. This commit was supported by the NSF-funded DataLad project.	2018-05-14 14:45:41 -04:00
Joey Hess	2fc768ce72	avoid git annex info remote buffering list of keys This leaves git annex unused --from remote still using loggedKeysFor and buffering more than ought to be necessary, but I can't see a way to improve that.	2018-04-26 16:13:05 -04:00
Joey Hess	bea0ad220a	avoid --all buffering list of all keys In Annex.Branch.branch, the (++) was killing laziness. Rewrote so it streams lazily. filterM also kills laziness, so made loggedKeys use a Unchecked type, and check if the key is dead in the seek loop. Note that loggedKeysFor still buffers, so git-annex info <remote> and git-annex unused --from remote still use more memory than necessary. Also removed some unused functions from Annex.Journal.	2018-04-26 16:00:20 -04:00
Joey Hess	9807e5bead	fix webapp opening in termux Open real url not html shim since android and file:// urls is a nasty kettle of fish. This commit was sponsored by John Pellman on Patreon.	2018-04-25 14:38:42 -04:00
Joey Hess	89e1a05a8f	Fix mangling of --json output of utf-8 characters when not running in a utf-8 locale As long as all code imports Utility.Aeson rather than Data.Aeson, and no Strings that may contain utf-8 characters are used for eg, object keys via T.pack, this is guaranteed to fix the problem everywhere that git-annex generates json. It's kind of annoying to need to wrap ToJSON with a ToJSON', especially since every data type that has a ToJSON instance has to be ported over. However, that only took 50 lines of code, which is worth it to ensure full coverage. I initially tried an alternative approach of a newtype FileEncoded, which had to be used everywhere a String was fed into aeson, and chasing down all the sites would have been far too hard. Did consider creating an intentionally overlapping instance ToJSON String, and letting ghc fail to build anything that passed in a String, but am not sure that wouldn't pollute some library that git-annex depends on that happens to use ToJSON String internally. This commit was supported by the NSF-funded DataLad project.	2018-04-16 16:21:21 -04:00
Joey Hess	f56594af9e	finish fixing inverted Ord for TrustLevel Flipped all comparisons. When a TrustLevel list was wanted from Trusted downwards, used Down to compare it in that order. This commit was sponsored by mo on Patreon.	2018-04-13 15:17:54 -04:00
Joey Hess	a0e4b9678b	fix inverted Ord for TrustLevel (intermediate commit) This commit removes the Ord and Enum instances, commenting out all code that depends on them, to make sure that all code effected by the inversion fix has been identified. (Assuming no ifdefs involve TrustLevel.) The next commit will fix up all the identified code.	2018-04-13 14:50:14 -04:00
Joey Hess	1831cc4a7d	remove unused import	2018-04-13 14:43:29 -04:00
Joey Hess	64980db7d9	move: Avoid drops that make bad situations worse, but otherwise allow See the big comment at the bottom of Command.Drop for the full details. (The --safe/--unsafe options were never released.) This commit was sponsored by Jake Vosloo on Patreon.	2018-04-13 14:36:43 -04:00
Joey Hess	4b8c289154	display addurl url not file The file gets displayed after download is complete, so this is the simplest way to avoid redundant display.	2018-04-13 01:37:46 -04:00
Joey Hess	4cda021acc	remove redundant meter This was stacked with another one, resulting in an extra newline	2018-04-13 01:23:09 -04:00
Joey Hess	b4a2bcaf4c	add missing newline between importfeed and subsequent addurl got lost when wget was eliminated	2018-04-13 01:12:22 -04:00
Joey Hess	af8546990d	move: --safe/--unsafe and potential drop race fix move: Added --safe option, which makes move honor numcopies settings. Also --unsafe enables the default behavior, anticipating that the default may one day change. This commit was sponsored by Ethan Aubin.	2018-04-09 16:20:10 -04:00
Joey Hess	ae530f043e	disentagle copy and move option parsing	2018-04-09 14:38:46 -04:00
Joey Hess	0106752db2	refactor FromToHereOptions	2018-04-09 14:29:28 -04:00
Joey Hess	c34152777b	Use http-conduit for url downloads by default, annex.web-options enables curl * For url downloads, git-annex now defaults to using a http library, rather than wget or curl. But, if annex.web-options is set, it will use curl. To use the .netrc file, run: git config annex.web-options --netrc * git-annex no longer uses wget (and wget is no longer shipped with git-annex builds). Note that curl is always run in silent mode, since the new API for download has a MeterUpdate and doesn't make way for curl progress output. It might be worth writing a parser for curl's progress output to update the meter when using it, but I didn't bother with this edge case for now. This commit was supported by the NSF-funded DataLad project.	2018-04-06 17:36:20 -04:00
Joey Hess	6cb5b7294f	info: Changed sorting of numcopies stats table, so it's ordered by the variance from the desired number of copies. Compare these... numcopies stats: numcopies -1: 1986 numcopies +0: 1170 numcopies -2: 769 numcopies +1: 716 numcopies -4: 696 numcopies -3: 485 numcopies -6: 230 numcopies -5: 111 numcopies -7: 91 numcopies -9: 9 numcopies stats: numcopies +1: 716 numcopies +0: 1170 numcopies -1: 1986 numcopies -2: 769 numcopies -3: 485 numcopies -4: 696 numcopies -5: 111 numcopies -6: 230 numcopies -7: 91 numcopies -9: 9 I feel that the former is a jumbled mess that doesn't tell much overall, while the second shows pretty clearly that most files are within 1 degree of the desired number of copies, with some outliers without enough.	2018-04-05 14:54:39 -04:00
Joey Hess	817ebb5765	info: Added "combined size of repositories containing these files" stat when run on a directory This commit was sponsored by andrea rota.	2018-04-05 14:44:58 -04:00
Joey Hess	9b98d3f630	better HTTP connection reuse Enable HTTP connection reuse across multiple files, when git-annex uses http-conduit. Before, a new Manager was created each time Utility.Url used it. Now, a single Manager gets created the first time, so connections are reused. Doesn't help when external programs are used for url download, but does speed up addurl --fast, fsck --from web, etc. Testing fsck --fast --from web with 3 files, over high-latency satellite internet, it sped up from 19.37s to 14.96s. This commit was supported by the NSF-funded DataLad project.	2018-04-04 15:39:40 -04:00
Joey Hess	2ec07bc29f	Avoid running annex.http-headers-command more than once.	2018-04-04 15:15:08 -04:00
Joey Hess	46d4316954	implement annex.retry et al Added annex.retry, annex.retry-delay, and per-remote versions to configure transfer retries. This commit was supported by the NSF-funded DataLad project.	2018-03-29 13:04:07 -04:00
Joey Hess	ae75eb06bc	exporttree support for adb special remote This commit was sponsored by Michael Magin.	2018-03-27 16:28:41 -04:00
Joey Hess	ed81762c86	avoid compiler warning add type sig so it's clear createtfile returns unit	2018-03-15 13:21:32 -04:00
Joey Hess	10d3b7fc62	Fix reversion introduced in 6.20171214 that caused concurrent transfers to incorrectly fail with "transfer already in progress". Avoid creating transfer info file before transfer lock is created and locked. The wrong order for one thing caused transfer info to be overwritten when a transfer was already in progress. But worse, it caused checkTransfer to see the transfer info, and so lock the transfer lock in order to verify the transfer was not in progress. Which in a concurrent situation, prevented the transferrer from locking the transfer lock, so it failed with "transfer already in progress". Note that the transferinfo command does not lock the transfer lock before creating the transfer info. But, that's only run after recvkey is running, and recvkey does lock the transfer lock, so that seems more or less ok. (Other than being a super complicated legacy mess that the P2P code has mostly obsoleted now.) This commit was supported by the NSF-funded DataLad project.	2018-03-14 18:55:34 -04:00
Joey Hess	31e1adc005	deal with unlocked files P2P protocol version 1 adds VALID\|INVALID after DATA; INVALID means the file was detected to change content while it was being sent and so we may not have received the valid content of the file. Added new MustVerify constructor for Verification, which forces verification even when annex.verify=false etc. This is used when INVALID and in protocol version 0. As well as changing git-annex-shell p2psdio, this makes git-annex tor remotes always force verification, since they don't yet use protocol version 1. Previously, annex.verify=false could skip verification when using tor remotes, and let bad data into the repository. This commit was sponsored by Jack Hill on Patreon.	2018-03-13 14:27:14 -04:00
Joey Hess	e16b069331	use total size from DATA Noticed that getting a key whose size is not known resulted in a progress display that didn't include the percent complete. Fixed for P2P by making the size sent with DATA be used to update the meter's total size. In order for rateLimitMeterUpdate to also learn the total size, had to make it be passed the Meter, and some other reorg in Utility.Metered was also done so that --json-progress can construct a Meter to pass to rateLimitMeterUpdate. When the fallback rsync is done, the progress display still doesn't include the percent complete. Only way to fix that seems to be to let rsync display its output again, but that would conflict with git-annex's own progress meter, which is also being displayed. This commit was sponsored by Henrik Riomar on Patreon.	2018-03-12 21:46:58 -04:00
Joey Hess	596af7cbc4	move protocol version stuff to the Net free monad Needs to be in Net not Local, so that Net actions can take the protocol version into account. This commit was sponsored by an anonymous bitcoin donor.	2018-03-12 15:20:51 -04:00
Joey Hess	c81768d425	version the P2P protocol Unfortunately ReceiveMessage didn't handle unknown messages the way it was documented to; client sending VERSION would cause the server to return an ERROR and hang up. Fixed that, but old releases of git-annex use the P2P protocol for tor and will still have that behavior. So, version is not negotiated for Remote.P2P connections, only for Remote.Git connections, which will support VERSION from their first release. There will need to be a later flag day to change Remote.P2P; left a commented out line that is the only thing that will need to be changed then. Version 1 of the P2P protocol is not implemented yet, but updated the docs for the DATA change that will be allowed by that version. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2018-03-12 14:36:35 -04:00
Joey Hess	6a59bc4845	use P2P protocol for drop Not yet used for everything else, but this is enough to verify that it works, and do some benchmarking. Some bugfixes included, which got it working. Also fallback to old actions has been verified to work correctly. Benchmarked dropping one thousand files from a ssh remote on localhost. Using the old git-annex 40.867 seconds. With the P2P protocol 9.905 seconds! This commit was sponsored by Jochen Bartl on Patreon.	2018-03-08 16:56:17 -04:00
Joey Hess	c036a380b2	p2p ssh connection pools Much like Remote.P2P, there's a pool of connections to a peer, in order to support concurrent operations. Deals with old git-annex-ssh on the remote that does not support p2pstdio, by only trying once to use it, and remembering if it's not supported. Made p2pstdio send an AUTH_SUCCESS with its uuid, which serves the dual purposes of something to detect to see that the connection is working, and a way to verify that it's connected to the right uuid. (There's a redundant uuid check since the uuid field is sent by git_annex_shell, but I anticipate that being removed later when the legacy git-annex-shell stuff gets removed.) Not entirely happy with Remote.Git.runSsh's behavior when the proto action fails. Running the fallback will work ok, but what will we do when the fallbacks later get removed? It might be better to try to reconnect, in case the connection got closed. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-03-08 15:11:31 -04:00
Joey Hess	6ddfa9807b	implemented git-annex-shell p2pstdio Not yet used by git-annex, but this will allow faster transfers etc than using individual ssh connections and rsync. Not called git-annex-shell p2p, because git-annex p2p does something else and I don't want two subcommands with the same name between the two for sanity reasons. This commit was sponsored by Øyvind Andersen Holm.	2018-03-07 15:38:01 -04:00
Joey Hess	f4103744c3	make sure that lockContentShared is always paired with an inAnnex check lockContentShared had a screwy caveat that it didn't verify that the content was present when locking it, but in the most common case, eg indirect mode, it failed to lock when the content is not present. That led to a few callers forgetting to check inAnnex when using it, but the potential data loss was unlikely to be noticed because it only affected direct mode I think. Fix data loss bug when the local repository uses direct mode, and a locally modified file is dropped from a remote repsitory. The bug caused the modified file to be counted as a copy of the original file. (This is not a severe bug because in such a situation, dropping from the remote and then modifying the file is allowed and has the same end result.) And, in content locking over tor, when the remote repository is in direct mode, it neglected to check that the content was actually present when locking it. This could cause git annex drop to remove the only copy of a file when it thought the tor remote had a copy. So, make lockContentShared do its own inAnnex check. This could perhaps be optimised for direct mode, to avoid the check then, since locking the content necessarily verifies it exists there, but I have not bothered with that. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2018-03-07 14:23:52 -04:00
Joey Hess	ba53f60801	refactor	2018-03-06 15:14:53 -04:00
Joey Hess	db057dcff0	fix sync bug in direct mode sync: Fix bug that prevented pulling changes into direct mode repositories that were committed to remotes using git commit rather than git-annex sync. This commit was supported by the NSF-funded DataLad project.	2018-02-26 14:10:03 -04:00
Joey Hess	cb3b73df6c	importfeed: Fix a failure when downloading with youtube-dl and the destination subdirectory does not exist yet. Noticed while running this (which a user posted in a comment they deleted for some reason): git-annex importfeed https://vimeo.com/logiingimars/videos/rss The filename that youtube-dl suggests included a subdirectory, which didn't exist, so renaming to it failed. This commit was sponsored by mo on Patreon.	2018-02-22 13:20:19 -04:00
Joey Hess	6583448bab	add --json-error-messages (not yet implemented) Added --json-error-messages option, which includes error messages in the json output, rather than outputting them to stderr. The actual rediretion of errors is not implemented yet, this is only the docs and option plumbing. This commit was supported by the NSF-funded DataLad project.	2018-02-19 14:32:15 -04:00
Joey Hess	42ba888875	optimise for case where there are no required contents Avoid reading location log in this case.	2018-02-08 14:16:00 -04:00
Joey Hess	7f5c6a28a6	fsck: Warn when required content is not present in the repository that requires it. This commit was sponsored by Jack Hill on Patreon.	2018-02-08 14:08:41 -04:00
Joey Hess	cfbfb3ab9a	inprogress: Avoid showing failures for files not in progress.	2018-01-24 20:43:19 -04:00

... 3 4 5 6 7 ...

2374 commits