git-annex

Author	SHA1	Message	Date
Joey Hess	41edf73789	listImportableContents filtering to wanted files This could in theory allow importing subsets of files with less memory use. Rather than building up a big import list and then filtering it to a smaller list of wanted files, support optionally filtering wanted files first. So far, the directory special remote implements it and will probably use less memory. (Since dirContentsRecursiveSkipping does lazy streaming.) Implementation in Remote.S3 is incomplete and fails to compile. Bit of a mess with ResourceT needing to use Annex. Also, in Remote.S3, filtering is not done for old versions. And mkImportableContentsUnversioned is doing now redundant work to filterwanted.	2023-12-20 15:55:09 -04:00
Joey Hess	d7ca716759	response	2023-12-20 13:12:56 -04:00
Joey Hess	da5726e790	Merge branch 'master' of ssh://git-annex.branchable.com	2023-12-20 12:52:46 -04:00
jkniiv	6c0259018a	close bug as notabug due to user error	2023-12-19 23:12:21 +00:00
jkniiv	191dde2857	Added a comment: my report was actually a User Failure on my part	2023-12-19 23:02:16 +00:00
unqueued	09acfef0b6	Added a comment	2023-12-19 18:50:08 +00:00
Joey Hess	c64db46b7f	refactor	2023-12-18 21:35:00 -04:00
lemondata	19edfc69a7		2023-12-19 00:17:00 +00:00
Joey Hess	9a67ed0f10	importtree: support preferred content expressions needing keys When importing from a special remote, support preferred content expressions that use terms that match on keys (eg "present", "copies=1"). Such terms are ignored when importing, since the key is not known yet. When "standard" or "groupwanted" is used, the terms in those expressions also get pruned accordingly. This does allow setting preferred content to "not (copies=1)" to make a special remote into a "source" type of repository. Importing from it will import all files. Then exporting to it will drop all files from it. In the case of setting preferred content to "present", it's pruned on import, so everything gets imported from it. Then on export, it's applied, and everything in it is left on it, and no new content is exported to it. Since the old behavior on these preferred content expressions was for importtree to error out, there's no backwards compatability to worry about. Except that sync/pull/etc will now import where before it errored out.	2023-12-18 16:27:59 -04:00
Joey Hess	0e161a7404	comment	2023-12-18 13:56:08 -04:00
Joey Hess	93e0810ad5	comment	2023-12-18 13:49:53 -04:00
Joey Hess	f79685f05e	fix a typo	2023-12-18 13:40:23 -04:00
Joey Hess	0a9c5724c1	Merge branch 'master' of ssh://git-annex.branchable.com	2023-12-18 12:37:57 -04:00
Atemu	4dbbc45b4d	Added a comment	2023-12-18 12:08:08 +00:00
nobodyinperson	526545bf48	Added a comment: numcopies is no the target	2023-12-17 19:04:41 +00:00
Atemu	43262855e2		2023-12-17 16:28:19 +00:00
oadams	7c108335eb	Added a comment	2023-12-16 21:37:22 +00:00
Joey Hess	e777ad9e87	Merge branch 'master' of ssh://git-annex.branchable.com	2023-12-16 17:07:11 -04:00
aaa	b09c85bf1a	Added a comment: Key permissions	2023-12-12 22:29:24 +00:00
jkniiv	986b9caa80	Added a comment	2023-12-12 19:25:17 +00:00
Joey Hess	9f17383c00	Merge branch 'master' of ssh://git-annex.branchable.com	2023-12-12 10:49:46 -04:00
imlew	488ffce640	Added a comment	2023-12-12 13:36:21 +00:00
imlew	261bd2af55	Added a comment	2023-12-12 13:32:17 +00:00
nobodyinperson	7aebfd6068	Added a comment	2023-12-12 12:44:38 +00:00
imlew	bee99dc1a4	Added a comment	2023-12-12 11:56:06 +00:00
Joey Hess	eb59da9dd2	Lower precision of timestamps in git-annex branch This can reduce the size of the branch by up to 8%. My test was running git-annex add 1000 times on one file each. Lots of different high-resolution timestamps were recorded before and eliminating those, after packing, the git repo was 8% smaller. Due to the use of vector clocks, high resolution timestamps are not necessary to make clear which information is most recent when eg, a value is changed repeatedly in the same second. In such a case, the vector clock will be advanced to the next second after the last modification. For example, running git-annex numcopies 1; git-annex numcopies 2 The first will record the current second, while the next records the second after that even if it runs in the same second. As for conflicting information written to two different clones of the repository, this will make git-annex sometimes pick information that was written earlier in a second over information written later in the same second. Usually git-annex does not write conflicting information, but there are some cases where it could. Eg, storing an object on a remote can update the remote state log with some state. If two repos both store the same object, and end up storing different remote state for some reason, this can result in one that ran a tiny bit later winning. Such a situation seems unlikely to be user visible. And a small amount of clock skew could already result in such things. The only case I can think of where this might be a user visible change is if a configuration command like git-annex numcopies is being run in 2 clones of a repository on the same machine at very close to the same time. Then the user will know which they ran last, and git-annex won't. If that did become a problem, this could be dialed back to eg log milliseconds with still some space saving.	2023-12-11 15:04:06 -04:00
Joey Hess	86dbe9a825	migrate: support adding size back to URL keys migrate: Support adding size to URL keys that were added with --relaxed, by running eg: git-annex migrate --backend=URL foo Since url keys cannot be generated, that used to fail. Make it notice that the backend is not changed, and just get the size of the content. Sponsored-by: Brock Spratlen on Patreon	2023-12-08 16:22:14 -04:00
Joey Hess	cb9bb2027c	update for distributed migration	2023-12-08 14:39:38 -04:00
Joey Hess	60d00fdd33	Merge branch 'master' of ssh://git-annex.branchable.com	2023-12-08 14:26:26 -04:00
Joey Hess	362a2808a5	split out todo for special remotes and close the main todo	2023-12-08 14:26:08 -04:00
Joey Hess	76e11e4458	Merge branch 'master' into distributedmigration	2023-12-08 14:18:23 -04:00
Joey Hess	257f01729c	distributed migration for pull and sync --content pull, sync: When operating on content, automatically hard link objects that have been migrated. Added annex.syncmigrations config that can be set to false to prevent pull and sync from migrating object content. I think that true is a good default for this config, because it avoids users having to re-download migrated content or learning about migration. But, some users will surely not like it, whether because it does take some time (especially for the first git-annex branch scan when there is a long history), or because they want to deal with it manually, or because their filesystem doesn't support hard links and they don't want it to copy objects. Sponsored-by: k0ld on Patreon	2023-12-08 14:18:18 -04:00
Joey Hess	4ed71b34de	migrate --apply And avoid migrate --update/--aply migrating when the new key was already present in the repository, and got dropped. Luckily, the location log allows distinguishing from the new key never having been present! That is mostly useful for --apply because otherwise dropped files would keep coming back until the old objects were reaped as unused. But it seemed to make sense to also do it for --update. for consistency in edge cases if nothing else. One case where --update can use it is when one branch got migrated earlier, and we dropped the file, and now another branch has migrated the same file. Sponsored-by: Jack Hill on Patreon	2023-12-08 13:23:46 -04:00
Joey Hess	51b974d9f0	skip distributed migration to insecure key when annex.securehashesonly is set This only avoids extra work and a warning messsage. It seems likely that in such a situation, the user does not want migrations to insecure hashes, and so best to ignore them as much as possible. If the user merges a branch that switches annexed files to an insecure hash, they will notice that the file contents are unavailable, and git-annex get will tell them the problem then. So it does not seem useful to have migrate --update also complain about it.	2023-12-08 12:41:50 -04:00
jkniiv	9189c41dde	report on the 'Production' build flag not producing a binary that passes the test suite	2023-12-08 16:38:12 +00:00
Joey Hess	b65379a107	fix missing space in warning message	2023-12-08 12:36:33 -04:00
Joey Hess	30c2728d65	always verify content in distributed migration doc/todo/distributed_migration.mdwn discusses security of distributed migration, and this was identified as necessary to do.	2023-12-07 20:05:42 -04:00
Joey Hess	62ce56c4ea	display filenames in migrate --update Have to go to a lot of bother to find them, but I think it's worth it for usability. Sponsored-by: Luke T. Shumaker on Patreon	2023-12-07 18:00:09 -04:00
Joey Hess	abea01d9e0	migrate --update fully working Could use some more testing. When the old key is not present, Command.ReKey.linkKey' will return False, so this handles that case ok. But, I do wonder if distributed migration may need to deal with the old key getting copied into the repository later. In that situation, re-running migrate --update won't link it to the new key. It may be that some users will need that. They can delete .git/annex/migrate.log and run it again, but that is not a good user interface. Maybe either have a way to re-run all distributed migrations, or record migrations in a database and scan the db to find migrations to do in a future run? Sponsored-by: Kevin Mueller on Patreon	2023-12-07 17:27:51 -04:00
Joey Hess	7c7c9912c1	migrate --update gets keys The git log is outputting the diff, but this only looks at the new files. When we have a new file, we can get the old filename by just replacing "new" with "old". And then use branchFileRef to refer to it allows catting the old key. While this does have to skip past the old files in the diff, it's still faster than calling git diff separately. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-12-07 17:25:56 -04:00
kolam	bae20e09cb	Added a comment	2023-12-07 20:16:29 +00:00
kolam	5aee7ef234	removed	2023-12-07 20:13:13 +00:00
kolam	319e3b6252	Added a comment	2023-12-07 20:12:31 +00:00
kolam	acd942f6d8	Added a comment	2023-12-07 20:08:52 +00:00
Joey Hess	f1ce15036f	started migrate --update This is most of the way there, but not quite working. The layout of migrate.tree/ needs to be changed to follow this approach. git log will list all the files in tree order, so the new layout needs to alternate old and new keys. Can that be done? git may not document tree order, or may not preserve it here. Alternatively, change to using git log --format=raw and extract the tree header from that, then use git diff --raw $tree:migrate.tree/old $tree:migrate.tree/new That will be a little more expensive, but only when there are lots of migrations. Sponsored-by: Joshua Antonishen on Patreon	2023-12-07 15:50:52 -04:00
kolam	625deffec4		2023-12-07 18:30:40 +00:00
kolam	2aa13feb3d		2023-12-07 14:08:54 +00:00
https://esgf-node.llnl.gov/esgf-idp/openid/mvhulten	60e9bb005e	Added a comment	2023-12-07 12:30:25 +00:00
nobodyinperson	df62843f64	Added a comment: Similar to initremote type=git	2023-12-07 08:31:09 +00:00
kolam	84d0bd9969		2023-12-06 23:46:49 +00:00

1 2 3 4 5 ...

44152 commits