git-annex

Author	SHA1	Message	Date
yarikoptic	28029d6668	original report / question	2024-06-18 13:57:23 +00:00
Joey Hess	e2fd2ee2bd	update	2024-06-17 09:31:44 -04:00
Joey Hess	3970bbb03b	Merge branch 'master' into proxy	2024-06-17 09:29:34 -04:00
Joey Hess	64afbb0b93	don't count clusters as copies, continued Handled limitCopies, as well as everything using fromNumCopies and fromMinCopies. This should be everything, probably. Note that, git-annex info displays a count of repositories, which still includes cluster. I think that's ok. It would be possible to filter out clusters there, but to the user they're pretty much just another repository. The numcopies displayed by eg `git-annex info .` does not include clusters.	2024-06-16 15:14:53 -04:00
Joey Hess	780367200b	remove dead nodes when loading the cluster log This is to avoid inserting a cluster uuid into the location log when only dead nodes in the cluster contain the content of a key. One reason why this is necessary is Remote.keyLocations, which excludes dead repositories from the list. But there are probably many more. Implementing this was challenging, because Logs.Location importing Logs.Cluster which imports Logs.Trust which imports Remote.List resulted in an import cycle through several other modules. Resorted to making Logs.Location not import Logs.Cluster, and instead it assumes that Annex.clusters gets populated when necessary before it's called. That's done in Annex.Startup, which is run by the git-annex command (but not other commands) at early startup in initialized repos. Or, is run after initialization. Note that is Remote.Git, it is unable to import Annex.Startup, because Remote.Git importing Logs.Cluster leads the the same import cycle. So ensureInitialized is not passed annexStartup in there. Other commands, like git-annex-shell currently don't run annexStartup either. So there are cases where Logs.Location will not see clusters. So it won't add any cluster UUIDs when loading the log. That's ok, the only reason to do that is to make display of where objects are located include clusters, and to make commands like git-annex get --from treat keys as being located in a cluster. git-annex-shell certainly does not do anything like that, and I'm pretty sure Remote.Git (and callers to Remote.Git.onLocalRepo) don't either.	2024-06-16 14:39:44 -04:00
beryllium@5bc3c32eb8156390f96e363e4ba38976567425ec	f707baf908	Added a comment	2024-06-15 07:37:07 +00:00
beryllium@5bc3c32eb8156390f96e363e4ba38976567425ec	0062ac1b49	Added a comment: Grafting? a special remote for tuned migration	2024-06-15 00:57:27 +00:00
Joey Hess	b3370a191c	insert cluster UUIDs when loading location logs, and omit when saving Inline isClusterUUID for speed.	2024-06-14 18:06:28 -04:00
Joey Hess	570ceffe8d	broke out initcluster One benefit of this is that a typo in annex-cluster-node config won't init a new cluster. Also it gets the cluster description set and is consistent with initremote.	2024-06-14 17:23:11 -04:00
Joey Hess	846903e9bb	update todo list for this month whew that's gonna be a lot	2024-06-14 15:23:43 -04:00
Joey Hess	bbf261487d	add git-annex updatecluster command Seems to work fine, making the right changes to the git-annex branch.	2024-06-14 15:02:01 -04:00
Joey Hess	2844230dfe	add git configs for clusters	2024-06-14 12:20:17 -04:00
Joey Hess	de1d795dfe	cache getClusters in Annex state	2024-06-14 11:16:01 -04:00
Joey Hess	9895e6659d	update	2024-06-13 19:08:04 -04:00
Joey Hess	6d59118b29	unique uuid namespace for clusters	2024-06-13 17:56:53 -04:00
Joey Hess	aa56d433d5	implement cluster.log Not used yet. (Or tested.) I did consider making the log start with the uuid of the node, followed by the cluster uuid (or uuids). That would perhaps mean a smaller write to the git-annex branch when adding a node, but overall the log file would be larger, and it will be read and cached near to startup on most git-annex runs.	2024-06-13 16:00:58 -04:00
Joey Hess	d16e19b8ca	comment	2024-06-13 14:30:32 -04:00
Joey Hess	ebebc04273	comment	2024-06-13 13:40:04 -04:00
Joey Hess	6ea78ec867	partial reproducer	2024-06-13 13:03:38 -04:00
Joey Hess	01f5015f30	update	2024-06-13 11:44:39 -04:00
Joey Hess	5e0acd1842	more cluster thoughts	2024-06-13 10:48:31 -04:00
Joey Hess	90e3b8b44f	avoided the strangeness of the cluster's proxy location tracking being wrong	2024-06-13 10:34:19 -04:00
Joey Hess	ffd7c745ff	update	2024-06-13 06:49:36 -04:00
Joey Hess	d8daabe9ec	Merge branch 'master' of ssh://git-annex.branchable.com	2024-06-13 06:44:22 -04:00
Joey Hess	22a329c57e	copied over some changes from proxy branch	2024-06-13 06:43:59 -04:00
Joey Hess	3cc48279ad	more thoughts on clusters	2024-06-13 06:41:42 -04:00
Joey Hess	555d7e52d3	more thoughts on clusters	2024-06-12 17:30:55 -04:00
Joey Hess	0ebb107974	update	2024-06-12 15:21:23 -04:00
Joey Hess	46a1fcb3ea	avoid git syncing with instantiate proxied remotes These remotes have no url configured, so git pull and push will fail. git-annex sync --content etc can still sync with them otherwise. Also, avoid git syncing twice with the same url. This is for cases where a proxied remote has been manually configured and so does have a url. Or perhaps proxied remotes will get configured like that automatically later.	2024-06-12 15:10:03 -04:00
Joey Hess	a986a20034	designing clusters	2024-06-12 14:57:26 -04:00
Joey Hess	e70e3473b3	on cycles	2024-06-12 13:52:17 -04:00
Joey Hess	44464e4410	update	2024-06-12 12:37:14 -04:00
Joey Hess	67d1e2a459	updates	2024-06-12 12:02:25 -04:00
m.risse@77eac2c22d673d5f10305c0bade738ad74055f92	c855b50f04		2024-06-12 15:42:42 +00:00
Joey Hess	dfdda95053	proxy updates location tracking information This does mean a redundant write to the git-annex branch. But, it means that two clients can be using the same proxy, and after one sends a file to a proxied remote, the other only has to pull from the proxy to learn about that. It does not need to pull from every remote behind the proxy (which it couldn't do anyway as git repo access is not currently proxied). Anyway, the overhead of this in git-annex branch writes is no worse than eg, sending a file to a repository where git-annex assistant is running, which then sends the file on to a remote, and updates the git-annex branch then. Indeed, when the assistant also drops the local copy, that results in more writes to the git-annex branch.	2024-06-12 11:37:14 -04:00
Joey Hess	96853cd833	finish P2P protocol proxying CONNECT is not supported by git-annex-shell p2pstdio, but for proxying to tor-annex remotes, it will be supported, and will make a git pull/push to a proxied remote work the same with that as it does over ssh, eg it accesses the proxy's git repo not the proxied remote's git repo. The p2p protocol docs say that NOTIFYCHANGES is not always supported, and it looked annoying to implement it for this, and it also seems pretty useless, so make it be a protocol error. git-annex remotedaemon will already be getting change notifications from the proxy's git repo, so there's no need to get additional redundant change notifications for proxied remotes that would be for changes to the same git repo.	2024-06-12 10:40:51 -04:00
Joey Hess	f98605bce7	a local git remote cannot proxy Prevent listProxied from listing anything when the proxy remote's url is a local directory. Proxying does not work in that situation, because the proxied remotes have the same url, and so git-annex-shell is not run when accessing them, instead the proxy remote is accessed directly. I don't think there is any good way to support this. Even if the instantiated git repos for the proxied remotes somehow used an url that caused it to use git-annex-shell to access them, planned features like `git-annex copy --to proxy` accepting a key and sending it on to nodes behind the proxy would not work, since git-annex-shell is not used to access the proxy. So it would need to use something to access the proxy that causes git-annex-shell to be run and speaks P2P protocol over it. And we have that. It's a ssh connection to localhost. Of course, it would be possible to take ssh out of that mix, and swap in something that does not have encryption overhead and authentication complications, but otherwise behaves the same as ssh. And if the user wants to do that, GIT_SSH does exist.	2024-06-12 10:16:04 -04:00
Joey Hess	c6e0710281	proxying to local git remotes works This just happened to work correctly. Rather surprisingly. It turns out that openP2PSshConnection actually also supports local git remotes, by just running git-annex-shell with the path to the remote. Renamed "P2PSsh" to "P2PShell" to make this clear.	2024-06-12 10:10:11 -04:00
Joey Hess	178da0dc99	Merge branch 'master' into proxy	2024-06-12 09:49:30 -04:00
Joey Hess	345494e3b4	expanding on the exporttree=yes design	2024-06-12 09:43:59 -04:00
yarikoptic	c6f2a5d372	TODO for log --key	2024-06-12 13:20:29 +00:00
Joey Hess	5beaffb412	proxying PUT now working The almost identical code duplication between relayDATA and relayDATA' is very annoying. I tried quite a few things to parameterize them, but the type checker is having fits when I try it.	2024-06-11 16:56:52 -04:00
Joey Hess	ed4fda098b	todo	2024-06-11 15:15:58 -04:00
Joey Hess	a2f4a8eddf	proxying GET now working Memory use is small and constant; receiveBytes returns a lazy bytestring and it does stream. Comparing speed of a get of a 500 mb file over proxy from origin-origin, vs from the same remote over a direct ssh: joey@darkstar:~/tmp/bench/client>/usr/bin/time git-annex get bigfile --from origin-origin get bigfile (from origin-origin...) ok (recording state in git...) 1.89user 0.67system 0:10.79elapsed 23%CPU (0avgtext+0avgdata 68716maxresident)k 0inputs+984320outputs (0major+10779minor)pagefaults 0swaps joey@darkstar:~/tmp/bench/client>/usr/bin/time git-annex get bigfile --from direct-ssh get bigfile (from direct-ssh...) ok 1.79user 0.63system 0:10.49elapsed 23%CPU (0avgtext+0avgdata 65776maxresident)k 0inputs+1024312outputs (0major+9773minor)pagefaults 0swaps So the proxy doesn't add much overhead even when run on the same machine as the client and remote. Still, piping receiveBytes into sendBytes like this does suggest that the proxy could be made to use less CPU resouces by using `sendfile()`.	2024-06-11 15:09:43 -04:00
Joey Hess	09b5e53f49	set annex.uuid in proxy's Repo getRepoUUID looks at that, and was seeing the annex.uuid of the proxy. Which caused it to unncessarily set the git config. Probably also would have led to other problems.	2024-06-11 13:40:50 -04:00
yarikoptic	b96ff82871	Added a comment	2024-06-11 17:36:51 +00:00
Joey Hess	657a91527a	update	2024-06-11 13:22:03 -04:00
Joey Hess	dd429ba8fe	Merge branch 'master' of ssh://git-annex.branchable.com	2024-06-11 13:08:45 -04:00
Joey Hess	5bb7f8cd64	Merge branch 'master' into proxy	2024-06-11 13:08:23 -04:00
Joey Hess	d2e3c5c89f	update	2024-06-11 13:07:53 -04:00
NewUser	124c1313bb		2024-06-11 13:31:01 +00:00
Joey Hess	501d65eeab	started implementing git-annex-shell proxy So far, it negotiates VERSION with both parties. This is a tricky dance. Untested.	2024-06-10 18:01:36 -04:00
Joey Hess	7b1548dbfa	correct AUTH-SUCCESS and AUTH-FAILURE It's AUTH_SUCCESS internally in git-annex, but the line based serialization uses AUTH-SUCCESS.	2024-06-10 15:06:27 -04:00
Joey Hess	649b87bedd	Merge branch 'master' into proxy	2024-06-10 14:26:18 -04:00
Joey Hess	d2576e5f1a	git-annex-shell: accept uuid of remote that proxying is enabled for For NotifyChanges and also for the fallthrough case where git-annex-shell passes a command off to git-shell, proxying is currently ignored. So every remote that is accessed via a proxy will be treated as the same git repository. Every other command listed in cmdsMap will need to check if Annex.proxyremote is set, and if so handle the proxying appropriately. Probably only P2PStdio will need to support proxying. For now, everything else refuses to work when proxying. The part of that I don't like is that there's the possibility a command later gets added to the list that doesn't check proxying. When proxying is not enabled, it's important that git-annex-shell not leak information that it would not have exposed before. Such as the names or uuids of remotes. I decided that, in the case where a repository used to have proxying enabled, but no longer supports any proxies, it's ok to give the user a clear error message indicating that proxying is not configured, rather than a confusing uuid mismatch message. Similarly, if a repository has proxying enabled, but not for the requested repository, give a clear error message. A tricky thing here is how to handle the case where there is more than one remote, with proxying enabled, with the specified uuid. One way to handle that would be to plumb the proxyRemoteName all the way through from the remote git-annex to git-annex-shell, eg as a field, and use only a remote with the same name. That would be very intrusive though. Instead, I decided to let the proxy pick which remote it uses to access a given Remote. And so it picks the least expensive one. The client after all doesn't necessarily know any details about the proxy's configuration. This does mean though, that if the least expensive remote is not accessible, but another remote would have worked, an access via the proxy will fail.	2024-06-10 12:44:35 -04:00
Joey Hess	783eb8879a	notes on behavior	2024-06-10 11:07:04 -04:00
jlueters@79a910340cdff27611c6a650c108afbe2f61c5f6	daa2c6cce1		2024-06-10 14:24:34 +00:00
Joey Hess	25a6ab6f11	Avoid grafting in export tree objects that are missing They could be missing due to an interrupted git-annex at just the wrong time during a prior graft, after which the tree objects got garbage collected. Or they could be missing because of manual messing with the git-annex branch, eg resetting it to back before the graft commit. Sponsored-by: Dartmouth College's OpenNeuro project	2024-06-07 16:51:50 -04:00
Joey Hess	b32c4c2e98	atomic git-annex branch update when regrafting in transition Fix a bug where interrupting git-annex while it is updating the git-annex branch could lead to git fsck complaining about missing tree objects. Interrupting git-annex while regraftexports is running in a transition that is forgetting git-annex branch history would leave the repository with a git-annex branch that did not contain the tree shas listed in export.log. That lets those trees be garbage collected. A subsequent run of the same transition then regrafts the trees listed in export.log into the git-annex branch. But those trees have been lost. Note that both sides of `if neednewlocalbranch` are atomic now. I had thought only the True side needed to be, but I do think there may be cases where the False side needs to be as well. Sponsored-by: Dartmouth College's OpenNeuro project	2024-06-07 16:34:10 -04:00
Joey Hess	6568ba4904	Merge branch 'master' into proxy	2024-06-07 12:35:47 -04:00
Joey Hess	43ff697f25	update status and design work on proxy encryption and chunking	2024-06-07 12:35:04 -04:00
Joey Hess	a0e59c1d17	comment	2024-06-07 12:35:00 -04:00
Joey Hess	5aaa285083	Merge branch 'master' into proxy	2024-06-07 10:43:13 -04:00
Joey Hess	058726ee86	next step identified	2024-06-06 18:06:45 -04:00
Joey Hess	d59383beaf	update	2024-06-06 17:25:22 -04:00
Joey Hess	9bc4dd635c	update	2024-06-06 17:23:51 -04:00
Joey Hess	a72d0f69d0	filter out illegal remote names when reading proxy log	2024-06-06 12:51:30 -04:00
Joey Hess	d208b03e5d	Merge branch 'master' into proxy	2024-06-06 12:42:18 -04:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	1e6b4f324a	removed	2024-06-06 13:40:26 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	6274d16102	Added a comment	2024-06-06 11:23:55 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	d4993248eb	Added a comment	2024-06-06 11:23:34 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	a1e1af35af		2024-06-06 10:29:21 +00:00
nobodyinperson	6985c62a47	Added a comment	2024-06-06 09:09:03 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	7dbfb16415		2024-06-05 17:45:49 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	93b11da4db	Added a comment	2024-06-05 17:34:32 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	6b4ae7b635		2024-06-05 17:22:04 +00:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd	ca687413ef	Added a comment	2024-06-05 16:53:51 +00:00
Joey Hess	1761e971ee	status update after day 1 of new project	2024-06-04 14:55:54 -04:00
Joey Hess	f97f4b8bdb	Added updateproxy command and remote.name.annex-proxy configuration So far this only records proxy information on the git-annex branch.	2024-06-04 14:52:03 -04:00
Joey Hess	3df70c5c0c	implementation plan	2024-06-04 07:51:33 -04:00
Joey Hess	6375e3be3b	recieved funding to work on this, which comes with a schedule	2024-06-04 06:53:59 -04:00
Joey Hess	ac3fe92956	comment	2024-06-04 06:41:14 -04:00
Joey Hess	3db94f1b71	Merge branch 'master' of ssh://git-annex.branchable.com	2024-06-04 06:40:08 -04:00
Joey Hess	3be7163771	update	2024-06-04 06:40:04 -04:00
Joey Hess	5992e1729a	fixed by git release	2024-06-04 06:39:08 -04:00
nobodyinperson	c606b6a35d	Added a comment: Yes, GitLab fixed!	2024-06-04 07:38:47 +00:00
datamanager	82b891de7a	Added a comment: GitLab fixed?	2024-06-04 01:18:25 +00:00
Joey Hess	61ed0b3f03	root cause analysis	2024-06-03 13:56:43 -04:00
yarikoptic	4a48933867	Added a comment	2024-06-03 17:54:43 +00:00
Joey Hess	c382555cf8	comment	2024-06-03 12:31:55 -04:00
jkniiv	313a0285e5	a small clarification	2024-06-01 22:11:32 +00:00
jkniiv	5badd2ae4e	report on git-remote-annex on Windows not quite working	2024-06-01 21:59:27 +00:00
Joey Hess	0e96f0acd8	add news item for git-annex 10.20240531	2024-05-31 12:32:42 -04:00
Joey Hess	a51c5d1cde	some analysis	2024-05-31 11:47:59 -04:00
yarikoptic	8706a6faf1	report on git repo getting broken	2024-05-31 14:38:58 +00:00
yarikoptic	d313dc22e3	reporting that annex merge should not merge into main branch	2024-05-31 13:49:17 +00:00
Joey Hess	d8cf23ffdb	tweak	2024-05-30 13:31:49 -04:00
Joey Hess	69c9e8c11c	tweak	2024-05-30 13:30:57 -04:00
Joey Hess	19454917eb	tweak	2024-05-30 13:30:33 -04:00
Joey Hess	3a48eafce4	tweaks	2024-05-30 13:30:10 -04:00
Joey Hess	adf17f5038	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-30 13:26:44 -04:00
Joey Hess	f877afe930	tip	2024-05-30 13:26:34 -04:00
Joey Hess	0155abfba4	git-remote-annex: Support urls like annex::https://example.com/foo-repo Using the usual url download machinery even allows these urls to need http basic auth, which is prompted for with git-credential. Which opens the possibility for urls that contain a secret to be used, eg the cipher for encryption=shared. Although the user is currently on their own constructing such an url, I do think it would work. Limited to httpalso for now, for security reasons. Since both httpalso (and retrieving this very url) is limited by the usual annex.security.allowed-ip-addresses configs, it's not possible for an attacker to make one of these urls that sets up a httpalso url that opens the garage door. Which is one class of attacks to keep in mind with this thing. It seems that there could be either a git-config that allows other types of special remotes to be set up this way, or special remotes could indicate when they are safe. I do worry that the git-config would encourage users to set it without thinking through the security implications. One remote config might be safe to access this way, but another config, for one with the same type, might not be. This will need further thought, and real-world examples to decide what to do.	2024-05-30 12:24:16 -04:00
yarikoptic	d23ae92da8	Added a comment	2024-05-30 14:34:32 +00:00
yarikoptic	285a7ff3c3	Added a comment	2024-05-30 14:29:43 +00:00
Joey Hess	3f33616068	security	2024-05-29 22:55:06 -04:00
Joey Hess	efa684ab8a	todo	2024-05-29 18:21:17 -04:00
yarikoptic	f186485fab	Added a comment	2024-05-29 18:31:16 +00:00
yarikoptic	09626c8114	Added a comment: odd odd odd	2024-05-29 18:25:23 +00:00
yarikoptic	e05564c297	Added a comment: odd odd odd	2024-05-29 18:25:11 +00:00
yarikoptic	60a7dea828	get is silently stuck.	2024-05-29 18:14:44 +00:00
Joey Hess	98762a2f96	group: Added --list option Seemed to make sense to exclude groups used only by dead repositories.	2024-05-29 13:37:35 -04:00
Joey Hess	09a0552489	split off todo, comment	2024-05-29 13:16:36 -04:00
Joey Hess	14daed9db7	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-29 13:00:34 -04:00
Joey Hess	e19916f54b	add config-uuid to annex:: url for --sameas remotes And use it to set annex-config-uuid in git config. This makes using the origin special remote work after cloning. Without the added Logs.Remote.configSet, instantiating the remote will look at the annex-config-uuid's config in the remote log, which will be empty, and so it will fail to find a special remote. The added deletion of files in the alternatejournaldir is just to make 100% sure they don't get committed to the git-annex branch. Now that they contain things that definitely should not be committed.	2024-05-29 12:50:00 -04:00
derphysiker	dfb0c4683c	Added a comment	2024-05-29 06:58:16 +00:00
Joey Hess	b0ff819850	clarify which rclone special remote Now that there are several.	2024-05-28 16:56:27 -04:00
Joey Hess	06cf131ef6	document using git-remote-annex with httpalso	2024-05-28 16:52:36 -04:00
Joey Hess	bbf49c9de7	httpalso just worked, with one small issue to fix	2024-05-28 16:26:16 -04:00
Joey Hess	2106cb0fce	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-28 16:10:13 -04:00
Joey Hess	cb7f15e733	clean up man page	2024-05-28 15:29:38 -04:00
yarikoptic	4ba024cc08	Added a comment	2024-05-28 17:51:02 +00:00
Joey Hess	d2efa141bb	update	2024-05-28 13:36:27 -04:00
Joey Hess	2ffe077cc2	git-remote-annex: brought back max-git-bundles config An incremental push that gets converted to a full push due to this config results in the inManifest having just one bundle in it, and the outManifest listing every other bundle. So it actually takes up more space on the special remote. But, it speeds up clone and fetch to not have to download a long series of bundles for incremental pushes.	2024-05-28 13:28:19 -04:00
Joey Hess	e9a2e4e94d	comment	2024-05-28 13:00:54 -04:00
Joey Hess	cb9f7b5646	update	2024-05-28 12:50:54 -04:00
Joey Hess	14443fd307	update	2024-05-28 12:46:56 -04:00
Joey Hess	3318d25c65	adjust unlocked execute bit handling When building an adjusted unlocked branch, make pointer files executable when the annex object file is executable. This slows down git-annex adjust --unlock/--unlock-present by needing to stat all annex object files in the tree. Probably not a significant slowdown compared to other work they do, but I have not benchmarked. I chose to leave git-annex adjust --unlock marked as stable, even though get or drop of an object file can change whether it would make the pointer file executable. Partly because making it unstable would slow down re-adjustment, and partly for symmetry with the handling of an unlocked pointer file that is executable when the content is dropped, which does not remove its execute bit.	2024-05-28 12:39:42 -04:00
Joey Hess	1bb819f597	retitle and comment	2024-05-28 12:07:58 -04:00
Joey Hess	e19f56e7d8	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-28 10:27:50 -04:00
Joey Hess	c6669990fb	update	2024-05-28 09:19:00 -04:00
nobodyinperson	f6c0f55ad1	Added a comment: Yep, would be nice!	2024-05-28 12:18:59 +00:00
m.risse@77eac2c22d673d5f10305c0bade738ad74055f92	bab6d3e58f	Added a comment: Re: worktree provisioning	2024-05-28 12:06:39 +00:00
Joey Hess	c2483f6e6d	update	2024-05-27 22:44:35 -04:00
derphysiker	f90511ec43		2024-05-27 18:33:44 +00:00
Joey Hess	0975e792ea	git-remote-annex: Fix error display on clone cleanupInitialization gets run when an exception is thrown, so needs to avoid throwing exceptions itself, as that would hide the error message that the user needs to see.	2024-05-27 13:28:05 -04:00
Joey Hess	a766475d14	split out a todo	2024-05-27 12:50:46 -04:00
Joey Hess	e64add7cdf	git-remote-annex: support importrree=yes remotes When exporttree=yes is also set. Probably it would also be possible to support ones with only importtree=yes, by enabling exporttree=yes for the remote only when using git-remote-annex, but let's keep this simple... I'm not sure what gets recorded in .git/annex/ state differently in the two cases that might cause a problem when doing that. Note that the full annex:: urls generated and displayed for such a remote omit the importree=yes. Which is ok, cloning from such an url uses an exporttree=remote, but the git-annex branch doesn't get written by this program, so once the real config is available from the git-annex branch, it will still function as an importree=yes remote.	2024-05-27 12:35:42 -04:00
Joey Hess	5a48f7b34e	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-27 11:16:25 -04:00
Joey Hess	6f51ba740d	comment	2024-05-27 11:16:21 -04:00
derphysiker	9442937865	Added a comment	2024-05-25 13:00:08 +00:00
Joey Hess	5e0d9c2029	comment	2024-05-24 17:32:50 -04:00
Joey Hess	4c8e57b907	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-24 17:16:39 -04:00
Joey Hess	19418e81ee	git-remote-annex: Display full url when using remote with the shorthand url	2024-05-24 17:15:31 -04:00
derphysiker	2d380f9941	Added a comment	2024-05-24 20:32:15 +00:00
Joey Hess	04a256a0f8	work around git "defense in depth" breakage with git clone checking for hooks This git bug also broke git-lfs, and I am confident it will be reverted in the next release. For now, cloning from an annex:: url wastes some bandwidth on the next pull by not caching bundles locally. If git doesn't fix this in the next version, I'd be tempted to rethink whether bundle objects need to be cached locally. It would be possible to instead remember which bundles have been seen and their heads, and respond to the list command with the heads, and avoid unbundling them agian in fetch. This might even be a useful performance improvement in the latter case. It would be quite a complication to a currently simple implementation though.	2024-05-24 15:49:53 -04:00
Joey Hess	6ccd09298b	convert srcref to a sha This fixes pushing a new ref that is the same as something already pushed. In findotherprereq, it compares two shas, which didn't work when one is actually not a sha but a ref. This is one of those cases where Sha being an alias for Ref makes it hard to catch mistakes. One of these days those need to be differentiated at the type level, but not today..	2024-05-24 15:33:35 -04:00
Joey Hess	96c66a7ca9	bug	2024-05-24 15:15:42 -04:00
Joey Hess	58301e40d2	sync with special remotes with an annex:: url Check explicitly for an annex:: url, not just any url. While no built-in special remotes set an url, except ones that can be synced with, it seems possible that some external special remote sets an url for its own use, but did not expect it to be used by git-annex sync et al. The assistant also syncs with them.	2024-05-24 14:57:29 -04:00
Joey Hess	22bf23782f	initremote, enableremote: Added --with-url to enable using git-remote-annex Also sets remote.name.fetch to a typical value, same as git remote add does.	2024-05-24 14:29:36 -04:00
Joey Hess	7d61a99da3	todo	2024-05-24 13:57:33 -04:00
Joey Hess	2670508b97	also broke git-remote-annex	2024-05-24 13:35:45 -04:00
Joey Hess	b792b128a0	verified checkprereq The case documented in its comment worked in a test push and clone.	2024-05-24 13:06:29 -04:00
Joey Hess	90580a2fad	comment	2024-05-24 12:57:11 -04:00
Joey Hess	54fa0e2f79	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-24 12:53:48 -04:00
Joey Hess	1a3c60cc8e	git-remote-annex: avoid bundle object leakage in push race or interrupted push Locally record the manifest before uploading it or any bundles, and read it on the next push. Any bundles from the push that are not included in the currently being pushed manifest will get added to the outManifest, and so eventually get deleted. This deals with an interrupted push that is not resumed and instead something else is pushed. And it deals with a push race that overwrites the manifest. Of course, this can't help if one of those situations is followed by the local repo being deleted. But that's equivilant to doing a git-annex copy of a new annexed file to a special remote and then deleting the special repo w/o pushing. In either case the special remote ends up with a object in it that git-annex doesn't know about.	2024-05-24 12:47:32 -04:00
derphysiker	fc7655324e		2024-05-23 20:49:06 +00:00
Joey Hess	4a77c77d2e	comment	2024-05-22 06:21:27 -04:00
Joey Hess	264c51b4f4	comment	2024-05-22 06:06:18 -04:00
Joey Hess	19ddbf0d74	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-22 04:26:52 -04:00
Joey Hess	4131e31f5c	PATH_MAX	2024-05-22 04:26:36 -04:00
TTTTAAAx	f332234c84		2024-05-22 06:27:30 +00:00
nobodyinperson	1a2bd28a52	Added a comment	2024-05-22 05:02:49 +00:00
datamanager	4b64964072	Added a comment	2024-05-21 23:32:34 +00:00
datamanager	01a085c27d	Added a comment	2024-05-21 23:27:17 +00:00
datamanager	d5ab807b55	Added a comment: sourcehut plays nicely	2024-05-21 22:39:28 +00:00
datamanager	5a5a4452f8	Scrub my identifying information!	2024-05-21 22:36:22 +00:00
Joey Hess	5fb307f1c5	comment	2024-05-21 17:47:55 -04:00
Joey Hess	aff6c12949	muh2	2024-05-21 17:43:09 -04:00
Joey Hess	6fe63e4615	muh	2024-05-21 17:42:16 -04:00
Joey Hess	938e714a11	bleh	2024-05-21 17:32:49 -04:00
Joey Hess	10a60183e1	guard pushEmpty	2024-05-21 12:12:44 -04:00
Joey Hess	14c79373c4	update	2024-05-21 12:05:44 -04:00
Joey Hess	be8de26b68	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-21 11:53:34 -04:00
Joey Hess	b3d7ae51f0	fix edge case where git-annex branch does not have config for enabled special remote One way this could happen is cloning an empty special remote. A later fetch would then fail.	2024-05-21 11:27:49 -04:00
Joey Hess	3e7324bbcb	only delete bundles on pushEmpty This avoids some apparently otherwise unsolveable problems involving races that resulted in the manifest listing bundles that were deleted. Removed the annex-max-git-bundles config because it can't actually result in deleting old bundles. It would still be possible to have a config that controls how often to do a full push, which would avoid needing to download too many bundles on clone, as well as needing to checkpresent too many bundles in verifyManifest. But it would need a different name and description.	2024-05-21 11:13:27 -04:00
Joey Hess	f544946b09	update	2024-05-21 10:20:30 -04:00
Joey Hess	f191f52343	force pushing also does a full push	2024-05-21 10:10:49 -04:00
Joey Hess	b042dfeb0e	emptying pushes only delete	2024-05-21 09:52:35 -04:00
Joey Hess	5d40759470	formalize problem description	2024-05-21 09:35:46 -04:00
nobodyinperson	1ab1ea0bcc	Added a comment	2024-05-21 13:25:54 +00:00
datamanager	2135514bc7	Added a comment: Not git's fault, but probably your forge's	2024-05-21 13:12:23 +00:00
nobodyinperson	04aa259cfa	Added a comment: Probably not git annex related, but a new git 'feature'	2024-05-21 10:49:58 +00:00
nobodyinperson	f0923985aa	Added a comment: Seeing this for the first time today as well	2024-05-21 10:41:08 +00:00
datamanager	d94ce5319b	A correction, and small update	2024-05-21 01:16:50 +00:00
datamanager	6280fac98b	Initial thread posting	2024-05-21 01:16:02 +00:00
Joey Hess	644ed44ec1	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-20 15:52:44 -04:00
Joey Hess	3a38520aac	avoid interrupted push leaving remote without a manifest Added a backup manifest key, which is used if the main manifest key is not present. When uploading a new Manifest, it makes sure that it never drops one key except when the other key is present. It's entirely possible for the two manifest keys to get out of sync, due to races. The main one wins when it's present, it is possible for the main one being dropped to expose the backup one, which has a different push recorded.	2024-05-20 15:41:09 -04:00
Joey Hess	594ca2fd3a	update	2024-05-20 14:52:06 -04:00
Joey Hess	34a6db4f15	improve recovery from interrupted push On push, first try to drop all outManifest keys listed in the current manifest file, which resumes from an interrupted push that didn't get a chance to delete those keys. The new manifest gets its outManifest populated with the keys that were in the old manifest, plus any of the keys that were unable to be dropped. Note that it would be possible for uploadManifest to skip dropping old keys at all. The old keys would get dropped on the next push. But it seems better to delete stuff immediately rather than waiting. And the extra work is limited to push and typically is small. A remote where dropKey always fails will result in an outManifest that grows longer and longer. It would be possible to check if the remote has appendonly = True and avoid populating the outManifest. Of course, an appendonly remote will grow with every git push anyway. And currently only Remote.GitLFS sets that, which can't be used as a git-remote-annex remote anyway.	2024-05-20 13:49:45 -04:00
btester4	cdcf558170		2024-05-20 07:47:09 +00:00
nobodyinperson	034d1a80ba	Added a comment: Importing specific directories from sdcard and internal storage	2024-05-19 08:08:46 +00:00
yarikoptic	cb952e762d	Added a comment	2024-05-16 22:22:50 +00:00
yarikoptic	3337236bd9	initial report about multiple UUIDs and names for the same remote	2024-05-16 22:17:49 +00:00
Joey Hess	7c7136b6b9	devblog	2024-05-16 13:23:36 -04:00
Joey Hess	ce60211881	add incremental vs full push race to todo with plan to deal with it	2024-05-16 09:37:28 -04:00
Joey Hess	468de43d66	Merge branch 'master' into git-remote-annex	2024-05-15 17:49:12 -04:00
Joey Hess	b1b6e35d4c	reorg todo	2024-05-15 17:41:55 -04:00
Joey Hess	adcebbae47	clean up git-remote-annex git-annex branch handling Implemented alternateJournal, which git-remote-annex uses to avoid any writes to the git-annex branch while setting up a special remote from an annex:: url. That prevents the remote.log from being overwritten with the special remote configuration from the url, which might not be 100% the same as the existing special remote configuration. And it prevents an overwrite deleting of other stuff that was already in the remote.log. Also, when the branch was created by git-remote-annex, only delete it at the end if nothing else has been written to it by another command. This fixes the race condition described in `797f27ab05`, where git-remote-annex set up the branch and git-annex init and other commands were run at the same time and their writes to the branch were lost.	2024-05-15 17:33:38 -04:00
Joey Hess	d24d8870c5	todo	2024-05-15 14:33:13 -04:00
Joey Hess	2dfffa0621	bugfix When pushing branch foo, we don't want to delete other tracking branches. In particular, a full push needs all the tracking branches.	2024-05-14 16:17:27 -04:00
Joey Hess	169e673ad4	result of some testing	2024-05-14 16:01:24 -04:00
Joey Hess	7dd2a67c41	fix names of new git configs	2024-05-14 15:33:47 -04:00
Joey Hess	0722c504c5	update docs for git-remote-annex	2024-05-14 15:31:16 -04:00
Joey Hess	23c4125ed4	mention other commands shipped with git-annex in SEE ALSO in man page	2024-05-14 15:23:45 -04:00
Joey Hess	24af51e66d	git-annex unused --from remote skips its git-remote-annex keys This turns out to only be necessary is edge cases. Most of the time, git-annex unused --from remote doesn't see git-remote-annex keys at all, because it does not record a location log for them. On the other hand, git-annex unused does find them, since it does not rely on the location log. And that's good because they're a local cache that the user should be able to drop. If, however, the user ran git-annex unused and then git-annex move --unused --to remote, the keys would have a location log for that remote. Then git-annex unused --from remote would see them, and would consider them unused. Even when they are present on the special remote they belong to. And that risks losing data if they drop the keys from the special remote, but didn't expect it would delete git branches they had pushed to it. So, make git-annex unused --from skip git-remote-annex keys whose uuid is the same as the remote.	2024-05-14 15:17:40 -04:00
Joey Hess	0bf72ef103	max-git-bundles config for git-remote-annex	2024-05-14 14:23:40 -04:00
Joey Hess	8ad768fdba	todo	2024-05-14 13:58:35 -04:00
Joey Hess	6f1039900d	prevent using git-remote-annex with unsuitable special remote configs I hope to support importtree=yes eventually, but it does not currently work. Added remote.<name>.allow-encrypted-gitrepo that needs to be set to allow using it with encrypted git repos. Note that even encryption=pubkey uses a cipher stored in the git repo to encrypt the keys stored in the remote. While it would be possible to not encrypt the GITBUNDLE and GITMANIFEST keys, and then allow using encryption=pubkey, it doesn't currently work, and that would be a complication that I doubt is worth it.	2024-05-14 13:52:20 -04:00
Joey Hess	e154c6da92	bug report (copied from email)	2024-05-13 17:11:34 -04:00
Joey Hess	8bf6dab615	update	2024-05-13 14:42:25 -04:00
Joey Hess	ddf05c271b	fix cloning from an annex:: remote with exporttree=yes Updating the remote list needs the config to be written to the git-annex branch, which was not done for good reasons. While it would be possible to instead use Remote.List.remoteGen without writing to the branch, I already have a plan to discard git-annex branch writes made by git-remote-annex, so the simplest fix is to write the config to the branch. Sponsored-by: k0ld on Patreon	2024-05-13 14:35:17 -04:00
Joey Hess	552b000ef1	update	2024-05-13 14:30:18 -04:00
Joey Hess	34eae54ff9	git-remote-annex support exporttree=yes remotes Put the annex objects in .git/annex/objects/ inside the export remote. This way, when importing from the remote, they will be filtered out. Note that, when importtree=yes, content identifiers are used, and this means that pushing to a remote updates the git-annex branch. Urk. Will need to try to prevent that later, but I already had a todo about that for other reasons. Untested! Sponsored-By: Brock Spratlen on Patreon	2024-05-13 11:48:00 -04:00
Joey Hess	3f848564ac	refuse to fetch from a remote that has no manifest Otherwise, it can be confusing to clone from a wrong url, since it fails to download a manifest and so appears as if the remote exists but is empty. Sponsored-by: Jack Hill on Patreon	2024-05-13 09:47:21 -04:00
Joey Hess	97b309b56e	extend manifest with keys to be deleted This will eventually be used to recover from an interrupted fullPush and drop the old bundle keys it was unable to delete. Sponsored-by: Luke T. Shumaker on Patreon	2024-05-13 09:09:33 -04:00
Joey Hess	dfb09ad1ad	preparing to merge git-remote-annex Update its todo with remaining items. Add changelog entry. Simplified internals document to no longer be notes to myself, but target users who want to understand how the data is stored and might want to extract these repos manually. Sponsored-by: Kevin Mueller on Patreon	2024-05-10 15:06:15 -04:00
Joey Hess	ff5193c6ad	Merge branch 'master' into git-remote-annex	2024-05-10 14:20:36 -04:00
Joey Hess	947cf1c345	back to annex:: for git-remote-annex url Oh, turns out git needs two colons to use a gitremote-helper. Ok.	2024-05-07 14:37:29 -04:00
Joey Hess	c7731cdbd9	add Backend.GitRemoteAnnex Making GITBUNDLE be in the backend list allows those keys to be hashed to verify, both when git-remote-annex downloads them, and by other transfers and by git fsck. GITMANIFEST is not in the backend list, because those keys will never be stored in .git/annex/objects and can't be verified in any case. This does mean that git-annex version will include GITBUNDLE in the list of backends. Also documented these in backends.mdwn Sponsored-by: Kevin Mueller on Patreon	2024-05-07 13:54:08 -04:00
Joey Hess	483887591d	working toward git-remote-annex using a special remote Not quite there yet. Also, changed the format of GITBUNDLE keys to use only one '-' after the UUID. A sha256 does not contain that character, so can just split at the last one. Amusingly, the sha256 will probably not actually be verified. A git bundle contains its own checksums that git uses to verify it. And if someone wanted to replace the content of a GITBUNDLE object, they could just edit the manifest to use a new one whose sha256 does verify. Sponsored-by: Nicholas Golder-Manning	2024-05-06 16:28:04 -04:00
Joey Hess	f4ba6e0c1e	add annex: url parser Changed the format of the url to use annex: rather than annex:: The reason is that in the future, might want to support an url that includes an uriAuthority part, eg: annex://foo@example.com:42/358ff77e-0bc3-11ef-bc49-872e6695c0e3?type=directory&encryption=none&directory=/mnt/foo/" To parse that foo@example.com:42 as an uriAuthority it needs to start with annex: rather than annex:: That would also need something to be done with uriAuthority, and also the uriPath (the UUID) is prefixed with "/" in that example. So the current parser won't handle that example currently. But this leaves the possibility for expansion. Sponsored-by: Joshua Antonishen on Patreon	2024-05-06 14:50:41 -04:00
Joey Hess	306ea42447	improve git-remote-annex docs renamed the git config to something shorter too	2024-05-06 13:06:22 -04:00
Joey Hess	0be9f7a2c6	add UUID to GITBUNDLE The UUID is included in the GITMANIFEST in order to allow a single key/value store to be used to store several special remotes, without any namespacing. In that situation though, if the same ref is pushed to two special remotes, it will result in git bundles with the same content. Which is ok, until a re-push happens to one of the special remote. At that point, the old git bundle will be deleted. That will prevent fetching it from the other special remote, where the re-push has not happened. Adding the UUID avoids this problem.	2024-05-06 12:51:44 -04:00
Joey Hess	a8cef2bf85	added man page for git-remote-annex And document remote.<name>.git-remote-annex-max-bundles which will configure it. datalad-annex uses a similar url format, but with some enhancements. See https://github.com/datalad/datalad-next/blob/main/datalad_next/gitremotes/datalad_annex.py I added the UUID to the URL, because it is needed in order to pick out which manifest file to use. The design allows for a single key/value store to have several special remotes all stored in it, and so the manifest includes the UUID in its name. While datalad-annex allows datalad-annex::<url>?, and allows referencing peices of the url in the parameters, needing the UUID prevents git-remote-annex from supporting that syntax. And anyway, it is a complication and I want to keep things simple for now. Sponsored-by: unqueued on Patreon	2024-05-06 12:48:04 -04:00
Joey Hess	90b389369f	fix name of gitremote-helpers The git man page has that name.	2024-05-06 12:07:05 -04:00
Joey Hess	4007d7234b	update	2024-05-06 11:36:43 -04:00
Joey Hess	5f61667f27	note on cycles	2024-05-02 12:22:04 -04:00
Joey Hess	4c538b0bb9	question	2024-05-02 11:15:35 -04:00
Joey Hess	883328b615	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-02 11:11:19 -04:00
Joey Hess	008ffd5cb5	update with presigned url idea Sponsored-by: Dartmouth College's OpenNeuro project	2024-05-02 11:10:23 -04:00
NewUser	54d3cc6ed6		2024-05-02 14:32:26 +00:00
NewUser	7e3b48a388		2024-05-02 14:31:22 +00:00
lell	13b21662c8		2024-05-02 09:06:21 +00:00
Yaroslav Halchenko	9c2ab31549	Fix compatable typo (yet to add to codespell) === Do not change lines below === { "chain": [], "cmd": "git-sedi compatable compatible", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "." } ^^^ Do not change lines above ^^^	2024-05-01 15:46:25 -04:00
Yaroslav Halchenko	aa9f9333ea	one spotted visually typo	2024-05-01 15:46:18 -04:00
Joey Hess	1cbf89f48f	Merge branch 'master' of ssh://git-annex.branchable.com	2024-05-01 15:27:48 -04:00
Joey Hess	cbaf2172ab	started on a design for P2P protocol over HTTP Added to git-annex_proxies todo because this is something OpenNeuro would need in order to use the git-annex proxy. Sponsored-by: Dartmouth College's OpenNeuro project	2024-05-01 15:26:51 -04:00
yarikoptic	f70ae767dc	question about assessing size of keys in tagged commits	2024-05-01 19:06:56 +00:00
Joey Hess	d28adebd6b	number list	2024-05-01 12:19:12 -04:00
Joey Hess	0d0c891ff9	add headers for tocs	2024-05-01 12:18:14 -04:00
Joey Hess	4cd2c980d2	toc	2024-05-01 12:14:59 -04:00
Joey Hess	901e02ccc3	design work on proxies for exporttree=yes Sponsored-by: Dartmouth College's OpenNeuro project	2024-05-01 12:07:57 -04:00
Joey Hess	e7333aa505	fix link	2024-05-01 11:08:57 -04:00
Joey Hess	9cdbcedc37	additional design work on proxies Sponsored-by: Dartmouth College's OpenNeuro project	2024-05-01 11:08:10 -04:00
Joey Hess	a612fe7299	add todo linking to two design docs and some related todos Tagging with projects/openneuro as Christopher Markiewicz has oked them funding at least the initial design work on this.	2024-05-01 11:04:20 -04:00
Joey Hess	5b36e6b4fb	comments	2024-04-30 16:08:46 -04:00
Joey Hess	fa0bcba86e	add news item for git-annex 10.20240430	2024-04-30 15:27:37 -04:00
Joey Hess	d4ed1d9977	comment	2024-04-30 15:20:25 -04:00
Joey Hess	f3cca8a9f8	applied patch	2024-04-30 15:17:38 -04:00

... 3 4 5 6 7 ...

34469 commits