git-annex

Author	SHA1	Message	Date
Joey Hess	d0938d730b	Merge branch 'master' into balanced	2024-08-30 11:01:39 -04:00
Joey Hess	242c525659	lookupkey: Allow using --ref in a bare repository.	2024-08-30 10:55:48 -04:00
Joey Hess	23d44aa4aa	use live reposizes in balanced preferred content	2024-08-27 10:17:43 -04:00
Joey Hess	3f8675f339	more LiveUpdate plumbing	2024-08-24 09:28:41 -04:00
Joey Hess	eb841ab004	plumb in LiveUpdate to copy/get/move/mirror copy and get do check preferred content, so need to prepareLiveUpdate. move and mirror do not, but copy is implemented using move, so move also needed to have a LiveUpdate plumbed through.	2024-08-24 09:20:58 -04:00
Joey Hess	418fbf3f2f	NoLiveExport for export and import While these do check preferred content, it would not make sense to use balanced preferred content with them.	2024-08-24 09:19:12 -04:00
Joey Hess	c3d40b9ec3	plumb in LiveUpdate (WIP) Each command that first checks preferred content (and/or required content) and then does something that can change the sizes of repositories needs to call prepareLiveUpdate, and plumb it through the preferred content check and the location log update. So far, only Command.Drop is done. Many other commands that don't need to do this have been updated to keep working. There may be some calls to NoLiveUpdate in places where that should be done. All will need to be double checked. Not currently in a compilable state.	2024-08-23 16:35:12 -04:00
Joey Hess	3fe67744b1	display new empty repos in maxsize table A new repo that has no location log info yet, but has an entry in uuid.log has 0 size, so make RepoSize aware of that. Note that a new repo that does not yet appear in uuid.log will still not be displayed. When a remote is added but not synced with yet, it has no uuid.log entry. If git-annex maxsize is used to configure that remote, it needs to appear in the maxsize table, and the change to Command.MaxSize takes care of that.	2024-08-22 07:03:22 -04:00
Joey Hess	a643699b7b	display ">100%" when past maxsize This is to avoid a value like 1000% causing the table to not align.	2024-08-21 20:52:54 -04:00
Joey Hess	2ec4602e36	fix column width	2024-08-21 12:18:16 -04:00
Joey Hess	d4b2f8201d	add %full field to table	2024-08-19 11:41:48 -04:00
Joey Hess	99514f9d18	maxsize overview display and --json support	2024-08-18 12:08:13 -04:00
Joey Hess	f985c58d8e	consistently don't show sizes of empty repositories This used to be the case, and when matching options are used, that code path still omits them, so also omit them in the getRepoSize code path.	2024-08-17 15:09:16 -04:00
Joey Hess	b62b58b50b	git-annex info speed up using getRepoSizes	2024-08-17 14:54:31 -04:00
Joey Hess	8239824d92	consistently omit clusters when calculating RepoSizes updateRepoSize is only called on the UUID of a repository, not any cluster it might be a node of. But overLocationLogs and overLocationLogsJournal were inclusing cluster UUIDs. So it was inconsistent. Currently I don't see any reason to calculate RepoSize for a cluster. It's not even clear what it should mean, the total size of all nodes, or the amount of information stored in the cluster in total?	2024-08-17 11:24:14 -04:00
Joey Hess	8ac2685b33	calcBranchRepoSizes without journal files This will be used to prime the RepoSizes database, which will always contain values that correpond to information in the git-annex branch, so without anything from journal files. Factored out overJournalFileContents which will later be used to update Annex.reposizes to include information from journal files. This will be partitcularly important to support private UUIDs which only ever get to journal files and not to the branch.	2024-08-14 03:19:30 -04:00
Joey Hess	f612ebb934	avoid changing git-annex info behavior `5afbea25e7` changed it to ignore journal files that did not correspond to a key in the git-annex branch. However, when there is a private journal, that can happen. Neither behavior is fully correct, so keep the old incorrect behavior rather than introducing a new differently incorrect behavior. I plan to eventually make git-annex info use Annex.reposizes instead of calculating it itself, and once Annex.reposizes handles this all correctly, this will be a moot problem.	2024-08-13 14:17:20 -04:00
Joey Hess	5afbea25e7	avoid counting size of keys that are in the journal twice In calcRepoSizes and also git-annex info, when a key was in the journal, it was passed to the callback twice, so the calculated size was wrong.	2024-08-13 13:23:39 -04:00
Joey Hess	467d80101a	improve handling of unmerged git-annex branches in readonly repo git-annex info was displaying a message that didn't make sense in context. In calcRepoSizes, it seems better to return the information from the git-annex branch, rather than giving up. Especially since balanced preferred content uses it, and we can't just give up evaluating a preferred content expression if git-annex is to be usable in such a readonly repo. Commit `6d7ecd9e5d` nobly wanted git-annex to behave the same with such unmerged branches as it does when it can merge them. But for the purposes of preferred content, it seems to me there's a sense that such an unmerged branch is the same as a remote we have not pulled from. The balanced preferred content will either way operate under outdated information, and so make not the best choices.	2024-08-13 13:13:12 -04:00
Joey Hess	0c3771beb1	add	2024-08-12 18:50:58 -04:00
Joey Hess	1265d7e5df	implement maxsize log and command * maxsize: New command to tell git-annex how large the expected maximum size of a repository is. * vicfg: Include maxsize configuration.	2024-08-11 15:41:26 -04:00
Joey Hess	1224f1c183	improve usage	2024-08-11 14:37:18 -04:00
Joey Hess	3ea835c7e8	proxied exporttree=yes versionedexport=yes remotes are not untrusted This removes versionedExport, which was only used by the S3 special remote. Instead, versionedexport=yes is a common way for remotes to indicate that they are versioned.	2024-08-08 15:24:19 -04:00
Joey Hess	c84d1a9462	update export db after rename from annexobjects location This allows git-annex post-receive, on the first push to the remote to see that it is able to get a key from it in order to upload it back. Also avoided actively checking if the source remote contains a key. The location log is good enough. If the location log is wrong, the export of that file will fail with an informative message.	2024-08-08 14:03:02 -04:00
Joey Hess	a2eb3b450a	post-receive: use the exporttree=yes remote as a source This handles cases where a single key is used by multiple files in the exported tree. When using `git-annex push`, the key's content gets stored in the annexobjects location, and then when the branch is pushed, it gets renamed from the annexobjects location to the first exported file. For subsequent exported files, a copy of the content needs to be made. This causes it to download the key from the remote in order to upload another copy to it. This is not needed when using `git push` followed by `git-annex copy --to` the proxied remote, because the received key is stored at all export locations then. Also, fixed handling of the synced branch push, it was exporting master when synced/master was pushed. Note that currently, the first push to the remote does not see that it is able to get a key from it in order to upload it back. It displays "(not available)". The second push is able to. Since git-annex push pushes first the synced branch and then the branch, this does end up with a full export being made, but it is not quite right.	2024-08-08 13:49:53 -04:00
Joey Hess	7294d23d78	export: Added --from option This is similar to git-annex copy --from --to, in that it downloads a local copy, locks it for removal, uploads it, and drops it. Removal of the temporary local copy is done without verifying numcopies for the same reason as that command. I do wonder, looking at this, if there's a race where the local copy gets used as a copy to allow some other drop in the narrow window after it is downloaded and before it gets locked for removal. That would need some other repository to have an out of date location log that says the repository contains a copy of the key, in order for it to try to use it as a copy. If there is such a race, git-annex copy/move would also be vulnerable to it. It would be better to lock it for removal before starting to download it! That is possible in v10 repositories, which do use a separate content lock file. Note that, when the exported tree contains several files that use the same key, it will be downloaded repeatedly, once per time needed to upload it. It would be possible to avoid that extra work, but it would complicate this since the local copy would need to be preserved, locked for removal, until the end. Also, that would mean that interrupting the export would leave possibly a lot of temporarily downloaded keys in the local repository, while currently it can only leave one.	2024-08-08 12:08:55 -04:00
Joey Hess	bd677bb65a	avoid warning in startDispose When a file never got exported to the remote, and is now being removed from the exported tree, it tried to rename, which failed, and displayed an ugly warning: unexport d m8 rename failed (/home/joey/tmp/bench2/d/m8: renameFile:renamePath:rename: does not exist (No such file or directory)); deleting instead ok	2024-08-08 11:59:16 -04:00
Joey Hess	01edd186e9	update proxied exporttree=yes remote on receive of sync branch Since git-annex sync sends the sync branch first, and only displays the output of the push to the sync branch, this makes git-annex post-retrieve's output when updating the exported tree be visible when syncing. This also makes syncing with a non-bare repository still update the exported tree, even when the checked out branch is not able to be updated. The sync branch gets sent regardless.	2024-08-07 13:11:06 -04:00
Joey Hess	55adbb6694	avoid trying to export tree to proxied exporttree=yes remotes This avoids a lot of ugly messages when syncing with such a remote. The export tree happens on the proxy side.	2024-08-07 13:00:19 -04:00
Joey Hess	6d96734128	updateproxy, updatecluster check annexobjects=yes updateproxy, updatecluster: Prevent using an exporttree=yes special remote that does not have annexobjects=yes, since it will not work.	2024-08-07 12:27:24 -04:00
Joey Hess	3289b1ad02	proxying to exporttree=yes annexobjects=yes basically working It works when using git-annex sync/push/assist, or when manually sending all content to the proxied remote before pushing to the proxy remote. But when the push comes before the content is sent, sending content does not update the exported tree.	2024-08-06 14:21:23 -04:00
Joey Hess	a535eaa176	rename from annexobjects location on export (When possible, of course it may not be there, or it may get renamed from there for another exported file first. Or the remote may not support renames.) This will avoids redundant uploads. An example case where this is important: Proxying to a exporttree remote, a file is uploaded to it but is not yet in an exported tree. When the exported tree is pushed, the remote needs to be updated by exporting to it. In this case, the proxy doesn't have a copy of the file, so it would need to download it from annexobjects before uploading it to the final location. With this optimisation, it can just rename it. However: If a key is used twice in an exported tree, it seems a proxy will need to download and reupload anyway. Unless a copy operation is added to exporttree remotes..	2024-08-04 12:19:10 -04:00
Joey Hess	a3d96474f2	rename to annexobjects location on unexport This avoids needing to re-upload the file again to get it to the annexobjects location, which git-annex sync was doing when it was preferred content. If the file is not preferred content, sync will drop it from the annexobjects location. If the file has been deleted from the tree, it will remain in the annexobjects location until an unused/dropunused pass is done.	2024-08-04 11:58:07 -04:00
Joey Hess	a4a06404d4	sync --content with annexobjects=true exporttree remotes	2024-08-03 11:39:23 -04:00
Joey Hess	c4352adf6a	in unexport, check for annexobjects presence before updating location log The key may still be in the annexobjects location.	2024-08-02 18:43:10 -04:00
Joey Hess	960daf210b	run with noMessages This avoids extraneous output from p2phttp, including eg, progress displays when transferring to proxied special remotes.	2024-07-29 13:35:08 -04:00
Joey Hess	fbbedae497	add --clusterjobs option and default to 1 The default of 1 is not ideal at all, but it avoids an accidental M*N causing so much concurrency it becomes unusable.	2024-07-28 10:36:22 -04:00
Joey Hess	d1faa13d6a	implement proxy connection pool removeOldestProxyConnectionPool will be innefficient the larger the pool is. A better data structure could be more efficient. Eg, make each value in the pool include the timestamp of its oldest element, then the oldest value can be found and modified, rather than rebuilding the whole Map. But, for pools of a few hundred items, this should be fine. It's O(n*n log n) or so. Also, when more than 1 connection with the same pool key exists, it's efficient even for larger pools, since removeOldestProxyConnectionPool is not needed. The default of 1 idle connection could perhaps be larger.. like the number of jobs? Otoh, it seems good to ramp up and down the number of connections, which does happen. With 1, there is at most one stale connection, which might cause a request to fail.	2024-07-26 17:03:31 -04:00
Joey Hess	ad025b8e5e	clean up protocol version for proxying The proxy always checks the protocol version of a remote before talking to it in a version-specific way, so the protocol version in the ProxyParams is the client's protocol version. The remote will always be at the same or an older protocol version than the client. Note that in relayDATAFinish, when the client is at protocol version 0, the remote must thus be as well, and that's why its version is not checked in the case for that. With that clarified, it's evident that, in P2P.Http.State, there's no need to look at the proxied remote's protocol version at all.	2024-07-26 13:49:05 -04:00
Joey Hess	cc1da2d516	http p2p proxy is now largely working	2024-07-26 10:44:10 -04:00
Joey Hess	6ef6ad808f	use a record to reduce the huge number of parameters	2024-07-25 15:23:18 -04:00
Joey Hess	3d14e2cf58	http server support for proxies, incomplete Refactored git-annex-shell code so this can use checkCanProxy'. At this point all that remains is opening a proxy connection, and using a proxy connection.	2024-07-25 13:19:24 -04:00
Joey Hess	0bdeafc2c4	use annex+http for accessing proxies Doesn't work yet on the http server side, which is throwing 502 bad gateway.	2024-07-25 12:00:57 -04:00
Joey Hess	7bd616e169	Remote.Git retrieveKeyFile works with annex+http urls This includes a bugfix to serveGet, it hung at the end.	2024-07-24 10:28:44 -04:00
Joey Hess	73ffb58456	p2phttp support https	2024-07-23 15:37:36 -04:00
Joey Hess	b7149e897b	add --bind option and listen to both ipv4 and ipv6 by default	2024-07-23 15:19:56 -04:00
Joey Hess	4e15b786ca	Remote.Git checkpresent works with annex+http urls.	2024-07-23 14:31:32 -04:00
Joey Hess	b0eed55d4f	factor out http server and client into own modules To avoid a cycle when Remote.Git uses the client.	2024-07-23 14:12:38 -04:00
Joey Hess	6bbc4565e6	started wiring p2phttp into Remote.Git but we have a cycle, ugh	2024-07-23 13:53:10 -04:00
Joey Hess	5c39652235	starting support for remote.name.annexUrl set to annex+http In this case, Remote.Git should not use that url for all access to the repository. It will only be used for annex operations, which isn't done yet.	2024-07-23 09:12:21 -04:00

1 2 3 4 5 ...

2941 commits