Commit graph

4695 commits

Author SHA1 Message Date
Joey Hess
22a329c57e
copied over some changes from proxy branch 2024-06-13 06:43:59 -04:00
Joey Hess
555d7e52d3
more thoughts on clusters 2024-06-12 17:30:55 -04:00
Joey Hess
0ebb107974
update 2024-06-12 15:21:23 -04:00
Joey Hess
46a1fcb3ea
avoid git syncing with instantiate proxied remotes
These remotes have no url configured, so git pull and push will fail.
git-annex sync --content etc can still sync with them otherwise.

Also, avoid git syncing twice with the same url. This is for cases where
a proxied remote has been manually configured and so does have a url.
Or perhaps proxied remotes will get configured like that automatically
later.
2024-06-12 15:10:03 -04:00
Joey Hess
a986a20034
designing clusters 2024-06-12 14:57:26 -04:00
Joey Hess
e70e3473b3
on cycles 2024-06-12 13:52:17 -04:00
Joey Hess
44464e4410
update 2024-06-12 12:37:14 -04:00
Joey Hess
67d1e2a459
updates 2024-06-12 12:02:25 -04:00
Joey Hess
dfdda95053
proxy updates location tracking information
This does mean a redundant write to the git-annex branch. But,
it means that two clients can be using the same proxy, and after
one sends a file to a proxied remote, the other only has to pull from
the proxy to learn about that. It does not need to pull from every
remote behind the proxy (which it couldn't do anyway as git repo
access is not currently proxied).

Anyway, the overhead of this in git-annex branch writes is no worse
than eg, sending a file to a repository where git-annex assistant
is running, which then sends the file on to a remote, and updates
the git-annex branch then. Indeed, when the assistant also drops
the local copy, that results in more writes to the git-annex branch.
2024-06-12 11:37:14 -04:00
Joey Hess
96853cd833
finish P2P protocol proxying
CONNECT is not supported by git-annex-shell p2pstdio, but for proxying
to tor-annex remotes, it will be supported, and will make a git pull/push
to a proxied remote work the same with that as it does over ssh,
eg it accesses the proxy's git repo not the proxied remote's git repo.

The p2p protocol docs say that NOTIFYCHANGES is not always supported,
and it looked annoying to implement it for this, and it also seems
pretty useless, so make it be a protocol error. git-annex remotedaemon
will already be getting change notifications from the proxy's git repo,
so there's no need to get additional redundant change notifications for
proxied remotes that would be for changes to the same git repo.
2024-06-12 10:40:51 -04:00
Joey Hess
f98605bce7
a local git remote cannot proxy
Prevent listProxied from listing anything when the proxy remote's
url is a local directory. Proxying does not work in that situation,
because the proxied remotes have the same url, and so git-annex-shell
is not run when accessing them, instead the proxy remote is accessed
directly.

I don't think there is any good way to support this. Even if the instantiated
git repos for the proxied remotes somehow used an url that caused it to use
git-annex-shell to access them, planned features like `git-annex copy --to
proxy` accepting a key and sending it on to nodes behind the proxy would not
work, since git-annex-shell is not used to access the proxy.

So it would need to use something to access the proxy that causes
git-annex-shell to be run and speaks P2P protocol over it. And we have that.
It's a ssh connection to localhost. Of course, it would be possible to
take ssh out of that mix, and swap in something that does not have
encryption overhead and authentication complications, but otherwise
behaves the same as ssh. And if the user wants to do that, GIT_SSH
does exist.
2024-06-12 10:16:04 -04:00
Joey Hess
c6e0710281
proxying to local git remotes works
This just happened to work correctly. Rather surprisingly. It turns out
that openP2PSshConnection actually also supports local git remotes,
by just running git-annex-shell with the path to the remote.

Renamed "P2PSsh" to "P2PShell" to make this clear.
2024-06-12 10:10:11 -04:00
yarikoptic
c6f2a5d372 TODO for log --key 2024-06-12 13:20:29 +00:00
Joey Hess
5beaffb412
proxying PUT now working
The almost identical code duplication between relayDATA and relayDATA'
is very annoying. I tried quite a few things to parameterize them, but
the type checker is having fits when I try it.
2024-06-11 16:56:52 -04:00
Joey Hess
ed4fda098b
todo 2024-06-11 15:15:58 -04:00
Joey Hess
a2f4a8eddf
proxying GET now working
Memory use is small and constant; receiveBytes returns a lazy bytestring
and it does stream.

Comparing speed of a get of a 500 mb file over proxy from origin-origin,
vs from the same remote over a direct ssh:

joey@darkstar:~/tmp/bench/client>/usr/bin/time git-annex get bigfile --from origin-origin
get bigfile (from origin-origin...)
ok
(recording state in git...)
1.89user 0.67system 0:10.79elapsed 23%CPU (0avgtext+0avgdata 68716maxresident)k
0inputs+984320outputs (0major+10779minor)pagefaults 0swaps

joey@darkstar:~/tmp/bench/client>/usr/bin/time git-annex get bigfile --from direct-ssh
get bigfile (from direct-ssh...)
ok
1.79user 0.63system 0:10.49elapsed 23%CPU (0avgtext+0avgdata 65776maxresident)k
0inputs+1024312outputs (0major+9773minor)pagefaults 0swaps

So the proxy doesn't add much overhead even when run on the same machine as
the client and remote.

Still, piping receiveBytes into sendBytes like this does suggest that the proxy
could be made to use less CPU resouces by using `sendfile()`.
2024-06-11 15:09:43 -04:00
Joey Hess
09b5e53f49
set annex.uuid in proxy's Repo
getRepoUUID looks at that, and was seeing the annex.uuid of the proxy.
Which caused it to unncessarily set the git config. Probably also would
have led to other problems.
2024-06-11 13:40:50 -04:00
Joey Hess
657a91527a
update 2024-06-11 13:22:03 -04:00
Joey Hess
d2e3c5c89f
update 2024-06-11 13:07:53 -04:00
Joey Hess
43ff697f25
update status and design work on proxy encryption and chunking 2024-06-07 12:35:04 -04:00
Joey Hess
058726ee86
next step identified 2024-06-06 18:06:45 -04:00
Joey Hess
d59383beaf
update 2024-06-06 17:25:22 -04:00
ruslan@302cb7f8d398fcce72f88b26b0c2f3a53aaf0bcd
ca687413ef Added a comment 2024-06-05 16:53:51 +00:00
Joey Hess
1761e971ee
status update after day 1 of new project 2024-06-04 14:55:54 -04:00
Joey Hess
3df70c5c0c
implementation plan 2024-06-04 07:51:33 -04:00
Joey Hess
6375e3be3b
recieved funding to work on this, which comes with a schedule 2024-06-04 06:53:59 -04:00
Joey Hess
5992e1729a
fixed by git release 2024-06-04 06:39:08 -04:00
Joey Hess
adf17f5038
Merge branch 'master' of ssh://git-annex.branchable.com 2024-05-30 13:26:44 -04:00
Joey Hess
0155abfba4
git-remote-annex: Support urls like annex::https://example.com/foo-repo
Using the usual url download machinery even allows these urls to need
http basic auth, which is prompted for with git-credential. Which opens
the possibility for urls that contain a secret to be used, eg the cipher
for encryption=shared. Although the user is currently on their own
constructing such an url, I do think it would work.

Limited to httpalso for now, for security reasons. Since both httpalso
(and retrieving this very url) is limited by the usual
annex.security.allowed-ip-addresses configs, it's not possible for an
attacker to make one of these urls that sets up a httpalso url that
opens the garage door. Which is one class of attacks to keep in mind
with this thing.

It seems that there could be either a git-config that allows other types
of special remotes to be set up this way, or special remotes could
indicate when they are safe. I do worry that the git-config would
encourage users to set it without thinking through the security
implications. One remote config might be safe to access this way, but
another config, for one with the same type, might not be. This will need
further thought, and real-world examples to decide what to do.
2024-05-30 12:24:16 -04:00
yarikoptic
d23ae92da8 Added a comment 2024-05-30 14:34:32 +00:00
yarikoptic
285a7ff3c3 Added a comment 2024-05-30 14:29:43 +00:00
Joey Hess
3f33616068
security 2024-05-29 22:55:06 -04:00
Joey Hess
efa684ab8a
todo 2024-05-29 18:21:17 -04:00
Joey Hess
09a0552489
split off todo, comment 2024-05-29 13:16:36 -04:00
Joey Hess
e19916f54b
add config-uuid to annex:: url for --sameas remotes
And use it to set annex-config-uuid in git config. This makes
using the origin special remote work after cloning.

Without the added Logs.Remote.configSet, instantiating the remote will
look at the annex-config-uuid's config in the remote log, which will be
empty, and so it will fail to find a special remote.

The added deletion of files in the alternatejournaldir is just to make
100% sure they don't get committed to the git-annex branch. Now that
they contain things that definitely should not be committed.
2024-05-29 12:50:00 -04:00
Joey Hess
bbf49c9de7
httpalso just worked, with one small issue to fix 2024-05-28 16:26:16 -04:00
Joey Hess
cb9f7b5646
update 2024-05-28 12:50:54 -04:00
Joey Hess
14443fd307
update 2024-05-28 12:46:56 -04:00
Joey Hess
e19f56e7d8
Merge branch 'master' of ssh://git-annex.branchable.com 2024-05-28 10:27:50 -04:00
Joey Hess
c6669990fb
update 2024-05-28 09:19:00 -04:00
m.risse@77eac2c22d673d5f10305c0bade738ad74055f92
bab6d3e58f Added a comment: Re: worktree provisioning 2024-05-28 12:06:39 +00:00
Joey Hess
c2483f6e6d
update 2024-05-27 22:44:35 -04:00
Joey Hess
0975e792ea
git-remote-annex: Fix error display on clone
cleanupInitialization gets run when an exception is thrown, so needs to
avoid throwing exceptions itself, as that would hide the error message
that the user needs to see.
2024-05-27 13:28:05 -04:00
Joey Hess
a766475d14
split out a todo 2024-05-27 12:50:46 -04:00
Joey Hess
e64add7cdf
git-remote-annex: support importrree=yes remotes
When exporttree=yes is also set. Probably it would also be possible to
support ones with only importtree=yes, by enabling exporttree=yes for
the remote only when using git-remote-annex, but let's keep this
simple... I'm not sure what gets recorded in .git/annex/ state
differently in the two cases that might cause a problem when doing that.

Note that the full annex:: urls generated and displayed for such a
remote omit the importree=yes. Which is ok, cloning from such an url
uses an exporttree=remote, but the git-annex branch doesn't get written
by this program, so once the real config is available from the git-annex
branch, it will still function as an importree=yes remote.
2024-05-27 12:35:42 -04:00
Joey Hess
19418e81ee
git-remote-annex: Display full url when using remote with the shorthand url 2024-05-24 17:15:31 -04:00
Joey Hess
04a256a0f8
work around git "defense in depth" breakage with git clone checking for hooks
This git bug also broke git-lfs, and I am confident it will be reverted
in the next release.

For now, cloning from an annex:: url wastes some bandwidth on the next
pull by not caching bundles locally.

If git doesn't fix this in the next version, I'd be tempted to rethink
whether bundle objects need to be cached locally. It would be possible to
instead remember which bundles have been seen and their heads, and
respond to the list command with the heads, and avoid unbundling them
agian in fetch. This might even be a useful performance improvement in
the latter case. It would be quite a complication to a currently simple
implementation though.
2024-05-24 15:49:53 -04:00
Joey Hess
6ccd09298b
convert srcref to a sha
This fixes pushing a new ref that is the same as something already
pushed. In findotherprereq, it compares two shas, which didn't work when
one is actually not a sha but a ref.

This is one of those cases where Sha being an alias for Ref makes it
hard to catch mistakes. One of these days those need to be
differentiated at the type level, but not today..
2024-05-24 15:33:35 -04:00
Joey Hess
96c66a7ca9
bug 2024-05-24 15:15:42 -04:00
Joey Hess
58301e40d2
sync with special remotes with an annex:: url
Check explicitly for an annex:: url, not just any url. While no built-in
special remotes set an url, except ones that can be synced with, it
seems possible that some external special remote sets an url for its own
use, but did not expect it to be used by git-annex sync et al.

The assistant also syncs with them.
2024-05-24 14:57:29 -04:00
Joey Hess
22bf23782f
initremote, enableremote: Added --with-url to enable using git-remote-annex
Also sets remote.name.fetch to a typical value, same as git remote add does.
2024-05-24 14:29:36 -04:00
Joey Hess
7d61a99da3
todo 2024-05-24 13:57:33 -04:00
Joey Hess
2670508b97
also broke git-remote-annex 2024-05-24 13:35:45 -04:00
Joey Hess
b792b128a0
verified checkprereq
The case documented in its comment worked in a test push and clone.
2024-05-24 13:06:29 -04:00
Joey Hess
1a3c60cc8e
git-remote-annex: avoid bundle object leakage in push race or interrupted push
Locally record the manifest before uploading it or any bundles,
and read it on the next push. Any bundles from the push that are
not included in the currently being pushed manifest will get added
to the outManifest, and so eventually get deleted.

This deals with an interrupted push that is not resumed and instead
something else is pushed. And it deals with a push race that overwrites
the manifest.

Of course, this can't help if one of those situations is followed by
the local repo being deleted. But that's equivilant to doing a git-annex
copy of a new annexed file to a special remote and then deleting the
special repo w/o pushing. In either case the special remote ends up with
a object in it that git-annex doesn't know about.
2024-05-24 12:47:32 -04:00
Joey Hess
264c51b4f4
comment 2024-05-22 06:06:18 -04:00
Joey Hess
4131e31f5c
PATH_MAX 2024-05-22 04:26:36 -04:00
Joey Hess
5fb307f1c5
comment 2024-05-21 17:47:55 -04:00
Joey Hess
938e714a11
bleh 2024-05-21 17:32:49 -04:00
Joey Hess
10a60183e1
guard pushEmpty 2024-05-21 12:12:44 -04:00
Joey Hess
14c79373c4
update 2024-05-21 12:05:44 -04:00
Joey Hess
b3d7ae51f0
fix edge case where git-annex branch does not have config for enabled special remote
One way this could happen is cloning an empty special remote.
A later fetch would then fail.
2024-05-21 11:27:49 -04:00
Joey Hess
3e7324bbcb
only delete bundles on pushEmpty
This avoids some apparently otherwise unsolveable problems involving
races that resulted in the manifest listing bundles that were deleted.

Removed the annex-max-git-bundles config because it can't actually
result in deleting old bundles. It would still be possible to have a
config that controls how often to do a full push, which would avoid
needing to download too many bundles on clone, as well as needing to
checkpresent too many bundles in verifyManifest. But it would need a
different name and description.
2024-05-21 11:13:27 -04:00
Joey Hess
f544946b09
update 2024-05-21 10:20:30 -04:00
Joey Hess
b042dfeb0e
emptying pushes only delete 2024-05-21 09:52:35 -04:00
Joey Hess
5d40759470
formalize problem description 2024-05-21 09:35:46 -04:00
Joey Hess
3a38520aac
avoid interrupted push leaving remote without a manifest
Added a backup manifest key, which is used if the main manifest key is
not present. When uploading a new Manifest, it makes sure that it never
drops one key except when the other key is present.

It's entirely possible for the two manifest keys to get out of sync, due
to races. The main one wins when it's present, it is possible for the
main one being dropped to expose the backup one, which has a different
push recorded.
2024-05-20 15:41:09 -04:00
Joey Hess
594ca2fd3a
update 2024-05-20 14:52:06 -04:00
Joey Hess
34a6db4f15
improve recovery from interrupted push
On push, first try to drop all outManifest keys listed in the current
manifest file, which resumes from an interrupted push that didn't
get a chance to delete those keys.

The new manifest gets its outManifest populated with the keys that were
in the old manifest, plus any of the keys that were unable to be
dropped.

Note that it would be possible for uploadManifest to skip dropping old
keys at all. The old keys would get dropped on the next push. But it
seems better to delete stuff immediately rather than waiting. And the
extra work is limited to push and typically is small.

A remote where dropKey always fails will result in an outManifest that
grows longer and longer. It would be possible to check if the remote
has appendonly = True and avoid populating the outManifest. Of course,
an appendonly remote will grow with every git push anyway. And currently
only Remote.GitLFS sets that, which can't be used as a git-remote-annex
remote anyway.
2024-05-20 13:49:45 -04:00
Joey Hess
ce60211881
add incremental vs full push race to todo
with plan to deal with it
2024-05-16 09:37:28 -04:00
Joey Hess
b1b6e35d4c
reorg todo 2024-05-15 17:41:55 -04:00
Joey Hess
adcebbae47
clean up git-remote-annex git-annex branch handling
Implemented alternateJournal, which git-remote-annex
uses to avoid any writes to the git-annex branch while setting up
a special remote from an annex:: url.

That prevents the remote.log from being overwritten with the special
remote configuration from the url, which might not be 100% the same as
the existing special remote configuration.

And it prevents an overwrite deleting of other stuff that was
already in the remote.log.

Also, when the branch was created by git-remote-annex, only delete it
at the end if nothing else has been written to it by another command.
This fixes the race condition described in
797f27ab05, where git-remote-annex
set up the branch and git-annex init and other commands were
run at the same time and their writes to the branch were lost.
2024-05-15 17:33:38 -04:00
Joey Hess
d24d8870c5
todo 2024-05-15 14:33:13 -04:00
Joey Hess
2dfffa0621
bugfix
When pushing branch foo, we don't want to delete other tracking
branches. In particular, a full push needs all the tracking branches.
2024-05-14 16:17:27 -04:00
Joey Hess
169e673ad4
result of some testing 2024-05-14 16:01:24 -04:00
Joey Hess
0722c504c5
update docs for git-remote-annex 2024-05-14 15:31:16 -04:00
Joey Hess
23c4125ed4
mention other commands shipped with git-annex in SEE ALSO in man page 2024-05-14 15:23:45 -04:00
Joey Hess
24af51e66d
git-annex unused --from remote skips its git-remote-annex keys
This turns out to only be necessary is edge cases. Most of the
time, git-annex unused --from remote doesn't see git-remote-annex keys
at all, because it does not record a location log for them.

On the other hand, git-annex unused does find them, since it does not
rely on the location log. And that's good because they're a local cache
that the user should be able to drop.

If, however, the user ran git-annex unused and then git-annex move
--unused --to remote, the keys would have a location log for that
remote. Then git-annex unused --from remote would see them, and would
consider them unused. Even when they are present on the special remote
they belong to. And that risks losing data if they drop the keys from
the special remote, but didn't expect it would delete git branches they
had pushed to it.

So, make git-annex unused --from skip git-remote-annex keys whose uuid
is the same as the remote.
2024-05-14 15:17:40 -04:00
Joey Hess
0bf72ef103
max-git-bundles config for git-remote-annex 2024-05-14 14:23:40 -04:00
Joey Hess
8ad768fdba
todo 2024-05-14 13:58:35 -04:00
Joey Hess
6f1039900d
prevent using git-remote-annex with unsuitable special remote configs
I hope to support importtree=yes eventually, but it does not currently
work.

Added remote.<name>.allow-encrypted-gitrepo that needs to be set to
allow using it with encrypted git repos.

Note that even encryption=pubkey uses a cipher stored in the git repo
to encrypt the keys stored in the remote. While it would be possible to
not encrypt the GITBUNDLE and GITMANIFEST keys, and then allow using
encryption=pubkey, it doesn't currently work, and that would be a
complication that I doubt is worth it.
2024-05-14 13:52:20 -04:00
Joey Hess
8bf6dab615
update 2024-05-13 14:42:25 -04:00
Joey Hess
ddf05c271b
fix cloning from an annex:: remote with exporttree=yes
Updating the remote list needs the config to be written to the git-annex
branch, which was not done for good reasons. While it would be possible
to instead use Remote.List.remoteGen without writing to the branch, I
already have a plan to discard git-annex branch writes made by
git-remote-annex, so the simplest fix is to write the config to the
branch.

Sponsored-by: k0ld on Patreon
2024-05-13 14:35:17 -04:00
Joey Hess
552b000ef1
update 2024-05-13 14:30:18 -04:00
Joey Hess
34eae54ff9
git-remote-annex support exporttree=yes remotes
Put the annex objects in .git/annex/objects/ inside the export remote.
This way, when importing from the remote, they will be filtered out.

Note that, when importtree=yes, content identifiers are used, and this
means that pushing to a remote updates the git-annex branch. Urk.
Will need to try to prevent that later, but I already had a todo about
that for other reasons.

Untested!

Sponsored-By: Brock Spratlen on Patreon
2024-05-13 11:48:00 -04:00
Joey Hess
3f848564ac
refuse to fetch from a remote that has no manifest
Otherwise, it can be confusing to clone from a wrong url, since it fails
to download a manifest and so appears as if the remote exists but is empty.

Sponsored-by: Jack Hill on Patreon
2024-05-13 09:47:21 -04:00
Joey Hess
97b309b56e
extend manifest with keys to be deleted
This will eventually be used to recover from an interrupted fullPush
and drop the old bundle keys it was unable to delete.

Sponsored-by: Luke T. Shumaker on Patreon
2024-05-13 09:09:33 -04:00
Joey Hess
dfb09ad1ad
preparing to merge git-remote-annex
Update its todo with remaining items.

Add changelog entry.

Simplified internals document to no longer be notes to myself, but
target users who want to understand how the data is stored
and might want to extract these repos manually.

Sponsored-by: Kevin Mueller on Patreon
2024-05-10 15:06:15 -04:00
Yaroslav Halchenko
9c2ab31549
Fix compatable typo (yet to add to codespell)
=== Do not change lines below ===
{
 "chain": [],
 "cmd": "git-sedi compatable compatible",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^
2024-05-01 15:46:25 -04:00
Joey Hess
cbaf2172ab
started on a design for P2P protocol over HTTP
Added to git-annex_proxies todo because this is something OpenNeuro
would need in order to use the git-annex proxy.

Sponsored-by: Dartmouth College's OpenNeuro project
2024-05-01 15:26:51 -04:00
Joey Hess
d28adebd6b
number list 2024-05-01 12:19:12 -04:00
Joey Hess
0d0c891ff9
add headers for tocs 2024-05-01 12:18:14 -04:00
Joey Hess
4cd2c980d2
toc 2024-05-01 12:14:59 -04:00
Joey Hess
e7333aa505
fix link 2024-05-01 11:08:57 -04:00
Joey Hess
a612fe7299
add todo linking to two design docs and some related todos
Tagging with projects/openneuro as Christopher Markiewicz has oked
them funding at least the initial design work on this.
2024-05-01 11:04:20 -04:00
Joey Hess
5b36e6b4fb
comments 2024-04-30 16:08:46 -04:00
Joey Hess
f3cca8a9f8
applied patch 2024-04-30 15:17:38 -04:00
Joey Hess
1f37d0b00d
promote comment to todo 2024-04-30 15:13:59 -04:00
Joey Hess
84611e7ee6
todo 2024-04-26 04:03:10 -04:00
Joey Hess
e3c5f0079d
Merge branch 'master' of ssh://git-annex.branchable.com 2024-04-25 17:01:32 -04:00
Joey Hess
d895df1010
update 2024-04-25 17:01:17 -04:00
ErrGe
cafa9af811 2024-04-22 15:37:04 +00:00
ErrGe
67d92c3aee 2024-04-22 15:36:26 +00:00
ErrGe
649909cc94 2024-04-22 15:35:18 +00:00
Joey Hess
c410b2bb73
annex.maxextensions configuration
Controls how many filename extensions to preserve.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-04-18 14:23:38 -04:00
Joey Hess
b700c48b15
comment 2024-04-18 13:50:19 -04:00
Joey Hess
a1fd72b91a
update to focus on why this is still open 2024-04-18 12:40:53 -04:00
ErrGe
e44513cfe7 Added a comment: hook idea implementation is cool, but usage is not so simple for the enduser 2024-04-18 01:17:02 +00:00
Joey Hess
d55e3f5fe2
Merge branch 'master' of ssh://git-annex.branchable.com 2024-04-17 15:27:09 -04:00
Joey Hess
d372553540
rclone special remote
Added rclone special remote, which can be used without needing to install
the git-annex-remote-rclone program. This needs a new version of rclone,
which supports "rclone gitannex".

This is implemented as a variant of an external special remote, that
runs "rclone gitannex" instead of the usual git-annex-remote- command.
Parameterized Remote.External to support that.

Sponsored-by: Luke T. Shumaker on Patreon
2024-04-17 15:20:37 -04:00
Joey Hess
5c542c0382
update 2024-04-17 13:11:17 -04:00
yarikoptic
6bcb2f7f02 original possible todo on extension 2024-04-17 13:30:23 +00:00
mih
66222e9354 Added a comment: Need for more than HEAD/URL? 2024-04-15 05:00:58 +00:00
m.risse@77eac2c22d673d5f10305c0bade738ad74055f92
eaece30f90 Added a comment: prior art 2024-04-13 20:30:56 +00:00
mih
08be93a5cb Thoughts on support special remotes that compute keys instead of downloading 2024-04-11 12:26:40 +00:00
Joey Hess
4bb5b7c519
comment 2024-04-10 13:00:16 -04:00
Joey Hess
38b1e8a36e
todo 2024-04-10 12:46:27 -04:00
Joey Hess
00593523c6
Merge branch 'master' of ssh://git-annex.branchable.com 2024-04-09 12:59:01 -04:00
Joey Hess
2c73845d90
multiple -m second try
Test suite passes this time. When committing the adjusted branch, use
the old method to make a message that old git-annex can consume. Also
made the code accept the new message, so that eventually
commitTreeExactMessage can be removed.

Sponsored-by: Kevin Mueller on Patreon
2024-04-09 12:56:47 -04:00
nobodyinperson
e3a58a2710 Added a comment: or use numcopies for safety 2024-04-09 10:48:10 +00:00
nobodyinperson
f020a804a2 Added a comment: when someone names files like keys, they probably want trouble 🙃 2024-04-09 10:43:16 +00:00
Joey Hess
69546f73ca
comment 2024-04-08 16:57:53 -04:00
lukasz.opiola@8b366725db99c2a5e0e638d1a5d57d457d0bdad4
5cb2186cd5 2024-04-08 13:56:32 +00:00
lukasz.opiola@8b366725db99c2a5e0e638d1a5d57d457d0bdad4
58180abd6b 2024-04-08 13:55:19 +00:00
lukasz.opiola@8b366725db99c2a5e0e638d1a5d57d457d0bdad4
a1616be8d6 2024-04-08 13:53:48 +00:00
Joey Hess
cefba1c4dc
Merge branch 'master' of ssh://git-annex.branchable.com 2024-04-08 09:16:34 -04:00
Joey Hess
967f887b95
update 2024-04-08 09:16:27 -04:00
grawity@dec5f8ddda45c421809e4687d9950f9ed2a03e46
3dfe165a05 2024-04-08 09:52:49 +00:00
Joey Hess
6401a1eed5
Merge branch 'master' of ssh://git-annex.branchable.com 2024-04-06 19:52:26 -04:00
Joey Hess
5060185a7b
project we started at Distribits 2024-04-06 19:52:07 -04:00
nobodyinperson
bae7076e43 Added a comment: more use cases for configurable default preferred content 2024-04-06 15:12:50 +00:00
Joey Hess
e216bd5f10
mention reversion 2024-04-02 17:33:20 -04:00
Joey Hess
a8dd85ea5a
Revert "multiple -m"
This reverts commit cee12f6a2f.

This commit broke git-annex init run in a repo that was cloned from a
repo with an adjusted branch checked out.

The problem is that findAdjustingCommit was not able to identify the
commit that created the adjusted branch. It seems that there is an extra
"\n" at the end of the commit message that it does not expect.

Since backwards compatability needs to be maintained, cannot just make
findAdjustingCommit accept it with the "\n". Will have to instead
have one commitTree variant that uses the old method, and use it for
adjusted branch committing.
2024-04-02 17:29:07 -04:00
Joey Hess
cee12f6a2f
multiple -m
sync, assist, import: Allow -m option to be specified multiple times, to
provide additional paragraphs for the commit message.

The option parser didn't allow multiple -m before, so there is no risk of
behavior change breaking something that was for some reason using multiple
-m already.

Pass through to git commands, so that the method used to assemble the
paragrahs is whatever git does. Which might conceivably change in the
future.

Note that git commit-tree has supported -m since git 1.7.7. commitTree
was probably not using it since it predates that version. Since the
configure script prevents building git-annex with git older than 2.1,
there is no risk that it's not supported now.

Sponsored-by: Nicholas Golder-Manning on Patreon
2024-03-27 15:58:27 -04:00
Joey Hess
e32a5166a0
Merge branch 'master' of ssh://git-annex.branchable.com 2024-03-26 18:19:29 -04:00
Joey Hess
8d35ea976c
todo 2024-03-26 18:19:23 -04:00
yarikoptic
6b837d17c2 Added a comment 2024-03-26 19:13:15 +00:00
Joey Hess
962da7bcf9
update for new rclone gitannex command 2024-03-26 13:48:43 -04:00
Joey Hess
331f9dd764
link to commit 2024-03-25 14:51:36 -04:00
Joey Hess
f04d9574d6
fix transfer lock file for Download to not include uuid
While redundant concurrent transfers were already prevented in most
cases, it failed to prevent the case where two different repositories were
sending the same content to the same repository. By removing the uuid
from the transfer lock file for Download transfers, one repository
sending content will block the other one from also sending the same
content.

In order to interoperate with old git-annex, the old lock file is still
locked, as well as locking the new one. That added a lot of extra code
and work, and the plan is to eventually stop locking the old lock file,
at some point in time when an old git-annex process is unlikely to be
running at the same time.

Note that in the case of 2 repositories both doing eg
`git-annex copy foo --to origin`
the output is not that great:

copy b (to origin...)
  transfer already in progress, or unable to take transfer lock
git-annex: transfer already in progress, or unable to take transfer lock
97%   966.81 MiB      534 GiB/s 0sp2pstdio: 1 failed

  Lost connection (fd:14: hPutBuf: resource vanished (Broken pipe))

  Transfer failed

Perhaps that output could be cleaned up? Anyway, it's a lot better than letting
the redundant transfer happen and then failing with an obscure error about
a temp file, which is what it did before. And it seems users don't often
try to do this, since nobody ever reported this bug to me before.
(The "97%" there is actually how far along the *other* transfer is.)

Sponsored-by: Joshua Antonishen on Patreon
2024-03-25 14:47:46 -04:00
Joey Hess
7044232696
todo 2024-03-13 11:04:06 -04:00
Joey Hess
eb2cd944d9
update 2024-03-08 14:32:29 -04:00
Joey Hess
ad966e5e7b
update 2024-03-08 13:43:31 -04:00
Joey Hess
1bf02029f9
small problem 2024-03-05 13:45:31 -04:00
Joey Hess
3874b7364f
add todo for tracking free space in repos via git-annex branch
For balanced preferred content perhaps, or just for git-annex info
display.

Sponsored-by: unqueued on Patreon
2024-03-05 13:16:42 -04:00
Joey Hess
a6a7b8320a
Merge branch 'master' of ssh://git-annex.branchable.com 2024-03-01 16:53:13 -04:00
Joey Hess
e7652b0997
implement URL to VURL migration
This needs the content to be present in order to hash it. But it's not
possible for a module used by Backend.URL to call inAnnex because that
would entail a dependency loop. So instead, rely on the fact that
Command.Migrate calls inAnnex before performing a migration.

But, Command.ExamineKey calls fastMigrate and the key may or may not
exist, and it's not wanting to actually perform a migration in any case.
To handle that, had to add an additional value to fastMigrate to
indicate whether the content is inAnnex.

Factored generateEquivilantKey out of Remote.Web.

Note that migrateFromURLToVURL hardcodes use of the SHA256E backend.
It would have been difficult not to, given all the dependency loop
issues. But --backend and annex.backend are used to tell git-annex
migrate to use VURL in any case, so there's no config knob that
the user could expect to configure that.

Sponsored-by: Brock Spratlen on Patreon
2024-03-01 16:42:02 -04:00
Joey Hess
cb50cdcc58
todo 2024-03-01 15:14:45 -04:00
Joey Hess
def94fbff6
update 2024-03-01 13:48:51 -04:00
Joey Hess
1b0de3021a
avoid double checksum when downloading VURL from web for 1st time
Sponsored-by: Jack Hill on Patreon
2024-03-01 13:44:40 -04:00
Joey Hess
4046f17ca0
incremental verification for VURL
Sponsored-by: Brett Eisenberg on Patreon
2024-03-01 13:33:29 -04:00
yarikoptic
283e071bcb has potential in DANDI project 2024-02-29 23:31:05 +00:00
Joey Hess
62e4c9d3b8
add future todo 2024-02-29 17:52:58 -04:00
Joey Hess
c72df19784
verifyKeyContent for VURL
VURL is now fully working, though needs more testing.

Still need to implement verifyKeyContentIncrementally but it works
without it.

Sponsored-by: Luke T. Shumaker on Patreon
2024-02-29 17:44:21 -04:00
Joey Hess
cc17ac423b
implement isCryptographicallySecureKey for VURL
Considerable difficulty to work around an import cycle. Had to move the
list of backends (except for VURL) to Backend.Variety to VURL could use
it.

Sponsored-by: Kevin Mueller on Patreon
2024-02-29 17:26:35 -04:00
Joey Hess
0f7143d226
support VURL backend
Not yet implemented is recording hashes on download from web and
verifying hashes.

addurl --verifiable option added with -V short option because I
expect a lot of people will want to use this.

It seems likely that --verifiable will become the default eventually,
and possibly rather soon. While old git-annex versions don't support
VURL, that doesn't prevent using them with keys that use VURL. Of
course, they won't verify the content on transfer, and fsck will warn
that it doesn't know about VURL. So there's not much problem with
starting to use VURL even when interoperating with old versions.

Sponsored-by: Joshua Antonishen on Patreon
2024-02-29 13:48:51 -04:00
Joey Hess
8f40e0269b
comment 2024-02-27 13:36:07 -04:00
Joey Hess
90b7c2d93c
fix link 2024-02-27 13:20:24 -04:00
Joey Hess
df7230deac
comment and todo 2024-02-27 12:44:34 -04:00
Joey Hess
f7be26d4e3
close 2024-02-22 12:50:12 -04:00
Joey Hess
2b3740e7b4
comment 2024-02-22 12:45:57 -04:00
Joey Hess
891a0076a6
move misplaced bug or todo to a better place 2024-02-22 11:21:39 -04:00
Joey Hess
3475b09c3e
pre-commit: Avoid committing the git-annex branch
Except when a commit is made in a view, which changes metadata.

Make the assistant commit the git-annex branch after git commit of working
tree changes.

This allows using the annex.commitmessage-command in the assistant to
generate a commit message for the git-annex branch that relies on state
gathered during the commit of the working tree. Eg, it might reuse the
commit message.

Note that, when not using the assistant, a git-annex add still commits
the git-annex branch, so such a annex.commitmessage-command set up would
not work then. But if someone is using the assistant and wants
programmatic control over commit messages, this is useful. Someone not
using the assistant can get the same result by using annex.alwayscommit=false
during the git-annex add, and git-annex merge after they git commit.

pre-commit was never really intended to commit the git-annex branch
(except after recording changed metadata), but the assistant did sort of
rely on it. It does later commit the git-annex branch before pushing to
remotes, but I didn't want to risk building up lots of uncommitted changes
to it if that didn't happen frequently.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-02-12 14:42:11 -04:00
Joey Hess
68e99513f0
added annex.commitmessage-command config
Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-02-12 14:35:22 -04:00
Joey Hess
dd0e45c86e
update 2024-02-10 11:24:32 -04:00
Joey Hess
6c5aaa2b0f
design document for verified relaxed urls
Sponsored-by: Graham Spencer on Patreon
2024-02-10 10:48:20 -04:00
Joey Hess
aba87a6e92
close as distributed migration meets this use case 2024-02-10 10:24:58 -04:00
Joey Hess
bca66acfd8
comment 2024-02-09 14:09:49 -04:00
yarikoptic
2a4d7d894b Added a comment 2024-02-09 13:50:21 +00:00
yarikoptic
517914c0f6 Added a comment 2024-02-08 22:08:41 +00:00
Joey Hess
c4943ae277
close 2024-02-07 16:24:39 -04:00
Joey Hess
644317e86f
Merge branch 'master' of ssh://git-annex.branchable.com 2024-02-07 16:21:31 -04:00
Joey Hess
21123ba368
assistant, undo: When committing, let the usual git commit hooks run
Was doing a Git.Branch.commit for historical reasons to do with direct
mode, which no longer apply.

Note that the preCommitAnnexHook is no longer called in commitStaged
because git-annex installs a pre-commit hook that runs the pre-commit-annex
hook. And git commit will run the pre-commit hook.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-02-07 16:15:35 -04:00
jstritch
d8df420b18 Added a comment 2024-02-07 15:52:18 +00:00
jstritch
e0488d74ff Added a comment 2024-02-06 16:10:56 +00:00
Joey Hess
48fe8ba23c
forgot to add this comment earlier 2024-02-05 15:49:32 -04:00
Joey Hess
792106abc3
improve special remote docs
Sponsored-by: Dartmouth College's DANDI project
2024-02-05 15:48:15 -04:00
Joey Hess
083c471ee9
really close 2024-02-05 15:20:40 -04:00
Joey Hess
6b38d0c427
addurl, importfeed: Added --raw-except option
--raw-except=web allows using yt-dlp but not any other special remotes.

Currently this option can only be used once, trying to use it repeatedly
will make option parsing fail. Perhaps it ought to support being used more
than once, but it seemed like an unlikely use case to need that.

Note that getParsed is called repeatedly when the option is used with
several urls. While implementing DeferredParseClass would avoid that
innefficiency, it didn't seem worth the added boilerplate since
getParsed only calls byNameWithUUID which does minimal work.

Sponsored-by: Dartmouth College's DANDI project
2024-02-05 15:16:25 -04:00
Joey Hess
d7419d6e65
comment 2024-02-05 14:06:03 -04:00
jstritch
593186d461 2024-02-03 18:48:36 +00:00
Joey Hess
fc32632774
followup 2024-02-02 14:32:12 -04:00
yarikoptic
5621922e08 the plea to respect prepare-commit-msg hook 2024-02-02 17:07:52 +00:00
yarikoptic
f36ba992ea todo for support of pushinsteadof 2024-02-02 16:27:20 +00:00
yarikoptic
0fd17ceeb5 Initial desire for addurl --raw-except 2024-01-31 20:35:03 +00:00
Joey Hess
3b22e61007
followup and close 2024-01-30 15:51:33 -04:00
yarikoptic
0621711c6f ask for better documentation. 2024-01-30 16:20:45 +00:00
yarikoptic
58b57ab999 error out if yt-dlp sees that video is/was there but not available 2024-01-29 22:03:47 +00:00
Joey Hess
8e9ee31621
webapp: Added --port option, and annex.port config
The getSocket comment that mentioned using ":port"
in the hostname seems to have been incorrect or be out of date.
After all, the bug report came when the user first tried doing that,
and it didn't work.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-01-25 14:08:36 -04:00
Joey Hess
d54f2ccae1
close 2024-01-25 13:28:23 -04:00
Joey Hess
3a20208ce1
confirm this todo 2024-01-25 13:25:15 -04:00
Joey Hess
2a56476ca5
close 2024-01-25 13:16:25 -04:00
Joey Hess
1120ac8272
update 2024-01-25 13:15:13 -04:00
Joey Hess
7aee4ca7c1
nack 2024-01-25 13:10:45 -04:00
Joey Hess
8646183e38
nack 2024-01-25 13:05:52 -04:00
Joey Hess
991dfcb9b8
nack 2024-01-25 13:04:35 -04:00
Joey Hess
3109447120
close 2024-01-25 12:58:16 -04:00
Joey Hess
b9e147d282
Added --expected-present file matching option 2024-01-25 12:56:41 -04:00
Joey Hess
72d2dbde5e
comment 2024-01-23 12:55:44 -04:00
Joey Hess
3ca1e036ed
open todo 2024-01-18 13:11:28 -04:00