Commit graph

1815 commits

Author SHA1 Message Date
Joey Hess
aba8ee1ca1
releasing package git-annex version 10.20250102 2025-01-02 12:32:05 -04:00
Joey Hess
b4e406952c
prep for release tomorrow and copyright year update 2025-01-01 14:24:57 -04:00
Joey Hess
da5e195597
remove i386ancient and need at least debian stable to build
* Removed the i386ancient standalone tarball build for linux, which
  was increasingly unable to support new git-annex features.
* Removed support for building with ghc older than 9.0.2,
  and with older versions of haskell libraries than are in current Debian
  stable.
* stack.yaml: Update to lts-23.2.

Note that i386ancient was targeting linux 2.6.32, which has been EOL for
over 9 years now. Any old system still using such a kernel is certainly highly
insecure. And I suspect i386ancient had its own insecurities due to haskell
libraries and C libraries not having been updated.
2025-01-01 14:15:55 -04:00
Joey Hess
29b3c7c660
annex.addunlocked support for tree imports
Honor annex.addunlocked configuration when importing a tree from a special
remote.

Note, in a --no-content import, the object file will not be populated
(usually) and so expressions that match on mime type will not match. Tested
this and it works ok, the file just ends up locked. Updated docs for the
mime expressions to mention that they can't match when the file is present

Note that in Command.Sync.pullThirdPartyPopulated, recordImportTree is
called without a AddUnlockedMatcher. Since the tree generated here is not
exposed to the user and does not contain usual filenames, there is no need
of the overhead of checking it.
2024-12-19 11:43:51 -04:00
Joey Hess
7d8558548b
empty preferred content
* Document that settting preferred content to "" is the same as the
  default unset behavior.
* sync: Avoid misleading warning about future preferred content
  transition when preferred content is set to "".
2024-12-13 13:26:48 -04:00
Joey Hess
dd052dcba1
annexInsteadOf config
Added config `url.<base>.annexInsteadOf` corresponding to git's
`url.<base>.pushInsteadOf`, to configure the urls to use for accessing the
git-annex repositories on a server without needing to configure
remote.name.annexUrl in each repository.

While one use case for this would be rewriting urls to use annex+http,
I decided not to add any kind of special case for that. So while
git-annex p2phttp, when serving multiple repositories, needs an url
of eg "annex+http://example.com/git-annex/ for each of them, rewriting an
url like "https://example.com/git/foo/bar" with this config set to
"https://example.com/git/" will result in eg
"annex+http://example.com/git-annex/foo/bar", which p2phttp does not
support.

That seems better dealt with in either git-annex p2phttp or a http
middleware, rather than complicating the config with a special case for
annex+http.

Anyway, there are other use cases for this that don't involve annex+http.
2024-12-03 14:39:07 -04:00
Joey Hess
0c08ff3d2c
deal with git's CFLR nonsense once again
Work around git hash-object --stdin-paths's odd stripping of carriage
return from the end of the line (some windows infection), avoiding crashing
when the repo contains a filename ending in a carriage return.
2024-12-02 13:47:51 -04:00
Joey Hess
430f6bc9c7
releasing package git-annex version 10.20241202 2024-12-02 12:36:24 -04:00
Joey Hess
8663c72f1e
git-remote-annex: Fix buggy behavior when annex.stalldetection is configured
Make programPath never return "git-remote-annex" or other known multi-call
program names, which are not git-annex and won't behave like it.
If the git-annex binary gets installed under some entirely other name,
it will still return it.

This change exposed that readProgramFile actually could crash,
which happened before only if getExecutablePath was not absolute
and there was no ~/.config/git-annex/program. So fixed that to catch
exception.
2024-11-25 12:14:52 -04:00
Joey Hess
757f93203a
Merge branch 'p2phttp-multi' 2024-11-21 15:16:06 -04:00
Joey Hess
4c785c338a
p2phttp: notice when new repositories are added to --directory
When a uuid is not known, rescan for new repositories. Easy.

When a repository is removed, it will also get removed from the server
state on the next scan. But until a new uuid is seen, there will not be
a scan. This leaves the server trying to serve a uuid whose repository
is gone. That seems buggy. While getting just fails, dropping fails the
first time, but seems to leave the server in an unusable state, so the
next drop attempt hangs. The server is still able to serve other uuids,
only the one whose repository was removed has that problem.
2024-11-21 15:09:12 -04:00
Joey Hess
31a38f8468
git-remote-annex: Require git version 2.31 or newer
Since old ones had a buggy git bundle command.

In particular, git 2.30.2 has a git bundle that supports --stdin, but does
not read from it, and so fails to create a bundle.

While not using --stdin would perhaps work, it limits the number of revs
that get included in the bundle to the command line length limit.

But the real kicker is that at the same time --stdin got fixed, a bug also
got fixed that made git bundle skip including refs when they had the same
sha as other refs it included. Which would lead to data loss. So best to
avoid that buggy thing.
2024-11-20 15:00:17 -04:00
Joey Hess
d7ed99a55f
document p2phttp --directory
The option is not implemented yet.
2024-11-20 13:40:38 -04:00
Joey Hess
b8a717a617
reuse http url password for p2phttp url when on same host
When remote.name.annexUrl is an annex+http(s) url, that uses the same
hostname as remote.name.url, which is itself a http(s) url, they are
assumed to share a username and password.

This avoids unnecessary duplicate password prompts.
2024-11-19 15:27:26 -04:00
Joey Hess
df29f29e0d
git-remote-annex: Fix cloning from a special remote on a crippled filesystem
Not initializing and so deleting the bundles only causes a little more work
on the first git fetch.
2024-11-19 12:43:51 -04:00
Joey Hess
dc5bf24823
use 80% less memory when importing from a versioned S3 bucket
Same idea as commit eb714c107b, but even
better, because a lot of the response is DeleteMarker, that can be garbage
collected now.
2024-11-15 14:19:17 -04:00
Joey Hess
4b87669ae2
S3 use last Key when there is no Marker element
Fix infinite loop and memory blowup when importing from an unversioned S3
bucket that is large enough to need pagination.

I don't think there actually ever will be a Marker element, a delimiter is
not set.

Probably this code path was never tested with pagination! Also the aws
library's lack of any docs made it easy to mess up.

Versioned buckets seem to not have the same problem. The API docs for
ListObjectVersions say that NextKeyMarker will always be provided when
paginating.
2024-11-14 16:12:37 -04:00
Joey Hess
1e17d0ee34
Merge branch 'checkbucketversioning' 2024-11-14 13:52:19 -04:00
Joey Hess
44da423e2e
S3: Send git-annex or other configured User-Agent.
--user-agent is the only way to configure it currently

(Needs aws-0.24.3)
2024-11-13 16:10:37 -04:00
Joey Hess
b94221594b
add: When adding a dotfile as a non-large file, mention that it's a dotfile
This is to reduce user confusion when their annex.largefiles matches it,
or is not set.

Note that, when annex.dotfiles is set, but a dotfile is not matched by
annex.largefiles, the "non-large file" message will be displayed. That
makes sense because whether the file is a dotfile does not matter with that
configuration.

Also, this slightly optimised the annex.dotfiles path in passing,
by avoiding the slight slowdown caused by the check added in commit
876d5b6c6f in that case.
2024-11-13 14:09:24 -04:00
Joey Hess
876d5b6c6f
add: Consistently treat files in a dotdir as dotfiles, even when ran inside that dotdir
Assistant and smudge also updated.

This does add a small amount of extra work, getting the TopFilePath.
Not enough to be concerned by.

Also improve documentation to make clear that files inside dotdirs are
treated as dotfiles.

Sponsored-by: Eve on Patreon
2024-11-13 13:43:01 -04:00
Joey Hess
a16bf4f914
S3: Support versioning=yes with a readonly bucket.
Needs aws-0.24.3.
2024-11-12 14:32:23 -04:00
Joey Hess
447e6adabd
vpop: Only update state after successful checkout
If checkout fails for some reason, they're still in a view, and should be
able to vpop again.
2024-11-11 14:15:51 -04:00
Joey Hess
700be6c38f
git-remote-annex: Fix a reversion
Introduced in version 10.20241031 that broke cloning from a special remote

retrieveKeyFile changed to use createAnnexDirectory, which means that the
path passed to it needs to be under .git

git-remote-annex is probably the only thing in git-annex where that was not
the case. And there's no real reason it cannot be the case with it either.
Just use withOtherTmp.
2024-11-11 12:42:35 -04:00
Joey Hess
80d82dba99
releasing package git-annex version 10.20241031 2024-10-31 17:20:13 -04:00
Joey Hess
ccbc5189b5
Fix hang when receiving a large file into a proxied special remote
Only indicate that we're done with the bytestring once it all gets written.
Otherwise, the end of it may get garbage collected before we can process
it, leading to a hang.

This seems to have been introduced in commit
cdc4bd7443. Which oddly was trying to fix a
very similar problem, but specific to a cluster node. In that commit,
things got out of order, with it signaling it was done with the bytestring
before it has written all of it to the file.

My test case for this bug is a directory special remote
with a file being sent to it via a proxy accessed via ssh or http.
The file was 10 mb, and it hung on the last few kb of it not being
received.

I've also tested this fix in the case of proxying to a cluster node
directory special remote over http, which was the case
cdc4bd7443 was dealing with.
2024-10-30 12:29:37 -04:00
Joey Hess
2ca6ecad58
add tip for DATA-PRESENT feature 2024-10-29 16:15:01 -04:00
Joey Hess
0117cdab11
document DATA-PRESENT in CHANGELOG
I wonder where else this could be documented? It's kind of a niche
feature, since it needs at least a partial custom implementation of the p2p
protocol or the p2phttp protocol. But it can save a lot of bandwidth and
avoid the proxy needing disk space to buffer files uploaded to a special
remote.
2024-10-29 15:07:30 -04:00
Joey Hess
8baccda98f
Merge branch 'master' into streamproxy 2024-10-22 09:49:28 -04:00
Joey Hess
bdf3a4747f
adjust: Allow any order of options when combining --hide-missing with options like --unlock.
optparse-applicative made this hard, the naive implementation this had
before didn't let --hide-missing come after --unlock. And just adding
additional <|> with --hide-missing coming after --unlock didn't work
either. So need to get some options and then combine them.
2024-10-21 16:03:39 -04:00
Joey Hess
de138c642b
p2phttp: Allow unauthenticated users to lock content by default
* p2phttp: Allow unauthenticated users to lock content by default.
* p2phttp: Added --unauth-nolocking option to prevent unauthenticated
  users from locking content.

The rationalle for this is that locking is not really a write operation, so
makes sense to allow in a repository that only allows read-only access. Not
supporting locking in that situation will prevent the user from dropping
content from a special remote they control in cases where the other copy of
the content is on the p2phttp server.

Also, when p2phttp is configured to also allow authenticated access,
lockcontent was resulting in a password prompt for users who had no way to
authenticate. And there is no good way to distinguish between the two types
of users client side.

--unauth-nolocking anticipates that this might be abused, and seems better
than disabling unauthenticated access entirely if a server is being
attacked. It may be that rate limiting locking by IP address or similar
would be an effective measure in such a situation. Or just limiting the
number of locks by anonymous users that can be live at any one time. Since
the impact of such an DOS attempt is limited to preventing dropping content
from the server, it seems not a very appealing target anyway.
2024-10-21 10:02:12 -04:00
Joey Hess
82e91b380a
add GITMANIFEST to parseKeyVariety
git-remote-annex: Fix bug that prevented using it with external special
remotes, leading to protocol error messages involving "GITMANIFEST".
2024-10-19 17:12:23 -04:00
Joey Hess
8c7047fc77
Merge branch 'master' into streamproxy 2024-10-18 10:18:59 -04:00
Joey Hess
3a53c60121
Allow enabling the servant build flag with older versions of stm
Allowing building with ghc 9.0.2 (debian stable).
2024-10-17 14:04:31 -04:00
Joey Hess
0629219617
p2phttp combining unauth and auth options
p2phttp: Support serving unauthenticated users while requesting
authentication for operations that need it. Eg, --unauth-readonly can be
combined with --authenv.

Drop locking currently needs authentication so it will prompt for that.
That still needs to be addressed somehow.
2024-10-17 11:10:28 -04:00
Joey Hess
d9b4bf4224
added retrieveKeyFileInOrder and ORDERED to external special remote protocol
I anticipate lots of external special remote programs will neglect
implementing this. Still, it's the right thing to do to assume that some
of them may write files out of order. Probably most external special
remotes will not be used with a proxy. When someone is using one with a
proxy, they can always get it fixed to send ORDERED.
2024-10-15 15:40:14 -04:00
Joey Hess
edaed18e4c
Sped up proxied downloads from special remotes, by streaming
Currently works for special remotes that don't use fileRetriever. Ones that
do will download to another filename and rename it into place, defeating
the streaming.

This actually benchmarks slightly slower when getting a large file from
a fast proxied special remote. However, when the proxied special remote
is slow, it will be a big win.
2024-10-15 12:25:15 -04:00
Joey Hess
fca26db22b
releasing package git-annex version 10.20240927 2024-09-30 19:15:57 -04:00
Joey Hess
dc6c0f0f1f
preparing for release later this week 2024-09-25 14:43:52 -04:00
Joey Hess
5a4bee24b8
fix sizebalanced empty size bug
Fix bug that prevented anything being stored in an empty repository whose
preferred content expression uses sizebalanced.
2024-09-23 14:30:18 -04:00
Joey Hess
52891711d2
git-annex sim command is working
Had to add Read instances to Key and NumCopies and some other similar
types. I only expect to use those in serializing a sim. Of course, this
risks that implementation changes break reading old data. For a sim,
that would not be a big problem.
2024-09-12 16:10:52 -04:00
Joey Hess
811dd95453
maxsize of 0 to disable 2024-09-09 09:32:43 -04:00
Joey Hess
340bdd0dac
treat "not present" in preferred content as invalid
Detect when a preferred content expression contains "not present", which
would lead to repeatedly getting and then dropping files, and make it never
match. This also applies to "not balanced" and "not sizebalanced".

--explain will tell the user when this happens

Note that getMatcher calls matchMrun' and does not check for unstable
negated limits. While there is no --present anyway, if there was,
it would not make sense for --not --present to complain about
instability and fail to match.
2024-09-03 13:50:06 -04:00
Joey Hess
8b2bd42540
Fix --debug display of onlyingroup preferred content expression. 2024-09-03 12:38:59 -04:00
Joey Hess
b3dc656153
releasing package git-annex version 10.20240831 2024-08-31 19:50:26 -04:00
Joey Hess
d0938d730b
Merge branch 'master' into balanced 2024-08-30 11:01:39 -04:00
Joey Hess
242c525659
lookupkey: Allow using --ref in a bare repository. 2024-08-30 10:55:48 -04:00
Joey Hess
70e2fca257
Added the annex.fullybalancedthreshhold git config. 2024-08-22 07:15:55 -04:00
Joey Hess
9e87061de2
Support "sizebalanced=" and "fullysizebalanced=" too
Might want to make --rebalance turn balanced=group:N where N > 1
to fullysizebalanced=group:N. Have not yet determined if that will
improve situations enough to be worth the extra work.
2024-08-21 15:01:54 -04:00
Joey Hess
99514f9d18
maxsize overview display and --json support 2024-08-18 12:08:13 -04:00