Commit graph

1695 commits

Author SHA1 Message Date
Joey Hess
8adafdd013
avoid cpp failure on windows
Seems that while the module is not imported by anything on windows, it
still gets cpped, and MIN_VERSION_unix is not defined so it failed to
preprocess.
2023-08-02 10:08:00 -04:00
Joey Hess
68c9b08faf
fix build with unix-2.8.0
Changed the parameters to openFd. So needed to add a small wrapper
library to keep supporting older versions as well.
2023-08-01 18:41:27 -04:00
Joey Hess
63f76d0ac3
fix build with unix-2.8.0
It made UserInfo into a pattern to discourage manually constructing
them, so just to use UserInfo in a type signature of a function that
consumes them, have to import the new ByteString module.
2023-08-01 17:47:30 -04:00
Joey Hess
d76f088dc4
fix build on windows 2023-08-01 17:39:24 -04:00
Joey Hess
fb640bc2f4
support building with unix-compat 0.7
It removed System.PosixCompat.User.
2023-08-01 15:17:43 -04:00
Joey Hess
08071a1b90
improve match result display simplifier
Sponsored-by: Dartmouth College's DANDI project
2023-07-26 15:28:57 -04:00
Joey Hess
70de4a7e6d
fix bug in match result display simplifier
Sponsored-by: Dartmouth College's DANDI project
2023-07-26 15:28:49 -04:00
Joey Hess
518a51a8a0
--explain for preferred/required content matching
And annex.largefiles and annex.addunlocked.

Also git-annex matchexpression --explain explains why its input
expression matches or fails to match.

When there is no limit, avoid explaining why the lack of limit
matches. This is also done when no preferred content expression is set,
although in a few cases it defaults to a non-empty matcher, which will
be explained.

Sponsored-by: Dartmouth College's DANDI project
2023-07-26 14:50:04 -04:00
Joey Hess
f25eeedeac
initial implementation of --explain
Currently it only displays explanations of options like --in and --copies.

In the future, it should explain preferred content expression evaluation
and other decisions.

The explanations of a few things could be better. In particular,
"standard" will just appear as-is (or as "!standard" if it doesn't
match), rather than explaining why the standard preferred content expression
for the group matches or not.

Currently as implemented, it goes to stdout, and so commands like
git-annex find that have custom output will not display --explain
information. Perhaps that should change, dunno.

Sponsored-by: Dartmouth College's DANDI project
2023-07-25 16:52:57 -04:00
Joey Hess
cf40e2d4b6
Revert "use existing debug machinery for explain"
This reverts commit 409572c9e4.
2023-07-25 15:53:50 -04:00
Joey Hess
409572c9e4
use existing debug machinery for explain
explain is a kind of debug message, but not formatted in the same way.
So it makes sense to reuse the debug machinery for it, since that is
already quite optimised.

Sponsored-by: Dartmouth College's DANDI project
2023-07-25 15:47:58 -04:00
Joey Hess
fbf19338be
remove excess doubled parens in match description
Sponsored-by: Dartmouth College's DANDI project
2023-07-25 13:55:01 -04:00
Joey Hess
f280d38045
parenthesize match description as needed to avoid ambiguity
While avoiding most unncessary parens.

Once case where unncessary parens are not avoided is:

	not ( ( not foo and baz ) )

It would be good eventually to remove doubled parens like these.

Sponsored-by: Dartmouth College's DANDI project
2023-07-25 13:40:23 -04:00
Joey Hess
0f63374be3
accumulate description while matching
This is to be used to explain why something did or didn't match.

Note that this reimplements match in terms of matchMrun.
Implementing match' as a Writer and matchMrun' as a MonadWriter
resulted in nearly identical implementations, which collapsed into the
same thing thanks to Writer being WriterT Identity.

MAnd and MOr implement short circuiting. So an expression
like "not (foo and bar)" will be explained as [MatchedNot, MatchOperation "foo"]
when foo does not match; whether bar matches is irrelevant. Similarly
"foo or bar" will be explained as [MatchedOperation "foo"] when foo
matches. It seems like that will keep the explanations more
understandable. But also, matchMrun already did short circuiting, and it
could be considerably more work to check if bar matches in these cases.

Note that the type signature of matchMrun changed, but it was
over-generic before.

Note that these changes are licensed under the AGPL. Changed module
license accordingly.

Sponsored-by: Dartmouth College's DANDI project
2023-07-25 12:53:05 -04:00
Joey Hess
7298123520
build git trees using ContentIdentifier to speed up import
This gets the trees built, but it does not use them. Next step will be
to remember the tree for next time an import is done, and diff between
old and new trees to find the files that have changed.

Added --missing to the mktree parameters. That only disables a check, so
it's ok to do everywhere mktree is used. It probably also speeds up
mktree to disable the check.

Note that git fsck does not complain about the resulting tree objects
that point to shas that are not in the repository. Even with --strict.

A quick benchmark, importing 10000 files, this slowed it down
from 2:04.06 to 2:04.28. So it will more than pay for itself.

Sponsored-by: Luke Shumaker on Patreon
2023-05-31 12:46:54 -04:00
Joey Hess
0da0e2efcc
add git config debugging
(and process cwd debugging)

Sponsored-by: Dartmouth College's Datalad project
2023-05-15 15:35:29 -04:00
Joey Hess
9812d9aaec
support aeson for Map
Make unused --json use it, which is better than the doubly nested lists
it was using.

Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
2023-05-10 13:51:37 -04:00
Joey Hess
57c1b4f5e5
initremote: Avoid creating a remote that is not encrypted when gpg is broken
checksize was applied lazily, so the exception didn't happen until the
remote was set up.

Sponsored-by: k0ld on Patreon
2023-05-01 13:00:05 -04:00
Joey Hess
aff37fc208
avoid annexFileMode special case
This makes annexFileMode be just an application of setAnnexPerm',
which avoids having 2 functions that do different versions of the same
thing.

Fixes some buggy behavior for some combinations of core.sharedRepository
and umask.

Sponsored-by: Jack Hill on Patreon
2023-04-27 15:58:37 -04:00
Joey Hess
2efceba789
fix windows build 2023-04-12 19:33:19 -04:00
Joey Hess
c5bcb55a8b
add ScopedTypeVariables 2023-04-12 19:19:22 -04:00
Joey Hess
6fc999193f
avoid displaying ExitCode exceptions
Don't need to be sanitized and displaying them messes up actually
exiting with the right exit code! And broke the test suite.

Sponsored-by: Brett Eisenberg on Patreon
2023-04-12 17:04:57 -04:00
Joey Hess
2fdb6ca879
remove unused imports 2023-04-12 16:48:18 -04:00
Joey Hess
a576fc3b12
fix mojibake reversion in display of utf8
When displaying a ByteString like "💕", safeOutput operates on
individual bytes like "\240\159\146\149" and isControl '\146' = True,
so it got truncated to just "\240".

So, only treat the low control characters, and DEL, as control
characters.

Also split Utility.Terminal out of Utility.SafeOutput. The latter needs
win32, but Utility.SafeOutput is used by Control.Exception, which is
used by Setup.

Sponsored-by: Nicholas Golder-Manning on Patreon
2023-04-12 13:53:30 -04:00
Joey Hess
31111d15c8
newline and tab are safe control characters
Oops, let's let git-annex display those! Lol
2023-04-11 15:38:47 -04:00
Joey Hess
afa5b883dc
find, findkeys, examinekey: escape output to terminal when --format is not used
Note that filenames are not quoted, only escaped. This is to match the
output of --format with escaping.

Sponsored-by: Lawrence Brogan on Patreon
2023-04-11 15:27:07 -04:00
Joey Hess
df6f9f1ee8
filter out control characters and quote filenames
Searched for uses of putStr and hPutStr and changed appropriate ones to filter
out control characters and quote filenames.

This notably does not make find and findkeys quote filenames in their default
output. Because they should only do that when stdout is non a pipe.

A few commands like calckey and lookupkey seem too low-level to make sense to filter
output, so skipped those.

Also when relaying output from other commands that is not progress output,
have git-annex filter out control characters.

Sponsored-by: k0ld on Patreon
2023-04-11 14:27:22 -04:00
Joey Hess
cd544e548b
filter out control characters in error messages
giveup changed to filter out control characters. (It is too low level to
make it use StringContainingQuotedPath.)

error still does not, but it should only be used for internal errors,
where the message is not attacker-controlled.

Changed a lot of existing error to giveup when it is not strictly an
internal error.

Of course, other exceptions can still be thrown, either by code in
git-annex, or a library, that include some attacker-controlled value.
This does not guard against those.

Sponsored-by: Noam Kremen on Patreon
2023-04-10 13:50:51 -04:00
Joey Hess
81bc57322f
clean up 2023-04-07 17:20:58 -04:00
Joey Hess
d9b6be7782
convert encode_c to ByteString
This turns out to be possible after all, because the old one decomposed
a unicode Char to multiple Word8s and encoded those. It should be faster
in some places, particularly in Git.Filename.encodeAlways.

The old version encoded all unicode by default as well as ascii control
characters and also '"'. The new one only encodes ascii control
characters by default.

That old behavior was visible in Utility.Format.format, which did escape
'"' when used in eg git-annex find --format='${escaped_file}\n'
So made sure to keep that working the same. Although the man page only
says it will escape "unusual" characters, so it might be able to be
changed.

Git.Filename.encodeAlways also needs to escape '"' ; that was the
original reason that was escaped.

Types.Transferrer I judge is ok to not escape '"', because the escaped
value is sent in a line-based protocol, which is decoded at the other
end by decode_c. So old git-annex and new will be fine whether that is
escaped or not, the result will be the same.

Note that when asked to escape a double quote, it is escaped to \"
rather than to \042. That's the same behavior as git has. It's
perhaps somehow more of a special case than it needs to be.

Sponsored-by: k0ld on Patreon
2023-04-07 17:10:49 -04:00
Joey Hess
371d4f8183
decode_c converted to ByteString
This speeds up a few things, notably CmdLine.Seek using Git.Filename
which uses decode_c and this avoids a conversion to String and back,
and probably the ByteString implementation of decode_c is also faster
for simple cases at least than the string version.

encode_c cannot be converted to ByteString (or if it did, it would have
to convert right back to String in order to handle unicode).

Sponsored-by: Brock Spratlen on Patreon
2023-04-07 14:44:19 -04:00
Joey Hess
d2d7d766d8
fix comment 2023-03-28 12:40:08 -04:00
Joey Hess
cd076cd085
Windows: Support urls like "file:///c:/path"
That is a legal url, but parseUrl parses it to "/c:/path"
which is not a valid path on Windows. So as a workaround, use
parseURIPortable everywhere, which removes the leading slash when
run on windows.

Note that if an url is parsed like this and then serialized back
to a string, it will be different from the input. Which could
potentially be a problem, but is probably not in practice.

An alternative way to do it would be to have an uriPathPortable
that fixes up the path after parsing. But it would be harder to
make sure that is used everywhere, since uriPath is also used
when constructing an URI.

It's also worth noting that System.FilePath.normalize "/c:/path"
yields "c:/path". The reason I didn't use it is that it also
may change "/" to "\" in the path and I wanted to keep the url
changes minimal. Also noticed that convertToWindowsNativeNamespace
handles "/c:/path" the same as "c:/path".

Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
2023-03-27 13:38:02 -04:00
Joey Hess
79d00a3b9f
fix build warning on windows 2023-03-27 12:17:55 -04:00
Joey Hess
2b5fa091e2
annex.maxextensionlength for view
view: Support annex.maxextensionlength when generating filenames for the
view branch.

Note that refining an existing view will reuse the extension length that was
configured when initially constructing the view. This is necessarily the case
because it reuses the filenames.

Also view files used to have all extensions at the end, no matter how
many there were. Since annex.maxextensionlength's documentation includes
that it's limited to 2 extensions, I made it consistent with that.

Sponsored-by: k0ld on Patreon
2023-03-24 14:01:38 -04:00
Joey Hess
e822df2a09
fix build warnings on windows 2023-03-21 18:41:23 -04:00
Yaroslav Halchenko
84b0a3707a
Apply codespell -w throughout 2023-03-17 15:14:58 -04:00
Yaroslav Halchenko
100f5aabb6
Typo: recurrance -> recurrence 2023-03-17 15:14:54 -04:00
Yaroslav Halchenko
0ae5ff797f
Typo: sansative -> sensitive 2023-03-17 15:14:50 -04:00
Yaroslav Halchenko
e018ae1125
Fix ambigous typos 2023-03-17 15:14:47 -04:00
Joey Hess
1eb64f0249
fix OSx build better 2023-03-09 11:43:55 -04:00
Joey Hess
0dac6411d8
fix build on OSX
Breakage from 54ad1b4cfb
2023-03-08 12:17:09 -04:00
Joey Hess
ba2d51ca80
remove redundant import 2023-03-08 11:41:20 -04:00
Joey Hess
433608753a
further fix windows build
Thanks to jkniiv for this patch.
2023-03-06 12:15:53 -04:00
Joey Hess
398633c12b
fix build on windows 2023-03-03 12:58:39 -04:00
Joey Hess
33a05a1a91
small RawFilePath optimisation 2023-03-02 10:53:12 -04:00
Joey Hess
54ad1b4cfb
Windows: Support long filenames in more (possibly all) of the code
Works around this bug in unix-compat:
https://github.com/jacobstanley/unix-compat/issues/56
getFileStatus and other FilePath using functions in unix-compat do not do
UNC conversion on Windows.

Made Utility.RawFilePath use convertToWindowsNativeNamespace to do the
necessary conversion on windows to support long filenames.

Audited all imports of System.PosixCompat.Files to make sure that no
functions that operate on FilePath were imported from it. Instead, use
the equvilants from Utility.RawFilePath. In particular the
re-export of that module in Common had to be removed, which led to lots
of other changes throughout the code.

The changes to Build.Configure, Build.DesktopFile, and Build.TestConfig
make Utility.Directory not be needed to build setup. And so let it use
Utility.RawFilePath, which depends on unix, which cannot be in
setup-depends.

Sponsored-by: Dartmouth College's Datalad project
2023-03-01 15:55:58 -04:00
Joey Hess
505f1a654b
avoid using Utility.Path.AbsRel in Utility.Path.Windows
AbsRel depends on unix, but Utility.Path.Windows will be used in some
libraries that are part of the setup-depends, which cannot depend on
unix.

The only reason that AbsRel uses getWorkingDirectory on unix is that
it returns RawFilePath. getCurrentDirectory returns FilePath and so
needs a conversion to RawFilePath. Looks like a newer version of
directory will fix that, by using OsPath, so eventually AbsPath should
be able to switch to using getCurrentDirectory on unix, and then the
small code duplication in this commit won't be needed.

Sponsored-by: Dartmouth College's Datalad project
2023-03-01 14:01:33 -04:00
Joey Hess
3c08af0da1
factor out convertToWindowsNativeNamespace into its own module
Gonna use this more widely.

Sponsored-by: Dartmouth College's Datalad project
2023-03-01 13:28:39 -04:00
Joey Hess
9b1fe37818
improve adjusted branch name parsing to support adjusted view branches
An adjusted view branch has a name like
"adjusted/views/master(author=_)(unlocked)"
and so the adjustment starts at the last open paren, not the first open
paren.

Note that git-annex sync still does not do anything useful when run in
such a branch, because it does not realize that it is a view branch.
This is only groundwork for adjusted view branches.

This also fixes adjusted branches when the basis branch name contains
parens for some other reason, though that is not common in a git branch
name.

Sponsored-by: Boyd Stephen Smith Jr. on Patreon
2023-02-27 14:09:05 -04:00
Joey Hess
96d46db2d5
Support http urls that contain ":" that is not followed by a port number
The same as git does.

Sponsored-by: Dartmouth College's DANDI project
2023-02-10 13:34:47 -04:00
Joey Hess
d079fb0578
work around ldd crash on arm64 autobuilder
This does mean it has to run ldd once per executable, which is actually
quite a number of times, so will be a bit slower.

Sponsored-by: Luke Shumaker on Patreon
2023-01-28 14:26:01 -04:00
Joey Hess
a00513754b
support old libgcc1 name
Used in Debian jessie, which the i386ancient autobuilder uses.
2023-01-26 15:37:21 -04:00
Joey Hess
f316b7f105
Revert "Removed the vendored git-lfs and the GitLfs build flag"
This reverts commit efda811404.

Turns out that datalad is building git-annex against debian bullseye.
https://github.com/datalad/git-annex/issues/149
2023-01-04 17:33:29 -04:00
Joey Hess
efda811404
Removed the vendored git-lfs and the GitLfs build flag
AFAICS all git-annex builds are using the git-lfs library not the vendored
copy.

Debian stable does have a too old haskell-git-lfs package to be able to
build git-annex from source, but there is not currently a backport of a
recent git-annex to Debian stable. And if they update the backport at some
point, they should be able to backport the library too.

Sponsored-by: Svenne Krap on Patreon
2022-12-26 12:49:53 -04:00
Joey Hess
d475f82c62
Added libgcc_s.so.1 to the linux standalone build so pthread_cancel will work
In Makefile, listed additional deps of Build/Standalone. Without that,
it does not get updated for the change to Utility/LinuxMkLibs.hs when
compiling incrementally.

Sponsored-by: Dartmouth College's DANDI project
2022-12-22 15:15:25 -04:00
Joey Hess
2a3958cbe1
new metric prefixes give us quettabyte and yottabyte
There does not yet seem to be an update to IEC 80000-13, so no
robibyte or quebibyte yet.

Sponsored-by: k0ld on Patreon
2022-11-18 12:03:12 -04:00
Joey Hess
2eedf58630
replace guessed win32 version with actual version 2.13.4.0 2022-11-09 13:08:27 -04:00
Joey Hess
9187a37dec
fix build warning 2022-10-27 10:21:24 -04:00
Joey Hess
14f7a386f0
Make git-annex enable-tor work when using the linux standalone build
Clean the standalone environment before running the su command
to run "sh". Otherwise, PATH leaked through, causing it to run
git-annex.linux/bin/sh, but GIT_ANNEX_DIR was not set,
which caused that script to not work:

[2022-10-26 15:07:02.145466106] (Utility.Process) process [938146] call: pkexec ["sh","-c","cd '/home/joey/tmp/git-annex.linux/r' && '/home/joey/tmp/git-annex.linux/git-annex' 'enable-tor' '1000'"]
/home/joey/tmp/git-annex.linux/bin/sh: 4: exec: /exe/sh: not found

Changed programPath to not use GIT_ANNEX_PROGRAMPATH,
but instead run the scripts at the top of GIT_ANNEX_DIR.
That works both when the standalone environment is set up, and when it's
not.

Sponsored-by: Kevin Mueller on Patreon
2022-10-26 15:45:08 -04:00
Joey Hess
f1c85ac11b
fix breakage in wormhole's sendFile
Commit ff0927bde9 broke this, it made it
try to read all of the input before looking for the code. But, wormhole
keeps running until it sends the file, so that caused a deadlock. Oops.

Sponsored-by: Luke Shumaker on Patreon
2022-09-26 15:26:29 -04:00
Joey Hess
17129fed66
fix wormhole --appid option position
p2p: Pass wormhole the --appid option before the receive/send command, as
it does not accept that option after the command

I'm left wondering, did I get this wrong from the beginning, or did
wormhole change its option parser? I'm reminded of the change in 0.8.2
where it silently changed what FD the pairing code was output to.
But, looking at the wormhole source, it was at least putting --appid before
send in its test suite from the introduction of the option.

So I think probably this has always been broken. On 2021-12-31 the --appid
option was enabled, and it took until now for someone to try
git-annex p2p --pair and notice that flag day broke it..

Sponsored-by: Svenne Krap on Patreon
2022-09-26 15:11:38 -04:00
Joey Hess
1fe9cf7043
deal with ignoreinode config setting
Improve handling of directory special remotes with importtree=yes whose
ignoreinode setting has been changed. (By either enableremote or by
upgrading to commit 3e2f1f73cbc5fc10475745b3c3133267bd1850a7.)

When getting a file from such a remote, accept the content that would have
been accepted with the previous ignoreinode setting.

After a change to ignoreinode, importing a tree from the remote will
re-import and generate new content identifiers using the new config. So
when ignoreinode has changed to no, the inodes will be learned, and after
that point, a change in an inode will be detected as a change. Before
re-importing, a change in an inode will be ignored, as it was before the
ignoreinode change. This seems acceptble, because the user can re-import
immediately if they urgently need to add inodes. And if not, they'll
do it sometime, presumably, and the change will take effect then.

Sponsored-by: Erik Bjäreholt on Patreon
2022-09-16 14:11:25 -04:00
Joey Hess
8a4cfd4f2d
use getSymbolicLinkStatus not getFileStatus to avoid crash on broken symlink
Fix crash importing from a directory special remote that contains a broken
symlink.

The crash was in listImportableContentsM but some other places in
Remote.Directory also seemed like they could have the same problem.

Also audited for other places that have such a problem. Not all calls
to getFileStatus are bad, in some cases it's better to crash on something
unexpected. For example, `git-annex import path` when the path is a broken
symlink should crash, the same as when it does not exist. Many of the
getFileStatus calls are like that, particularly when they involve
.git/annex/objects which should never have a broken symlink in it.

Fixed a few other possible cases of the problem.

Sponsored-by: Lawrence Brogan on Patreon
2022-09-05 13:46:32 -04:00
Joey Hess
840bd50390
make it easier to use curl for unusual url schemes
Use curl when annex.security.allowed-url-schemes includes an url scheme not
supported by git-annex internally, as long as
annex.security.allowed-ip-addresses is configured to allow using curl.

Sponsored-by: Luke Shumaker on Patreon
2022-08-15 12:22:13 -04:00
Joey Hess
23c6e350cb
improve createDirectoryUnder to allow alternate top directories
This should not change the behavior of it, unless there are multiple top
directories, and then it should behave the same as if there was a single
top directory that was actually above the directory to be created.

Sponsored-by: Dartmouth College's Datalad project
2022-08-12 12:52:37 -04:00
Joey Hess
5bc70e2da5
When bup split fails, display its stderr
It seems worth noting here that I emailed bup's author about bup split
being noisy on stderr even with -q in approximately 2011. That never got
fixed. Its current repo on github only accepts pull requests, not bug
reports. Needing to add such complexity to deal with such a longstanding
unfixed issue is not fun.

Sponsored-by: Kevin Mueller on Patreon
2022-08-05 13:57:20 -04:00
Joey Hess
cd9fd6e28c
fix case of Win32 2022-08-04 12:17:27 -04:00
Joey Hess
472f5c142b
Use createFile_NoRetry from win32 2.13.3.1
Sponsored-by: Tobias Ammann on Patreon
2022-08-02 10:45:39 -04:00
Joey Hess
ef8e481ebd
clarify comment and remove broken link
There are archives of MC Knowledgebase, which google will find, I don't
want to try to keep a link to an archive working since MS is no longer
providing it.
2022-08-01 13:53:36 -04:00
Joey Hess
d905232842
use ResourcePool for hash-object handles
Avoid starting an unncessary number of git hash-object processes when
concurrency is enabled.

Sponsored-by: Dartmouth College's DANDI project
2022-07-25 17:32:39 -04:00
Joey Hess
2d65c4ff1d
avoid unix-compat's rename
On Windows, that does not support long paths
https://github.com/jacobstanley/unix-compat/issues/56

Instead, use System.Directory.renamePath, which does support long paths.

Sponsored-by: Dartmouth College's Datalad project
2022-07-12 14:55:02 -04:00
Joey Hess
02ef3d6a64
fix build with assistant disabled and webapp enabled
The webapp modules cannot build with the assistant disabled, so make the
webapp be under the assistant build flag.

Sponsored-by: Jarkko Kniivilä on Patreon
2022-06-29 14:19:18 -04:00
Joey Hess
d137bd4c29
fix typo
Recent commit got amended accidentially to include this typo. Argh. It
was fine, then I tweakd the commit message and accidentially staged this
breakage.
2022-06-23 13:54:34 -04:00
Joey Hess
9562da790f
add rent optimisation inhibitor
As outlined in my blog post, I object to Microsoft training their Copilot
model on my code, and believe it likely violates my copyright.
https://joeyh.name/blog/entry/a_bitter_pill_for_Microsoft_Copilot/

While I never push git-annex to Github, other users choose to. And since
Microsoft is now selling access to Copilot to anyone, this situation is
escalating.
2022-06-23 13:13:54 -04:00
Joey Hess
aad362e1c4
remove vendored library no longer used
http-manager-restricted is used for a while, but I forgot to delete this
file when making that change.
2022-06-23 12:35:26 -04:00
Joey Hess
debcf86029
use RawFilePath version of rename
Some small wins, almost certianly swamped by the system calls, but still
worthwhile progress on the RawFilePath conversion.

Sponsored-by: Erik Bjäreholt on Patreon
2022-06-22 16:47:34 -04:00
Joey Hess
ebb76f0486
avoid setEnv while testing gpg
setEnv is not thread safe and could cause a getEnv by another thread to
segfault, or perhaps other had behavior.

Sponsored-by: Dartmouth College's Datalad project
2022-05-18 16:05:11 -04:00
Joey Hess
8675b2b075
rename memoryUnits
It's not just used for memory sizes.
2022-05-05 15:35:11 -04:00
Joey Hess
d1cce869ed
implement dataUnits finally
Added support for "megabit" and related bandwidth units in
annex.stalldetection and everywhere else that git-annex parses data units.

Note that the short form is "Mbit" not "Mb" because that differs from "MB"
only in case, and git-annex parses units case-insensitively. It would be
horrible if two different versions of git-annex parsed the same value
differently, so I don't think "Mb" can be supported.

See comment for bonus sad story from my childhood.

Sponsored-by: Nicholas Golder-Manning
2022-05-05 15:25:11 -04:00
Joey Hess
642703c7e4
avoid using removePathForcibly everywhere, it is unsafe
If the temp directory can somehow contain a hard link, it changes the
mode, which affects all other hard linked files. So, it's too unsafe
to use everywhere in git-annex, since hard links are possible in
multiple ways and it would be very hard to prove that every place that
uses a temp directory cannot possibly put a hard link in it.

Added a call to removeDirectoryForCleanup to test_crypto, which will
fix the problem that commit 17b20a2450
was intending to fix, with a much smaller hammer.

Sponsored-by: Dartmouth College's Datalad project
2022-05-02 14:06:20 -04:00
Joey Hess
17b20a2450
Fix test failure on NFS when cleaning up gpg temp directory
Using removePathForcibly avoids concurrent removal problems.

The i386ancient build still uses an old version of ghc and directory that
do not include removePathForcibly though.

Sponsored-by: Dartmouth College's Datalad project
2022-04-19 13:33:33 -04:00
Joey Hess
150d73c268
fix quickcheck test on windows
prop_relPathDirToFileAbs_basics (TestableFilePath ":/") failed on
windows. The colon was filtered out after trying to make
the path relative, which only removed leading path separators.
So, ":/" changed to "/" which is not relative. Filtering out the colon
before hand avoids this problem.

Sponsored-by: Luke Shumaker on Patreon
2022-03-22 13:53:55 -04:00
Joey Hess
982eb7ed0d
remove vendored http-client-restricted
Removed vendored copy of http-client-restricted, and removed the
HttpClientRestricted build flag that avoided that dependency.

http-client-restricted is in Debian stable, and the i386ancient build also
uses it, so I think this vendored copy is no longer needed.

Sponsored-by: Noam Kremen on Patreon
2022-03-22 11:50:06 -04:00
Joey Hess
5b6518d4a3
fix build warning with old aeson 2022-03-07 13:29:24 -04:00
Joey Hess
cbd138e042
factor out Utility.Aeson.textKey 2022-03-02 18:24:06 -04:00
sternenseemann
ca596e7c54
allow building with aeson >= 2.0
In aeson 2.0, Text has been replaced by the Key type and HashMap by the
KeyMap interface. Accomodating this required adding some CPP in order to
still be able to compile with aeson < 2.0. The required changes were:

* Prevent Key from being re-exported by Utilities.Aeson, as it clashes
  with git-annex's own Key type.

* Fix up convertion from String/Text to Key (or Text in aeson 1.*) in a
  couple of places

* Import Data.Aeson.KeyMap instead of Data.HashMap.Strict, as they are
  mostly API-compatible. insertWith needs to be replaced by unionWith,
  however, as KeyMap lacks the former function.
2022-03-02 18:01:41 -04:00
Joey Hess
952664641a
turn of PackageImports in cabal file
This makes it easier to build eg benchmarks of individual modules.

May be that most of these PackageImports are not really necessary,
dunno.
2022-02-25 13:16:36 -04:00
Joey Hess
61b48b69ba
fix build on windows 2021-12-09 13:39:16 -04:00
Joey Hess
bba74a2b84
improve comments 2021-12-08 18:59:22 -04:00
Joey Hess
ef3ab0769e
close pid lock only once no threads use it
This fixes a FD leak when annex.pidlock is set and -J is used. Also, it
fixes bugs where the pid lock file got deleted because one thread was
done with it, while another thread was still holding it open.

The LockPool now has two distinct types of resources,
one is per-LockHandle and is used for file Handles, which get closed
when the associated LockHandle is closed. The other one is per lock
file, and gets closed when no more LockHandles use that lock file,
including other shared locks of the same file.

That latter kind is used for the pid lock file, so it's opened by the
first thread to use a lock, and closed when the last thread closes a lock.

In practice, this means that eg git-annex get of several files opens and
closes the pidlock file a few times per file. While with -J5 it will open
the pidlock file, process a number of files, until all the threads happen to
finish together, at which point the pidlock file gets closed, and then
that repeats. So in either case, another process still gets a chance to
take the pidlock.

registerPostRelease has a rather intricate dance, there are fine-grained
STM locks, a STM lock of the pidfile itself, and the actual pidlock file
on disk that are all resolved in stages by it.

Sponsored-by: Dartmouth College's Datalad project
2021-12-06 15:01:39 -04:00
Joey Hess
774c7dab2f
Merge branch 'master' into pidlockfinegrained 2021-12-06 13:00:40 -04:00
Joey Hess
ae4c56b28a
Revert "fix too early close of shared lock file"
This reverts commit 66b2536ea0.

I misunderstood commit ac56a5c2a0
and caused a FD leak when pid locking is not used.

A LockHandle contains an action that will close the underlying lock
file, and that action is run when it is closed. In the case of a shared
lock, the lock file is opened once for each LockHandle, and only
the one for the LockHandle that is being closed will be closed.
2021-12-06 12:51:28 -04:00
Joey Hess
e464ffd641
update comment to current status 2021-12-03 18:41:51 -04:00
Joey Hess
e5ca67ea1c
fine-grained locking when annex.pidlock is enabled
This locking has been missing from the beginning of annex.pidlock.
It used to be possble, when two threads are doing conflicting things,
for both to run at the same time despite using locking. Seems likely
that nothing actually had a problem, but it was possible, and this
eliminates that possible source of failure.

Sponsored-by: Dartmouth College's Datalad project
2021-12-03 17:20:21 -04:00
Joey Hess
6988c2e740
fix build on windows
broken by ed0afbc36b

Sponsored-by: Dartmouth College's Datalad project
2021-12-03 14:08:12 -04:00
Joey Hess
ed0afbc36b
avoid concurrent threads trying to take pid lock at same time
Seem there are several races that happen when 2 threads run PidLock.tryLock
at the same time. One involves checkSaneLock of the side lock file, which may
be deleted by another process that is dropping the lock, causing checkSaneLock
to fail. And even with the deletion disabled, it can still fail, Probably due
to linkToLock failing when a second thread overwrites the lock file.

The same can happen when 2 processes do, but then one process just fails
to take the lock, which is fine. But with 2 threads, some actions where failing
even though the process as a whole had the pid lock held.

Utility.LockPool.PidLock already maintains a STM lock, and since it uses
LockShared, 2 threads can hold the pidlock at the same time, and when
the first thread drops the lock, it will remain held by the second
thread, and so the pid lock file should not get deleted until the last
thread to hold it drops the lock. Which is the right behavior, and why a
LockShared STM lock is used in the first place.

The problem is that each time it takes the STM lock, it then also calls
PidLock.tryLock. So that was getting called repeatedly and concurrently.

Fixed by noticing when the shared lock is already held, and stop calling
PidLock.tryLock again, just use the pid lock that already exists then.

Also, LockFile.PidLock.tryLock was deleting the pid lock when it failed
to take the lock, which was entirely wrong. It should only drop the side
lock.

Sponsored-by: Dartmouth College's Datalad project
2021-12-01 17:14:39 -04:00
Joey Hess
66b2536ea0
fix too early close of shared lock file
This fixes a reversion introduced in commit
ac56a5c2a0.

I didn't notice there that it was handling the case of a shared lock
file that was still open elsewhere by not running the close action.

This was especially deadly when annex.pidlock is set, as it caused early
deletion of the pid lock file.

Sponsored-by: Dartmouth College's Datalad project
2021-12-01 17:06:28 -04:00
Joey Hess
a6699be79d
catch error statting pid lock file if it somehow does not exist
It ought to exist, since linkToLock has just created it. However,
Lustre seems to have a rather probabilisitic view of the contents of a
directory, so catching the error if it somehow does not exist and
running the same code path that would be ran if linkToLock failed
might avoid this fun Lustre failure.

Sponsored-by: Dartmouth College's Datalad project
2021-11-29 14:53:07 -04:00
Joey Hess
f3326b8b5a
git-lfs gitlab interoperability fix
git-lfs: Fix interoperability with gitlab's implementation of the git-lfs
protocol, which requests Content-Encoding chunked.

Sponsored-by: Dartmouth College's Datalad project
2021-11-10 13:51:11 -04:00