Commit graph

43958 commits

Author SHA1 Message Date
Joey Hess
dc02236c85
git-annex log --sizes
CSV format so it can be fed into a program to graph it.

Note that dead repositories are not yet handled so their sizes show as
nonzero after they are marked dead.

Sponsored-By: k0ld on Patreon
2023-11-13 13:07:22 -04:00
Joey Hess
385ddc3225
Merge branch 'master' of ssh://git-annex.branchable.com 2023-11-11 14:43:13 -04:00
Lukey
7df85809c8 Added a comment 2023-11-11 16:13:28 +00:00
Joey Hess
6203b8afba
the last line is for the current time 2023-11-10 17:37:55 -04:00
Joey Hess
574514545c
git-annex log --sizesof
This can take a lot of memory. I decided to violate the usual rule in
git-annex that it operate in constant memory no matter how many annexed
objects. In this case, it would be hard to be fast without using a big
map of the location logs. The main difficulty here is that there can be
many git-annex branches and it needs to display a consistent view at a
point in time, which means merging information from multiple git-annex
branches.

I have not checked if there are any laziness leaks in this code. It
takes 1 gb to run in my big repo, which is around what I estimated
before writing it.

2 options that are documented are not yet implemented.

Small bug: With eg --when=1h, it will display at 12:00 then 1:10 if the
next change after 12:59 is then. Then it waits until after 2:10 to
display the next change. It ought to wait until after 2:00.

Sponsored-by: Brock Spratlen on Patreon
2023-11-10 17:26:10 -04:00
Joey Hess
561c036664
split out generic git log parser
Sponsored-By: Jack Hill on Patreon
2023-11-10 15:40:03 -04:00
Joey Hess
ae401fae14
Merge branch 'master' of ssh://git-annex.branchable.com 2023-11-08 14:14:41 -04:00
Joey Hess
6a8672d756
todo 2023-11-08 14:14:35 -04:00
Joey Hess
11cc9f1933
info: Added calculation of combined annex size of all repositories
Factored out overLocationLogs from CmdLine.Seek, which can calculate this
pretty fast even in a large repo. In my big repo, the time to run git-annex
info went up from 1.33s to 8.5s.

Note that the "backend usage" stats are for annexed files in the working
tree only, not all annexed files. This new data source would let that be
changed, but that would be a confusing behavior change. And I cannot
retitle it either, out of fear something uses the current title (eg parsing
the json).

Also note that, while time says "402108maxresident" in my big repo now,
up from "54092maxresident", top shows the RES constant at 64mb, and it
was 48mb before. So I don't think there is a memory leak. I tried using
deepseq to force full evaluation of addKeyCopies and memory use didn't
change, which also says no memory leak. And indeed, not even calling
addKeyCopies resulted in the same memory use. Probably the increased memory
usage is buffering the stream of data from git in overLocationLogs.

Sponsored-by: Brett Eisenberg on Patreon
2023-11-08 13:35:11 -04:00
Joey Hess
8768966d97
improve comments 2023-11-08 12:06:03 -04:00
jcjgraf
215c96be95 2023-11-07 20:42:11 +00:00
mih
4d242b88eb Added a comment: What about temporary annex.private declaration? 2023-11-07 15:49:47 +00:00
oadams
9ef6257c2f Added a comment 2023-11-07 08:30:21 +00:00
oadams
f8340fc921 2023-11-07 08:26:44 +00:00
Joey Hess
323e4f5a2f
expand description 2023-11-06 11:11:50 -04:00
aurelien@f0d0a0c7da69eff6badf0464898f0a859f69114d
eb846403c9 2023-11-06 01:48:43 +00:00
unqueued
184ebb8802 Added a comment 2023-11-05 21:32:17 +00:00
nobodyinperson
b0b54b678b report 2023-11-05 13:58:25 +00:00
Joey Hess
4e35067325
windows hook scripts newlines without CR
Windows: When git-annex init is installing hook scripts, it will
avoid ending lines with CR for portability.

Existing hook scripts that do have CR line endings will not be changed.
While it would be possible to have git-annex init upgrade them, users would
need to know to use that command to do that, and it would add complexity
that does not seem warranted for the portability benefit alone.

Sponsored-by: Luke T. Shumaker on Patreon
2023-11-02 13:37:04 -04:00
Joey Hess
c41ca6c832
convert StorableCipher to ByteString
This allows getting rid of the ugly and error prone handling of
"bag of bytes" String in Remote.Helper.Encryptable.
Avoiding breakage like that dealt with by commit
9862d64bf9

And allows converting Utility.Gpg to use ByteString for IO, which is
a welcome change.

Tested the new git-annex interoperability with old, using all 3
encryption= types.

Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
2023-11-01 14:39:49 -04:00
Joey Hess
be6b56df4c
remove unused import 2023-11-01 13:14:39 -04:00
Joey Hess
9862d64bf9
bring back "bag of bytes" handling for ciphers
Fixes test suite failure with LANG=C caused by commit
3742263c99

Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
2023-11-01 13:09:42 -04:00
Joey Hess
c927a38dc5
reproduced 2023-11-01 12:19:50 -04:00
Joey Hess
b4c052a4bf
Merge branch 'master' of ssh://git-annex.branchable.com 2023-11-01 12:03:55 -04:00
Joey Hess
3d85489603
update 2023-11-01 08:43:32 -04:00
Joey Hess
caf1652796
update 2023-11-01 08:42:15 -04:00
Joey Hess
1ec3c3e541
update 2023-10-31 14:06:46 -04:00
yarikoptic
99ec795c6a initial bug report 2023-10-30 22:03:20 +00:00
jkniiv
14c38e8a25 Added a comment 2023-10-30 21:32:07 +00:00
Joey Hess
f8d35d9480
lookupkey: Sped up --batch
When the file is relative, it does not need to be passed
through git lsfiles to normalize it.

Sponsored-by: Kevin Mueller on Patreon
2023-10-30 14:59:09 -04:00
Joey Hess
8a540138b6
Merge branch 'master' of ssh://git-annex.branchable.com 2023-10-30 14:45:58 -04:00
Joey Hess
f905bd1508
comment 2023-10-30 14:45:52 -04:00
Joey Hess
39ca30e004
Windows: Consistently avoid ending output lines with CR
This matches the behavior of git on Windows, which does not end lines with
CR either.

Previously, git-annex used to always write lines with putStrLn, so would
output CR on Windows. Then parts of it changed to use ByteString.putStrLn,
which does not output CR. That left its output inconsistent, sometimes
within the same command.

The point of this commit is to get back to consistency. Having the same
behavior as git is a nice bonus. It would be much harder to make it
consistently output CR, because every place it uses ByteString.putStrLn or
similar would need to be changed.

Sponsored-by: Nicholas Golder-Manning on Patreon
2023-10-30 14:43:43 -04:00
Joey Hess
eb42935e58
Windows: Fix CRLF handling in some log files
In particular, the mergedrefs file was written with CR added to each line,
but read without CRLF handling. This resulted in each update of the file
adding CR to each line in it, growing the number of lines, while also
preventing the optimisation from working, so it remerged unncessarily.

writeFile and readFile do NewlineMode translation on Windows. But the
ByteString conversion prevented that from happening any longer.

I've audited for other cases of this, and found three more
(.git/annex/index.lck, .git/annex/ignoredrefs, and .git/annex/import/). All
of those also only prevent optimisations from working. Some other files are
currently both read and written with ByteString, but old git-annex may have
written them with NewlineMode translation. Other files are at risk for
breakage later if the reader gets converted to ByteString.

This is a minimal fix, but should be enough, as long as I remember to use
fileLines when splitting a ByteString into lines. This leaves files written
using ByteString without CR added, but that's ok because old git-annex has
no difficulty reading such files.

When the mergedrefs file has gotten lines that end with "\r\r\r\n", this
will eventually clean it up. Each update will remove a single trailing CR.

Note that S8.lines is still used in eg Command.Unused, where it is parsing
git show-ref, and similar in Git/*. git commands don't include CR in their
output so that's ok.

Sponsored-by: Joshua Antonishen on Patreon
2023-10-30 14:23:23 -04:00
Joey Hess
ea2876ae77
add PackageImports
This makes loading it in ghci work when both crypton and cryptonite are
installed.
2023-10-30 14:10:46 -04:00
mih
2299ee2519 Report on a possibly unknown slowness of lookupkey 2023-10-30 07:45:05 +00:00
Joey Hess
bef1d68d13
Merge branch 'master' of ssh://git-annex.branchable.com 2023-10-27 17:47:50 -04:00
Joey Hess
a09d2954c5
remove wrongly nested comment
This happened to prevent checkout on windows, due to filename too long.
2023-10-27 17:47:23 -04:00
jkniiv
d0b381bd42 Added a comment 2023-10-27 18:45:03 +00:00
Joey Hess
73c02e4fe3
comment 2023-10-26 15:22:22 -04:00
Joey Hess
3330d80ae2
Merge branch 'master' of ssh://git-annex.branchable.com 2023-10-26 14:23:08 -04:00
Joey Hess
8f325d67ae
display summary after concurrent output is complete
Otherwise, it could have buffered output that was displayed after the
summary. I started seeing that on my 12 core laptop, probably due to the
increased amount of concurrency.

Also seems possible, that, since the summary did exit, it might have
prevented displaying buffered output sometimes, or had other bad
results.
2023-10-26 14:19:41 -04:00
Joey Hess
7776d03355
avoid unused import 2023-10-26 14:00:21 -04:00
Joey Hess
0f3b78ec29
simplify 2023-10-26 14:00:02 -04:00
Joey Hess
e4ef688b98
RawFilePath conversion
May slightly speed up startup.
2023-10-26 13:58:49 -04:00
nobodyinperson
af6ecc9be5 Added a comment 2023-10-26 17:46:28 +00:00
Joey Hess
b0302c937d
avoid double RawFilePath conversion
Minor optimisation.
2023-10-26 13:41:17 -04:00
Joey Hess
d9fd205cbb
push RawFilePath down into Annex.ReplaceFile
Minor optimisation, but a win in every case, except for a couple where
it's a wash.

Note that replaceFile still takes a FilePath, because it needs to
operate on Chars to truncate unicode filenames properly.
2023-10-26 13:36:49 -04:00
Joey Hess
c873586e14
eliminate s2w8 and w82s
Note that the use of s2w8 in genUUIDInNameSpace made it truncate unicode
characters. Luckily, genUUIDInNameSpace is only ever used on ASCII
strings as far as I can determine. In particular, git-remote-gcrypt's
gcrypt-id is an ASCII string.
2023-10-26 13:12:57 -04:00
Joey Hess
3742263c99
simplify base64 to only use ByteString
Note the use of fromString and toString from Data.ByteString.UTF8 dated
back to commit 9b93278e8a. Back then it
was using the dataenc package for base64, which operated on Word8 and
String. But with the switch to sandi, it uses ByteString, and indeed
fromB64' and toB64' were already using ByteString without that
complication. So I think there is no risk of such an encoding related
breakage.

I also tested the case that 9b93278e8a
fixed:

	git-annex metadata -s foo='a …' x
	git-annex metadata x
	metadata x
	  foo=a …

In Remote.Helper.Encryptable, it was avoiding using Utility.Base64
because of that UTF8 conversion. Since that's no longer done, it can
just use it now.
2023-10-26 13:10:05 -04:00