Commit graph

4452 commits

Author SHA1 Message Date
Joey Hess
1bf02029f9
small problem 2024-03-05 13:45:31 -04:00
Joey Hess
3874b7364f
add todo for tracking free space in repos via git-annex branch
For balanced preferred content perhaps, or just for git-annex info
display.

Sponsored-by: unqueued on Patreon
2024-03-05 13:16:42 -04:00
Joey Hess
a6a7b8320a
Merge branch 'master' of ssh://git-annex.branchable.com 2024-03-01 16:53:13 -04:00
Joey Hess
e7652b0997
implement URL to VURL migration
This needs the content to be present in order to hash it. But it's not
possible for a module used by Backend.URL to call inAnnex because that
would entail a dependency loop. So instead, rely on the fact that
Command.Migrate calls inAnnex before performing a migration.

But, Command.ExamineKey calls fastMigrate and the key may or may not
exist, and it's not wanting to actually perform a migration in any case.
To handle that, had to add an additional value to fastMigrate to
indicate whether the content is inAnnex.

Factored generateEquivilantKey out of Remote.Web.

Note that migrateFromURLToVURL hardcodes use of the SHA256E backend.
It would have been difficult not to, given all the dependency loop
issues. But --backend and annex.backend are used to tell git-annex
migrate to use VURL in any case, so there's no config knob that
the user could expect to configure that.

Sponsored-by: Brock Spratlen on Patreon
2024-03-01 16:42:02 -04:00
Joey Hess
cb50cdcc58
todo 2024-03-01 15:14:45 -04:00
Joey Hess
def94fbff6
update 2024-03-01 13:48:51 -04:00
Joey Hess
1b0de3021a
avoid double checksum when downloading VURL from web for 1st time
Sponsored-by: Jack Hill on Patreon
2024-03-01 13:44:40 -04:00
Joey Hess
4046f17ca0
incremental verification for VURL
Sponsored-by: Brett Eisenberg on Patreon
2024-03-01 13:33:29 -04:00
yarikoptic
283e071bcb has potential in DANDI project 2024-02-29 23:31:05 +00:00
Joey Hess
62e4c9d3b8
add future todo 2024-02-29 17:52:58 -04:00
Joey Hess
c72df19784
verifyKeyContent for VURL
VURL is now fully working, though needs more testing.

Still need to implement verifyKeyContentIncrementally but it works
without it.

Sponsored-by: Luke T. Shumaker on Patreon
2024-02-29 17:44:21 -04:00
Joey Hess
cc17ac423b
implement isCryptographicallySecureKey for VURL
Considerable difficulty to work around an import cycle. Had to move the
list of backends (except for VURL) to Backend.Variety to VURL could use
it.

Sponsored-by: Kevin Mueller on Patreon
2024-02-29 17:26:35 -04:00
Joey Hess
0f7143d226
support VURL backend
Not yet implemented is recording hashes on download from web and
verifying hashes.

addurl --verifiable option added with -V short option because I
expect a lot of people will want to use this.

It seems likely that --verifiable will become the default eventually,
and possibly rather soon. While old git-annex versions don't support
VURL, that doesn't prevent using them with keys that use VURL. Of
course, they won't verify the content on transfer, and fsck will warn
that it doesn't know about VURL. So there's not much problem with
starting to use VURL even when interoperating with old versions.

Sponsored-by: Joshua Antonishen on Patreon
2024-02-29 13:48:51 -04:00
Joey Hess
8f40e0269b
comment 2024-02-27 13:36:07 -04:00
Joey Hess
90b7c2d93c
fix link 2024-02-27 13:20:24 -04:00
Joey Hess
df7230deac
comment and todo 2024-02-27 12:44:34 -04:00
Joey Hess
f7be26d4e3
close 2024-02-22 12:50:12 -04:00
Joey Hess
2b3740e7b4
comment 2024-02-22 12:45:57 -04:00
Joey Hess
891a0076a6
move misplaced bug or todo to a better place 2024-02-22 11:21:39 -04:00
Joey Hess
3475b09c3e
pre-commit: Avoid committing the git-annex branch
Except when a commit is made in a view, which changes metadata.

Make the assistant commit the git-annex branch after git commit of working
tree changes.

This allows using the annex.commitmessage-command in the assistant to
generate a commit message for the git-annex branch that relies on state
gathered during the commit of the working tree. Eg, it might reuse the
commit message.

Note that, when not using the assistant, a git-annex add still commits
the git-annex branch, so such a annex.commitmessage-command set up would
not work then. But if someone is using the assistant and wants
programmatic control over commit messages, this is useful. Someone not
using the assistant can get the same result by using annex.alwayscommit=false
during the git-annex add, and git-annex merge after they git commit.

pre-commit was never really intended to commit the git-annex branch
(except after recording changed metadata), but the assistant did sort of
rely on it. It does later commit the git-annex branch before pushing to
remotes, but I didn't want to risk building up lots of uncommitted changes
to it if that didn't happen frequently.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-02-12 14:42:11 -04:00
Joey Hess
68e99513f0
added annex.commitmessage-command config
Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-02-12 14:35:22 -04:00
Joey Hess
dd0e45c86e
update 2024-02-10 11:24:32 -04:00
Joey Hess
6c5aaa2b0f
design document for verified relaxed urls
Sponsored-by: Graham Spencer on Patreon
2024-02-10 10:48:20 -04:00
Joey Hess
aba87a6e92
close as distributed migration meets this use case 2024-02-10 10:24:58 -04:00
Joey Hess
bca66acfd8
comment 2024-02-09 14:09:49 -04:00
yarikoptic
2a4d7d894b Added a comment 2024-02-09 13:50:21 +00:00
yarikoptic
517914c0f6 Added a comment 2024-02-08 22:08:41 +00:00
Joey Hess
c4943ae277
close 2024-02-07 16:24:39 -04:00
Joey Hess
644317e86f
Merge branch 'master' of ssh://git-annex.branchable.com 2024-02-07 16:21:31 -04:00
Joey Hess
21123ba368
assistant, undo: When committing, let the usual git commit hooks run
Was doing a Git.Branch.commit for historical reasons to do with direct
mode, which no longer apply.

Note that the preCommitAnnexHook is no longer called in commitStaged
because git-annex installs a pre-commit hook that runs the pre-commit-annex
hook. And git commit will run the pre-commit hook.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-02-07 16:15:35 -04:00
jstritch
d8df420b18 Added a comment 2024-02-07 15:52:18 +00:00
jstritch
e0488d74ff Added a comment 2024-02-06 16:10:56 +00:00
Joey Hess
48fe8ba23c
forgot to add this comment earlier 2024-02-05 15:49:32 -04:00
Joey Hess
792106abc3
improve special remote docs
Sponsored-by: Dartmouth College's DANDI project
2024-02-05 15:48:15 -04:00
Joey Hess
083c471ee9
really close 2024-02-05 15:20:40 -04:00
Joey Hess
6b38d0c427
addurl, importfeed: Added --raw-except option
--raw-except=web allows using yt-dlp but not any other special remotes.

Currently this option can only be used once, trying to use it repeatedly
will make option parsing fail. Perhaps it ought to support being used more
than once, but it seemed like an unlikely use case to need that.

Note that getParsed is called repeatedly when the option is used with
several urls. While implementing DeferredParseClass would avoid that
innefficiency, it didn't seem worth the added boilerplate since
getParsed only calls byNameWithUUID which does minimal work.

Sponsored-by: Dartmouth College's DANDI project
2024-02-05 15:16:25 -04:00
Joey Hess
d7419d6e65
comment 2024-02-05 14:06:03 -04:00
jstritch
593186d461 2024-02-03 18:48:36 +00:00
Joey Hess
fc32632774
followup 2024-02-02 14:32:12 -04:00
yarikoptic
5621922e08 the plea to respect prepare-commit-msg hook 2024-02-02 17:07:52 +00:00
yarikoptic
f36ba992ea todo for support of pushinsteadof 2024-02-02 16:27:20 +00:00
yarikoptic
0fd17ceeb5 Initial desire for addurl --raw-except 2024-01-31 20:35:03 +00:00
Joey Hess
3b22e61007
followup and close 2024-01-30 15:51:33 -04:00
yarikoptic
0621711c6f ask for better documentation. 2024-01-30 16:20:45 +00:00
yarikoptic
58b57ab999 error out if yt-dlp sees that video is/was there but not available 2024-01-29 22:03:47 +00:00
Joey Hess
8e9ee31621
webapp: Added --port option, and annex.port config
The getSocket comment that mentioned using ":port"
in the hostname seems to have been incorrect or be out of date.
After all, the bug report came when the user first tried doing that,
and it didn't work.

Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project
2024-01-25 14:08:36 -04:00
Joey Hess
d54f2ccae1
close 2024-01-25 13:28:23 -04:00
Joey Hess
3a20208ce1
confirm this todo 2024-01-25 13:25:15 -04:00
Joey Hess
2a56476ca5
close 2024-01-25 13:16:25 -04:00
Joey Hess
1120ac8272
update 2024-01-25 13:15:13 -04:00
Joey Hess
7aee4ca7c1
nack 2024-01-25 13:10:45 -04:00
Joey Hess
8646183e38
nack 2024-01-25 13:05:52 -04:00
Joey Hess
991dfcb9b8
nack 2024-01-25 13:04:35 -04:00
Joey Hess
3109447120
close 2024-01-25 12:58:16 -04:00
Joey Hess
b9e147d282
Added --expected-present file matching option 2024-01-25 12:56:41 -04:00
Joey Hess
72d2dbde5e
comment 2024-01-23 12:55:44 -04:00
Joey Hess
3ca1e036ed
open todo 2024-01-18 13:11:28 -04:00
Joey Hess
dda4cb372c
update 2024-01-12 13:51:59 -04:00
Joey Hess
7e69063a29
support annex.shared-sop-command for encryption=shared
This works well, and it interoperates with gpg in my testing (although some
SOP commands might choose to use a profile that does not so caveat emptor).

Note that for creating the Cipher, gpg --gen-random is still used. SOP
does not have an eqivilant, and as long as the user has gpg around,
which seems likely, it doesn't matter that it uses gpg here, it's not being
used for encryption. That seemed better than implementing a second way
to get high quality entropy, at least for now.

The need for the sop command to run in an empty directory has each call
to encrypt and decrypt creating a new temporary directory. That is some
unncessary overhead, though probably swamped by the overhead of running
the sop command. This could be improved in the future by passing an
already empty directory to them, or a sufficiently empty directory
(.git/annex/tmp would probably suffice).

Sponsored-by: Brett Eisenberg on Patreon
2024-01-12 13:31:18 -04:00
Joey Hess
654f3b7e06
comments 2024-01-09 17:04:17 -04:00
Joey Hess
a496c05995
update 2024-01-09 17:04:10 -04:00
Joey Hess
db5fa267c7
sop 2024-01-09 16:57:11 -04:00
Joey Hess
2c86651180
optimise adjustTree when adding many TreeItems
The old code traversed the list of addtreeitems once per subdirectory in
the tree, so could get quite slow. Converting to Map lookups sped it up
significantly.

In my test case, git-annex import used to take about 2 minutes, when
calling adjustTree to add back excluded files to the imported tree. This
dropped it down to 6 seconds. Of which 4 seconds are the actual
enumeration of the contents of the remote, so really only 2 seconds for
this.

The path prefix map is a bit suboptimal memory-wise, since items get
stored in the map once per subdirectory on the path to the item. It
would perhaps be better to use a tree data structure.

Also it's suboptimal memory-wise that it builds two maps, as well
as retaining a reference to addtreeitems. I could not see a way around
that though.

Sponsored-by: Luke T. Shumaker on Patreon
2024-01-03 15:07:49 -04:00
Joey Hess
a6a67f79e7
todo 2024-01-02 17:00:41 -04:00
Atemu
86d3e8d31a Added a comment 2023-12-29 17:06:37 +00:00
Joey Hess
a4a5ec6366
info: Added "annex sizes of repositories" table to the overall display
Thanks to previous work in 11cc9f1933,
this is almost entirely free, it only needs to do some additional map
lookups and math.

The strictness annotations keep the memory use from blowing up.

Sponsored-by: unqueued on Patreon
2023-12-29 12:09:30 -04:00
Joey Hess
e7a550a25b
plan 2023-12-29 10:48:12 -04:00
Joey Hess
49b50dd466
todo 2023-12-29 10:36:11 -04:00
Atemu
f58d629b95 Added a comment 2023-12-25 13:37:58 +00:00
Joey Hess
9a67ed0f10
importtree: support preferred content expressions needing keys
When importing from a special remote, support preferred content expressions
that use terms that match on keys (eg "present", "copies=1"). Such terms
are ignored when importing, since the key is not known yet.

When "standard" or "groupwanted" is used, the terms in those
expressions also get pruned accordingly.

This does allow setting preferred content to "not (copies=1)" to make a
special remote into a "source" type of repository. Importing from it will
import all files. Then exporting to it will drop all files from it.

In the case of setting preferred content to "present", it's pruned on
import, so everything gets imported from it. Then on export, it's applied,
and everything in it is left on it, and no new content is exported to it.

Since the old behavior on these preferred content expressions was for
importtree to error out, there's no backwards compatability to worry about.
Except that sync/pull/etc will now import where before it errored out.
2023-12-18 16:27:59 -04:00
Joey Hess
362a2808a5
split out todo for special remotes and close the main todo 2023-12-08 14:26:08 -04:00
Joey Hess
0bd8b17b59
log migration trees to git-annex branch
This will allow distributed migration: Start a migration in one clone of
a repo, and then update other clones.

commitMigration is a bit of a bear.. There is some inversion of control
that needs some TMVars. Also streamLogFile's finalizer does not handle
recording the trees, so an interrupt at just the wrong time can cause
migration.log to be emptied but the git-annex branch not updated.

Sponsored-by: Graham Spencer on Patreon
2023-12-06 15:40:03 -04:00
Joey Hess
10964f91bc
further thoughts 2023-12-05 15:00:22 -04:00
Joey Hess
edf31a2ebc
update 2023-12-01 15:01:45 -04:00
Joey Hess
5c4ce1353e
comment 2023-12-01 14:42:55 -04:00
Joey Hess
1d020df896
git-annex branch size when storing migration information
Sponsored-by: Jack Hill on Patreon
2023-12-01 13:09:52 -04:00
Joey Hess
3e8618fed3
comment 2023-11-30 16:49:48 -04:00
NewUser
3a4883cabb Added a comment: Is annex.tune.objecthashlower=true recommended for interop with windows? 2023-11-20 04:24:35 +00:00
Joey Hess
1ddec09f7c
close 2023-11-13 17:45:37 -04:00
Joey Hess
6a8672d756
todo 2023-11-08 14:14:35 -04:00
Joey Hess
1ec3c3e541
update 2023-10-31 14:06:46 -04:00
nobodyinperson
af6ecc9be5 Added a comment 2023-10-26 17:46:28 +00:00
Joey Hess
985dd38847
add 2023-10-25 14:44:57 -04:00
Joey Hess
626622da1b
comment 2023-10-25 14:07:16 -04:00
Joey Hess
97403a4b4b
comment 2023-10-25 13:30:19 -04:00
Joey Hess
9a1e8fbabc
Merge branch 'master' of ssh://git-annex.branchable.com 2023-10-25 13:21:12 -04:00
nobodyinperson
1d1864ee5e Brainstorm (semi)automatic description updating 2023-10-25 11:27:17 +00:00
Joey Hess
aaeadc422a
comment 2023-10-24 13:54:31 -04:00
Joey Hess
0da1d40cd4
Improve memory use of --all when using annex.private
This does not improve Annex.Branch.files at all, since it still uses ++ to
combine the lists, so forcing all but the last one.

But when there are a lot of files in the private journal, it does avoid
--all (or a bare repo) from buffering the filenames in memory.

See commit 653b719472 for prior discussion of
this buffering.

Sponsored-by: Graham Spencer on Patreon
2023-10-24 13:20:55 -04:00
Joey Hess
8bde6101e3
sqlite datbase for importfeed
importfeed: Use caching database to avoid needing to list urls on every
run, and avoid using too much memory.

Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster,
and memory use dropped from 203000k to 59408k.

Database.ImportFeed is Database.ContentIdentifier with the serial number
filed off. There is a bit of code duplication I would like to avoid,
particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use
the persistent sqlite tables, so despite the code being the same, they
cannot be factored out.

Since this database includes the contentidentifier metadata, it will be
slightly redundant if a sqlite database is ever added for metadata. I
did consider making such a generic database and using it for this. But,
that would then need importfeed to update both the url database and the
metadata database, which is twice as much work diffing the git-annex
branch trees. Or would entagle updating two databases in a complex way.
So instead it seems better to optimise the database that
importfeed needs, and if the metadata database is used by another command,
use a little more disk space and do a little bit of redundant work to
update it.

Sponsored-by: unqueued on Patreon
2023-10-23 16:46:22 -04:00
Joey Hess
892d87efa4
comment 2023-10-14 14:33:38 -04:00
Joey Hess
4ec1694f89
comment 2023-10-09 14:47:19 -04:00
Atemu
44a7b4c973 2023-10-01 09:38:29 +00:00
Joey Hess
bda0db6f65
todo 2023-09-14 20:29:12 -04:00
anarcat
22bf65b875 Added a comment: just show start time? 2023-09-12 15:51:22 +00:00
Joey Hess
32cb2bd3fa
Fix linker optimisation in linux standalone tarballs
Was only symlinking when there is a usr/ directory, but with usr/ merge,
there are none.

Sponsored-by: Dartmouth College's Datalad project
2023-09-07 12:59:27 -04:00
Joey Hess
9563830529
tag datalad 2023-09-07 12:57:38 -04:00
yarikoptic
70e766c95b Added a comment 2023-08-31 16:48:34 +00:00
yarikoptic
07abfc3075 Added a comment 2023-08-31 13:57:21 +00:00
yarikoptic
c6f6b993bc reporting on increased number of looksup 2023-08-31 13:54:20 +00:00