Commit graph

4452 commits

Author SHA1 Message Date
Joey Hess
7aee4ca7c1
nack 2024-01-25 13:10:45 -04:00
Joey Hess
8646183e38
nack 2024-01-25 13:05:52 -04:00
Joey Hess
991dfcb9b8
nack 2024-01-25 13:04:35 -04:00
Joey Hess
3109447120
close 2024-01-25 12:58:16 -04:00
Joey Hess
b9e147d282
Added --expected-present file matching option 2024-01-25 12:56:41 -04:00
Joey Hess
72d2dbde5e
comment 2024-01-23 12:55:44 -04:00
Joey Hess
3ca1e036ed
open todo 2024-01-18 13:11:28 -04:00
Joey Hess
dda4cb372c
update 2024-01-12 13:51:59 -04:00
Joey Hess
7e69063a29
support annex.shared-sop-command for encryption=shared
This works well, and it interoperates with gpg in my testing (although some
SOP commands might choose to use a profile that does not so caveat emptor).

Note that for creating the Cipher, gpg --gen-random is still used. SOP
does not have an eqivilant, and as long as the user has gpg around,
which seems likely, it doesn't matter that it uses gpg here, it's not being
used for encryption. That seemed better than implementing a second way
to get high quality entropy, at least for now.

The need for the sop command to run in an empty directory has each call
to encrypt and decrypt creating a new temporary directory. That is some
unncessary overhead, though probably swamped by the overhead of running
the sop command. This could be improved in the future by passing an
already empty directory to them, or a sufficiently empty directory
(.git/annex/tmp would probably suffice).

Sponsored-by: Brett Eisenberg on Patreon
2024-01-12 13:31:18 -04:00
Joey Hess
654f3b7e06
comments 2024-01-09 17:04:17 -04:00
Joey Hess
a496c05995
update 2024-01-09 17:04:10 -04:00
Joey Hess
db5fa267c7
sop 2024-01-09 16:57:11 -04:00
Joey Hess
2c86651180
optimise adjustTree when adding many TreeItems
The old code traversed the list of addtreeitems once per subdirectory in
the tree, so could get quite slow. Converting to Map lookups sped it up
significantly.

In my test case, git-annex import used to take about 2 minutes, when
calling adjustTree to add back excluded files to the imported tree. This
dropped it down to 6 seconds. Of which 4 seconds are the actual
enumeration of the contents of the remote, so really only 2 seconds for
this.

The path prefix map is a bit suboptimal memory-wise, since items get
stored in the map once per subdirectory on the path to the item. It
would perhaps be better to use a tree data structure.

Also it's suboptimal memory-wise that it builds two maps, as well
as retaining a reference to addtreeitems. I could not see a way around
that though.

Sponsored-by: Luke T. Shumaker on Patreon
2024-01-03 15:07:49 -04:00
Joey Hess
a6a67f79e7
todo 2024-01-02 17:00:41 -04:00
Atemu
86d3e8d31a Added a comment 2023-12-29 17:06:37 +00:00
Joey Hess
a4a5ec6366
info: Added "annex sizes of repositories" table to the overall display
Thanks to previous work in 11cc9f1933,
this is almost entirely free, it only needs to do some additional map
lookups and math.

The strictness annotations keep the memory use from blowing up.

Sponsored-by: unqueued on Patreon
2023-12-29 12:09:30 -04:00
Joey Hess
e7a550a25b
plan 2023-12-29 10:48:12 -04:00
Joey Hess
49b50dd466
todo 2023-12-29 10:36:11 -04:00
Atemu
f58d629b95 Added a comment 2023-12-25 13:37:58 +00:00
Joey Hess
9a67ed0f10
importtree: support preferred content expressions needing keys
When importing from a special remote, support preferred content expressions
that use terms that match on keys (eg "present", "copies=1"). Such terms
are ignored when importing, since the key is not known yet.

When "standard" or "groupwanted" is used, the terms in those
expressions also get pruned accordingly.

This does allow setting preferred content to "not (copies=1)" to make a
special remote into a "source" type of repository. Importing from it will
import all files. Then exporting to it will drop all files from it.

In the case of setting preferred content to "present", it's pruned on
import, so everything gets imported from it. Then on export, it's applied,
and everything in it is left on it, and no new content is exported to it.

Since the old behavior on these preferred content expressions was for
importtree to error out, there's no backwards compatability to worry about.
Except that sync/pull/etc will now import where before it errored out.
2023-12-18 16:27:59 -04:00
Joey Hess
362a2808a5
split out todo for special remotes and close the main todo 2023-12-08 14:26:08 -04:00
Joey Hess
0bd8b17b59
log migration trees to git-annex branch
This will allow distributed migration: Start a migration in one clone of
a repo, and then update other clones.

commitMigration is a bit of a bear.. There is some inversion of control
that needs some TMVars. Also streamLogFile's finalizer does not handle
recording the trees, so an interrupt at just the wrong time can cause
migration.log to be emptied but the git-annex branch not updated.

Sponsored-by: Graham Spencer on Patreon
2023-12-06 15:40:03 -04:00
Joey Hess
10964f91bc
further thoughts 2023-12-05 15:00:22 -04:00
Joey Hess
edf31a2ebc
update 2023-12-01 15:01:45 -04:00
Joey Hess
5c4ce1353e
comment 2023-12-01 14:42:55 -04:00
Joey Hess
1d020df896
git-annex branch size when storing migration information
Sponsored-by: Jack Hill on Patreon
2023-12-01 13:09:52 -04:00
Joey Hess
3e8618fed3
comment 2023-11-30 16:49:48 -04:00
NewUser
3a4883cabb Added a comment: Is annex.tune.objecthashlower=true recommended for interop with windows? 2023-11-20 04:24:35 +00:00
Joey Hess
1ddec09f7c
close 2023-11-13 17:45:37 -04:00
Joey Hess
6a8672d756
todo 2023-11-08 14:14:35 -04:00
Joey Hess
1ec3c3e541
update 2023-10-31 14:06:46 -04:00
nobodyinperson
af6ecc9be5 Added a comment 2023-10-26 17:46:28 +00:00
Joey Hess
985dd38847
add 2023-10-25 14:44:57 -04:00
Joey Hess
626622da1b
comment 2023-10-25 14:07:16 -04:00
Joey Hess
97403a4b4b
comment 2023-10-25 13:30:19 -04:00
Joey Hess
9a1e8fbabc
Merge branch 'master' of ssh://git-annex.branchable.com 2023-10-25 13:21:12 -04:00
nobodyinperson
1d1864ee5e Brainstorm (semi)automatic description updating 2023-10-25 11:27:17 +00:00
Joey Hess
aaeadc422a
comment 2023-10-24 13:54:31 -04:00
Joey Hess
0da1d40cd4
Improve memory use of --all when using annex.private
This does not improve Annex.Branch.files at all, since it still uses ++ to
combine the lists, so forcing all but the last one.

But when there are a lot of files in the private journal, it does avoid
--all (or a bare repo) from buffering the filenames in memory.

See commit 653b719472 for prior discussion of
this buffering.

Sponsored-by: Graham Spencer on Patreon
2023-10-24 13:20:55 -04:00
Joey Hess
8bde6101e3
sqlite datbase for importfeed
importfeed: Use caching database to avoid needing to list urls on every
run, and avoid using too much memory.

Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster,
and memory use dropped from 203000k to 59408k.

Database.ImportFeed is Database.ContentIdentifier with the serial number
filed off. There is a bit of code duplication I would like to avoid,
particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use
the persistent sqlite tables, so despite the code being the same, they
cannot be factored out.

Since this database includes the contentidentifier metadata, it will be
slightly redundant if a sqlite database is ever added for metadata. I
did consider making such a generic database and using it for this. But,
that would then need importfeed to update both the url database and the
metadata database, which is twice as much work diffing the git-annex
branch trees. Or would entagle updating two databases in a complex way.
So instead it seems better to optimise the database that
importfeed needs, and if the metadata database is used by another command,
use a little more disk space and do a little bit of redundant work to
update it.

Sponsored-by: unqueued on Patreon
2023-10-23 16:46:22 -04:00
Joey Hess
892d87efa4
comment 2023-10-14 14:33:38 -04:00
Joey Hess
4ec1694f89
comment 2023-10-09 14:47:19 -04:00
Atemu
44a7b4c973 2023-10-01 09:38:29 +00:00
Joey Hess
bda0db6f65
todo 2023-09-14 20:29:12 -04:00
anarcat
22bf65b875 Added a comment: just show start time? 2023-09-12 15:51:22 +00:00
Joey Hess
32cb2bd3fa
Fix linker optimisation in linux standalone tarballs
Was only symlinking when there is a usr/ directory, but with usr/ merge,
there are none.

Sponsored-by: Dartmouth College's Datalad project
2023-09-07 12:59:27 -04:00
Joey Hess
9563830529
tag datalad 2023-09-07 12:57:38 -04:00
yarikoptic
70e766c95b Added a comment 2023-08-31 16:48:34 +00:00
yarikoptic
07abfc3075 Added a comment 2023-08-31 13:57:21 +00:00
yarikoptic
c6f6b993bc reporting on increased number of looksup 2023-08-31 13:54:20 +00:00
Joey Hess
1e580a30be
comment (and a new example) 2023-08-22 15:10:04 -04:00
Joey Hess
47f92409f2
Merge branch 'master' of ssh://git-annex.branchable.com 2023-08-22 15:01:43 -04:00
Joey Hess
cf8b30c914
oldkeys: New command that lists the keys used by old versions of a file
The tricky thing about this turned out to be handling renames and reverts.
For that, it has to make two passes over the git log, and to avoid
buffering a possibly huge amount of logs in memory (ie the whole git log of
an entire repository!), runs git log twice.

(It might be possible to speed this up by asking git log to show a diff,
and so avoid needing to use catKey.)

Sponsored-By: Brock Spratlen on Patreon
2023-08-22 14:51:06 -04:00
nobodyinperson
1afa7dcf44 Added a comment 2023-08-22 17:57:45 +00:00
nobodyinperson
42683457d0 Added a comment: Oh yes please 🤩 2023-08-22 16:48:58 +00:00
Joey Hess
6115bced71
comment, todo 2023-08-22 12:38:00 -04:00
Joey Hess
d4ca85fd23
comment 2023-08-22 12:10:46 -04:00
Joey Hess
379d58b499
diffdriver: Added --get option
Removed the dontCheck repoExists, because running it in a repo that has not
been initialized yet would update location log with nouuid. And I guess
it's ok for it to only support running in git-annex repos.
2023-08-22 11:58:53 -04:00
Joey Hess
5126f6d002
comment 2023-08-16 13:16:49 -04:00
anarcat
4c7e2b167e 2023-08-16 14:40:17 +00:00
Joey Hess
d467c70ef7
change sync content transition plan and fine tune warning
Only display warning when git-annex sync (without --content or
--no-content) is used with repositories that have preferred content
configured.

Sponsored-by: Leon Schuermann on Patreon
2023-08-14 13:51:35 -04:00
nobodyinperson
1613f7caae Added a comment 2023-08-10 01:08:16 +00:00
anarcat
5d7eec4402 2023-08-10 00:37:15 +00:00
anarcat
6561f0049f 2023-08-10 00:36:56 +00:00
Joey Hess
a86521a396
Merge branch 'master' of ssh://git-annex.branchable.com 2023-08-09 12:43:56 -04:00
Joey Hess
3efad7f5f4
info: Added --dead-repositories option
I considered a more wide-ranging config option to make other commands
also show dead repositories. But it would be difficult to implement that
because Remote.keyLocations is used to get locations, filtering out dead
repos, and commands like get then try to use those locations. So a config
setting would make dead repos sometimes be acted on by commands.

Sponsored-by: unqueued on Patreon
2023-08-09 12:43:48 -04:00
nobodyinperson
09b894cc8f Added a comment 2023-08-09 05:31:26 +00:00
Joey Hess
27a9915a67
closing dup 2023-08-08 13:45:37 -04:00
Joey Hess
0c2e885796
fix link 2023-08-08 13:11:00 -04:00
Joey Hess
1e754dd10c
link 2023-08-08 13:08:56 -04:00
Joey Hess
fa92383993
onlyingroup
* Support "onlyingroup=" in preferred content expressions.
* Support --onlyingroup= matching option.

Sponsored-by: Jack Hill on Patreon
2023-07-31 14:43:58 -04:00
Joey Hess
a17ece1428
foo 2023-07-31 14:03:18 -04:00
Joey Hess
a5606f1c43
comment 2023-07-31 14:01:33 -04:00
Joey Hess
d2a87d4a1b
comment 2023-07-31 13:50:12 -04:00
Joey Hess
846384fc3a
--explain for numcopies checks
And closed the todo as completed.

Sponsored-by: Dartmouth College's DANDI project
2023-07-31 12:53:17 -04:00
nobodyinperson
2fb9a24b48 Added a comment: onlyingroup also useful for offline backup drives 2023-07-31 04:40:43 +00:00
nobodyinperson
33dac74dfa Suggest that git annex diffdriver fetches content to diff 2023-07-29 17:07:22 +00:00
aragilar
732532ba49 Added a comment 2023-07-28 02:47:42 +00:00
nobodyinperson
daa4dd8951 Added a comment: --explain is very helpful 👍 2023-07-27 15:22:47 +00:00
Joey Hess
588cda3833
Merge branch 'master' of ssh://git-annex.branchable.com 2023-07-26 15:44:47 -04:00
Joey Hess
499d014123
improve match explanations
Using == and != proved too hard to read, so went with [TRUE] and [FALSE]
after the term. I would kind of liked to have used emojis for a green
check and red X, but probably that is too fancy to be a good idea.

Make --explain output be inside [ ] with whitespace around them, to
avoid it ending with eg "[FALSE]]" and to make it easier to visually
find the start of it.

Sponsored-by: Dartmouth College's DANDI project
2023-07-26 15:37:03 -04:00
nobodyinperson
e20c7ceb35 Added a comment: Workaround for listing dead remotes 2023-07-26 17:40:56 +00:00
aragilar
20a449db12 2023-07-26 12:56:21 +00:00
Joey Hess
7333104fd9
tag option_to_explain dandi
See comment, this is a continuation of the other todo that was tagged
dandi and which I didn't fully address at the time.
2023-07-25 11:08:33 -04:00
Joey Hess
7f38355860
dropunused: Support --jobs
Sponsored-by: Kevin Mueller on Patreon
2023-07-21 14:03:34 -04:00
nobodyinperson
d07cc31792 Suggest onlyingroup preferred content expression 2023-07-15 11:19:27 +00:00
nobodyinperson
0f105aaad4 Added a comment: git annex drop --unused --jobs=cpus works 2023-07-15 05:53:43 +00:00
agot
626f0f5123 2023-07-14 22:31:05 +00:00
Joey Hess
0df94132d9
add a warning and a related todo
arising from a conversation at FOSSY
2023-07-13 19:58:12 -04:00
nobodyinperson
f23993261b Added a comment: Clarification 2023-07-12 11:29:41 +00:00
ewen
e7f59c4ca4 Added a comment: Breaking change to "sync" 2023-07-12 10:21:04 +00:00
nobodyinperson
538d602c88 2023-07-11 20:39:34 +00:00
jstritch
c611b0a24b Added a comment 2023-07-11 18:56:40 +00:00
nobodyinperson
02fab1ca2a Added a comment 2023-07-06 05:52:50 +00:00
nobodyinperson
7bcb827cd3 Added a comment 2023-07-06 05:44:57 +00:00
Joey Hess
18faf9fb5b
comment 2023-07-05 17:39:26 -04:00
Joey Hess
522d032016
comment 2023-07-05 17:20:47 -04:00
Joey Hess
fc1fcc7491
Merge branch 'master' of ssh://git-annex.branchable.com 2023-07-05 16:55:10 -04:00
Joey Hess
af27da1eba
comment 2023-07-05 16:54:52 -04:00
nobodyinperson
9d305081c4 Added a comment 2023-07-05 20:46:13 +00:00
Joey Hess
3d810726af
diffdriver --text support options for diff
Sponsored-by: KDM on Patreon
2023-07-05 15:43:29 -04:00
Joey Hess
5aad0cea83
Merge branch 'master' of ssh://git-annex.branchable.com 2023-07-05 11:54:43 -04:00
nobodyinperson
d250a51ec3 2023-07-03 13:45:36 +00:00
nobodyinperson
381e7b9bce 2023-07-03 11:08:36 +00:00
nobodyinperson
aa46805061 Added a comment: Thanks for git annex satisfy, numcopies question 2023-06-30 05:37:06 +00:00
Joey Hess
18923cd0f1
idea 2023-06-29 17:00:32 -04:00
Joey Hess
06f7345558
update 2023-06-29 15:40:37 -04:00
Joey Hess
e1fc9e204e
added git-annex satisfy
This ended up having an interface like sync, rather than like get/copy/drop.
That let it be implemented in terms of sync, which took a lot less code.
Also, it lets it handle many of the edge cases that sync does, such as
getting files that are not visible in a --hide-missing branch, and sending
files to exporttree remotes.

As well as being easier to implement, `git-annex satisfy myremote` makes
sense as it satisfies the preferred content settings of the remote.
`git-annex satisfy somefile` does not form a sentence that makes sense. So
while -C can be a little bit annoying, it still makes sense to have this
syntax.

Note that, while I initially thought this would also satisfy numcopies, it
does not. Arguably it ought to. But, sync does not send files in order to
satisfy numcopies, it only sends files to satisfy preferred content. And
it's important that this transfer the same files as sync does, because
it will probably be used in a workflow where the user sometimes syncs and
sometimes satisfies, and does not expect satisfy to do things that sync
would not do.

(Also opened a new bug that also affects sync et all, not only this command.)

Sponsored-by: Nicholas Golder-Manning on Patreon
2023-06-29 15:34:53 -04:00
Joey Hess
946e52cbfd
idea 2023-06-29 13:59:00 -04:00
Joey Hess
4bf1690355
comments 2023-06-29 13:57:27 -04:00
Joey Hess
53ab91da2a
confirm 2023-06-29 13:44:52 -04:00
Joey Hess
42b381e4b2
tag todos potentially useful for datalad 2023-06-29 13:30:26 -04:00
nobodyinperson
0ccf436795 Added a comment: How about git annex transfer? 2023-06-29 05:25:50 +00:00
nobodyinperson
265afbd599 Added a comment 2023-06-29 05:20:56 +00:00
jkniiv
ac11b3ac0e for some reason I couldn't add a page comment so had to resort to an inline comment 2023-06-28 22:26:04 +00:00
nobodyinperson
720602b351 Added a comment: Awesome, thanks! 2023-06-28 21:09:51 +00:00
Joey Hess
d5c6197791
diffdriver: Added --text option for easy diffing of the contents of annexed text files
This was already possible, but it was rather hard to come up with the
complex shell command needed.

Note that the diff output starts with "diff a/... b/...".
I left off the "--git" because it's not a git format diff.
2023-06-28 15:27:16 -04:00
Joey Hess
0b84c850fa
response 2023-06-28 13:26:06 -04:00
Joey Hess
da429a609c
todo 2023-06-28 13:23:37 -04:00
Joey Hess
55fef4fb81
confirmed 2023-06-26 11:51:48 -04:00
Joey Hess
75f3045b0a
comment and confirm 2023-06-26 11:12:32 -04:00
nobodyinperson
9112f0c184 2023-06-26 14:08:32 +00:00
nobodyinperson
c7309b30e9 2023-06-26 11:43:20 +00:00
jkniiv
4c84f464a4 Added a comment 2023-06-24 22:23:09 +00:00
jkniiv
77443df1b2 add a plea for saving the the old 'import /dir' interface as a renamed, reduced command 2023-06-24 22:04:35 +00:00
jkniiv
63f13d92cf add an inline comment about --fast vs. --no-content wrt. git-annex-import 2023-06-24 21:01:43 +00:00
Joey Hess
d105a6f93a
fix tag 2023-06-23 16:39:22 -04:00
Joey Hess
13b7208f65
already done 2023-06-23 16:35:49 -04:00
Joey Hess
8fc63cf156
expand and confirm 2023-06-23 16:31:54 -04:00
Joey Hess
f1714915b2
confirmed
(but I need them to tell me what errors of what special remotes need a
message-id)
2023-06-23 16:17:56 -04:00
Joey Hess
6774f0bf40
confirmed 2023-06-23 16:07:22 -04:00
Joey Hess
e42470fa60
confirmed 2023-06-23 15:53:28 -04:00
Joey Hess
5b1e8ba779
confirmed 2023-06-23 14:31:54 -04:00
Joey Hess
388804b785
thought 2023-06-23 14:30:37 -04:00
Joey Hess
941cd7cfaa
confirmed 2023-06-23 14:25:25 -04:00
Joey Hess
3b13609b93
thought this out more fully 2023-06-23 14:22:57 -04:00
Joey Hess
867a833624
todo work 2023-06-23 13:47:01 -04:00
Joey Hess
2f3a275b58
confirmed 2023-06-23 12:59:34 -04:00
Joey Hess
790e0083ed
comment 2023-06-23 12:54:55 -04:00
Joey Hess
e15ad92689
close 2023-06-23 12:41:32 -04:00
Joey Hess
5a257dba88
comment 2023-06-23 12:34:48 -04:00
Joey Hess
8d09207a2d
comment and update todo 2023-06-23 12:25:08 -04:00
Joey Hess
0da5e9730d
comment 2023-06-23 12:01:54 -04:00
Joey Hess
553962eb47
link comment to todo 2023-06-19 11:42:45 -04:00
Joey Hess
12fa697ca5
Merge branch 'master' of ssh://git-annex.branchable.com 2023-06-15 10:08:16 -04:00
Joey Hess
839fce9549
comment 2023-06-14 19:53:55 -04:00
Joey Hess
40b6155b7d
idea 2023-06-14 19:40:42 -04:00
nobodyinperson
73d400fe83 Added a comment 2023-06-12 21:12:29 +00:00
Joey Hess
6f0783d7a0
close bug report and improve docs that led to it being filed 2023-06-12 16:30:21 -04:00
Joey Hess
64738ea157
config: Added the --show-origin and --for-file options
* config: Added the --show-origin and --for-file options.
* config: Support annex.numcopies and annex.mincopies.

There is a little bit of redundancy here with other code elsewhere that
combines the various configs and selects which to use. But really only
for the special case of annex.numcopies, which is a git config that does
not override the annex branch setting and for annex.mincopies, which does
not have a git config but does have gitattributes settings as well as the
annex branch setting.

That seems small enough, and unlikely enough to grow into a mess that it was
worth supporting annex.numcopies and annex.mincopies in git-annex config
--show-origin. Because these settings are a prime thing that someone might
get confused about and want to know where they were configured.

And, it followed that git-annex config might as well support those two
for --set and --get as well. While this is redundant with the speclialized
commands, it's only a little code and it makes it more consistent.

Note that --set does not have as nice output as numcopies/mincopies
commands in some special cases like setting to 0 or a negative number.
It does avoid setting to a bad value thanks to the smart
constructors (eg configuredNumCopies).

As for other git-annex branch configurations that are not set by git-annex
config, things like trust and wanted that are specific to a repository
don't map to a git config name, so don't really fit into git-annex config.
And they are only configured in the git-annex branch with no local override
(at least so far), so --show-origin would not be useful for them.

Sponsored-by: Dartmouth College's DANDI project
2023-06-12 16:24:31 -04:00