Commit graph

4079 commits

Author SHA1 Message Date
jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb
3007e3c177 Added a comment: it turns out I had to file this as a bug 2021-08-27 01:38:29 +00:00
Joey Hess
ab7b5a492c
--batch-keys
New --batch-keys option added to these commands:  get, drop, move, copy, whereis

git-annex-matching-options had to be reworded since some of its options
can be used to match on keys, not only files.

Sponsored-by: Luke Shumaker on Patreon
2021-08-25 14:21:12 -04:00
Joey Hess
8944569988
comment 2021-08-24 11:21:12 -04:00
jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb
20a252a129 Added a comment: git annex sync --no-commit --content takes double the time of git annex get . 2021-08-20 02:05:53 +00:00
Joey Hess
53744e132d
incremental verification for gitlfs and httpalso
And that should be all the special remotes supporting it on linux now,
except for in the odd edge case here and there.

Sponsored-by: Dartmouth College's DANDI project
2021-08-18 15:17:10 -04:00
Joey Hess
f5e09a1dbe
incremental verification for S3
Sponsored-by: Dartmouth College's DANDI project
2021-08-18 15:07:00 -04:00
Joey Hess
d154e7022e
incremental verification for web special remote
Except when configuration makes curl be used. It did not seem worth
trying to tail the file when curl is downloading.

But when an interrupted download is resumed, it does not read the whole
existing file to hash it. Same reason discussed in
commit 7eb3742e4b76d1d7a487c2c53bf25cda4ee5df43; that could take a long
time with no progress being displayed. And also there's an open http
request, which needs to be consumed; taking a long time to hash the file
might cause it to time out.

Also in passing implemented it for git and external special remotes when
downloading from the web. Several others like S3 are within striking
distance now as well.

Sponsored-by: Dartmouth College's DANDI project
2021-08-18 15:02:22 -04:00
Joey Hess
1dca3ba26a
status update 2021-08-18 12:58:27 -04:00
Joey Hess
d67da1f4db
idea 2021-08-18 12:39:03 -04:00
Joey Hess
4bbc6a25fa
comment 2021-08-17 10:28:18 -04:00
Joey Hess
ffa1f6ed30
Merge branch 'master' of ssh://git-annex.branchable.com 2021-08-16 17:30:04 -04:00
Joey Hess
f8463ad52f
status update 2021-08-16 17:29:39 -04:00
Joey Hess
b1622eb932
incremental verify for directory special remote
Added fileRetriever', which will let the remaining special remotes
eventually also support incremental verify.

Sponsored-by: Dartmouth College's DANDI project
2021-08-16 16:51:33 -04:00
Joey Hess
ec82299730
status update
I was wrong about S3 supporting tailVerify.
2021-08-16 15:15:32 -04:00
https://christian.amsuess.com/chrysn
905fef31b3 Added a comment: Another example 2021-08-15 17:42:54 +00:00
Joey Hess
751242b55e
status update 2021-08-13 16:34:18 -04:00
Joey Hess
51d59fb260
comment 2021-08-12 14:49:48 -04:00
Joey Hess
7eb3742e4b
incremental verify for chunked remotes
Simply feed each chunk in turn to the incremental verifier.

When resuming an interrupted retrieve, it does not do incremental
verification. That would need to read the file, up to the resume point,
and feed it to the incremental verifier. That seems easy to get wrong.
Also it would mean extra work done before the transfer can start. Which
would complicate displaying progress, and would perhaps not appear to the
user as if it was resuming from where it left off. Instead, in that
situation, return UnVerified, and let the verification be done in a
separate pass.

Granted, Annex.CopyFile does manage all that, but it's not complicated
by dealing with chunks too.

Sponsored-by: Dartmouth College's DANDI project
2021-08-11 14:42:49 -04:00
Joey Hess
8886ff1cff
done! 2021-08-04 12:40:25 -04:00
Joey Hess
c67b1e31a6
branch 2021-08-03 17:05:50 -04:00
Joey Hess
629e95fd8e
update 2021-08-03 14:03:25 -04:00
Joey Hess
bb56186daa
new todo.. I seem to have cracked a longstanding problem
Sponsored-by: Jochen Bartl on Patreon
2021-08-03 13:51:23 -04:00
Joey Hess
461035c6ec
close
I'm now reasonably sure I've identified both cases where this can
happen. v8 upgrades and certian filesystems eg NFS. Both are handled as
well as can be, though it may involve some extra checksumming work.
2021-07-30 15:22:22 -04:00
Joey Hess
1fc6e6a65f
close this frustrating todo due to lack of followup and/or being fixed 2021-07-26 11:16:58 -04:00
Joey Hess
2ab19be3f6
comment 2021-07-22 13:34:27 -04:00
Joey Hess
39d7479366
comment 2021-07-21 11:06:03 -04:00
neil.okamoto@6ca5b1b6563bc71ce13ec235852d23e6ac86e331
9da4a7d5b2 Added a comment 2021-07-21 07:13:02 +00:00
neil.okamoto@6ca5b1b6563bc71ce13ec235852d23e6ac86e331
24c88e8167 Added a comment 2021-07-20 23:22:29 +00:00
Joey Hess
c9b9a93509
close 2021-07-16 16:28:22 -04:00
Joey Hess
acbf711e21
comment 2021-07-16 15:16:56 -04:00
Joey Hess
aa33c928cb
Merge branch 'master' of ssh://git-annex.branchable.com 2021-07-16 14:02:41 -04:00
Joey Hess
dc4e79c582
I understand this now, marking confirmed 2021-07-16 14:02:12 -04:00
Atemu
a4947c65ec Added a comment 2021-07-16 11:21:43 +00:00
Lukey
3c1f7ee773 Added a comment 2021-07-15 18:40:13 +00:00
Lukey
ca12b01fca Clarify 2021-07-15 18:30:22 +00:00
Joey Hess
b2a7a665b2
comment 2021-07-15 12:19:49 -04:00
Atemu
eb89eeea30 Added a comment 2021-07-14 21:27:42 +00:00
Joey Hess
d6f056eca3
have whereused also check the reflog
Since the stash is part of that, it can also find stashed content.

Sponsored-by: Boyd Stephen Smith Jr. on Patreon
2021-07-14 16:05:20 -04:00
Joey Hess
980613ec8c
man page for git-annex whereused
And an implementation strategy.

Sponsored-by: Brett Eisenberg on Patreon
2021-07-14 13:43:45 -04:00
Joey Hess
6db4ee6c34
comment 2021-07-14 12:01:34 -04:00
Atemu
fc5c6c6f11 Added a comment 2021-07-13 21:13:12 +00:00
Joey Hess
011341ff1a
comment 2021-07-13 13:20:14 -04:00
Joey Hess
93d8585faf
comment 2021-07-13 13:17:27 -04:00
Joey Hess
ce6a288412
comment 2021-07-13 12:47:56 -04:00
Ilya_Shlyakhter
e371367cb0 Added a comment: (partial) filenames in keys 2021-07-13 15:13:27 +00:00
Ilya_Shlyakhter
f4a281f712 Added a comment 2021-07-13 03:05:45 +00:00
jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb
72b9db6680 Added a comment 2021-07-12 22:48:26 +00:00
Joey Hess
b4c149b05e
comment 2021-07-12 12:05:52 -04:00
Ilya_Shlyakhter
5b45deda15 Added a comment: about maxextensionlength 2021-07-12 15:38:22 +00:00
Ilya_Shlyakhter
ea3fdc43d6 Added a comment 2021-07-12 15:30:25 +00:00
Joey Hess
443f841b78
comment 2021-07-12 11:20:51 -04:00
Joey Hess
bb87ee3375
comment 2021-07-12 11:01:59 -04:00
Joey Hess
0d3da9496f
comment 2021-07-12 10:57:44 -04:00
Ilya_Shlyakhter
e85eed583f Added a comment 2021-07-09 16:04:47 +00:00
https://christian.amsuess.com/chrysn
b4d33f3358 Added a comment: +1 on supportunlocked in git-annex config 2021-07-09 15:26:13 +00:00
https://christian.amsuess.com/chrysn
4a78debbe1 Added a comment: +1 on supportunlocked in git-annex config 2021-07-09 15:25:53 +00:00
jkniiv@b330fc3a602d36a37a67b2a2d99d4bed3bb653cb
a74ea32352 Added a comment 2021-07-06 19:49:07 +00:00
Ilya_Shlyakhter
9dd962aa73 added suggestion: add maxextensionlength to git-annex-config 2021-07-06 15:05:05 +00:00
Ilya_Shlyakhter
92c0080bf4 Added a comment: about keys with extensions 2021-07-06 14:59:24 +00:00
Joey Hess
2a328dca8c
comment 2021-07-06 10:09:03 -04:00
Joey Hess
92ca28c47b
close 2021-07-06 09:46:25 -04:00
Joey Hess
c00a15dde0
close 2021-07-06 09:45:45 -04:00
Ilya_Shlyakhter
97eb79e752 Added a comment: read-only view of repo clone with symlinks as normal files 2021-07-01 17:17:20 +00:00
Joey Hess
5594d00f8c
close 2021-06-28 13:08:57 -04:00
Joey Hess
ef5fb1a853
close 2021-06-28 13:05:26 -04:00
Joey Hess
3a14648142
dropping unused marks as dead
Dropping an object with drop --unused or dropunused will mark it as
dead, preventing fsck --all from complaining about it after it's been
dropped from all repositories.

If another repository still has a copy, it won't be treated as dead
until it's also dropped from there.

The drop has to use --unused, can't be --key or something else, because
this indicates that the user has recently ran git-annex unused. If it
checked the unused log on every drop, bad things would happen when the
unused log was out of date, eg a file used to be unused but then got
re-added. Marking such a file as dead could be confusing. When the user
uses --unused/dropunused, they must consider the unused information to be
up-to-date.

The particular workflow this enables is:

	git annex add foo
	git annex unannex foo
	git annex unused
	git annex drop --unused / dropunused
	git annex fsck --all # no warnings

The docs for git-annex unannex say to use git-annex unused and dropunused,
so the user should be pointed in this direction when they want to undo an
accidental add.

Sponsored-by: Brock Spratlen on Patreon
2021-06-25 15:22:26 -04:00
Joey Hess
0c1df95dd7
comment 2021-06-25 14:15:14 -04:00
Joey Hess
172c7e22e9
comment 2021-06-25 13:52:59 -04:00
Joey Hess
f5595ea063
comment 2021-06-25 12:13:25 -04:00
Joey Hess
e2cc9bf53d
comment 2021-06-25 12:11:05 -04:00
Ilya_Shlyakhter
27df3f8e88 added suggestion re: hardening git-annex against interruptions 2021-06-24 18:16:43 +00:00
Ilya_Shlyakhter
7ad95210d1 Added a comment: resolving merge conflicts 2021-06-24 18:03:40 +00:00
Lukey
fb2e418fa2 Added a comment 2021-06-24 17:43:36 +00:00
Ilya_Shlyakhter
879ac0b713 Added a comment: thanks 2021-06-24 16:40:48 +00:00
Joey Hess
847aa88bcb
Merge branch 'master' of ssh://git-annex.branchable.com 2021-06-23 17:31:09 -04:00
Joey Hess
a3a6e64122
comment 2021-06-23 17:30:50 -04:00
Ilya_Shlyakhter
7bff1dc27c Added a comment: recovering from sqlite db corruption 2021-06-23 18:45:47 +00:00
Joey Hess
af64c184f3
comment 2021-06-23 13:19:34 -04:00
Joey Hess
b980d90b05
close 2021-06-23 12:59:00 -04:00
Joey Hess
e701f69ac4
comment and close 2021-06-23 12:53:18 -04:00
Lukey
d2ba1698c9 2021-06-22 18:40:56 +00:00
Joey Hess
4b1b9d7a83
Added annex.freezecontent-command and annex.thawcontent-command configs
Freeze first sets the file perms, and then runs
freezecontent-command. Thaw runs thawcontent-command before
restoring file permissions. This is in case the freeze command
prevents changing file perms, as eg setting a file immutable does.
Also, changing file perms tends to mess up previously set ACLs.

git-annex init's probe for crippled filesystem uses them, so if file perms
don't work, but freezecontent-command manages to prevent write to a file,
it won't treat the filesystem as crippled.

When the the filesystem has been probed as crippled, the hooks are not
used, because there seems to be no point then; git-annex won't be relying
on locking annex objects down. Also, this avoids them being run when the
file perms have not been changed, in case they somehow rely on
git-annex's setting of the file perms in order to work.

Sponsored-by: Dartmouth College's Datalad project
2021-06-21 14:40:52 -04:00
Joey Hess
f23ae9a45b
comment 2021-06-21 13:52:50 -04:00
Joey Hess
ea0835eba6
tag datalad
at yoh's request
2021-06-21 13:23:51 -04:00
Joey Hess
a6a6217322
remove old closed bugs and todo items to speed up wiki updates and reduce size
Remove closed bugs and todos that were last edited or commented before 2020.

Except for ones tagged projects/* since projects like datalad want to keep
around records of old deleted bugs longer.

Command line used:

for f in $(grep -l '|done\]\]' -- ./*.mdwn); do if ! grep -q "projects/" "$f"; then d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=01-01-2020 --pretty=oneline -- "$f")" -a -z "$(git log --since=01-01-2020 --pretty=oneline -- "$d")" ]; then git rm -- "./$f" ; git rm -rf "./$d"; fi; fi; done
for f in $(grep -l '\[\[done\]\]' -- ./*.mdwn); do if ! grep -q "projects/" "$f"; then d="$(echo "$f" | sed 's/.mdwn$//')"; if [ -z "$(git log --since=01-01-2020 --pretty=oneline -- "$f")" -a -z "$(git log --since=01-01-2020 --pretty=oneline -- "$d")" ]; then git rm -- "./$f" ; git rm -rf "./$d"; fi; fi; done
2021-06-21 13:10:13 -04:00
Ilya_Shlyakhter
bfa530e962 added suggestion to allow use of synchronous=OFF with Sqlite 2021-06-16 20:06:39 +00:00
Joey Hess
d2be68907c
drop, move, mirror: when two files have the same content, honor the max numcopies and requiredcopies
Eg, before with a .gitattributes like:

*.2 annex.numcopies=2
*.1 annex.numcopies=1

And foo.1 and foo.2 having the same content and key, git-annex drop foo.1 foo.2
would succeed, leaving just 1 copy, despite foo.2 needing 2 copies.
It dropped foo.1 first and then skipped foo.2 since its content was gone.

Now that the keys database includes locked files, this longstanding wart
can be fixed.

Sponsored-by: Noam Kremen on Patreon
2021-06-15 11:38:44 -04:00
Joey Hess
af9fdf5dba
verify associated files when checking numcopies
Most of this is just refactoring. But, handleDropsFrom
did not verify that associated files from the keys db were still
accurate, and has now been fixed to.

A minor improvement to this would be to avoid calling catKeyFile
twice on the same file, when getting the numcopies and mincopies value,
in the common case where the same file has the highest value for both.
But, it avoids checking every associated file, so it will scale well to
lots of dups already.

Sponsored-by: Kevin Mueller on Patreon
2021-06-15 11:14:52 -04:00
Joey Hess
effc9bf5dd
close 2021-06-15 10:11:14 -04:00
Joey Hess
711252331e
comment 2021-06-14 14:34:22 -04:00
Joey Hess
398f9decd4
comment 2021-06-14 14:32:38 -04:00
Joey Hess
78da00c7a6
Future proof activity log parsing
When the log has an activity that is not known, eg added by a future
version of git-annex, it used to be treated as no activity at all,
which would make git-annex expire think it should expire the repository,
despite it having some kind of recent activity.

Hopefully there will be no reason to add a new activity until enough
time has passed that this commit is in use everywhere.

Sponsored-by: Jake Vosloo on Patreon
2021-06-14 14:18:19 -04:00
Joey Hess
3ac9363c03
comment 2021-06-14 12:42:11 -04:00
Joey Hess
014dc63a55
avoid sometimes expensive operations when annex.supportunlocked = false
This will mostly just avoid a DB lookup, so things get marginally
faster. But in cases where there are many files using the same key, it
can be a more significant speedup.

Added overhead is one MVar lookup per call, which should be small
enough, since this happens after transferring or ingesting a file,
which is always a lot more work than that. It would be nice, though,
to move getGitConfig to AnnexRead, which there is an open todo about.
2021-06-14 12:40:41 -04:00
Joey Hess
6cb9113ff5
comments 2021-06-08 17:38:56 -04:00
Joey Hess
7b6deb1109
display scanning message whenever reconcileStaged has enough files to chew on
Clear visible progress bar first.

Removed showSideActionAfter because it can't be used in reconcileStaged
(import loop). Instead, it counts the number of files it
processes and displays it after it's seen a sufficient to know it's
taking a while.

Sponsored-by: Dartmouth College's Datalad project
2021-06-08 12:48:30 -04:00
Joey Hess
13b9a288d3
scanAnnexedFiles in smudge --update
This makes git checkout and git merge hooks do the work to catch up with
changes that they made to the tree. Rather than doing it at some later
point when the user is not thinking about that past operation.

Sponsored-by: Dartmouth College's Datalad project
2021-06-08 11:37:47 -04:00
Joey Hess
2467de4f9b
todo 2021-06-07 16:58:35 -04:00
Joey Hess
0f10f208a7
avoid double work in git-annex init
reconcileStaged was doing a redundant scan to scannAnnexedFiles.

It would probably make sense to move the body of scannAnnexedFiles
into reconcileStaged, the separation does not really serve any purpose.

Sponsored-by: Dartmouth College's Datalad project
2021-06-07 16:50:14 -04:00
Joey Hess
6ceb31a30a
optimise reconcileStaged with git cat-file streaming
Commit 428c91606b made it need to do more
work in situations like switching between very different branches.

Compare with seekFilteredKeys which has a similar optimisation. Might be
possible to factor out the common part from these?

Sponsored-by: Dartmouth College's Datalad project
2021-06-07 15:26:48 -04:00
Joey Hess
570e93abfd
comment 2021-06-07 13:28:36 -04:00
Joey Hess
1c35cacf8e
fix link 2021-06-07 13:06:16 -04:00
Joey Hess
b960ebf1b3
Merge branch 'master' of ssh://git-annex.branchable.com 2021-06-07 12:59:21 -04:00
Ilya_Shlyakhter
4d581ad6b4 Added a comment: deferring the keys-to-files scan 2021-06-07 16:11:01 +00:00
Joey Hess
a0bba3afad
comment 2021-06-07 11:49:28 -04:00
Ilya_Shlyakhter
5359f8bc14 added suggestion to match keys by file extension in the key 2021-06-07 15:08:51 +00:00
lucas.gautheron@09f1983993dfb0907d02ba268b3ca672f1dc3eea
b38dc11a37 Added a comment 2021-06-05 10:10:57 +00:00
Ilya_Shlyakhter
d39dfed2a7 Added a comment: "why all these wild ideas are being thrown out there" 2021-06-04 22:15:33 +00:00
Joey Hess
a2c9360905
Merge branch 'master' of ssh://git-annex.branchable.com 2021-06-04 16:45:02 -04:00
Joey Hess
8a13bbedd6
--size-limit exit 101
Sponsored-by: Mark Reidenbach on Patreon
2021-06-04 16:43:47 -04:00
Atemu
ee5f30ee6b 2021-06-04 20:26:27 +00:00
Joey Hess
771a122c9e
add --size-limit option
When this option is not used, there should be effectively no added
overhead, thanks to the optimisation in
b3cd0cc6ba.

When an action fails on a file, the size of the file still counts toward
the size limit. This was necessary to support concurrency, but also
generally seems like the right choice.

Most commands that operate on annexed files support the option.
export and import do not, and I don't know if it would make sense for
export to.. Why would you want an incomplete export? sync doesn't, and
while it would be easy to make it support it for transferring files,
it's not clear if dropping files should also take the size limit into
account. Commands like add that don't operate on annexed files don't
support the option either.

Exiting 101 not yet implemented.

Sponsored-by: Denis Dzyubenko on Patreon
2021-06-04 16:16:53 -04:00
Joey Hess
7868dbd5e0
comment 2021-06-04 13:53:24 -04:00
Joey Hess
327033c2e5
comment 2021-06-04 13:36:51 -04:00
Joey Hess
0434674c85
avoid displaying the scanning annexed files message when repo is not large
Avoids users thinking this scan is a big deal, when it's not in the
majority of repos.

showSideActionAfter has some ugly caveats, since it has to display in
the background of another action. I could not see a better way to do it
and it works fine in this particular case. It also doesn't really belong
in Annex.Concurrent, but cannot go in Messages due to an import loop.

Sponsored-by: Dartmouth College's Datalad project
2021-06-04 13:16:48 -04:00
Joey Hess
95cec1bdfe
comment 2021-06-04 13:14:29 -04:00
yarikoptic
b925ea2923 about "scanning for annexed" while in git-annex branch 2021-06-04 15:20:34 +00:00
Atemu
f70251d638 2021-06-03 13:36:20 +00:00
Ilya_Shlyakhter
de12aeb1a4 Added a comment: matching include/exclude based on file extension in the key 2021-06-02 17:02:58 +00:00
Ilya_Shlyakhter
6a2bfad192 Added a comment: keys db optimization 2021-06-02 16:53:03 +00:00
Joey Hess
6f3f972355
Merge branch 'master' of ssh://git-annex.branchable.com 2021-06-01 11:43:36 -04:00
Joey Hess
3155c0d03e
todo 2021-06-01 10:39:48 -04:00
Ilya_Shlyakhter
a7e8a630fb Added a comment: keys-to-paths db 2021-05-31 23:15:21 +00:00
Joey Hess
f00e365f41
comments 2021-05-31 17:54:17 -04:00
Ilya_Shlyakhter
2dac55978c Added a comment: startup scan for files 2021-05-31 20:50:36 +00:00
Joey Hess
8734f17bc5
comment 2021-05-31 15:15:09 -04:00
Joey Hess
988dbce27a
Merge branch 'master' of ssh://git-annex.branchable.com 2021-05-31 15:05:40 -04:00
Joey Hess
eb6f6ff9b8
speed up keys database writes
There seems to be no reason to check the time here. I think it was
inherited from code in Database.Fsck, which does have a reason to commit
every few minutes. Removing that syscall speeds up a git-annex init
in a repo with 100000 annexed files by about 3 seconds.

Sponsored-by: Dartmouth College's Datalad project
2021-05-31 15:01:00 -04:00
Atemu
6da7f26e2a 2021-05-31 18:59:15 +00:00
Atemu
ae129dc317 2021-05-31 18:42:56 +00:00
Joey Hess
0f54e5e0ae
speed up initial scanning for annexed files
Streaming through git this way speeds it up by around 25%. This is
similar to the optimisations of seeking annexed files.

Sponsored-by: Dartmouth College's Datalad project
2021-05-31 14:29:34 -04:00
Joey Hess
759e5a9903
todo 2021-05-31 10:50:22 -04:00
Joey Hess
3b7f28feca
comment 2021-05-31 10:43:59 -04:00
Joey Hess
57a0ef8d90
comment and reject todo 2021-05-27 12:19:35 -04:00
Atemu
0a0889e72e Added a comment 2021-05-26 07:11:20 +00:00
Joey Hess
13a6bfff49
comments 2021-05-25 16:37:32 -04:00
Joey Hess
f5dc06077d
Merge branch 'master' of ssh://git-annex.branchable.com 2021-05-25 13:10:34 -04:00
Joey Hess
b5f5475ed6
New matching options --excludesamecontent and --includesamecontent
The normalisation of filenames turns out to be the tricky part here,
because the associated files coming out of the keys db may look like
"./foo/bar" or "../bar". For the former to match a glob like "foo/*",
it needs to be normalised.

Note that, on windows, normalise "./foo/bar" = "foo\\bar"
which a glob like "foo/*" won't match. So the glob is matched a second
time, on the toInternalGitPath, so allowing the user to provide a glob
with the slashes in either direction. However, this still won't support
some wacky edge cases like the user providing a glob of "foo/bar\\*"

Sponsored-by: Dartmouth College's Datalad project
2021-05-25 13:08:18 -04:00
Lukey
2ccf525b7f Added a comment 2021-05-25 16:48:26 +00:00
Joey Hess
cd73fcc92c
Merge branch 'master' of ssh://git-annex.branchable.com 2021-05-25 11:45:02 -04:00
Joey Hess
483fc4dc6b
Merge branch 'trackassociated' 2021-05-25 11:43:52 -04:00
Joey Hess
e9c95ef890
comments 2021-05-25 11:43:46 -04:00
Atemu
7ed4c4a35c 2021-05-25 14:51:21 +00:00
parhuzamos
54e1ac849a Added a comment 2021-05-24 09:33:50 +00:00
Joey Hess
428c91606b
include locked files in the keys database associated files
Before only unlocked files were included.

The initial scan now scans for locked as well as unlocked files. This
does mean it gets a little bit slower, although I optimised it as well
as I think it can be.

reconcileStaged changed to diff from the current index to the tree of
the previous index. This lets it handle deletions as well, removing
associated files for both locked and unlocked files, which did not
always happen before.

On upgrade, there will be no recorded previous tree, so it will diff
from the empty tree to current index, and so will fully populate the
associated files, as well as removing any stale associated files
that were present due to them not being removed before.

reconcileStaged now does a bit more work. Most of the time, this will
just be due to running more often, after some change is made to the
index, and since there will be few changes since the last time, it will
not be a noticable overhead. What may turn out to be a noticable
slowdown is after changing to a branch, it has to go through the diff
from the previous index to the new one, and if there are lots of
changes, that could take a long time. Also, after adding a lot of files,
or deleting a lot of files, or moving a large subdirectory, etc.

Command.Lock used removeAssociatedFile, but now that's wrong because a
newly locked file still needs to have its associated file tracked.

Command.Rekey used removeAssociatedFile when the file was unlocked.
It could remove it also when it's locked, but it is not really
necessary, because it changes the index, and so the next time git-annex
run and accesses the keys db, reconcileStaged will run and update it.

There are probably several other places that use addAssociatedFile and
don't need to any more for similar reasons. But there's no harm in
keeping them, and it probably is a good idea to, if only to support
mixing this with older versions of git-annex.

However, mixing this and older versions does risk reconcileStaged not
running, if the older version already ran it on a given index state. So
it's not a good idea to mix versions. This problem could be dealt with
by changing the name of the gitAnnexKeysDbIndexCache, but that would
leave the old file dangling, or it would need to keep trying to remove
it.
2021-05-21 16:24:37 -04:00
Joey Hess
1d9bad51d2
plan for these 2021-05-21 13:50:26 -04:00
Joey Hess
b68a40fa88
todo 2021-05-20 11:18:46 -04:00
Joey Hess
64e26287dd
comment 2021-05-19 11:07:02 -04:00
yarikoptic
674f33c139 todo for extra logging when content changed 2021-05-18 14:05:18 +00:00
Joey Hess
c525d18cf7
filter-branch: New command, useful to produce a filtered version of the git-annex branch, eg when splitting a repository 2021-05-17 14:16:46 -04:00
Joey Hess
5004eed27d
branch 2021-05-13 16:18:35 -04:00
Joey Hess
715d3d728c
new name for command 2021-05-13 16:07:30 -04:00
Joey Hess
40ade7a515
add some functions listing log files
Not used yet, will be used by copy-branch to generate the list of files
to copy.
2021-05-13 14:57:38 -04:00
Joey Hess
13a8706cda
almost have a plan 2021-05-13 14:09:06 -04:00
Joey Hess
03f46b95e6
comment 2021-05-13 12:05:24 -04:00
Atemu
d8ed6daeb3 Added a comment 2021-05-13 10:26:53 +00:00
Joey Hess
7500ba7ceb
already implemented 2021-05-12 12:24:55 -04:00
Joey Hess
8dbbbc7250
idea 2021-05-10 19:16:15 -04:00
Joey Hess
b184fc490a
split out common options to its own page and mention it on each subcommand page
Sometimes users would get confused because an option they were looking
for was not mentioned on a subcommand's man page, and they had not
noticed that the main git-annex man page had a list of common options.
This change lets each subcommand mention the common options, similarly
to how the matching options are handled.

This commit was sponsored by Svenne Krap on Patreon.
2021-05-10 15:00:13 -04:00
Joey Hess
56ccc0302e
mention --all on fsck man page, and repurpose todo 2021-05-10 11:11:50 -04:00
Joey Hess
9cc8e24727
comment 2021-05-10 10:48:44 -04:00
Atemu
c09497dfad Added a comment 2021-05-10 14:13:55 +00:00
Lukey
c2b1c730a5 Added a comment 2021-05-10 12:21:37 +00:00
Atemu
689a26d25a 2021-05-10 11:01:43 +00:00
Joey Hess
9e1a693f16
Merge branch 'master' of ssh://git-annex.branchable.com 2021-05-07 11:44:58 -04:00
Joey Hess
0caf171c63
patch review 2021-05-07 11:44:30 -04:00
Ilya_Shlyakhter
fd60824979 Added a comment: different keys for the same content 2021-05-07 15:37:46 +00:00
Atemu
e686a878ed 2021-05-05 08:52:28 +00:00
Atemu
78948270e2 2021-05-05 06:11:51 +00:00
Atemu
931a55b9a4 Added a comment 2021-05-05 06:04:49 +00:00
Atemu
857bffe388 Added a comment 2021-05-05 05:43:56 +00:00
yarikoptic
4d6e5bc6ad removed 2021-05-04 19:57:53 +00:00
yarikoptic
eac6b763d3 Added a comment 2021-05-04 17:51:09 +00:00
yarikoptic
66e175fec9 Added a comment 2021-05-04 17:50:45 +00:00
Atemu
6b23eb031f Added a comment 2021-05-04 17:37:04 +00:00
yarikoptic
1f7577839c Added a comment 2021-05-04 16:53:18 +00:00
Joey Hess
32e6d6880f
comment 2021-05-04 11:13:50 -04:00
Joey Hess
2d36fa7e17
comment 2021-05-04 10:56:27 -04:00
Joey Hess
20fee2ef04
response 2021-05-04 10:31:53 -04:00
Joey Hess
084f0e3e89
comment and todo 2021-05-04 10:14:53 -04:00
Atemu
0df9dc617f Added a comment 2021-05-04 12:46:59 +00:00
Atemu
6299fd688b 2021-05-04 02:43:14 +00:00
Joey Hess
4bd22a45e4
update 2021-05-02 15:22:22 -04:00
yarikoptic
e5bbbb5d02 initial todo asking for possibility to assign costs per URL 2021-04-30 13:54:30 +00:00
yarikoptic
349c0668be initial todo for perspective copy-key(file) command(s) 2021-04-29 22:41:56 +00:00
Ilya_Shlyakhter
5efce5078a added suggestion: support tree-ish in command args 2021-04-29 16:53:31 +00:00
Joey Hess
34c2d13dce
add note to git-annex-log man page about when information is not available 2021-04-23 14:53:38 -04:00
Joey Hess
bfa2db9222
done 2021-04-23 14:48:45 -04:00
Joey Hess
32138b8cd8
implement annex.privateremote and remote.name.private configs
The slightly unusual parsing in Types.GitConfig avoids the need to look
at the remote list to get configs of remotes. annexPrivateRepos combines
all the configs, and will only be calculated once, so it's nice and
fast.

privateUUIDsKnown and regardingPrivateUUID now need to read from the
annex mvar, so are not entirely free. But that overhead can be optimised
away, as seen in getJournalFileStale. The other call sites didn't seem
worth optimising to save a single MVar access. The feature should have
impreceptable speed overhead when not being used.
2021-04-23 14:21:57 -04:00
Joey Hess
d5a05655b4
Merge branch 'master' into hiddenannex 2021-04-23 13:06:33 -04:00
Joey Hess
0547884eb2
importfeed: fix bug while also speeding up 12x!
* Fix bug that could make git-annex importfeed not see recently recorded
  state when configured with annex.alwayscommit=false.
* importfeed: Made "checking known urls" phase run 12 times faster.

The massive speedup is because it no longer queries for metadata
accompanying each url. Instead it processes the whole git-annex branch and
checks all metadata files for feed item ids, and uses any it finds.

This could result in a behavior change, in an unlikely situation: If a feed
id is recorded in a key's metadata, but the url gets removed, the old code
would not see that item id and would re-download it if it finds an url for
it in a feed, while the new code will see the item id. I don't think
the old behavior was intentional, and it may be that the new behavior is
better. Not gonna worry about this.
2021-04-23 12:36:56 -04:00
Joey Hess
657d55c401
convert withKnownUrls to use overBranchFileContents
This only partly fixes importfeed to see journalled files, since it
separately cats metadata directly from the branch. Held off on a
changelog for a bug fix until that's dealt with.
2021-04-23 11:32:25 -04:00
Joey Hess
27a29c99fe
update 2021-04-22 12:52:32 -04:00
Joey Hess
dc37a5d1eb
update 2021-04-21 23:42:00 -04:00
Joey Hess
7cb96bc3e3
alternative 2021-04-21 17:18:47 -04:00
Joey Hess
0bb57702e1
Merge branch 'master' into hiddenannex 2021-04-21 15:45:12 -04:00
Joey Hess
653b719472
fix --all to include not yet committed files from the journal
Fix bug caused by recent optimisations that could make git-annex not see
recently recorded status information when configured with
annex.alwayscommit=false.

This does mean that --all can end up processing the same key more than once,
but before the optimisations that introduced this bug, it used to also behave
that way. So I didn't try to fix that; it's an edge case and anyway git-annex
behaves well when run on the same key repeatedly.

I am not too happy with the use of a MVar to buffer the list of files in the
journal. I guess it doesn't defeat lazy streaming of the list, if that
list is actually generated lazily, and anyway the size of the journal is
normally capped and small, so if configs are changed to make it huge and
this code path fire, git-annex using enough memory to buffer it all is not a
large problem.
2021-04-21 15:40:32 -04:00
Joey Hess
9b870e29fd
Merge branch 'master' into hiddenannex 2021-04-21 13:04:40 -04:00
Ilya_Shlyakhter
78f31022e3 Added a comment: auto-expire temp repos 2021-04-21 15:37:38 +00:00
Joey Hess
5dae95f95f
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-20 15:18:56 -04:00
Joey Hess
154fb46b24
update 2021-04-20 15:18:18 -04:00
Joey Hess
05989556a2
start implementing hidden git-annex repositories
This adds a separate journal, which does not currently get committed to
an index, but is planned to be committed to .git/annex/index-private.

Changes that are regarding a UUID that is private will get written to
this journal, and so will not be published into the git-annex branch.

All log writing should have been made to indicate the UUID it's
regarding, though I've not verified this yet.

Currently, no UUIDs are treated as private yet, a way to configure that
is needed.

The implementation is careful to not add any additional IO work when
privateUUIDsKnown is False. It will skip looking at the private journal
at all. So this should be free, or nearly so, unless the feature is
used. When it is used, all branch reads will be about twice as expensive.

It is very lucky -- or very prudent design -- that Annex.Branch.change
and maybeChange are the only ways to change a file on the branch,
and Annex.Branch.set is only internal use. That let Annex.Branch.get
always yield any private information that has been recorded, without
the risk that Annex.Branch.set might be called, with a non-private UUID,
and end up leaking the private information into the git-annex branch.

And, this relies on the way git-annex union merges the git-annex branch.
When reading a file, there can be a public and a private version, and
they are just concacenated together. That will be handled the same as if
there were two diverged git-annex branches that got union merged.
2021-04-20 15:04:53 -04:00
Atemu
2f05565db5 Added a comment 2021-04-20 18:05:27 +00:00
Joey Hess
3d9d1d1416
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-20 11:08:06 -04:00
Joey Hess
2d1cbdaba7
thoughts 2021-04-19 13:58:43 -04:00
Joey Hess
3262d6c0bc
yoh asked me to tag this datalad 2021-04-19 13:20:07 -04:00
Ilya_Shlyakhter
3cfc51343f Added a comment 2021-04-18 23:45:26 +00:00
anatoly.sayenko@880a118acc67f3244b406a2700f0556b2f10672c
4a9bb24d60 Added a comment: migration warning still present after migration 2021-04-18 09:37:10 +00:00
Ilya_Shlyakhter
d21b54b05b Added a comment: drop --not-used-elsewhere 2021-04-17 22:31:52 +00:00
Ilya_Shlyakhter
44aad24f30 added suggestion: let git-annex-matching-options query .gitattributes 2021-04-17 20:38:09 +00:00
Joey Hess
c8e607f226
comment 2021-04-16 14:45:46 -04:00
Joey Hess
e56e40c51a
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-16 14:41:51 -04:00
Joey Hess
29108a8801
thoughts 2021-04-16 14:41:12 -04:00
Lukey
c4fcfa8d33 Added a comment 2021-04-16 18:29:22 +00:00
Joey Hess
7496c86c7c
comment 2021-04-16 13:49:05 -04:00
Joey Hess
90eb649e73
idea 2021-04-16 13:30:23 -04:00
Ilya_Shlyakhter
3378f74fb0 Added a comment: lockContent for special remotes 2021-04-15 16:32:34 +00:00
yarikoptic
a72b7b8c2f initial todo/report on drop dropping a key "for all paths" 2021-04-15 02:30:33 +00:00
yarikoptic
3aa4cc9d6f Added a comment 2021-04-14 20:46:11 +00:00
Joey Hess
17646b0b31
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-14 16:20:13 -04:00
Joey Hess
58da9f74b7
directory CoW on export
Completing Cow support for directory.
2021-04-14 16:19:43 -04:00
Joey Hess
b86206b553
directory CoW on import 2021-04-14 16:10:09 -04:00
Joey Hess
a36be49b01
comment 2021-04-14 14:12:32 -04:00
yarikoptic
066c4f1efc Added a comment: comment to joey response on cp --reflink workaround 2021-04-14 18:04:53 +00:00
Joey Hess
34e959f181
tag confirmed 2021-04-14 13:45:59 -04:00
Joey Hess
cac7866bce
note 2021-04-14 13:44:43 -04:00
Joey Hess
d1478e8b40
correction 2021-04-14 13:42:37 -04:00
Joey Hess
42c8f1e5f5
comment 2021-04-14 13:41:24 -04:00
Joey Hess
799e7b3c29
update 2021-04-14 13:32:28 -04:00
Joey Hess
5978b2a35b
comment 2021-04-14 13:31:08 -04:00
Joey Hess
5783a8d081
fsck: avoid redundant checksum when transfer is Verified
When downloading content from a remote, if the content is able to be
verified during the transfer, skip checksumming it a second time.

Note that in this case, the fsck output does not include "(checksum)"
which it does when the checksumming is done separately from the download.

This commit was sponsored by Brock Spratlen on Patreon.
2021-04-14 13:22:54 -04:00
Atemu
46309994a2 2021-04-14 16:14:20 +00:00
yarikoptic
c300675051 is importtree CoW from directory? 2021-04-14 14:24:18 +00:00
Joey Hess
0bcf155e11
thoughts 2021-04-13 14:41:27 -04:00
Joey Hess
6911787042
idea 2021-04-13 13:41:36 -04:00
Joey Hess
67d91c63f7
update 2021-04-12 14:13:44 -04:00
Joey Hess
1e322c329e
update 2021-04-12 13:00:24 -04:00
Joey Hess
4c35d58bfe
comment and analysis 2021-04-12 12:54:46 -04:00
Ilya_Shlyakhter
ad2a6d45db Added a comment 2021-04-12 15:39:31 +00:00
Ilya_Shlyakhter
70991c1d65 Added a comment 2021-04-12 14:42:13 +00:00
Ilya_Shlyakhter
cf60184992 Added a comment: lockContent for special remotes w/o changing the protocol 2021-04-12 01:20:16 +00:00
Joey Hess
7b6ab0ae9a
comment 2021-04-08 13:51:43 -04:00
Atemu
351f5d753f fix url 2021-04-08 11:56:13 +00:00
Atemu
474dd1a3fc 2021-04-08 11:50:59 +00:00
Ilya_Shlyakhter
9041b2b6a4 Added a comment: running untrusted code 2021-04-07 16:52:42 +00:00
Joey Hess
da88863082
comment and close, open related todo 2021-04-06 16:51:38 -04:00
Joey Hess
98b223a71c
Merge branch 'master' of ssh://git-annex.branchable.com 2021-04-05 15:32:08 -04:00
Joey Hess
1b645e1ace
added --debugfilter (and annex.debugfilter) 2021-04-05 15:31:10 -04:00
Atemu
45a93d7129 2021-04-04 09:23:41 +00:00
Joey Hess
3204f0bbaa
comments 2021-04-02 13:41:26 -04:00
Joey Hess
4a30fddc2a
idea 2021-04-01 15:49:30 -04:00
Joey Hess
632ae09e28
comment 2021-04-01 12:24:21 -04:00
Ilya_Shlyakhter
4dde355c79 Added a comment: dockerized special remotes: security 2021-04-01 15:20:05 +00:00
Joey Hess
24c576bfa7
Merge branch 'master' of ssh://git-annex.branchable.com 2021-03-30 12:58:34 -04:00
Lukey
a366e9d0fc Added a comment 2021-03-30 16:21:14 +00:00
Joey Hess
773752b040
comment 2021-03-30 12:06:36 -04:00
Lukey
568f1c421b Added a comment 2021-03-30 16:01:04 +00:00
Ilya_Shlyakhter
4403791c6c Added a comment: autoenabling external special remotes 2021-03-30 15:17:05 +00:00
Ilya_Shlyakhter
9b8661c327 added suggestion to have git-annex-info display the time of last interaction with repos 2021-03-30 14:31:14 +00:00
parhuzamos
72f5088d34 2021-03-26 12:34:01 +00:00
parhuzamos
2187892a81 2021-03-26 12:31:47 +00:00
Ilya_Shlyakhter
a4cc0c95b4 added suggestion for additional git-annex-config settings 2021-03-23 20:11:39 +00:00
Ilya_Shlyakhter
547a5a8ca8 Added a comment: annex.supportunlocked=false 2021-03-23 20:02:19 +00:00
Joey Hess
f19271c5d9
comment 2021-03-23 15:51:21 -04:00
Joey Hess
806b6f77b9
Merge branch 'master' of ssh://git-annex.branchable.com 2021-03-23 15:47:21 -04:00
Ilya_Shlyakhter
3925235805 Added a comment: annex.supportunlocked 2021-03-23 19:30:44 +00:00
Joey Hess
5d78cd9d08
Sped up git-annex init in a clone of an existing repository
Seems that hasOrigin was never finding origin's git-annex branch, so a new
one got created each time. And so then it later needed to merge the two
branches, which is expensive.

Added --no-track to git branch to avoid it displaying a message about
setting up tracking branches. Of course there's no reason to make the
git-annex branch a tracking branch since git-annex auto-merges it.
2021-03-23 15:23:13 -04:00
yarikoptic
ed5fd5b896 Added a comment 2021-03-23 18:43:32 +00:00
Joey Hess
798f685077
New annex.supportunlocked config
Can beet to false to avoid some expensive things needed to support unlocked
files.

See my comment for why this only controls what init sets up, and not other
behavior.

I didn't bother with making the v5 upgrade code path look at this, though
it easily could, because the docs say to run git-annex init after setting
it to make it take effect.
2021-03-23 14:04:34 -04:00
Joey Hess
af96b49145
comment 2021-03-23 13:53:32 -04:00
Joey Hess
d89a9b0f78
comments 2021-03-23 12:05:05 -04:00
Ilya_Shlyakhter
14fafeea8a Added a comment: another deduplication option 2021-03-22 13:51:35 +00:00
Ilya_Shlyakhter
800b13f4ac Added a comment: installing clean/smudge filter lazily 2021-03-19 02:30:16 +00:00
Ilya_Shlyakhter
9bde743eae Added a comment: install clean/smudge filter only when needed 2021-03-18 15:47:03 +00:00
Joey Hess
02e74c010b
todo 2021-03-16 17:18:35 -04:00
Joey Hess
7dabbe4520
comment 2021-03-16 14:56:08 -04:00
yarikoptic
d0b56f113c an idea to avoid lengthy Scanning for unlocked files (this may take some time) 2021-03-15 17:57:53 +00:00
Ilya_Shlyakhter
b4022decb8 Added a comment: re: individually hash chunks 2021-03-15 15:23:09 +00:00
0xloem@0bd8a79a57e4f0dcade8fc81d162c37eae4d6730
32f773ae8f 2021-03-15 07:08:08 +00:00
Joey Hess
37263e97c7
comment 2021-03-12 11:45:28 -04:00
meribold
96b549779a Added a comment 2021-03-12 03:58:08 +00:00
Joey Hess
22d8ec6d74
comment 2021-03-11 13:06:24 -04:00
Joey Hess
06f86379b0
todo from comment 2021-03-11 12:26:10 -04:00
Joey Hess
b05f8458fb
remove tag 2021-03-08 14:47:40 -04:00
Joey Hess
e8065ee99d
close todo 2021-03-05 14:46:09 -04:00
Joey Hess
8d983b6432
comment 2021-03-05 12:39:57 -04:00
Joey Hess
eb594c710e
unregisterurl: New command
Implemented by generalizing registerurl. Without the implicit batch mode
of registerurl since that is only a backwards compatability thing
(see commit 1d1054faa6).
2021-03-01 14:28:24 -04:00
Joey Hess
06665d733a
comment 2021-03-01 13:52:19 -04:00
Joey Hess
db2e0485ae
Merge branch 'master' of ssh://git-annex.branchable.com 2021-03-01 13:10:50 -04:00
Ilya_Shlyakhter
cd682cf227 Added a comment: issue of pushing refs to annexed files but not info on how to fetch them 2021-03-01 16:46:57 +00:00
Joey Hess
9835fa5d01
todo 2021-02-26 12:12:41 -04:00
kyle
4c48791d41 Added a comment 2021-02-25 21:25:52 +00:00
kyle
9d96f41185 Added a comment: setpresentkey 0 2021-02-25 21:18:50 +00:00
yarikoptic
e5a24a502c or just a rmurl --key? 2021-02-25 20:37:25 +00:00
yarikoptic
4c1ffaac56 TODO for unregisterurl 2021-02-25 20:27:46 +00:00
Joey Hess
62d5a73bdd
unannex, uninit: Avoid running git rm once per annexed file, for a large speedup. 2021-02-22 12:56:11 -04:00
git-annex.branchable.com@d12f3f46c9222459d17f96bc7be04f7cd03a6732
1e4fac1046 Added a comment 2021-02-21 15:50:49 +00:00
git-annex.branchable.com@d12f3f46c9222459d17f96bc7be04f7cd03a6732
26c19de0d9 Add workaround 2021-02-20 19:05:42 +00:00
git-annex.branchable.com@d12f3f46c9222459d17f96bc7be04f7cd03a6732
c37bfccb63 Initial report 2021-02-20 19:04:18 +00:00
yarikoptic
a876884987 initial observation about slow uninit 2021-02-19 17:08:39 +00:00
Joey Hess
cb7bb3e4b9
comment 2021-02-10 21:49:25 -04:00
Joey Hess
e3832af5d5
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-10 16:40:16 -04:00
Joey Hess
f44d4704c6
incremental checksum for local remotes
This benchmarks only slightly faster than the old git-annex. Eg, for a 1
gb file, 14.56s vs 15.57s. (On a ram disk; there would certianly be
more of an effect if the file was written to disk and didn't stay in
cache.)

Commenting out the updateIncremental calls make the same run in 6.31s.
May be that overhead in the implementation, other than the actual
checksumming, is slowing it down. Eg, MVar access.

(I also tried using 10x larger chunks, which did not change the speed.)
2021-02-10 16:05:24 -04:00
Joey Hess
6487a75d33
comment 2021-02-10 13:15:00 -04:00
yarikoptic
f5c3eb86f9 Added a comment 2021-02-09 22:48:07 +00:00
Joey Hess
d8598dc3a0
comment 2021-02-09 17:05:56 -04:00
Joey Hess
fd51b0cd83
comment 2021-02-09 13:42:49 -04:00
Joey Hess
cbe84b62b9
close 2021-02-08 18:17:59 -04:00
Joey Hess
8f3554a7a8
close as dup 2021-02-08 14:19:43 -04:00
Joey Hess
dd39e9e255
suggest when user may want annex.stalldetection
When annex.stalldetection is not enabled, and a likely stall is detected,
display a suggestion to enable it.

Note that the progress meter display is not taken down when displaying
the message, so it will display like this:

	0%    8 B                 0 B/s
	  Transfer seems to have stalled. To handle stalling transfers, configure annex.stalldetection
	0%    10 B                0 B/s

Although of course if it's really stalled, it will never update
again after the message. Taking down the progress meter and starting
a new one doesn't seem too necessary given how unusual this is,
also this does help show the state it was at when it stalled.

Use of uninterruptibleCancel here is ok, the thread it's canceling
only does STM transactions and sleeps. The annex thread that gets
forked off is separate to avoid it being canceled, so that it
can be joined back at the end.

A module cycle required moving from dupState the precaching of the
remote list. Doing it at startConcurrency should cover all the cases
where the remote list is used in concurrent actions.

This commit was sponsored by Kevin Mueller on Patreon.
2021-02-03 15:57:19 -04:00
Joey Hess
135757d64a
automatic stall detection
annex.stalldetection can now be set to "true" to make git-annex do
automatic stall detection when it detects a remote is updating its transfer
progress consistently enough.

This commit was sponsored by Luke Shumaker on Patreon.
2021-02-03 13:33:57 -04:00
Joey Hess
904689f11b
Merge branch 'master' of ssh://git-annex.branchable.com 2021-02-02 19:39:24 -04:00
guardcat
ebb805b611 2021-02-02 21:04:21 +00:00
Joey Hess
aec2cf0abe
addon commands
Seems only fair, that, like git runs git-annex, git-annex runs
git-annex-foo.

Implementation relies on O.forwardOptions, so that any options are passed
through to the addon program. Note that this includes options before the
subcommand, eg: git-annex -cx=y foo

Unfortunately, git-annex eats the --help/-h options.
This is because it uses O.hsubparser, which injects that option into each
subcommand. Seems like this should be possible to avoid somehow, to let
commands display their own --help, instead of the dummy one git-annex
displays.

The two step searching mirrors how git works, it makes finding
git-annex-foo fast when "git annex foo" is run, but will also support fuzzy
matching, once findAllAddonCommands gets implemented.

This commit was sponsored by Dr. Land Raider on Patreon.
2021-02-02 16:32:49 -04:00
Joey Hess
d0fe0c5e10
close old todo 2021-02-02 13:25:58 -04:00
Joey Hess
696ee5d464
close 2021-02-02 13:21:34 -04:00
Joey Hess
233b2ab133
reject 2021-02-02 13:19:28 -04:00
Joey Hess
80a16a9dc8
remove no longer relevant part 2021-02-02 13:15:32 -04:00
Joey Hess
811399c8a1
meant to close this earlier 2021-02-02 13:12:47 -04:00
Joey Hess
cdbf80d338
comment 2021-02-02 13:10:07 -04:00
Joey Hess
5cad76198a
close 2021-02-02 12:52:00 -04:00
Joey Hess
d631304237
remove priority tag (unused) 2021-02-02 12:42:20 -04:00
Joey Hess
02b5bd224e
move to todo 2021-01-29 13:07:49 -04:00
Joey Hess
b372d962ae
Added GETGITREMOTENAME to extenal special remote protocol 2021-01-26 12:42:47 -04:00
Joey Hess
a11dad646c
Merge branch 'master' of ssh://git-annex.branchable.com 2021-01-26 11:32:52 -04:00
Joey Hess
8a8491b5b9
update 2021-01-26 11:32:40 -04:00
Joey Hess
4d8ce4464f
add 2021-01-26 11:27:14 -04:00
hello@6d9a437cebceb2fc657f93c4f30a7ba859e9309c
8d3bc1d3cb Added a comment: Other benefits of proposed improvements to the smudge/clean API 2021-01-26 14:25:59 +00:00
Joey Hess
9b2711167c
Merge branch 'master' of ssh://git-annex.branchable.com 2021-01-19 13:20:22 -04:00
Joey Hess
73df633a62
omit inode from ContentIdentifier for directory special remote
Directory special remotes with importtree=yes now avoid unncessary overhead
when inodes of files have changed, as happens whenever a FAT filesystem
gets remounted.

A few unusual edge cases of modifications won't be detected and
imported. I think they're unusual enough not to be a concern. It would
be possible to add a config setting that controls whether to compare
inodes too, but does not seem worth bothering the user about currently.

I chose to continue to use the InodeCache serialization, just with the
inode zeroed. This way, if I later change my mind or make it
configurable, can parse it back to an InodeCache and operate on it. The
overhead of storing a 0 in the content identifier log seems worth it.

There is a one-time cost to this change; all directory special remotes
with importtree=yes will re-hash all files once, and will update the
content identifier logs with zeroed inodes.

This commit was sponsored by Brett Eisenberg on Patreon.
2021-01-19 13:15:07 -04:00
Lukey
b11aa063ae Added a comment 2021-01-19 16:27:07 +00:00
Joey Hess
2b458c2d68
comment and todo 2021-01-19 11:56:27 -04:00
Joey Hess
96a7a1fb71
close 2021-01-11 12:26:52 -04:00
Joey Hess
515f54bd70
idea 2021-01-10 16:32:44 -04:00
Joey Hess
4694c2bb87
Merge branch 'master' into requirednumcopies 2021-01-06 14:24:09 -04:00
Joey Hess
cc89699457
mincopies
This is conceptually very simple, just making a 1 that was hard coded be
exposed as a config option. The hard part was plumbing all that, and
dealing with complexities like reading it from git attributes at the
same time that numcopies is read.

Behavior change: When numcopies is set to 0, git-annex used to drop
content without requiring any copies. Now to get that (highly unsafe)
behavior, mincopies also needs to be set to 0. It seemed better to
remove that edge case, than complicate mincopies by ignoring it when
numcopies is 0.

This commit was sponsored by Denis Dzyubenko on Patreon.
2021-01-06 14:15:19 -04:00
Joey Hess
8d8cdbec56
branch 2021-01-05 14:28:54 -04:00
Joey Hess
5ce61c6b2a
add: Significantly speed up adding lots of non-large files to git
* add: Significantly speed up adding lots of non-large files to git,
  by disabling the annex smudge filter when running git add.
* add --force-small: Run git add rather than updating the index itself,
  so any other smudge filters than the annex one that may be enabled will
  be used.
2021-01-04 13:12:28 -04:00
Joey Hess
e7b0754171
comment and todo 2021-01-04 12:26:48 -04:00
Joey Hess
9a3998392e
close import_tree todo
Split out two todos for things that were mentioned as still open items
in there. Most of the others were already dealt with. I didn't open a
new todo for the import from readonly S3 bucket because I guess if
someone needs that, they can ask for it.
2020-12-30 13:40:49 -04:00
Joey Hess
b16e6fb4e6
borg appendonly config 2020-12-28 16:23:38 -04:00
Joey Hess
6280af2901
generate more compact git-annex branch for imports
Especially from borg, where the content identifier logs
all end up being the same identical file!

But also, for other imports, the location tracking logs can,
in some cases, be identical files.

Bonus optimisation: Avoid looking up (and parsing when set)
GIT_ANNEX_VECTOR_CLOCK env var every time a log is written to.
Although the lookup does happen at startup even when no
log will be written now.
2020-12-23 15:25:16 -04:00
Joey Hess
7916fc98a3
graft in imported tree to avoid gc
Fix a bug that could prevent getting files from an importtree=yes remote,
because the imported tree was allowed to be garbage collected.
2020-12-23 14:27:38 -04:00
Joey Hess
e3d356fe84
borg: add subdir= config
Note that, after changing it with enableremote, syncing won't rescan
known archives in the borg repo using the changed config. Probably not a
problem?

Also used File in some places where filenames that could theoretically
start with - are passed to borg, to avoid it confusing them with
options.
2020-12-23 13:12:11 -04:00
Joey Hess
1574972ba9
make sync --content get from third-party populated remotes like borg 2020-12-23 12:10:39 -04:00
Joey Hess
79729bcd76
todo 2020-12-22 16:37:19 -04:00
Joey Hess
a2fe994ebb
move unimplemented option to todo 2020-12-22 16:28:13 -04:00
Joey Hess
cd4c68924b
merged borg
Still a couple related todos, but it's basically usable now.
2020-12-22 16:22:44 -04:00
Joey Hess
2335476e1e
todo 2020-12-22 16:19:02 -04:00
Joey Hess
4254e2297d
implement retrieveExportWithContentIdentifier
Moved out an XXX to a todo

This seems about ready to merge..
2020-12-22 16:16:48 -04:00
Joey Hess
f31bdd0b19
todo 2020-12-22 15:01:07 -04:00
Joey Hess
82e43da936
todo 2020-12-22 15:00:11 -04:00
Joey Hess
f62aee0525
fix handling of importtree-only remotes
Don't want to try to use these remotes as key/value remotes, which will
surely fail. It only recently became possible for importtree to be set
w/o exporttree, so before this code was ok.

(cherry picked from commit 97599cb0f7f4115aa5a3e81a91ee3d1d6c52dc84)
2020-12-18 15:13:30 -04:00
Joey Hess
4c63cab467
todo 2020-12-17 16:30:51 -04:00
Joey Hess
2abda21123
update 2020-12-15 16:35:06 -04:00
Joey Hess
117d270bb4
comment 2020-12-15 16:34:16 -04:00
Joey Hess
74c1e0660b
propagate git-annex -c on to transferrer child process
git -c was already propagated via environment, but need this for
consistency.

Also, notice it does not use gitAnnexChildProcess to run the
transferrer. So nothing is done about avoid it taking the
pid lock. It's possible that the caller is already doing something that
took the pid lock, and if so, the transferrer will certianly fail,
since it needs to take the pid lock too. This may prevent combining
annex.stalldetection with annex.pidlock, but I have not verified it's
really a problem. If it was, it seems git-annex would have to take
the pid lock when starting a transferrer, and hold it until shutdown,
or would need to take pid lock when starting to use a transferrer,
and hold it until done with a transfer and then drop it. The latter
would require starting the transferrer with pid locking disabled for the
child process, so assumes that the transferrer does not do anyting that
needs locking when not running a transfer.
2020-12-15 11:36:25 -04:00
Joey Hess
75acf5f440
improve some edge cases around partial initialization
* Guard against running in a repo where annex.uuid is set but
  annex.version is set, or vice-versa.
* Avoid autoinit when a repo does not have annex.version or annex.uuid
  set, but has a git-annex objects directory, suggesting it was used
  by git-annex before.
2020-12-14 13:17:43 -04:00
Joey Hess
9aaab02e44
add 2020-12-13 18:50:35 -04:00
Joey Hess
3c76a31b15
response and related todo 2020-12-11 16:21:16 -04:00
Joey Hess
e6692b66f1
remove
I need to think about this some more, not clear if it's a todo
item specific to stalldetection at all. Remotes with this behavior
also show no progress when run with -J. And some other remotes don't
update any progress meters at all, eg adb is that way and so are hook
remotes and of course external remotes don't have to send progress info.
2020-12-11 15:49:39 -04:00
Joey Hess
fadf47557f
note 2020-12-11 15:48:03 -04:00
Joey Hess
90eadbce49
close 2020-12-11 15:45:55 -04:00
Joey Hess
d3f78da0ed
propagate signals to the transferrer process group
Done on unix, could not implement it on windows quite.

The signal library gets part of the way needed for windows.
But I had to open https://github.com/pmlodawski/signal/issues/1 because
it lacks raiseSignal.

Also, I don't know what the equivilant of getProcessGroupIDOf is on
windows. And System.Process does not provide a way to send any signal to
a process group except for SIGINT.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2020-12-11 15:32:00 -04:00
Joey Hess
79c765b727
meant to close this earlier 2020-12-11 14:25:54 -04:00
Joey Hess
095cdc7e83
extend transferrer protocol to send progress bar total size updates
New protocol is not back-compat with old one, but it's never been
released so that's ok.
2020-12-11 12:42:28 -04:00
Joey Hess
263fd1d459
ugh 2020-12-11 11:51:46 -04:00
Joey Hess
a422a056f2
make getViaTmpFrom no longer update location log
All callers adjusted to update it themselves.

In Command.ReKey, and Command.SetKey, the cleanup action already did,
so it was updating the log twice before.

This fixes a bug when annex.stalldetection is set, as now
Command.Transferrer can skip updating the location log, and let it be
updated by the calling process.
2020-12-11 11:50:13 -04:00
Joey Hess
a6ed23b82f
todo 2020-12-11 11:09:19 -04:00
Joey Hess
d3c11eac5d
2 bugs involving new stalldetection feature 2020-12-10 17:46:58 -04:00
Joey Hess
3fa2bc2eed
rename back, there were links to this 2020-12-09 13:14:54 -04:00
Joey Hess
05c0543e8e
move new interface to git-annex transfer
This is to avoid breakage when upgrading or downgrading git-annex with a
process running that uses the interface. It's better to keep the
compatability code for a few years than worry about such breakage.

This commit was sponsored by Brett Eisenberg on Patreon.
2020-12-09 12:33:56 -04:00
Joey Hess
41f2c308ff
stall detection is working
New config annex.stalldetection, remote.name.annex-stalldetection, which
can be used to deal with remotes that stall during transfers, or are
sometimes too slow to want to use.

This commit was sponsored by Luke Shumaker on Patreon.
2020-12-08 15:22:18 -04:00
Joey Hess
09ed9f7d1f
reminder for later 2020-12-08 15:20:05 -04:00
Joey Hess
c4d489f7d4
add todo
Not going to do this yet, so remember for later.
2020-12-08 15:17:35 -04:00
Joey Hess
438d5be1f7
support prompt in message serialization
That seems to be the last thing needed for message serialization.
Although it's only used in the assistant currently, so hard to tell if I
forgot something.

At this point, it should be possible to start using transferkeys
when performing transfers, which will allow killing a transferkeys
process if a transfer times out or stalls. But that's for another day.

This commit was sponsored by Ethan Aubin.
2020-12-04 14:54:09 -04:00
Joey Hess
4d9f416949
idea 2020-12-04 00:00:40 -04:00
Joey Hess
bf76ae2c90
mention new branch 2020-12-03 16:29:22 -04:00
Joey Hess
1858b65d88
design work 2020-12-02 14:31:24 -04:00
Joey Hess
ee86972f66
thought 2020-11-30 13:27:45 -04:00
Joey Hess
e843334a40
comment 2020-11-30 13:13:56 -04:00
Lukey
36b4a253e7 2020-11-29 19:06:37 +00:00
Lukey
7e86da7701 2020-11-29 19:04:34 +00:00
Joey Hess
a6f7017eba
Merge branch 'master' of ssh://git-annex.branchable.com 2020-11-23 13:00:36 -04:00
Joey Hess
8b8ee68a9c
update 2020-11-23 13:00:20 -04:00
Cebtenzzre
8816f0d8ab Added a comment 2020-11-22 16:07:16 +00:00
Joey Hess
1be38362aa
retitle 2020-11-16 15:17:48 -04:00
Joey Hess
805af01562
bug fix
really innefficient but it does solve dropping
2020-11-16 14:57:51 -04:00
Joey Hess
0896038ba7
annex.adjustedbranchrefresh
Added annex.adjustedbranchrefresh git config to update adjusted branches
set up by git-annex adjust --unlock-present/--hide-missing.

Note, in a few cases, I was not able to make the adjusted branch
be updated in calls to moveAnnex, because information about what
file corresponds to a key is not available. They are:

* If two files point to one file, then eg, `git annex get foo` will
  update the branch to unlock foo, but will not unlock bar, because it
  does not know about it. Might be fixable by making `git annex get
  bar` do something besides skipping bar?
* git-annex-shell recvkey likewise (so sends over ssh from old versions
  of git-annex)
* git-annex setkey
* git-annex transferkey if the user does not use --file
* git-annex multicast sends keys with no associated file info

Doing a single full refresh at the end, after any incremental refresh,
will deal with those edge cases.
2020-11-16 14:27:28 -04:00
Joey Hess
26cf26caca
Merge branch 'master' into symlink-missing 2020-11-16 10:03:12 -04:00
Joey Hess
5a8d01f63e
examinekey: Added a "file" format variable
For consistency with find, and for easier scripting.
2020-11-16 09:59:11 -04:00
yarikoptic
13bab4f2cf Added a comment 2020-11-14 02:00:13 +00:00
Joey Hess
f07670a282
measurement 2020-11-13 15:57:35 -04:00
Joey Hess
56aabccda4
close 2020-11-13 15:54:33 -04:00
Joey Hess
b9351922d2
add todo 2020-11-13 15:50:35 -04:00
Joey Hess
7566aa6bc5
examinekey: Added --migrate-to-backend
Note that, the way the SeekInput parser is written to support batch mode,
it's actually possible to do git-annex examinekey
"SHA1--foo foo.tar.gz" --migrate-to-backend=SHA1E

While that might be kind of useful to support multiple migrations not using
batch mode, I have not documented it. It would be better to take pairs of
key and file in that case.
2020-11-12 14:09:14 -04:00
Joey Hess
12e32d1dee
examinekey: Added two new format variables: objectpath and objectpointer 2020-11-12 13:02:31 -04:00
Joey Hess
c5141b469a
comment 2020-11-12 12:59:27 -04:00
Joey Hess
d7da4ee00a
comment 2020-11-12 12:29:15 -04:00
yarikoptic
60a71f90cc adding a note pointing to tentative recipe 2020-11-11 20:10:33 +00:00
yarikoptic
07e9f43c63 todo/question on how to get full path to the key knowing metadata but having no file 2020-11-11 19:29:14 +00:00
Lukey
9d598265e4 2020-11-05 18:41:25 +00:00
Joey Hess
664bec4297
comment 2020-11-02 22:34:04 -04:00
yarikoptic
5b4d5f6d64 Added a comment 2020-11-02 20:37:47 +00:00
Joey Hess
40679616ed
comment 2020-11-02 15:01:17 -04:00
yarikoptic
79874325b8 a plea for more --debug output 2020-11-02 18:04:20 +00:00
michael.hanke@c60e12358aa3fc6060531bdead1f530ac4d582ec
0170b5468f Added a comment: Documentation of demand 2020-10-27 14:59:47 +00:00
yarikoptic
c24480c061 Added a comment: windows build with magic 2020-10-22 19:32:55 +00:00
Joey Hess
0133b7e5a8
move: Improve resuming a move that was interrupted after the object was transferred
In cases where numcopies checks prevented the resumed move from dropping
the object from the source repository, it now relies on a log of recent
moves to replicate the behavior of the interrupted command.

Performance: Probably noticable impact, since it has to add to the log,
check the log, and remove from the log. Seems worth it to avoid this
annoying edge case. The log functions are pretty well optimised to avoid
unncessary work.

An performance improvement to make later would be to avoid cleanup doing
anything if it's not written to the log file, and has confirmed that the
log file does not contain the log line.

This commit was sponsored by Jake Vosloo on Patreon.
2020-10-21 10:31:56 -04:00
Joey Hess
15ea0e6c0a
design done 2020-10-19 14:57:02 -04:00
Joey Hess
5009c1ce68
update 2020-10-19 14:49:16 -04:00
Joey Hess
c337b58caf
more thoughts 2020-10-19 14:19:41 -04:00
Joey Hess
3c6cfacb19
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-10-19 12:19:33 -04:00
Joey Hess
e408658eaa
close 2020-10-19 12:16:15 -04:00
Ilya_Shlyakhter
1ac3735ac9 Added a comment: copying directly from directory special remote to the cloud 2020-10-18 21:00:59 +00:00
Joey Hess
e4a8b7b26f
comment 2020-10-13 18:57:30 -04:00
Joey Hess
9a5cd96f0d
Fix a memory leak introduced in the last release
The problem was this line:

	cleanup = and <$> sequence (map snd v)

That caused all of v to be held onto until the end, when the cleanup action
was run.

I could not seem to find a bang pattern that avoided the leak, so I
resorted to a IORef, rather clunky, but not a performance problem because
it will only be written once per git ls-files, so typically just 1 time.

This commit was sponsored by Mark Reidenbach on Patreon.
2020-10-13 16:31:01 -04:00
Joey Hess
d41795c8ff
blew a day investigating this and still don't understand it fully 2020-10-13 15:33:59 -04:00
Joey Hess
cf0ff9f53d
comment 2020-10-13 12:24:13 -04:00
Joey Hess
b72a5c0680
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-10-12 16:14:51 -04:00
Joey Hess
db1def72ee
comment
This commit was sponsored by Ethan Aubin.
2020-10-12 16:07:30 -04:00
Joey Hess
4124862ae0
comment
This commit was sponsored by Ethan Aubin.
2020-10-12 15:47:46 -04:00
Joey Hess
fc16057f99
profiling
Currently a bit stuck, but at least starting to get some clues on this
memory leak.

This commit was sponsored by Ethan Aubin.
2020-10-12 14:48:26 -04:00
Joey Hess
81216931c6
comment 2020-10-12 12:32:53 -04:00
Lukey
308e5b3a81 Added a comment 2020-10-11 14:17:47 +00:00
Joey Hess
46371797ec
design for finally slaying this beast
This commit was sponsored by Ethan Aubin.
2020-10-08 10:09:12 -04:00
yarikoptic
8035e2073d Added a comment 2020-10-05 23:17:08 +00:00
Joey Hess
471bcfaf37
comment 2020-10-05 14:51:19 -04:00
kyle
a39f9e8c7f Added a comment 2020-10-05 18:09:07 +00:00
yarikoptic
0bb7569e1c Added a comment 2020-10-05 18:02:51 +00:00
kyle
64ca40e11c Added a comment 2020-10-05 17:42:26 +00:00
yarikoptic
4bb980bfb9 Added a comment 2020-10-05 15:26:10 +00:00
kyle
742ce7ec87 Added a comment 2020-10-01 19:07:48 +00:00
yarikoptic
1bf55a8fc1 Added a comment 2020-10-01 17:26:10 +00:00
Joey Hess
37b1f2f2ed
response 2020-10-01 13:08:09 -04:00
yarikoptic
23103e225a Added a comment 2020-10-01 14:06:34 +00:00
Joey Hess
1610d94776
addurl: Avoid a redundant git ignores check for speed
Ensure that checkCanAdd is used everywhere a file is added to git,
so git add is run with -f, presumably avoiding the work it would usually
do to check ignores.
2020-09-29 13:00:41 -04:00
Joey Hess
d10cbaa084
comment 2020-09-29 12:25:40 -04:00
lykos@d125a37d89b1cfac20829f12911656c40cb70018
a3083436dc removed 2020-09-29 08:36:37 +00:00
lykos@d125a37d89b1cfac20829f12911656c40cb70018
7b37f22880 Added a comment 2020-09-29 08:36:19 +00:00
lykos@d125a37d89b1cfac20829f12911656c40cb70018
ff0784dad9 Added a comment 2020-09-29 08:32:22 +00:00
Joey Hess
3eaaec3113
consistently use importKey when available
This avoids import with --no-content and with --content potentially
generating two different trees, leading to a merge conflict when run in
two different clones of a repo. And it's necessary groundwork to make
git-annex sync --no-content import from special remotes that support
importKey.

Only the directory special remote currently supports importKey, and it
generates the same key as git-annex usually does, so there is no
behavior change for it.

Future special remotes will need to take care when adding importKey,
if it generates different keys. Added some warnings about that to
comments.

This commit was sponsored by Noam Kremen on Patreon.
2020-09-28 15:27:46 -04:00
Joey Hess
15c1ee16d9
import --no-content: Check annex.largefiles
Import small files into git, the same as is done when importing with content.
Which means, for small files, --no-content does download them.

If the largefiles expression needs the file content available
(due to mimetype or mimeencoding being used), the import will fail.

This commit was sponsored by Jake Vosloo on Patreon.
2020-09-28 13:28:57 -04:00
Joey Hess
9e676f062f
split out todo 2020-09-28 10:40:13 -04:00
Joey Hess
6a41a615b9
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-09-25 13:51:20 -04:00
Joey Hess
13f9c88123
add todo 2020-09-25 13:51:04 -04:00
Lukey
12eb7a3ceb Added a comment 2020-09-25 16:33:42 +00:00
Joey Hess
ace02f41b0
seek: defer matcher check until more info is known
Sped up seeking for files to operate on, when using options like --copies
or --in, by around 20%.

Benchmark showed an increase for --copies from 155 seconds to 121
seconds, and --in remote will be similar to that.

For --in here, the speedup was less, 5-10% or so.

(both warm cache)

This commit was sponsored by Jack Hill on Patreon.
2020-09-24 17:59:12 -04:00
Joey Hess
d89984b121
sync --all avoid unncessary first pass
Sped up seeking to around twice as fast, by avoiding a pass over the
worktree files when preferred content expressions of the local repo and
remotes don't use include=/exclude=.

Thanks to Lukey for identifying the optimisation.

This commit was sponsored by Brock Spratlen on Patreon.
2020-09-24 15:12:09 -04:00
Joey Hess
c1b4d76e6b
make MatchFiles introspectable
matchNeedsFileContent is not used yet, but shows how to add information
about terminals. That one would be needed for
https://git-annex.branchable.com/todo/sync_fast_import/

Note the tricky bit in Annex.FileMatcher.call where it folds over the
included matcher to propagate the information.

This commit was sponsored by Svenne Krap on Patreon.
2020-09-24 14:01:53 -04:00