Commit graph

3400 commits

Author SHA1 Message Date
Joey Hess
73df633a62
omit inode from ContentIdentifier for directory special remote
Directory special remotes with importtree=yes now avoid unncessary overhead
when inodes of files have changed, as happens whenever a FAT filesystem
gets remounted.

A few unusual edge cases of modifications won't be detected and
imported. I think they're unusual enough not to be a concern. It would
be possible to add a config setting that controls whether to compare
inodes too, but does not seem worth bothering the user about currently.

I chose to continue to use the InodeCache serialization, just with the
inode zeroed. This way, if I later change my mind or make it
configurable, can parse it back to an InodeCache and operate on it. The
overhead of storing a 0 in the content identifier log seems worth it.

There is a one-time cost to this change; all directory special remotes
with importtree=yes will re-hash all files once, and will update the
content identifier logs with zeroed inodes.

This commit was sponsored by Brett Eisenberg on Patreon.
2021-01-19 13:15:07 -04:00
Lukey
b11aa063ae Added a comment 2021-01-19 16:27:07 +00:00
Joey Hess
2b458c2d68
comment and todo 2021-01-19 11:56:27 -04:00
Joey Hess
96a7a1fb71
close 2021-01-11 12:26:52 -04:00
Joey Hess
515f54bd70
idea 2021-01-10 16:32:44 -04:00
Joey Hess
4694c2bb87
Merge branch 'master' into requirednumcopies 2021-01-06 14:24:09 -04:00
Joey Hess
cc89699457
mincopies
This is conceptually very simple, just making a 1 that was hard coded be
exposed as a config option. The hard part was plumbing all that, and
dealing with complexities like reading it from git attributes at the
same time that numcopies is read.

Behavior change: When numcopies is set to 0, git-annex used to drop
content without requiring any copies. Now to get that (highly unsafe)
behavior, mincopies also needs to be set to 0. It seemed better to
remove that edge case, than complicate mincopies by ignoring it when
numcopies is 0.

This commit was sponsored by Denis Dzyubenko on Patreon.
2021-01-06 14:15:19 -04:00
Joey Hess
8d8cdbec56
branch 2021-01-05 14:28:54 -04:00
Joey Hess
5ce61c6b2a
add: Significantly speed up adding lots of non-large files to git
* add: Significantly speed up adding lots of non-large files to git,
  by disabling the annex smudge filter when running git add.
* add --force-small: Run git add rather than updating the index itself,
  so any other smudge filters than the annex one that may be enabled will
  be used.
2021-01-04 13:12:28 -04:00
Joey Hess
e7b0754171
comment and todo 2021-01-04 12:26:48 -04:00
Joey Hess
9a3998392e
close import_tree todo
Split out two todos for things that were mentioned as still open items
in there. Most of the others were already dealt with. I didn't open a
new todo for the import from readonly S3 bucket because I guess if
someone needs that, they can ask for it.
2020-12-30 13:40:49 -04:00
Joey Hess
b16e6fb4e6
borg appendonly config 2020-12-28 16:23:38 -04:00
Joey Hess
6280af2901
generate more compact git-annex branch for imports
Especially from borg, where the content identifier logs
all end up being the same identical file!

But also, for other imports, the location tracking logs can,
in some cases, be identical files.

Bonus optimisation: Avoid looking up (and parsing when set)
GIT_ANNEX_VECTOR_CLOCK env var every time a log is written to.
Although the lookup does happen at startup even when no
log will be written now.
2020-12-23 15:25:16 -04:00
Joey Hess
7916fc98a3
graft in imported tree to avoid gc
Fix a bug that could prevent getting files from an importtree=yes remote,
because the imported tree was allowed to be garbage collected.
2020-12-23 14:27:38 -04:00
Joey Hess
e3d356fe84
borg: add subdir= config
Note that, after changing it with enableremote, syncing won't rescan
known archives in the borg repo using the changed config. Probably not a
problem?

Also used File in some places where filenames that could theoretically
start with - are passed to borg, to avoid it confusing them with
options.
2020-12-23 13:12:11 -04:00
Joey Hess
1574972ba9
make sync --content get from third-party populated remotes like borg 2020-12-23 12:10:39 -04:00
Joey Hess
79729bcd76
todo 2020-12-22 16:37:19 -04:00
Joey Hess
a2fe994ebb
move unimplemented option to todo 2020-12-22 16:28:13 -04:00
Joey Hess
cd4c68924b
merged borg
Still a couple related todos, but it's basically usable now.
2020-12-22 16:22:44 -04:00
Joey Hess
2335476e1e
todo 2020-12-22 16:19:02 -04:00
Joey Hess
4254e2297d
implement retrieveExportWithContentIdentifier
Moved out an XXX to a todo

This seems about ready to merge..
2020-12-22 16:16:48 -04:00
Joey Hess
f31bdd0b19
todo 2020-12-22 15:01:07 -04:00
Joey Hess
82e43da936
todo 2020-12-22 15:00:11 -04:00
Joey Hess
f62aee0525
fix handling of importtree-only remotes
Don't want to try to use these remotes as key/value remotes, which will
surely fail. It only recently became possible for importtree to be set
w/o exporttree, so before this code was ok.

(cherry picked from commit 97599cb0f7f4115aa5a3e81a91ee3d1d6c52dc84)
2020-12-18 15:13:30 -04:00
Joey Hess
4c63cab467
todo 2020-12-17 16:30:51 -04:00
Joey Hess
2abda21123
update 2020-12-15 16:35:06 -04:00
Joey Hess
117d270bb4
comment 2020-12-15 16:34:16 -04:00
Joey Hess
74c1e0660b
propagate git-annex -c on to transferrer child process
git -c was already propagated via environment, but need this for
consistency.

Also, notice it does not use gitAnnexChildProcess to run the
transferrer. So nothing is done about avoid it taking the
pid lock. It's possible that the caller is already doing something that
took the pid lock, and if so, the transferrer will certianly fail,
since it needs to take the pid lock too. This may prevent combining
annex.stalldetection with annex.pidlock, but I have not verified it's
really a problem. If it was, it seems git-annex would have to take
the pid lock when starting a transferrer, and hold it until shutdown,
or would need to take pid lock when starting to use a transferrer,
and hold it until done with a transfer and then drop it. The latter
would require starting the transferrer with pid locking disabled for the
child process, so assumes that the transferrer does not do anyting that
needs locking when not running a transfer.
2020-12-15 11:36:25 -04:00
Joey Hess
75acf5f440
improve some edge cases around partial initialization
* Guard against running in a repo where annex.uuid is set but
  annex.version is set, or vice-versa.
* Avoid autoinit when a repo does not have annex.version or annex.uuid
  set, but has a git-annex objects directory, suggesting it was used
  by git-annex before.
2020-12-14 13:17:43 -04:00
Joey Hess
9aaab02e44
add 2020-12-13 18:50:35 -04:00
Joey Hess
3c76a31b15
response and related todo 2020-12-11 16:21:16 -04:00
Joey Hess
e6692b66f1
remove
I need to think about this some more, not clear if it's a todo
item specific to stalldetection at all. Remotes with this behavior
also show no progress when run with -J. And some other remotes don't
update any progress meters at all, eg adb is that way and so are hook
remotes and of course external remotes don't have to send progress info.
2020-12-11 15:49:39 -04:00
Joey Hess
fadf47557f
note 2020-12-11 15:48:03 -04:00
Joey Hess
90eadbce49
close 2020-12-11 15:45:55 -04:00
Joey Hess
d3f78da0ed
propagate signals to the transferrer process group
Done on unix, could not implement it on windows quite.

The signal library gets part of the way needed for windows.
But I had to open https://github.com/pmlodawski/signal/issues/1 because
it lacks raiseSignal.

Also, I don't know what the equivilant of getProcessGroupIDOf is on
windows. And System.Process does not provide a way to send any signal to
a process group except for SIGINT.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2020-12-11 15:32:00 -04:00
Joey Hess
79c765b727
meant to close this earlier 2020-12-11 14:25:54 -04:00
Joey Hess
095cdc7e83
extend transferrer protocol to send progress bar total size updates
New protocol is not back-compat with old one, but it's never been
released so that's ok.
2020-12-11 12:42:28 -04:00
Joey Hess
263fd1d459
ugh 2020-12-11 11:51:46 -04:00
Joey Hess
a422a056f2
make getViaTmpFrom no longer update location log
All callers adjusted to update it themselves.

In Command.ReKey, and Command.SetKey, the cleanup action already did,
so it was updating the log twice before.

This fixes a bug when annex.stalldetection is set, as now
Command.Transferrer can skip updating the location log, and let it be
updated by the calling process.
2020-12-11 11:50:13 -04:00
Joey Hess
a6ed23b82f
todo 2020-12-11 11:09:19 -04:00
Joey Hess
d3c11eac5d
2 bugs involving new stalldetection feature 2020-12-10 17:46:58 -04:00
Joey Hess
3fa2bc2eed
rename back, there were links to this 2020-12-09 13:14:54 -04:00
Joey Hess
05c0543e8e
move new interface to git-annex transfer
This is to avoid breakage when upgrading or downgrading git-annex with a
process running that uses the interface. It's better to keep the
compatability code for a few years than worry about such breakage.

This commit was sponsored by Brett Eisenberg on Patreon.
2020-12-09 12:33:56 -04:00
Joey Hess
41f2c308ff
stall detection is working
New config annex.stalldetection, remote.name.annex-stalldetection, which
can be used to deal with remotes that stall during transfers, or are
sometimes too slow to want to use.

This commit was sponsored by Luke Shumaker on Patreon.
2020-12-08 15:22:18 -04:00
Joey Hess
09ed9f7d1f
reminder for later 2020-12-08 15:20:05 -04:00
Joey Hess
c4d489f7d4
add todo
Not going to do this yet, so remember for later.
2020-12-08 15:17:35 -04:00
Joey Hess
438d5be1f7
support prompt in message serialization
That seems to be the last thing needed for message serialization.
Although it's only used in the assistant currently, so hard to tell if I
forgot something.

At this point, it should be possible to start using transferkeys
when performing transfers, which will allow killing a transferkeys
process if a transfer times out or stalls. But that's for another day.

This commit was sponsored by Ethan Aubin.
2020-12-04 14:54:09 -04:00
Joey Hess
4d9f416949
idea 2020-12-04 00:00:40 -04:00
Joey Hess
bf76ae2c90
mention new branch 2020-12-03 16:29:22 -04:00
Joey Hess
1858b65d88
design work 2020-12-02 14:31:24 -04:00
Joey Hess
ee86972f66
thought 2020-11-30 13:27:45 -04:00
Joey Hess
e843334a40
comment 2020-11-30 13:13:56 -04:00
Lukey
36b4a253e7 2020-11-29 19:06:37 +00:00
Lukey
7e86da7701 2020-11-29 19:04:34 +00:00
Joey Hess
a6f7017eba
Merge branch 'master' of ssh://git-annex.branchable.com 2020-11-23 13:00:36 -04:00
Joey Hess
8b8ee68a9c
update 2020-11-23 13:00:20 -04:00
Cebtenzzre
8816f0d8ab Added a comment 2020-11-22 16:07:16 +00:00
Joey Hess
1be38362aa
retitle 2020-11-16 15:17:48 -04:00
Joey Hess
805af01562
bug fix
really innefficient but it does solve dropping
2020-11-16 14:57:51 -04:00
Joey Hess
0896038ba7
annex.adjustedbranchrefresh
Added annex.adjustedbranchrefresh git config to update adjusted branches
set up by git-annex adjust --unlock-present/--hide-missing.

Note, in a few cases, I was not able to make the adjusted branch
be updated in calls to moveAnnex, because information about what
file corresponds to a key is not available. They are:

* If two files point to one file, then eg, `git annex get foo` will
  update the branch to unlock foo, but will not unlock bar, because it
  does not know about it. Might be fixable by making `git annex get
  bar` do something besides skipping bar?
* git-annex-shell recvkey likewise (so sends over ssh from old versions
  of git-annex)
* git-annex setkey
* git-annex transferkey if the user does not use --file
* git-annex multicast sends keys with no associated file info

Doing a single full refresh at the end, after any incremental refresh,
will deal with those edge cases.
2020-11-16 14:27:28 -04:00
Joey Hess
26cf26caca
Merge branch 'master' into symlink-missing 2020-11-16 10:03:12 -04:00
Joey Hess
5a8d01f63e
examinekey: Added a "file" format variable
For consistency with find, and for easier scripting.
2020-11-16 09:59:11 -04:00
yarikoptic
13bab4f2cf Added a comment 2020-11-14 02:00:13 +00:00
Joey Hess
f07670a282
measurement 2020-11-13 15:57:35 -04:00
Joey Hess
56aabccda4
close 2020-11-13 15:54:33 -04:00
Joey Hess
b9351922d2
add todo 2020-11-13 15:50:35 -04:00
Joey Hess
7566aa6bc5
examinekey: Added --migrate-to-backend
Note that, the way the SeekInput parser is written to support batch mode,
it's actually possible to do git-annex examinekey
"SHA1--foo foo.tar.gz" --migrate-to-backend=SHA1E

While that might be kind of useful to support multiple migrations not using
batch mode, I have not documented it. It would be better to take pairs of
key and file in that case.
2020-11-12 14:09:14 -04:00
Joey Hess
12e32d1dee
examinekey: Added two new format variables: objectpath and objectpointer 2020-11-12 13:02:31 -04:00
Joey Hess
c5141b469a
comment 2020-11-12 12:59:27 -04:00
Joey Hess
d7da4ee00a
comment 2020-11-12 12:29:15 -04:00
yarikoptic
60a71f90cc adding a note pointing to tentative recipe 2020-11-11 20:10:33 +00:00
yarikoptic
07e9f43c63 todo/question on how to get full path to the key knowing metadata but having no file 2020-11-11 19:29:14 +00:00
Lukey
9d598265e4 2020-11-05 18:41:25 +00:00
Joey Hess
664bec4297
comment 2020-11-02 22:34:04 -04:00
yarikoptic
5b4d5f6d64 Added a comment 2020-11-02 20:37:47 +00:00
Joey Hess
40679616ed
comment 2020-11-02 15:01:17 -04:00
yarikoptic
79874325b8 a plea for more --debug output 2020-11-02 18:04:20 +00:00
michael.hanke@c60e12358aa3fc6060531bdead1f530ac4d582ec
0170b5468f Added a comment: Documentation of demand 2020-10-27 14:59:47 +00:00
yarikoptic
c24480c061 Added a comment: windows build with magic 2020-10-22 19:32:55 +00:00
Joey Hess
0133b7e5a8
move: Improve resuming a move that was interrupted after the object was transferred
In cases where numcopies checks prevented the resumed move from dropping
the object from the source repository, it now relies on a log of recent
moves to replicate the behavior of the interrupted command.

Performance: Probably noticable impact, since it has to add to the log,
check the log, and remove from the log. Seems worth it to avoid this
annoying edge case. The log functions are pretty well optimised to avoid
unncessary work.

An performance improvement to make later would be to avoid cleanup doing
anything if it's not written to the log file, and has confirmed that the
log file does not contain the log line.

This commit was sponsored by Jake Vosloo on Patreon.
2020-10-21 10:31:56 -04:00
Joey Hess
15ea0e6c0a
design done 2020-10-19 14:57:02 -04:00
Joey Hess
5009c1ce68
update 2020-10-19 14:49:16 -04:00
Joey Hess
c337b58caf
more thoughts 2020-10-19 14:19:41 -04:00
Joey Hess
3c6cfacb19
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-10-19 12:19:33 -04:00
Joey Hess
e408658eaa
close 2020-10-19 12:16:15 -04:00
Ilya_Shlyakhter
1ac3735ac9 Added a comment: copying directly from directory special remote to the cloud 2020-10-18 21:00:59 +00:00
Joey Hess
e4a8b7b26f
comment 2020-10-13 18:57:30 -04:00
Joey Hess
9a5cd96f0d
Fix a memory leak introduced in the last release
The problem was this line:

	cleanup = and <$> sequence (map snd v)

That caused all of v to be held onto until the end, when the cleanup action
was run.

I could not seem to find a bang pattern that avoided the leak, so I
resorted to a IORef, rather clunky, but not a performance problem because
it will only be written once per git ls-files, so typically just 1 time.

This commit was sponsored by Mark Reidenbach on Patreon.
2020-10-13 16:31:01 -04:00
Joey Hess
d41795c8ff
blew a day investigating this and still don't understand it fully 2020-10-13 15:33:59 -04:00
Joey Hess
cf0ff9f53d
comment 2020-10-13 12:24:13 -04:00
Joey Hess
b72a5c0680
Merge branch 'master' of ssh://git-annex.branchable.com into master 2020-10-12 16:14:51 -04:00
Joey Hess
db1def72ee
comment
This commit was sponsored by Ethan Aubin.
2020-10-12 16:07:30 -04:00
Joey Hess
4124862ae0
comment
This commit was sponsored by Ethan Aubin.
2020-10-12 15:47:46 -04:00
Joey Hess
fc16057f99
profiling
Currently a bit stuck, but at least starting to get some clues on this
memory leak.

This commit was sponsored by Ethan Aubin.
2020-10-12 14:48:26 -04:00
Joey Hess
81216931c6
comment 2020-10-12 12:32:53 -04:00
Lukey
308e5b3a81 Added a comment 2020-10-11 14:17:47 +00:00
Joey Hess
46371797ec
design for finally slaying this beast
This commit was sponsored by Ethan Aubin.
2020-10-08 10:09:12 -04:00
yarikoptic
8035e2073d Added a comment 2020-10-05 23:17:08 +00:00
Joey Hess
471bcfaf37
comment 2020-10-05 14:51:19 -04:00
kyle
a39f9e8c7f Added a comment 2020-10-05 18:09:07 +00:00