importfeed bug fix: When -J was used with multiple feeds, some feeds did
not get their items downloaded.
In my case, I had added a feed to the end of the list, and no items from it
were ever downloaded.
Sponsored-by: Leon Schuermann on Patreon
This causes changes to the original branch to get merged with a single
sync. Before, it took 2 syncs; the first happened to update the synced/
branch, and the second merged changes from the synced/ branch into the
ajusted branch.
Using mergeToAdjustedBranch when tomerge == origbranch is probably
overkill, but it does work fine.
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
(And allow it to be used in the --template although that seems unlikely to
be very useful.)
My use case for this is that one of the podcast feeds I subscribe to is
sometimes leaking episodes of some other podcast. The other podcast is also
very close to spam, so this may be a form of intentional spamming. I have
not been able to catch the podcast feed containing those episodes, so I
don't know which one is at fault. So putting this in the metadata will let
me eventually catch it.
Command.Add.seek starts concurrency with CommandStages. And for
Command.Sync, it needs TransferStages. So, to get both types of concurrency
for the two different parts, it either needs to change the type of
concurrency in between, or just call startConcurrency once for each.
It seems safe enough to call startConcurrency twice, because it does shut
down concurrency (mostly) at the end, and eg the old Annex.workers get
emptied.
Sponsored-by: unqueued on Patreon
This ended up having an interface like sync, rather than like get/copy/drop.
That let it be implemented in terms of sync, which took a lot less code.
Also, it lets it handle many of the edge cases that sync does, such as
getting files that are not visible in a --hide-missing branch, and sending
files to exporttree remotes.
As well as being easier to implement, `git-annex satisfy myremote` makes
sense as it satisfies the preferred content settings of the remote.
`git-annex satisfy somefile` does not form a sentence that makes sense. So
while -C can be a little bit annoying, it still makes sense to have this
syntax.
Note that, while I initially thought this would also satisfy numcopies, it
does not. Arguably it ought to. But, sync does not send files in order to
satisfy numcopies, it only sends files to satisfy preferred content. And
it's important that this transfer the same files as sync does, because
it will probably be used in a workflow where the user sometimes syncs and
sometimes satisfies, and does not expect satisfy to do things that sync
would not do.
(Also opened a new bug that also affects sync et all, not only this command.)
Sponsored-by: Nicholas Golder-Manning on Patreon
This was already possible, but it was rather hard to come up with the
complex shell command needed.
Note that the diff output starts with "diff a/... b/...".
I left off the "--git" because it's not a git format diff.
Rather than after it, which can leave one wondering what file it's
downloading.
youtubeDl should not ever return Right Nothing in normal operation,
becaause it's already asked youtube-dl if it supports the url. So it
would have to succeed at that, then not download any file, but also
exit successfully, in order for the new error message to display.
Also display the name of yt-dlp when using it.
Fixes a failure like this:
curl: (33) HTTP server doesn't seem to support byte ranges. Cannot resume.
That happens because the whole web page has already been downloaded
previously, and kept, so now addurl tries to download it, and curl asks the
server to resume from the last byte. And youtube.com can't, for whatever
stupid reason.
So, delete the temp file after determining that youtube-dl can be used.
That will fail, and it already exports whole trees.
f6dd34ca81 made it sync content with
import remotes, and if an import remote is also an export remote, that
caused this new failure mode.
Sponsored-by: Brock Spratlen on Patreon
* config: Added the --show-origin and --for-file options.
* config: Support annex.numcopies and annex.mincopies.
There is a little bit of redundancy here with other code elsewhere that
combines the various configs and selects which to use. But really only
for the special case of annex.numcopies, which is a git config that does
not override the annex branch setting and for annex.mincopies, which does
not have a git config but does have gitattributes settings as well as the
annex branch setting.
That seems small enough, and unlikely enough to grow into a mess that it was
worth supporting annex.numcopies and annex.mincopies in git-annex config
--show-origin. Because these settings are a prime thing that someone might
get confused about and want to know where they were configured.
And, it followed that git-annex config might as well support those two
for --set and --get as well. While this is redundant with the speclialized
commands, it's only a little code and it makes it more consistent.
Note that --set does not have as nice output as numcopies/mincopies
commands in some special cases like setting to 0 or a negative number.
It does avoid setting to a bad value thanks to the smart
constructors (eg configuredNumCopies).
As for other git-annex branch configurations that are not set by git-annex
config, things like trust and wanted that are specific to a repository
don't map to a git config name, so don't really fit into git-annex config.
And they are only configured in the git-annex branch with no local override
(at least so far), so --show-origin would not be useful for them.
Sponsored-by: Dartmouth College's DANDI project
This didn't used to be needed because importKeys would import all
content and so doing another pass was redundant.
But since 40017089f2 it uses
importChanges, so only new files are imported. If a file that was
already imported before was dropped, that would prevent sync --content
from gettng its content again.
Sponsored-by: Jack Hill on Patreon
Large speed up to importing trees from special remotes that contain a lot
of files, by only processing changed files.
Benchmarks:
Importing from a special remote that has 10000 files, that have all been
imported before, and 1 new file sped up from 26.06 to 2.59 seconds.
An import with no change and 10000 unchanged files sped up from 24.3 to
1.99 seconds.
Going up to 20000 files, an import with no changes sped up from
125.95 to 3.84 seconds.
Sponsored-by: k0ld on Patreon
For simplicity, I've not tried to make it handle History yet, so when
there is a history, a full import will still be done. Probably the right
way to handle history is to first diff from the current tree to the last
imported tree. Then, diff from the current tree to each of the
historical trees, and recurse through the history diffing from child tree
to parent tree.
I don't think that will need a record of the previously imported
historical trees, and so Logs.Import doesn't store them. Although I did
leave room for future expansion in that log just in case.
Next step will be to change importTree to importChanges and modify
recordImportTree et all to handle it, by using adjustTree.
Sponsored-by: Brett Eisenberg on Patreon
Consistency with sync and internal consistency is more important than
consistency with the assistant, which is not itself consistent about
what it does when run in a subdirectory.
Note that with -C, it will still commit staged changes to files outside
the directory. Like sync does. Presumably if the user is manually
staging things, then running this command, they intend to build up a
commit.
Sponsored-by: unqueued on Patreon
assist: New command, which is the same as git-annex sync but with
new files added and content transferred by default.
(Also this fixes another reversion in git-annex sync,
--commit --no-commit, and --message were not enabled, oops.)
See added comment for why git-annex assist does commit staged
changes elsewhere in the work tree, but only adds files under
the cwd.
Note that it does not support --no-commit, --no-push, --no-pull
like sync does. My thinking is, why should it? If you want that
level of control, use git commit, git annex push, git annex pull.
Sync only got those options because pull and push were not split
out.
Sponsored-by: k0ld on Patreon
When used without --content or --no-content, warn about the upcoming
transition, and suggest using one of the options, or setting
annex.synccontent.
Sponsored-by: Brett Eisenberg on Patreon
This option is not specific to sync, so it seemed it should be in either
pull or push as well as sync. Since it does modify the remote, it seems
better to have it in push; the modification of the local repo pulls in
the direction of pull, but not hard enough.
Maybe it would be better to have it in both?
Sponsored-by: Luke Shumaker on Patreon
I anticipate that if sync is transitioned to syncing content by default,
people will want a short option. And in repositories where
annex.synccontent = true, they already would. And pull and push sync
content by default, so a short option is useful with them too.
Mnemonic: -g makes only git data be synced
Also, -a makes only annex data be synced.
Would have preferred -c, which would complement -C, but it
was already taken to set git configs.
Sponsored-by: Noam Kremen on Patreon
Split out two new commands, git-annex pull and git-annex push. Those plus a
git commit are equivilant to git-annex sync.
In a sense, git-annex sync conflates 3 things, and it would have been
better to have push and pull from the beginning and not sync. Although
note that git-annex sync --content is faster than a pull followed by a
push, because it only has to walk the tree once, look at preferred
content once, etc. So there is some value in git-annex sync in speed, as
well as user convenience.
And it would be hard to split out pull and push from sync, as far as the
implementaton goes. The implementation inside sync was easy, just adjust
SyncOptions so it does the right thing.
Note that the new commands default to syncing content, unless
annex.synccontent is explicitly set to false. I'd like sync to also do
that, but that's a hard transition to make. As a start to that
transition, I added a note to git-annex-sync.mdwn that it may start to
do so in a future version of git-annex. But a real transition would
necessarily involve displaying warnings when sync is used without
--content, and time.
Sponsored-by: Kevin Mueller on Patreon
The man page is somewhat vague about this, but I do think it was a bug
that these options didn't alreay behave that way. The options are
documented to disable imports and exports, which is the same operations
just with a special remote that uses trees.
The real motivation for this is that I'm adding git-annex pull and
git-annex push, and I want these options to turn off the equivilant of
those commands. And git-annex pull will certianly download and push
upload.
Sponsored-by: Nicholas Golder-Manning on Patreon
That was added in 2011 to prevent writing to the git-annex branch on
shutdown. But, the use of saveState causes pending git-annex branch
writes to be completed before the branch is deleted. So, an unusual exit
is not needed.
Had to convert uninit to do everything that can error out inside a
CommandStart. This was harder than feels nice.
(Also, in passing, converted CommandCheck to use a data type, not a
weird number that it was not clear how it managed to be unique.)
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
I think this was just copied from another command without paying
attention to what it did, because there does not seem to be any valid
reason to want to only unannex some files when running uninit.
Seems unlikely to be too useful, but who knows.
Moved the checkSafeConfig call to happen after an action is started, so
it will be captured by --json-error-messages
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
Including special --whatelse handling.
Otherwise, it seems unlikely to be too useful, but who knows.
Refactored code to call starting before displaying error messages.
This makes the error messages be captured by --json-error-messages
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
Seems unlikely to be very useful, but trivial.
And, this completes the story that git-annex sync does not need json,
since every sub-operation is available in a command that does support json.
(Well, except for committing, but that's not a git-annex command.)
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
The input field is consistently the url of the feed, which makes sense
as that is the user input, but to differentiate multiple urls downloaded
from a feed when using --json-progress -J, need the url that is being
downloaded too.
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project
Both -J and --json needed importfeed to be refactored to use commandAction.
That was difficult, because of the interrelated nature of downloading feeds
and then downloading files from feeds, both of which needed to use
commandAction. And then checking for problems in feeds has to come after
these actions, which may be run as background jobs.
As for --json support, it's most of the way there, but still has some
warts, so I didn't enable jsonOptions yet. The warts include:
- An initial empty json record is displayed by getCache.
- Input is not populated, should be feed url
- feedProblem at end will not be captured by --json-error-messages
(see FIXME)
Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project