Commit graph

2547 commits

Author SHA1 Message Date
Joey Hess
6079b0c72c
fix reversion
add: Avoid unncessarily converting a newly unlocked file to be stored
in git when it is not modified, even when annex.largefiles does not
match it.

This fixes a reversion in version 10.20220222, where git-annex unlock
followed by git-annex add, followed by git commit file could result in
git thinking the file was modified after the commit.

I do have half a mind to remove the withUnmodifiedUnlockedPointers part
of git-annex add. It seems weird, despite that old bug report arguing
a case of consistency that it ought to behave that way. When git-annex
add surpises me, it seems likely it's wrong.. But for now, this is the
smallest possible fix.

Sponsored-by: Dartmouth College's Datalad project
2022-03-21 15:54:04 -04:00
Joey Hess
a314a8dfd0
add back lost packString
The patch that removed it did not break anything, since the strings it's
used on are all ASCII not unicode. But I like making sure to use
packString everywhere just in case the code later changes in a way that
needs it.
2022-03-02 18:22:38 -04:00
sternenseemann
ca596e7c54
allow building with aeson >= 2.0
In aeson 2.0, Text has been replaced by the Key type and HashMap by the
KeyMap interface. Accomodating this required adding some CPP in order to
still be able to compile with aeson < 2.0. The required changes were:

* Prevent Key from being re-exported by Utilities.Aeson, as it clashes
  with git-annex's own Key type.

* Fix up convertion from String/Text to Key (or Text in aeson 1.*) in a
  couple of places

* Import Data.Aeson.KeyMap instead of Data.HashMap.Strict, as they are
  mostly API-compatible. insertWith needs to be replaced by unionWith,
  however, as KeyMap lacks the former function.
2022-03-02 18:01:41 -04:00
Joey Hess
952664641a
turn of PackageImports in cabal file
This makes it easier to build eg benchmarks of individual modules.

May be that most of these PackageImports are not really necessary,
dunno.
2022-02-25 13:16:36 -04:00
Joey Hess
64ccb4734e
smudge: Warn when encountering a pointer file that has other content appended to it
It will then proceed to add the file the same as if it were any other
file containing possibly annexable content. Usually the file is one that
was annexed before, so the new, probably corrupt content will also be added
to the annex. If the file was not annexed before, the content will be added
to git.

It's not possible for the smudge filter to throw an error here, because
git then just adds the file to git anyway.

Sponsored-by: Dartmouth College's Datalad project
2022-02-23 15:17:08 -04:00
Joey Hess
5b373a9dd2
read a consistent amount from pointer file
A few places were reading the max symlink size of a pointer file,
then passing tp parseLinkTargetOrPointer. Which is fine currently, but
to support pointer files with lines of data after the pointer, enough
has to be read that parseLinkTargetOrPointer can be assured of seeing
enough of that data to know if it's correctly formatted.

Sponsored-by: Dartmouth College's Datalad project
2022-02-23 12:52:34 -04:00
Joey Hess
ce1b3a9699
info: Allow using matching options in more situations
File matching options like --include will be rejected in situations where
there is no filename to match against. (Or where there is a filename but
it's not relative to the cwd, or otherwise seemed too bothersome to match
against.)

The addition of listKeys' was necessary to avoid using more memory in the
common case of "git-annex info". Adding a filterM would have caused the
list to buffer in memory and not stream. This is an ugly hack, but listKeys
had previously run Annex operations inside unafeInterleaveIO (for direct
mode). And matching against a matcher should hopefully not change any Annex
state.

This does allow for eg `git-annex info somefile --include=*.ext`
although why someone would want to do that I don't really know. But it
seems to make sense to allow it.
But, consider: `git-annex info ./somefile --include=somefile`
This does not match, so will not display info about somefile.
If the user really wants to, they can `--include=./somefile`.

Using matching options like --copies or --in=remote seems likely to be
slower than git-annex find with those options, because unlike such
commands, info does not have optimised streaming through the matcher.

Note that `git-annex info remote` is not the same as
`git-annex info --in remote`. The former shows info about all files in
the remote. The latter shows local keys that are also in that remote.
The output should make that clear, but this still seems like a point
where users could get confused.

Sponsored-by: Jochen Bartl on Patreon
2022-02-21 14:46:07 -04:00
Joey Hess
c68f52c6a2
restage pointer file after unlock
This avoids a later git status or similar taking a long time to run
as it runs git-annex smudge once per file. While v9 repositories do
avoid that taking long when the files are small, large files can still
make git status take a very long time.

This does make unlock slower, because now git-annex smudge is being run
once per file unlocked. However, the next commit should speed that up in
many cases.

Sponsored-by: Boyd Stephen Smith Jr. on Patreon
2022-02-18 14:55:52 -04:00
Joey Hess
0edf01d7d4
registerurl,unregisterurl: rework output and support --json
* registerurl, unregisterurl: Improved output when reading from stdin
  to be more like other batch commands.
* registerurl, unregisterurl: Added --json and --json-error-messages options.

Note that this did change the --batch output in a way that could possibly
break something that expected the old output to never change. I think it's
acceptable to break that because there has never been a guarantee of
unchanging output format except with --batch for most commands. The old
output was just really weird too!

One possible wart is that "git-annex registerurl" with no options now
seems to just hang, since it's waiting for stdin input. Before, it said
"registerurl (stdin)" which was clearer about what's happenening. But this
is a deprecated mode anyway, --batch makes clear what's happening. If
anything, this problem would be a reason to eventually remove the support
for reading from stdin w/o --batch.

Sponsored-by: Dartmouth College's Datalad project
2022-02-14 13:29:20 -04:00
Joey Hess
835c50966a
reject batch options combined with non-batch options
Reject combinations of --batch (or --batch-keys) with options like --all or
--key or with filenames.

Most commands ignored the non-batch items when batch mode was enabled.

For some reason, addurl and dropkey both processed first the specified
non-batch items, followed by entering batch mode. Changed them to also
error out, for consistency.

Sponsored-by: Dartmouth College's Datalad project
2022-01-26 13:00:19 -04:00
Joey Hess
4f7b8ce09d
fix spelling of upgradeable 2022-01-19 12:14:50 -04:00
Joey Hess
3936599885
move code from Command.Fsck
Sponsored-by: Dartmouth College's Datalad project
2022-01-13 13:24:50 -04:00
Joey Hess
e416635021
renameremote: Better handling of case where there are multiple special remotes with a name
Instead of renaming one at random, error out and ask that a uuid be
specified.

Sponsored-by: Brett Eisenberg on Patreon
2022-01-05 15:24:02 -04:00
Joey Hess
58afb00f6e
enableremote: Better handling of the unusual case where multiple special remotes have been initialized with the same name
Before it would pick one at random, though preferring ones that were not
dead over dead ones.

Now, if one is dead and the other not, it will use the non-dead one. But if
both are not dead, or both dead, it will error out, suggesting the user
clarify what they want to enable.

Sponsored-by: Luke Shumaker on Patreon
2022-01-05 15:12:11 -04:00
Joey Hess
7e2f5edd68
avoid exporting non-annexed symlinks
So that importing does not replace them with plain files.

This works similarly to how the previous handling of submodules and
matchers did, except that annexed symlinks still get exported as plain
files of course, it's only non-annexed symlinks that it does not make sense
to export.

When symlinks have previously been exported, updating the export will
unexport them after upgrading to this commit.

Sponsored-by: Kevin Mueller on Patreon
2022-01-03 14:21:50 -04:00
Joey Hess
f8ebd0363b
complete the magic wormhole pairin appid transition
Started in 2017 in commit 3fe9d99f24.
Starting tomorrow, all versions of git-annex since then will provide
an appid, and so it will no longer be necessary to check the date.

Sponsored-by: Nicholas Golder-Manning on Patreon
2021-12-30 12:16:22 -04:00
Joey Hess
b1d719f9d2
handle transitions with read-only unmerged git-annex branches
Capstone to this feature. Any transitions that have been performed on an
unmerged remote ref but not on the local git-annex branch, or vice-versa
have to be applied on the fly when reading files.

Sponsored-by: Dartmouth College's Datalad project
2021-12-28 13:23:32 -04:00
Joey Hess
058193adc6
prevent git-annex log with read-only unmerged git-annex branches
It would display incomplete information, which would differ from the
information displayed with write access. So refuse to display anything.

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 15:44:15 -04:00
Joey Hess
23a485498f
handle Annex.Branch.files with read-only unmerged git-annex branches
It would be difficult to make Annex.Branch.files query the unmerged
git-annex branches. Might be possible, similar to what was discussed in
7f6b2ca49c but again I decided to make it
not do anything in that situation to start with before adding such a
complicated thing.

git-annex info uses it when getting info about a repostory. The choices
were to make that fail with an error, or display the info it can, and
change the output slightly for the bits of info it cannot access. While
that is a behavior change, and I want to avoid any behavior changes due
to unmerged git-annex branches in a read-only repo, displaying a message
that is not a number seems unlikely to break anything that was consuming
a number, any worse than throwing an exception would. Probably.

Also git-annex unused --from origin is made to throw an error, but
it would fail later anyway when trying to write to the unused log files.

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 15:28:31 -04:00
Joey Hess
7f6b2ca49c
handle overBranchFileContents with read-only unmerged git-annex branches
This makes --all error out in that situation. Which is better than
ignoring information from the branches.

To really handle the branches right, overBranchFileContents would need
to both query all the branches and union merge file contents
(or perhaps not provide any file content), as well as diffing between
branches to find files that are only present in the unmerged branches.
And also, it would need to handle transitions..

Sponsored-by: Dartmouth College's Datalad project
2021-12-27 14:30:51 -04:00
Joey Hess
5ff55f622d
improve sync message in export edge case
sync: Better error message when unable to export to a remote because
remote.name.annex-tracking-branch is configured to a ref that does not
exist.

It does not suggest how to fix the problem because there are several
possible solutions: Change the git config to point to something that does
exist, git add some files, or put files on the special remote that will be
imported and so populate the ref.

I considered just silently not doing anything, which is what it does
when annex-tracking-branch = master and nothing has been committed to
master yet. But it seems better to be explicit about it, since this is a
fairly confusing situation to find yourself in.

Sponsored-By: Max Thoursie on Patreon
2021-12-23 14:45:01 -04:00
Joey Hess
567f63ba47
export: Avoid unncessarily re-exporting non-annexed files that were already exported
Commit b6e4ed9aa7 made non-annexed files
be re-uploaded every time, since they're not tracked in the location log,
and it made it check the location log. Don't do that for non-annexed files.

Sponsored-by: Brock Spratlen on Patreon
2021-11-29 14:02:38 -04:00
Joey Hess
01a5ee6998
addurl, youtube-dl: When --check-raw prevents downloading an url, still continue with any downloads that come after it, rather than erroring out
Sponsored-By: Mark Reidenbach on Patreon
2021-11-28 19:40:06 -04:00
Joey Hess
1d513540e9
Fix build with old versions of feed library 2021-11-23 16:06:51 -04:00
Joey Hess
31be0770a5
importfeed: Display url before starting youtube-dl download
It was displaying a blank line before.
2021-11-17 13:23:55 -04:00
Joey Hess
86fa460ce2
better wording 2021-11-17 12:48:28 -04:00
Joey Hess
332385a117
use parseFeedFromFile to avoid mojibake
As mentioned in commit 2bd778a46e, there
was mojibake when LANG=C.

Looking at parseFeedFromFile, it is very particular to read the file as
unicode. parseFeedString looks like it will accept any old String,
but a String that was read using the filesystem encoding will not in
fact have the right encoding.

I think this is a bug in the feed library and will file one.

Sponsored-by: Svenne Krap on Patreon
2021-11-15 15:31:02 -04:00
Joey Hess
2bd778a46e
importfeed: Fix a crash when used in a non-unicode locale
See comment for analysis.

At first I thought I'd need to convert all T.unpack in git-annex, but
luckily not -- so long as the Text is read from a file, the filesystem
encoding is applied and T.unpack is fine. It's only when using Feed
that the filesystem encoding is not applied.

While this fixes the crash, it does result in some mojibake, eg:
itemid=http://www.manager-tools.com/2014/01/choosing-a-company-work-chapter-7-���-questions/

Have not tracked that down, but it must be unrelated, because
I've verified that it roundtrips when using encodeUf8:

joey@darkstar:~/src/git-annex>LANG=C ghci  Utility/FileSystemEncoding.hs
ghci> useFileSystemEncoding
ghci> Just f <- Text.Feed.Import.parseFeedFromFile "/home/joey/tmp/career_tools_podcasts.xml"
ghci> Just (_, x) = Text.Feed.Query.getItemId (Text.Feed.Query.feedItems f !! 0)
ghci> decodeBS (Data.Text.Encoding.encodeUtf8 x)
"http://www.manager-tools.com/2014/01/choosing-a-company-work-chapter-7-\56546\56448\56467-questions/"
ghci> writeFile "foo" $ decodeBS (Data.Text.Encoding.encodeUtf8 x)
Writes a file containing the ENDASH character.

Sponsored-by: Jochen Bartl on Patreon
2021-11-15 15:04:21 -04:00
Joey Hess
889e771357
display error message if unable to run youtube-dl
This would have made the typo of the command name that was just fixed
obvious earlier, when --no-raw was used to force using it.
2021-11-13 09:07:43 -04:00
Joey Hess
51b73ea1fc
migrate: New --remove-size option
While intended for converting URL keys added by addurl --fast to be
as if added by addurl --relaxed, it can also be used to remove size
from other types of keys. Although that is not likely to be useful
for checksummed keys, I suppose it could be used for WORM or other
non-checksum keys.

Specifying the --remove-size option does not prevent other migrations
from taking effect if there's a key upgrade to perform, or if the
backend has changed. So --backend=URL needs to be used to prevent
migrating an URL key to the default backend.

Note that it's not possible to use git-annex migrate to convert from a
non-URL key to an URL key, as URL keys cannot be generated, except by
addurl. So while this can get the same effect as --relaxed would have
when addurl --fast was used, when --fast was not used, it won't work, or
if --backend=URL is not used will remove the size but not prevent
checksum verification, which is not useful. Due to this complexity, I
decided not to mention it in the git-annex addurl man page.

Sponsored-by: Jochen Bartl on Patreon
2021-11-12 13:28:28 -04:00
Joey Hess
9d3ce224e3
uninit edge cases
* uninit: Avoid error message when no commits have been made to the
  repository yet.
* uninit: Avoid error message when there is no git-annex branch.

Sponsored-by: Svenne Krap on Patreon
2021-11-08 16:47:00 -04:00
Joey Hess
68257e9076
add git-annex filter-process
filter-process: New command that can make git add/checkout faster when
there are a lot of unlocked annexed files or non-annexed files, but that
also makes git add of large annexed files slower.

Use it by running: git
config filter.annex.process 'git-annex filter-process'

Fully tested and working, but I have not benchmarked it at all.
And, incremental hashing is not done when git add uses it, so extra work is
done in that case.

Sponsored-by: Mark Reidenbach on Patreon
2021-11-04 15:02:36 -04:00
Joey Hess
07158b7cf6
shorten synopsis
This is to avoid the display being too wide.
2021-11-04 14:33:07 -04:00
Joey Hess
438e5b56aa
tighter --json parsing for metadata
metadata --batch --json: Reject input whose "fields" does not consist of
arrays of strings. Such invalid input used to be silently ignored.

Used to be that parseJSON for a JSONActionItem ran parseJSON separately
for the itemAdded, and if that failed, did not propagate the error. That
allowed different items with differently named fields to be parsed.
But it was actually only used to parse "fields" for metadata, so that
flexability is not needed.

The fix is just to parse "fields" as-is. AddJSONActionItemFields is needed
only because of the wonky way Command.MetaData adds onto the started json
object.

Note that this line got a dummy type signature added,
just because the type checker needs it to be some type.
itemFields = Nothing :: Maybe Bool
Since it's Nothing, it doesn't really matter what type it is,
and the value gets turned into json and is then thrown away.

Sponsored-by: Kevin Mueller on Patreon
2021-11-01 14:42:37 -04:00
Joey Hess
80f1354685
metadata --batch: Avoid crashing when a non-annexed file is input
Turns out that CommandStart actions do not have their exceptions caught,
which is why the giveup was causing a crash. Mostly these actions
do not do very much work on their own, but it does seem possible there
are other commands whose CommandStart also throws an exception.

So, my first attempt at a fix was to catch those exceptions. But,
--json-error-messages then causes a difficulty, because in order to output
a json error message, an action needs to have been started; that sets up
the json object that the error message will be included in a field of.

While it would be possible to output an object with just an error field,
this would be json output of a format that the user has no reason to
expect, that happens only in an exceptional circumstance. That is something
I have always wanted to avoid with the json output; while git-annex man
pages don't document what the json looks like, the output has always
been made to be self-describing. Eg, it includes "error-messages":[]
even when there's no errors.

With that ruled out, it doesn't seem a good idea to catch CommandStart
exceptions and display the error to stderr when --json-error-messages
is set. And so I don't know if it makes sense to catch exceptions from that
at all. Maybe I'd have a different opinion if --json-error-messages did not
exist though.

So instead, output a blank line like other batch commands do.
This also leaves open the possibility of implementing support for matching
object with metadata --json, which would also want to output a blank line
when the input didn't match.

Sponsored-by: Dartmouth College's DANDI project
2021-11-01 13:40:43 -04:00
Joey Hess
eb95ed4863
fix addurl concurrency issue
addurl: Support adding the same url to multiple files at the same time when
using -J with --batch --with-files.

Implementation was easier than expected, was able to reuse OnlyActionOn.

While it will download the url's content multiple times, that seems like
the best thing to do; see my comment for why.

Sponsored-by: Dartmouth College's DANDI project
2021-10-27 16:15:41 -04:00
Joey Hess
887edeb1ad
avoid warning when built with unix-compat 0.5.3
It re-exports modificationTimeHiRes, and provides a windows version.

Might be worth using that windows version eventually, but I have not
tested it.
2021-10-18 16:25:28 -04:00
Joey Hess
7bdc7350a5
remove git-annex-shell compat code
* Removed support for accessing git remotes that use versions of
  git-annex older than 6.20180312.
* git-annex-shell: Removed several commands that were only needed to
  support git-annex versions older than 6.20180312.
  (lockcontent, recvkey, sendkey, transferinfo, commit)

The P2P protocol was added in that version, and used ever since, so
this code was only needed for interop with older versions.

"git-annex-shell commit" is used by newer git-annex versions, though
unnecessarily so, because the p2pstdio command makes a single commit at
shutdown. Luckily, it was run with stderr and stdout sent to /dev/null,
and non-zero exit status or other exceptions are caught and ignored. So,
that was able to be removed from git-annex-shell too.

git-annex-shell inannex, recvkey, sendkey, and dropkey are still used by
gcrypt special remotes accessed over ssh, so those had to be kept.
It would probably be possible to convert that to using the P2P protocol,
but it would be another multi-year transition.

Some git-annex-shell fields were able to be removed. I hoped to remove
all of them, and the very concept of them, but unfortunately autoinit
is used by git-annex sync, and gcrypt uses remoteuuid.

The main win here is really in Remote.Git, removing piles of hairy fallback
code.

Sponsored-by: Luke Shumaker
2021-10-11 15:36:51 -04:00
Joey Hess
69f8e6c7c0
ImportableContentsChunkable
This improves the borg special remote memory usage, by
letting it only load one archive's worth of filenames into memory at a
time, and building up a larger tree out of the chunks.

When a borg repository has many archives, git-annex could easily OOM
before. Now, it will use only memory proportional to the number of
annexed keys in an archive.

Minor implementation wart: Each new chunk re-opens the content
identifier database, and also a new vector clock is used for each chunk.
This is a minor innefficiency only; the use of continuations makes
it hard to avoid, although putting the database handle into a Reader
monad would be one way to fix it.

It may later be possible to extend the ImportableContentsChunkable
interface to remotes that are not third-party populated. However, that
would perhaps need an interface that does not use continuations.

The ImportableContentsChunkable interface currently does not allow
populating the top of the tree with anything other than subtrees. It
would be easy to extend it to allow putting files in that tree, but borg
doesn't need that so I left it out for now.

Sponsored-by: Noam Kremen on Patreon
2021-10-08 13:15:22 -04:00
Joey Hess
19e78816f0
convert Key to ShortByteString
This adds the overhead of a copy when serializing and deserializing keys.
I have not benchmarked much, but runtimes seem barely changed at all by that.

When a lot of keys are in memory, it improves memory use.

And, it prevents keys sometimes getting PINNED in memory and failing to GC,
which is a problem ByteString has sometimes. In particular, git-annex sync
from a borg special remote had that problem and this improved its memory
use by a large amount.

Sponsored-by: Shae Erisson on Patreon
2021-10-05 20:20:08 -04:00
Joey Hess
9012fa0187
reinject: Fix crash when reinjecting a file from outside the repository
Commit 4bf7940d6b introduced this
problem, but was otherwise doing a good thing. Problem being
that fileRef "/foo" used to return ":./foo", which was actually wrong,
but as long as there was no foo in the local repository, catKey
could operate on it without crashing. After that fix though, fileRef
would return eg "../../foo", resulting in fileRef returning
":./../../foo", which will make git cat-file crash since that's
not a valid path in the repo.

Fix is simply to make fileRef detect paths outside the repo and return
Nothing. Then catKey can be skipped. This needed several bugfixes to
dirContains as well, in previous commits.

In Command.Smudge, this led to needing to check for Nothing. That case
should actually never happen, because the fileoutsiderepo check will
detect it earlier.

Sponsored-by: Brock Spratlen on Patreon
2021-10-01 14:06:34 -04:00
Joey Hess
b9a1cc512d
avoid uncessary call to inAnnex
sync --content: Avoid a redundant checksum of a file that was
incrementally verified, when used on NTFS and perhaps other filesystems.

When sync has just gotten the content, it does not need to check inAnnex a
second time. On NTFS, for some reason the write of the inode cache after
it gets the content is not immediately able to be read, and with an
empty/non-matching inode cache due to that stale data, inAnnex falls back
to hashing the whole object to determine if it's present.

Sponsored-by: Brock Spratlen on Patreon
2021-10-01 12:02:35 -04:00
Joey Hess
9ea8106bb0
sped up git-annex smudge --clean by 25%
Disabling git-annex branch update for this command is
ok, because it does not use any information from the branch,
but only logs the location when it adds a key.

Sponsored-by: Dartmouth College's Datalad project
2021-09-24 14:15:20 -04:00
Joey Hess
18e00500ce
bwlimit
Added annex.bwlimit and remote.name.annex-bwlimit config that works for git
remotes and many but not all special remotes.

This nearly works, at least for a git remote on the same disk. With it set
to 100kb/1s, the meter displays an actual bandwidth of 128 kb/s, with
occasional spikes to 160 kb/s. So it needs to delay just a bit longer...
I'm unsure why.

However, at the beginning a lot of data flows before it determines the
right bandwidth limit. A granularity of less than 1s would probably improve
that.

And, I don't know yet if it makes sense to have it be 100ks/1s rather than
100kb/s. Is there a situation where the user would want a larger
granularity? Does granulatity need to be configurable at all? I only used that
format for the config really in order to reuse an existing parser.

This can't support for external special remotes, or for ones that
themselves shell out to an external command. (Well, it could, but it
would involve pausing and resuming the child process tree, which seems
very hard to implement and very strange besides.) There could also be some
built-in special remotes that it still doesn't work for, due to them not
having a progress meter whose displays blocks the bandwidth using thread.
But I don't think there are actually any that run a separate thread for
downloads than the thread that displays the progress meter.

Sponsored-by: Graham Spencer on Patreon
2021-09-21 16:58:10 -04:00
Joey Hess
ec12537774
defer write permissions checking in import until after copy to repo
This should complete the fix started in
6329997ac4, fixing the actual cause of the
test suite failure this time.

Sponsored-by: Dartmouth College's Datalad project
2021-09-02 13:45:21 -04:00
Joey Hess
4f42292b13
improve url download failure display
* When downloading urls fail, explain which urls failed for which
  reasons.
* web: Avoid displaying a warning when downloading one url failed
  but another url later succeeded.

Some other uses of downloadUrl use urls that are effectively internal use,
and should not all be displayed to the user on failure. Eg, Remote.Git
tries different urls where content could be located depending on how the
remote repo is set up. Exposing those urls to the user would lead to wild
goose chases. So had to parameterize it to control whether it displays urls
or not.

A side effect of this change is that when there are some youtube urls
and some regular urls, it will try regular urls first, even if the
youtube urls are listed first. This seems like an improvement if
anything, but in any case there's no defined order of urls that it's
supposed to use.

Sponsored-by: Dartmouth College's Datalad project
2021-09-01 15:33:38 -04:00
Joey Hess
a99a84f342
add: Detect when xattrs or perhaps ACLs prevent locking down a file's content
And fail with an informative message.

I don't think ACLs can prevent removing the write bit, but I'm not sure,
so kept it mentioning them as a possibility.

Should git-annex lock also check if the write bits are able to be removed?
Maybe, but the case I know about with xattrs involves cp -a copying NFS
xattrs, and it's the copy of the file that is the problem. So when locking
a file, I guess it will not be the copy.

Sponsored-by: Dartmouth College's Datalad project
2021-08-27 14:33:01 -04:00
Joey Hess
ab7b5a492c
--batch-keys
New --batch-keys option added to these commands:  get, drop, move, copy, whereis

git-annex-matching-options had to be reworded since some of its options
can be used to match on keys, not only files.

Sponsored-by: Luke Shumaker on Patreon
2021-08-25 14:21:12 -04:00
Joey Hess
f9b92c81f6
unused: Skip the refs/annex/last-index ref that git-annex recently started creating
This was unlikely to cause any problem, but it is unsightly to mention
normally hidden refs, and it might have done a bit of unnecessary work to
check that ref.

Sponsored-by: Noam Kremen on Patreon
2021-08-24 12:58:14 -04:00
Joey Hess
d154e7022e
incremental verification for web special remote
Except when configuration makes curl be used. It did not seem worth
trying to tail the file when curl is downloading.

But when an interrupted download is resumed, it does not read the whole
existing file to hash it. Same reason discussed in
commit 7eb3742e4b76d1d7a487c2c53bf25cda4ee5df43; that could take a long
time with no progress being displayed. And also there's an open http
request, which needs to be consumed; taking a long time to hash the file
might cause it to time out.

Also in passing implemented it for git and external special remotes when
downloading from the web. Several others like S3 are within striking
distance now as well.

Sponsored-by: Dartmouth College's DANDI project
2021-08-18 15:02:22 -04:00