Commit graph

945 commits

Author SHA1 Message Date
Joey Hess
9f3c2dfeda
stop using remote.name.annex-readonly for two distinct things 2020-04-23 14:56:03 -04:00
Joey Hess
cd1676d604
fix bug involving local git remote and out of date location log
get --from, move --from: When used with a local git remote, these used to
silently skip files that the location log thought were present on the
remote, when the remote actually no longer contained them. Since that
behavior could be surprising, now instead display a warning.

I got very confused when I encountered this behavior, since it was silently
skipping a file I needed that whereis said was on the remote.

get without --from already displayed a "unable to access these remotes"
message, which while a bit misleading in that the remote is likely
accessible, but just doesn't contain the file, at least indicated something
went wrong.

Having get --from display a warning makes it in line with get
w/o --from, so seems certianly ok. It might be there are situations where
move --from is used, on eg a whole directory, and the user only wants to
move whatever is present in the remote, and is perfectly ok with files
that are not present being skipped. So I'm less sure about the new warning
being ok there. OTOH, only local git remotes avoiding displaying a warning
in that case too, so this just brings them into line with other remotes.

(Also note that this makes it a little bit faster when dealing with a lot of
files, since it avoids a redundant stat of the file.)
2020-04-21 12:36:58 -04:00
Joey Hess
04352ed9c5
check-ignore resource pool
Much like check-attr before.
2020-04-21 11:25:28 -04:00
Joey Hess
45fb7af21c
check-attr resource pool
Limited to min of -JN or number of CPU cores, because it will often be
CPU bound, once it's read the gitignore file for a directory.

In some situations it's more disk bound, but in any case it's unlikely
to be the main bottleneck that -J is used to avoid. Eg, when dropping,
this is used for numcopies checks, but the main bottleneck will be
accessing the remotes to verify presence. So the user might decide to
-J32 that, but having 32 check-attr processes would just waste however
many filehandles they open, and probably worsen their performance due to
CPU contention.

Note that, I first tried just letting up to the -JN be started. However,
even when it's no bottleneck at all, that still results in all of them
being started. Why? Well, all the worker threads start up nearly
simulantaneously, so there's a thundering herd..
2020-04-21 11:05:57 -04:00
Joey Hess
cee6b344b4
cat-file resource pool
Avoid running a large number of git cat-file child processes when run with
a large -J value.

This implementation takes care to avoid adding any overhead to git-annex
when run without -J. When run with -J, there is a small bit of added
overhead, to manipulate the resource pool. That optimisation added a
fair bit of complexity.
2020-04-20 15:19:31 -04:00
Joey Hess
529f488ec4
fix a thundering herd problem
Avoid repeatedly opening keys db when accessing a local git remote and -J
is used.

What was happening was that Remote.Git.onLocal created a new annex state
as each thread started up. The way the MVar was used did not prevent that.
And that, in turn, led to repeated opening of the keys db, as well as
probably other extra work or resource use.

Also managed to get rid of Annex.remoteannexstate, and it turned out there
was an unncessary Maybe in the keysdbhandle, since the handle starts out
closed.
2020-04-17 17:09:29 -04:00
Joey Hess
957a87b437
fix absolute filenames fed into --batch and git-annex info 2020-04-15 16:04:05 -04:00
Joey Hess
5a62e8132d
When parsing git configs, support all the documented ways to write true and false, including "yes", "on", "1", etc.
This change does impact git-annex config
eg "git annex config --set annex.addunlocked on"
will store "on" and new git-annex will understand that value, while
old git-annex will error:
git-annex: bad annex.addunlocked configuration in git annex config:
Parse failure: near "on"
That seems acceptable.

Not special remote configs that are only documented as =true or =false
however. Having git-annex support other values for those would break
backwards compatability when used with old versions of git-annex. And
older versions ignore invalid special remote configs.. That would not
be a good combination.
2020-04-13 14:05:30 -04:00
Joey Hess
9cb69dbb76
support boolean git configs that are represented by the name of the setting with no value
Eg"core.bare" is the same as "core.bare = true".

Note that git treats "core.bare =" the same as "core.bare = false", so the
code had to become more complicated in order to treat the absense of a
value differently than an empty value. Ugh.
2020-04-13 13:35:22 -04:00
Joey Hess
ca9c6c5f60
Fix a potential failure to parse git config
Git has an obnoxious special case in git config, a line "foo" is the same
as "foo = true". That means there is no way to examine the output of
git config and tell if it was run with --null or not, since a "foo"
in the first line could be such a boolean, or could be followed by its
value on the next line if --null were used.

So, rather than trying to do such a detection, track the style of config
at all the points where it's generated.
2020-04-13 13:05:41 -04:00
Joey Hess
86426036a0
optimise catfile interface with ByteString and Attoparsec
Around 3% total speedup.

Profiling git annex find --not --in web, it's now bytestring end-to-end,
and there is only a little added overhead in eg accessing the Annex
state MVar (3%). The rest of the runtime is spent reading symlinks, and
in attoparsec.

This feels like the end of the optimisation road, without a major change
like caching information for faster queries.
2020-04-10 14:18:52 -04:00
Joey Hess
2caf579718
cache annex index filename for 1.5% speedup to queries 2020-04-10 13:37:04 -04:00
Joey Hess
aeca7c2207
Sped up query commands that read the git-annex branch by around 5%
The only price paid is one additional MVar read per write to the journal.
Presumably writing a journal file dominiates over a MVar read time by
several orders of magnitude.

--batch does not get the speedup because then it needs to notice when
another process has made a change. Also made the assistant and other damon
modes bypass the optimisation, which would not help them anyway.
2020-04-09 13:54:43 -04:00
Joey Hess
541803bb52
changelog for ByteString Ref 2020-04-08 14:10:29 -04:00
Joey Hess
7ebc118776
adb: Better messages when the adb command is not installed
After a user completely ignored the display of the exception probably
because it didn't make sense..

This does make it a little bit slower since it checks adb is in path each
time before running it. Also, it might display a lot of warnings about it
not being installed.

This commit was sponsored by Ilya Shlyakhter on Patreon.
2020-04-02 10:48:28 -04:00
Joey Hess
87d5583a91
use programPath consistently, not readProgramFile
Improve git-annex's ability to find the path to its program, especially
when it needs to run itself in another repo to upgrade it.

Some parts of the code used readProgramFile, probably because I forgot that
programPath exists.

I noticed this when a git-annex auto-upgrade failed because it was running
git-annex upgrade --autoonly, but the code to run git-annex used
readProgramFile, which happened to point to an older build of git-annex.
2020-03-30 16:06:27 -04:00
Joey Hess
dbad6c5c39
releasing package git-annex version 8.20200330 2020-03-30 13:46:24 -04:00
Joey Hess
c9ac7aa338
patch applied 2020-03-26 15:18:47 -04:00
Joey Hess
173465592f
changelog for Kyle's other fix, and close bug 2020-03-26 13:18:41 -04:00
Joey Hess
c0ceb969e6
changelog for Kyle's fix 2020-03-26 13:12:49 -04:00
Joey Hess
bbe977a2b6
fsck: Fix reversion in 8.20200226 that made it incorrectly warn that hashed keys with an extension should be upgraded. 2020-03-20 13:09:16 -04:00
Joey Hess
4b92bbe8d7
webdav: Made exporttree remotes faster by caching connection to the server
Followed example of Remote.S3.
2020-03-20 12:48:43 -04:00
Joey Hess
c8fec6ab03
Fix a minor bug that caused options provided with -c to be passed multiple times to git. 2020-03-16 13:06:44 -04:00
Joey Hess
14a4a9f4cd
releasing package git-annex version 8.20200309 2020-03-09 17:08:16 -04:00
Joey Hess
4ce518998a
Fix upgrade failure when a file has been deleted from the working tree 2020-03-09 16:59:18 -04:00
Joey Hess
afe72d04ff
fix problems with upgrade of local remotes
Upgrade other repos than the current one by running git-annex upgrade
inside them, which avoids problems with upgrade code making assumptions
that the cwd will be inside the repo being upgraded.

In particular, this fixes a problem where upgrading a v7 repo to v8 caused
an ugly git error message.

I actually could not find a way to make Upgrade.V7 work properly
without changing directory to the remote. Once I got git ls-files to work,
the git cat-file failed because :path can only be used in the current git
repo.
2020-03-09 16:49:28 -04:00
Joey Hess
d930a2035c
Avoid converting .git file in a worktree or submodule to a symlink when the repository is not a git-annex repository.
This means it will still be a .git file when git-annex init runs. That's
ok, the repo probably contains no annexed objects yet, and even if it does,
git-annex init does not care if symlinks in the worktree don't point to the
objects.

I made init, at the end, run the conversion code. Not really necessary
because the next git-annex command could do it just as well. But, this
avoids commands that don't normally write to the repo needing to write to
it, which might avoid some problem or other, and seems worth avoiding
generally.
2020-03-09 14:54:14 -04:00
Joey Hess
c0a981cb0e
update comment 2020-03-09 14:31:28 -04:00
Joey Hess
8a58f67170
remove changelog for problem I've not fixed yet 2020-03-09 14:24:40 -04:00
Joey Hess
1978a24207
Fix bug that caused unlocked annexed dotfiles to be added to git by the smudge filter when annex.dotfiles was not set. 2020-03-09 14:20:02 -04:00
Joey Hess
70d24c0302
add a comment about CWD
While git ls-files can actually be used on a repo that is not in the
cwd, it works inconsistently. For example, this fails:

git --git-dir=../foo/.git --work-tree=../foo ls-files ../foo

But change some of the paths to absolute and it will succeed. That seems
like a bug in git.

OTOH, this succeeds:

git --git-dir=../foo/.git --work-tree=../foo ls-files

But, that lists paths relative to the top of the --work-tree,
rather than the usual listing them relative to the cwd. Because the cwd
is not in the repo. And so anything parsing the ls-files output of that
is likely to operate on files in the wrong location. Indeed, there is
code in Upgrade/ that has this problem!
2020-03-09 13:37:01 -04:00
Joey Hess
6a91471923
GETCONFIG name fix
Fix regression that prevented external special remotes from using GETCONFIG
to query values like "name". (Introduced in version 7.20200202.7.)
2020-03-09 12:38:04 -04:00
Joey Hess
093fde5abd
completed the createDirectoryIfMissing conversion
Remaining calls in the assistant and Annex.Ssh have been audited and are ok.
2020-03-06 12:55:03 -04:00
Joey Hess
88f721549d
Linux standalone: Use md5sum to shorten paths in .cache/git-annex/locales
md5sum is part of busybox, so is probably available unless it were compiled
out. If md5sum (or cut for that matter) is not available, it will
still use the whole path to $base, otherwise hash it.

Of course it's possible for md5sum to be available sometimes and not others
on the same system; in such an event the locales would be built twice for
the same bundle. The cleanup code will delete both sets once that
version of the bundle is upgraded.
2020-03-04 13:04:56 -04:00
Joey Hess
2099e40664
improve wording 2020-03-02 16:27:29 -04:00
Joey Hess
ccd8c43dc8
git-annex config: guard against non-repo-global configs
git-annex config: Only allow configs be set that are ones git-annex
actually supports reading from repo-global config, to avoid confused users
trying to set other configs with this.
2020-03-02 15:54:18 -04:00
Joey Hess
c578e8ebfd
close already fixed bug
and document it in the changelog for the release where it was fixed
2020-03-02 13:47:47 -04:00
Joey Hess
f9a116f056
stack.yaml: Updated to lts-14.27. 2020-03-02 13:25:51 -04:00
Joey Hess
f6d629e483
changelog and minor style 2020-02-28 12:57:55 -04:00
Joey Hess
2366e7fb84
catch whereisKey exception and provide error messages when external programs neglect to
* whereis: If a remote fails to report on urls where a key
  is located, display a warning, rather than giving up and not displaying
  any information.
* When external special remotes fail but neglect to provide an error
  message, say what request failed, which is better than displaying an
  empty error message to the user.
2020-02-27 14:09:18 -04:00
Joey Hess
c089f395b0
Bugfix: Don't ignore --debug when it is followed by -c 2020-02-27 00:52:37 -04:00
Joey Hess
8bc393a63b
releasing package git-annex version 8.20200226 2020-02-26 19:09:23 -04:00
Joey Hess
81e3faf810
Merge branch 'v7' 2020-02-26 18:15:18 -04:00
Joey Hess
d37975357d
Bugfix: export --tracking (a deprecated option) set annex-annex-tracking-branch, instead of annex-tracking-branch.
(cherry picked from commit a3a674d15b)
2020-02-26 18:08:04 -04:00
Joey Hess
e535da621c
Bugfix to getting content from an export remote with -J, when the export database was not yet populated.
(cherry picked from commit e520341500)
2020-02-26 18:07:20 -04:00
Joey Hess
8af6d2c3c5
fix encryption of content to gcrypt and git-lfs
Fix serious regression in gcrypt and encrypted git-lfs remotes.
Since version 7.20200202.7, git-annex incorrectly stored content
on those remotes without encrypting it.

Problem was, Remote.Git enumerates all git remotes, including git-lfs
and gcrypt. It then dispatches to those. So, Remote.List used the
RemoteConfigParser from Remote.Git, instead of from git-lfs or gcrypt,
and that parser does not know about encryption fields, so did not
include them in the ParsedRemoteConfig. (Also didn't include other
fields specific to those remotes, perhaps chunking etc also didn't
get through.)

To fix, had to move RemoteConfig parsing down into the generate methods
of each remote, rather than doing it in Remote.List.

And a consequence of that was that ParsedRemoteConfig had to change to
include the RemoteConfig that got parsed, so that testremote can
generate a new remote based on an existing remote.

(I would have rather fixed this just inside Remote.Git, but that was not
practical, at least not w/o re-doing work that Remote.List already did.
Big ugly mostly mechanical patch seemed preferable to making git-annex
slower.)
2020-02-26 18:05:36 -04:00
Joey Hess
9050788b66
info: Fix display of the encryption value. (Some debugging junk had crept in.) 2020-02-26 15:02:23 -04:00
Joey Hess
e520341500
Bugfix to getting content from an export remote with -J, when the export database was not yet populated. 2020-02-26 14:57:29 -04:00
Joey Hess
9659f1c30f
annex.security.allowed-ip-addresses ports syntax
Extended annex.security.allowed-ip-addresses to let specific ports of an IP
address to be used, while denying use of other ports.
2020-02-25 15:45:52 -04:00
Joey Hess
1bb32098d6
jump right to v8, don't stop part way
* init --version: When the version given is one that automatically
  upgrades to a newer version, use the newer version instead.
* Auto upgrades from older repo versions, like v5, now jump right to v8.
2020-02-24 13:21:00 -04:00
Joey Hess
09df58c4ea
handle keys with extensions consistently in all locales
Fix some cases where handling of keys with extensions varied depending on
the locale.

A filename with a unicode extension would before generate a key with an
extension in a unicode locale, but not in LANG=C, because the extension
was not all alphanumeric. Also the the length of the extension could be
counted differently depending on the locale.

In a non-unicode locale, git-annex migrate would see that the extension
was not all alphanumeric and want to "upgrade" it. Now that doesn't happen.

As far as backwards compatability, this does mean that unicode
extensions are counted by the number of bytes, not number of characters.
So, if someone is using unicode extensions, they may find git-annex
stops using them when adding files, because their extensions are too
long. Keys already in their repo with the "too long" extensions will
still work though, so this only prevents adding the same content with
the same extension generating the same key. Documented this by
documenting that annex.maxextensionlength is a number of bytes.

Also, if a filename has an extension that is not valid utf-8 and the
locale is utf-8, the extension will be allowed now, and an old
git-annex, in the same locale would not, and would also want to
"upgrade" that.
2020-02-20 17:30:25 -04:00
Joey Hess
3d00a8a1dc
Makefile: Support newer versions of cabal that use the new-build system.
Unfortunately, cabal puts the binary in a very complicated path
and does not provide any good way to get it out, leaving no good choice
except to use find.

It may be possible to use cabal (new-)install --symlink-bindir,
and ask it to symlink to pwd, but with my older version of cabal,
that does not work.

The stack branch will probably also break once it uses a newer cabal,
didn't try to deal with that.
2020-02-20 14:13:21 -04:00
Joey Hess
c1bfe86f6a
mention speedup 2020-02-19 15:09:00 -04:00
Joey Hess
0f3ecbe7e8
tighten up description of v8 upgrade 2020-02-19 15:07:48 -04:00
Joey Hess
d431dba1a8
merged v8 into master 2020-02-19 14:37:56 -04:00
Joey Hess
029c883713
Merge branch 'master' into v8 2020-02-19 14:32:11 -04:00
Joey Hess
a22ed03d0f
tighten up docs of dotfiles changes 2020-02-19 14:29:50 -04:00
Joey Hess
79a0435b77
automate remote.name.skipFetchAll
initremote, enableremote: Set remote.name.skipFetchAll when the remote
cannot be fetched from by git, so git fetch --all will not try to use it.
2020-02-19 13:58:26 -04:00
Joey Hess
a3a674d15b
Bugfix: export --tracking (a deprecated option) set annex-annex-tracking-branch, instead of annex-tracking-branch. 2020-02-19 13:34:24 -04:00
Joey Hess
cd8a208b8c
releasing package git-annex version 7.20200219 2020-02-19 12:45:30 -04:00
Joey Hess
a78eb6dd58
sync --only-annex and annex.synconlyannex
* Added sync --only-annex, which syncs the git-annex branch and annexed
  content but leaves managing the other git branches up to you.
* Added annex.synconlyannex git config setting, which can also be set with
  git-annex config to configure sync in all clones of the repo.

Use case is then the user has their own git workflow, and wants to use
git-annex without disrupting that, so they sync --only-annex to get the
git-annex stuff in sync in addition to their usual git workflow.

When annex.synconlyannex is set, --not-only-annex can be used to override
it.

It's not entirely clear what --only-annex --commit or --only-annex
--push should do, and I left that combination not documented because I
don't know if I might want to change the current behavior, which is that
such options do not override the --only-annex. My gut feeling is that
there is no good reasons to use such combinations; if you want to use
your own git workflow, you'll be doing your own committing and pulling
and pushing.

A subtle question is, how should import/export special remotes be handled?
Importing updates their remote tracking branch and merges it into master.
If --only-annex prevented that git branch stuff, then it would prevent
exporting to the special remote, in the case where it has changes that
were not imported yet, because there would be a unresolved conflict.

I decided that it's best to treat the fact that there's a remote tracking
branch for import/export as an implementation detail in this case. The more
important thing is that an import/export special remote is entirely annexed
content, and so it makes a lot of sense that --only-annex will still sync
with it.
2020-02-17 16:33:10 -04:00
Joey Hess
879f52a116
annex.tune.branchhash1=true bugfix
Fix support for repositories tuned with annex.tune.branchhash1=true,
including --all not working and git-annex log not displaying anything for
annexed files.
2020-02-14 15:22:48 -04:00
Joey Hess
352963690a
fsck --from remote -J concurrency bug
fsck --from remote: Fix a concurrency bug that could make it incorrectly
detect that content in the remote is corrupt, and remove it, resulting in
data loss.
2020-02-14 14:52:15 -04:00
Joey Hess
399319ccbc
Avoid throwing fatal errors when asked to write to a readonly git remote on http
Test suite found one of them, looking for giveup turned up several more.
2020-02-14 14:38:13 -04:00
Joey Hess
a490947068
annex.sshcaching warning improvement and allow overridding build time default
* When git-annex is built with a ssh that does not support ssh connection
  caching, default annex.sshcaching to false, but let the user override it.
* Improve warning messages further when ssh connection caching cannot
  be used, to clearly state why.
2020-02-14 14:21:03 -04:00
Joey Hess
46bf2a259b
releasing package git-annex version 7.20200204 2020-02-04 14:33:03 -04:00
Joey Hess
c9357bdc0e
ifdef persistent-template 2.8.0 fixes
The i386ancient build has a ghc too old for these extensions.

Build with persistent-template 2.8.0 tested.
2020-02-04 13:53:00 -04:00
Joey Hess
ee718fb35d
Makefile: Really move the fish completion to the vendor_completions.d directory. 2020-02-04 12:10:09 -04:00
Joey Hess
4920df6573
Fix build with newest version of persistent-template.
This is untested because of rain, also I am operating from truncated
copiler error messages in a bug report that also doesn't mention what the
library version is. Still, it should work.

May break builds with old ghc, in particular DerivingStrategies is
I think fairly new? The pragmas could be ifdefed if necessary. Works with
ghc 8.6.5.
2020-02-04 12:03:30 -04:00
Joey Hess
467cc50bb4
releasing package git-annex version 7.20200202.7 2020-02-02 16:55:38 -04:00
Joey Hess
5c3d06b070
Makefile: Move the fish completion to the vendor_completions.d directory. 2020-01-23 16:42:08 -04:00
Joey Hess
5c3636037b
Display a warning when concurrency is enabled but ssh connection caching is not enabled or won't work due to a crippled filesystem
A warning message is unsatisfying. But erroring out is too hard a failure,
especially since it may well work fine if the user has enabled passwordless
ssh.

I did think about falling back to one ssh connection at a time in this
case, but it would have needed a rework of every ssh call, which
seems far overboard for such a niche problem. There's no single place where
git-annex runs ssh, so no one place that it could block a concurrent call
on a semaphore. And, even if it did fall back to one ssh connection at a
time, it seems to me that doing so without warning the user about the
problem just invites bug reports like "git-annex is ignoring my -J2 and
only doing one download at a time". So a warning is needed, and I suppose
is good enough.
2020-01-23 12:35:46 -04:00
Joey Hess
1883f7ef8f
support git remotes that need http basic auth
using git credential to get the password

One thing this doesn't do is wrap the password prompting inside the prompt
action. So with -J, the output can be a bit garbled.
2020-01-22 16:16:19 -04:00
Joey Hess
d227093002
avoid ugly error message
Http remotes that do expose a git config file, but are not initialized
resulted in an ugly and unncessary error message, now sqelched.

When git-annex-shell configlist is run w/o the autoinit field, it may
not generate a uuid for the repository. So in that case, it's not
unexpected for the config it does list to not include a UUID, and
dumping out the config in a warning message is not needed.

If configlist is asked to autoinit and we don't get back a config with a
UUID in it, that suggests some problem, and what we got back may not be
a config at all but some diagnostic message, so it does make sense to
output it then.
2020-01-22 11:57:20 -04:00
Joey Hess
5c6bf1be97
--whatelse is a better name than --describe-other-params
The use case is basically the user having forgotten, so --help would be
best, but it would be quite hard to include this in --help, since it may
even have to spin up an external special remote program.

I also considered --umm but typoed it the first time I tried it as
--uum, and while memorable, it's too cutesy. --whatelse is good because
it explicitly asks, what other params, besides the ones I've given?
2020-01-20 17:04:45 -04:00
Joey Hess
aa949bbb7d
initremote --describe-other-params
Does not yet include descriptions from external special remote programs.
2020-01-20 16:05:51 -04:00
Joey Hess
99cb3e75f1
add LISTCONFIGS to external special remote protocol
Special remote programs that use GETCONFIG/SETCONFIG are recommended
to implement it.

The description is not yet used, but will be useful later when adding a way
to make initremote list all accepted configs.

configParser now takes a RemoteConfig parameter. Normally, that's not
needed, because configParser returns a parter, it does not parse it
itself. But, it's needed to look at externaltype and work out what
external remote program to run for LISTCONFIGS.

Note that, while externalUUID is changed to a Maybe UUID, checkExportSupported
used to use NoUUID. The code that now checks for Nothing used to behave
in some undefined way if the external program made requests that
triggered it.

Also, note that in externalSetup, once it generates external,
it parses the RemoteConfig strictly. That generates a
ParsedRemoteConfig, which is thrown away. The reason it's ok to throw
that away, is that, if the strict parse succeeded, the result must be
the same as the earlier, lenient parse.

initremote of an external special remote now runs the program three
times. First for LISTCONFIGS, then EXPORTSUPPORTED, and again
LISTCONFIGS+INITREMOTE. It would not be hard to eliminate at least
one of those, and it should be possible to only run the program once.
2020-01-17 16:07:17 -04:00
Joey Hess
9c45eca37d
update 2020-01-15 14:08:44 -04:00
Joey Hess
71ecfbfccf
be stricter about rejecting invalid configurations for remotes
This is a first step toward that goal, using the ProposedAccepted type
in RemoteConfig lets initremote/enableremote reject bad parameters that
were passed in a remote's configuration, while avoiding enableremote
rejecting bad parameters that have already been stored in remote.log

This does not eliminate every place where a remote config is parsed and a
default value is used if the parse false. But, I did fix several
things that expected foo=yes/no and so confusingly accepted foo=true but
treated it like foo=no. There are still some fields that are parsed with
yesNo but not not checked when initializing a remote, and there are other
fields that are parsed in other ways and not checked when initializing a
remote.

This also lays groundwork for rejecting unknown/typoed config keys.
2020-01-10 14:52:48 -04:00
Joey Hess
5e4deb3620
support sha256 git repos
Git will eventually switch to sha2 and there will not be one single
shaSize anymore, but two (40 and 64).

Changed all parsers for git plumbing output to support both sizes of
shas.

One potential problem this does not deal with is, if somewhere in
git-annex it reads two shas from different sources, and compares them
to see if they're the same sha, it would fail if they're sha1 and sha256
of the same value. I don't know if that will really be a concern.
2020-01-07 12:22:19 -04:00
Joey Hess
2de3dddfd2
reinject --known: Fix bug that prevented it from working in a bare repo.
ifAnnexed in a bare repo passes to git cat-file :./filename , which it
refuses to do since the repo is bare.

Note that, reinject somefile someannexedfile in a bare repo silently does
nothing, because someannexedfile is never actually an annexed worktree
file, because the repo is bare.
2020-01-06 14:22:22 -04:00
Joey Hess
2cea674d1e
Merge branch 'master' into v8 2020-01-01 14:26:43 -04:00
Joey Hess
503788238c
add --force-annex/--force-git
options make it easier to override annex.largefiles configuration
(and potentially safer as it avoids bugs like the smudge bug fixed
in the last release)

Deleted some old comments that were posted to the man page discussing such
options.

Updated docs that used -c annex.largefiles to use the options.

Note that addSmallOverridden was needed to avoid the clean filter running
on the file. It would be possible to make addFile also update the index
directly, rather than going via git add. However, it was not necessary,
and I want to avoid breaking on some edge case, particularly if the code in
addSmallOverridden has some oversight.

Also, when annex.addunlocked is set and annex.largefiles does not match a file,
git annex add --force-large works, but git status will then show the file
as added, with a unstaged modification. The unstaged modification adds the
file to git. This is identical behavior to using -c annex.largefiles=nothing
when annex.addunlocked is set. This does not prevent committing what was
intended to be added. I have not gotten to the bottom of why git thinks
the file is modified and runs it through the clean filter in this case.
2020-01-01 14:03:06 -04:00
Joey Hess
985373f8e7
releasing package git-annex version 7.20191230 2019-12-30 14:49:31 -04:00
Joey Hess
ea3cb7d277
fix a case where file tracked by git unexpectedly becomes annex pointer file
smudge: When annex.largefiles=anything, files that were already stored in
git, and have not been modified could sometimes be converted to being
stored in the annex. Changes in 7.20191024 made this more of a problem.
This case is now detected and prevented.
2019-12-27 15:08:03 -04:00
Joey Hess
3cd3757236
annex.dotfiles
The git add behavior changes could be avoided if it turns out to be
really annoying, but then it would need to behave the old way when
annex.dotfiles=false and the new way when annex.dotfiles=true. I'd
rather not have the config option result in such divergent behavior as
`git annex add .` skipping a dotfile (old) vs adding to annex (new).

Note that the assistant always adds dotfiles to the annex.
This is surprising, but not new behavior. Might be worth making it also
honor annex.dotfiles, but I wonder if perhaps some user somewhere uses
it and keeps large files in a directory that happens to begin with a
dot. Since dotfiles and dotdirs are a unix culture thing, and the
assistant users may not be part of that culture, it seems best to keep
its current behavior for now.
2019-12-26 16:33:39 -04:00
Joey Hess
2b821eb225
Merge branch 'master' into sqlite 2019-12-26 15:15:42 -04:00
Joey Hess
444d5591ee
Improve file ordering behavior when one parameter is "." and other parameters are other directories
eg, `git-annex get . ..` used to order the files strangly, because it
did not realize that when git ls-files output eg "foo", that should be
grouped with the first set of files and not the second set.

Fixed by making            dirContains "." "./foo" = True
which makes sense, because dirContains ".." "../foo" = True
2019-12-20 18:01:29 -04:00
Joey Hess
37467a008f
annex.addunlocked expressions
* annex.addunlocked can be set to an expression with the same format used by
  annex.largefiles, in case you want to default to unlocking some files but
  not others.
* annex.addunlocked can be configured by git-annex config.

Added a git-annex-matching-expression man page, broken out from
tips/largefiles.

A tricky consequence of this is that git-annex add --relaxed
honors annex.addunlocked, but an expression might want to know the size
or content of an url, which it's not going to download. I decided it was
better not to fail, and just dummy up some plausible data in that case.

Performance impact should be negligible. The global config is already
loaded for annex.largefiles. The expression only has to be parsed once,
and in the simple true/false case, it should not do any additional work
matching it.
2019-12-20 15:56:25 -04:00
Joey Hess
5591622731
git-annex-config --set/--unset: No longer change the local git config setting
e53070c1f quietly made it set the local git config too, but that was never
documented anywhere, and it had surprising results. If I set
annex.largefiles globally in a repo, I would expect to be able to change it
in another repo, and the original repo would get the change and use it,
rather than being stuck on the old value set there.

And, if I have a local annex.largefiles and set a different global default,
I'd be surprised to have my local setting overwritten.

annex.securehashesonly does need to be set locally, since it's a security
feature and the global is only a default until it gets set locally. So
special cased.
2019-12-20 13:17:28 -04:00
Joey Hess
4acbb40112
git-annex config annex.largefiles
annex.largefiles can be configured by git-annex config, to more easily set
a default that will also be used by clones, without needing to shoehorn the
expression into the gitattributes file. The git config and gitattributes
override that.

Whenever something is added to git-annex config, we have to consider what
happens if a user puts a purposfully bad value in there. Or, if a new
git-annex adds some new value that an old git-annex can't parse.
In this case, a global annex.largefiles that can't be parsed currently
makes an error be thrown. That might not be ideal, but the gitattribute
behaves the same, and is almost equally repo-global.

Performance notes:

git-annex add and addurl construct a matcher once
and uses it for every file, so the added time penalty for reading the global
config log is minor. If the gitattributes annex.largefiles were deprecated,
git-annex add would get around 2% faster (excluding hashing), because
looking that up for each file is not fast. So this new way of setting
it is progress toward speeding up add.

git-annex smudge does need to load the log every time. As well as checking
the git attribute. Not ideal. Setting annex.gitaddtoannex=false avoids
both overheads.
2019-12-20 13:01:41 -04:00
Joey Hess
ce3fb0b2e5
fixed an oversight that had always prevented annex.resolvemerge from being honored, when it was configured by git-annex config
forgot to add it to the merge function
2019-12-20 11:00:08 -04:00
Joey Hess
f6c18f6940
Merge branch 'bs' into sqlite-bs 2019-12-18 15:14:44 -04:00
Joey Hess
7d9dff5b05
Merge branch 'master' into bs
and update changelog
2019-12-18 15:13:30 -04:00
Joey Hess
d5628a16b8
Merge branch 'bs' into sqlite-bs 2019-12-18 14:51:03 -04:00
Joey Hess
7fd5376334
inprogress: Support --key 2019-12-18 14:14:16 -04:00
Joey Hess
1bc7055a21
add back changelog entry 2019-12-18 13:53:10 -04:00
Joey Hess
c19211774f
use filepath-bytestring for annex object manipulations
git-annex find is now RawFilePath end to end, no string conversions.
So is git-annex get when it does not need to get anything.
So this is a major milestone on optimisation.

Benchmarks indicate around 30% speedup in both commands.

Probably many other performance improvements. All or nearly all places
where a file is statted use RawFilePath now.
2019-12-11 15:25:07 -04:00
Joey Hess
2f9a80d803
merging sqlite and bs branches
Since the sqlite branch uses blobs extensively, there are some
performance benefits, ByteStrings now get stored and retrieved w/o
conversion in some cases like in Database.Export.
2019-12-06 15:30:45 -04:00
Joey Hess
718fa83da6
mention optimisations 2019-12-05 11:46:55 -04:00