Commit graph

876 commits

Author SHA1 Message Date
Joey Hess
467cc50bb4
releasing package git-annex version 7.20200202.7 2020-02-02 16:55:38 -04:00
Joey Hess
5c3d06b070
Makefile: Move the fish completion to the vendor_completions.d directory. 2020-01-23 16:42:08 -04:00
Joey Hess
5c3636037b
Display a warning when concurrency is enabled but ssh connection caching is not enabled or won't work due to a crippled filesystem
A warning message is unsatisfying. But erroring out is too hard a failure,
especially since it may well work fine if the user has enabled passwordless
ssh.

I did think about falling back to one ssh connection at a time in this
case, but it would have needed a rework of every ssh call, which
seems far overboard for such a niche problem. There's no single place where
git-annex runs ssh, so no one place that it could block a concurrent call
on a semaphore. And, even if it did fall back to one ssh connection at a
time, it seems to me that doing so without warning the user about the
problem just invites bug reports like "git-annex is ignoring my -J2 and
only doing one download at a time". So a warning is needed, and I suppose
is good enough.
2020-01-23 12:35:46 -04:00
Joey Hess
1883f7ef8f
support git remotes that need http basic auth
using git credential to get the password

One thing this doesn't do is wrap the password prompting inside the prompt
action. So with -J, the output can be a bit garbled.
2020-01-22 16:16:19 -04:00
Joey Hess
d227093002
avoid ugly error message
Http remotes that do expose a git config file, but are not initialized
resulted in an ugly and unncessary error message, now sqelched.

When git-annex-shell configlist is run w/o the autoinit field, it may
not generate a uuid for the repository. So in that case, it's not
unexpected for the config it does list to not include a UUID, and
dumping out the config in a warning message is not needed.

If configlist is asked to autoinit and we don't get back a config with a
UUID in it, that suggests some problem, and what we got back may not be
a config at all but some diagnostic message, so it does make sense to
output it then.
2020-01-22 11:57:20 -04:00
Joey Hess
5c6bf1be97
--whatelse is a better name than --describe-other-params
The use case is basically the user having forgotten, so --help would be
best, but it would be quite hard to include this in --help, since it may
even have to spin up an external special remote program.

I also considered --umm but typoed it the first time I tried it as
--uum, and while memorable, it's too cutesy. --whatelse is good because
it explicitly asks, what other params, besides the ones I've given?
2020-01-20 17:04:45 -04:00
Joey Hess
aa949bbb7d
initremote --describe-other-params
Does not yet include descriptions from external special remote programs.
2020-01-20 16:05:51 -04:00
Joey Hess
99cb3e75f1
add LISTCONFIGS to external special remote protocol
Special remote programs that use GETCONFIG/SETCONFIG are recommended
to implement it.

The description is not yet used, but will be useful later when adding a way
to make initremote list all accepted configs.

configParser now takes a RemoteConfig parameter. Normally, that's not
needed, because configParser returns a parter, it does not parse it
itself. But, it's needed to look at externaltype and work out what
external remote program to run for LISTCONFIGS.

Note that, while externalUUID is changed to a Maybe UUID, checkExportSupported
used to use NoUUID. The code that now checks for Nothing used to behave
in some undefined way if the external program made requests that
triggered it.

Also, note that in externalSetup, once it generates external,
it parses the RemoteConfig strictly. That generates a
ParsedRemoteConfig, which is thrown away. The reason it's ok to throw
that away, is that, if the strict parse succeeded, the result must be
the same as the earlier, lenient parse.

initremote of an external special remote now runs the program three
times. First for LISTCONFIGS, then EXPORTSUPPORTED, and again
LISTCONFIGS+INITREMOTE. It would not be hard to eliminate at least
one of those, and it should be possible to only run the program once.
2020-01-17 16:07:17 -04:00
Joey Hess
9c45eca37d
update 2020-01-15 14:08:44 -04:00
Joey Hess
71ecfbfccf
be stricter about rejecting invalid configurations for remotes
This is a first step toward that goal, using the ProposedAccepted type
in RemoteConfig lets initremote/enableremote reject bad parameters that
were passed in a remote's configuration, while avoiding enableremote
rejecting bad parameters that have already been stored in remote.log

This does not eliminate every place where a remote config is parsed and a
default value is used if the parse false. But, I did fix several
things that expected foo=yes/no and so confusingly accepted foo=true but
treated it like foo=no. There are still some fields that are parsed with
yesNo but not not checked when initializing a remote, and there are other
fields that are parsed in other ways and not checked when initializing a
remote.

This also lays groundwork for rejecting unknown/typoed config keys.
2020-01-10 14:52:48 -04:00
Joey Hess
5e4deb3620
support sha256 git repos
Git will eventually switch to sha2 and there will not be one single
shaSize anymore, but two (40 and 64).

Changed all parsers for git plumbing output to support both sizes of
shas.

One potential problem this does not deal with is, if somewhere in
git-annex it reads two shas from different sources, and compares them
to see if they're the same sha, it would fail if they're sha1 and sha256
of the same value. I don't know if that will really be a concern.
2020-01-07 12:22:19 -04:00
Joey Hess
2de3dddfd2
reinject --known: Fix bug that prevented it from working in a bare repo.
ifAnnexed in a bare repo passes to git cat-file :./filename , which it
refuses to do since the repo is bare.

Note that, reinject somefile someannexedfile in a bare repo silently does
nothing, because someannexedfile is never actually an annexed worktree
file, because the repo is bare.
2020-01-06 14:22:22 -04:00
Joey Hess
2cea674d1e
Merge branch 'master' into v8 2020-01-01 14:26:43 -04:00
Joey Hess
503788238c
add --force-annex/--force-git
options make it easier to override annex.largefiles configuration
(and potentially safer as it avoids bugs like the smudge bug fixed
in the last release)

Deleted some old comments that were posted to the man page discussing such
options.

Updated docs that used -c annex.largefiles to use the options.

Note that addSmallOverridden was needed to avoid the clean filter running
on the file. It would be possible to make addFile also update the index
directly, rather than going via git add. However, it was not necessary,
and I want to avoid breaking on some edge case, particularly if the code in
addSmallOverridden has some oversight.

Also, when annex.addunlocked is set and annex.largefiles does not match a file,
git annex add --force-large works, but git status will then show the file
as added, with a unstaged modification. The unstaged modification adds the
file to git. This is identical behavior to using -c annex.largefiles=nothing
when annex.addunlocked is set. This does not prevent committing what was
intended to be added. I have not gotten to the bottom of why git thinks
the file is modified and runs it through the clean filter in this case.
2020-01-01 14:03:06 -04:00
Joey Hess
985373f8e7
releasing package git-annex version 7.20191230 2019-12-30 14:49:31 -04:00
Joey Hess
ea3cb7d277
fix a case where file tracked by git unexpectedly becomes annex pointer file
smudge: When annex.largefiles=anything, files that were already stored in
git, and have not been modified could sometimes be converted to being
stored in the annex. Changes in 7.20191024 made this more of a problem.
This case is now detected and prevented.
2019-12-27 15:08:03 -04:00
Joey Hess
3cd3757236
annex.dotfiles
The git add behavior changes could be avoided if it turns out to be
really annoying, but then it would need to behave the old way when
annex.dotfiles=false and the new way when annex.dotfiles=true. I'd
rather not have the config option result in such divergent behavior as
`git annex add .` skipping a dotfile (old) vs adding to annex (new).

Note that the assistant always adds dotfiles to the annex.
This is surprising, but not new behavior. Might be worth making it also
honor annex.dotfiles, but I wonder if perhaps some user somewhere uses
it and keeps large files in a directory that happens to begin with a
dot. Since dotfiles and dotdirs are a unix culture thing, and the
assistant users may not be part of that culture, it seems best to keep
its current behavior for now.
2019-12-26 16:33:39 -04:00
Joey Hess
2b821eb225
Merge branch 'master' into sqlite 2019-12-26 15:15:42 -04:00
Joey Hess
444d5591ee
Improve file ordering behavior when one parameter is "." and other parameters are other directories
eg, `git-annex get . ..` used to order the files strangly, because it
did not realize that when git ls-files output eg "foo", that should be
grouped with the first set of files and not the second set.

Fixed by making            dirContains "." "./foo" = True
which makes sense, because dirContains ".." "../foo" = True
2019-12-20 18:01:29 -04:00
Joey Hess
37467a008f
annex.addunlocked expressions
* annex.addunlocked can be set to an expression with the same format used by
  annex.largefiles, in case you want to default to unlocking some files but
  not others.
* annex.addunlocked can be configured by git-annex config.

Added a git-annex-matching-expression man page, broken out from
tips/largefiles.

A tricky consequence of this is that git-annex add --relaxed
honors annex.addunlocked, but an expression might want to know the size
or content of an url, which it's not going to download. I decided it was
better not to fail, and just dummy up some plausible data in that case.

Performance impact should be negligible. The global config is already
loaded for annex.largefiles. The expression only has to be parsed once,
and in the simple true/false case, it should not do any additional work
matching it.
2019-12-20 15:56:25 -04:00
Joey Hess
5591622731
git-annex-config --set/--unset: No longer change the local git config setting
e53070c1f quietly made it set the local git config too, but that was never
documented anywhere, and it had surprising results. If I set
annex.largefiles globally in a repo, I would expect to be able to change it
in another repo, and the original repo would get the change and use it,
rather than being stuck on the old value set there.

And, if I have a local annex.largefiles and set a different global default,
I'd be surprised to have my local setting overwritten.

annex.securehashesonly does need to be set locally, since it's a security
feature and the global is only a default until it gets set locally. So
special cased.
2019-12-20 13:17:28 -04:00
Joey Hess
4acbb40112
git-annex config annex.largefiles
annex.largefiles can be configured by git-annex config, to more easily set
a default that will also be used by clones, without needing to shoehorn the
expression into the gitattributes file. The git config and gitattributes
override that.

Whenever something is added to git-annex config, we have to consider what
happens if a user puts a purposfully bad value in there. Or, if a new
git-annex adds some new value that an old git-annex can't parse.
In this case, a global annex.largefiles that can't be parsed currently
makes an error be thrown. That might not be ideal, but the gitattribute
behaves the same, and is almost equally repo-global.

Performance notes:

git-annex add and addurl construct a matcher once
and uses it for every file, so the added time penalty for reading the global
config log is minor. If the gitattributes annex.largefiles were deprecated,
git-annex add would get around 2% faster (excluding hashing), because
looking that up for each file is not fast. So this new way of setting
it is progress toward speeding up add.

git-annex smudge does need to load the log every time. As well as checking
the git attribute. Not ideal. Setting annex.gitaddtoannex=false avoids
both overheads.
2019-12-20 13:01:41 -04:00
Joey Hess
ce3fb0b2e5
fixed an oversight that had always prevented annex.resolvemerge from being honored, when it was configured by git-annex config
forgot to add it to the merge function
2019-12-20 11:00:08 -04:00
Joey Hess
f6c18f6940
Merge branch 'bs' into sqlite-bs 2019-12-18 15:14:44 -04:00
Joey Hess
7d9dff5b05
Merge branch 'master' into bs
and update changelog
2019-12-18 15:13:30 -04:00
Joey Hess
d5628a16b8
Merge branch 'bs' into sqlite-bs 2019-12-18 14:51:03 -04:00
Joey Hess
7fd5376334
inprogress: Support --key 2019-12-18 14:14:16 -04:00
Joey Hess
1bc7055a21
add back changelog entry 2019-12-18 13:53:10 -04:00
Joey Hess
c19211774f
use filepath-bytestring for annex object manipulations
git-annex find is now RawFilePath end to end, no string conversions.
So is git-annex get when it does not need to get anything.
So this is a major milestone on optimisation.

Benchmarks indicate around 30% speedup in both commands.

Probably many other performance improvements. All or nearly all places
where a file is statted use RawFilePath now.
2019-12-11 15:25:07 -04:00
Joey Hess
2f9a80d803
merging sqlite and bs branches
Since the sqlite branch uses blobs extensively, there are some
performance benefits, ByteStrings now get stored and retrieved w/o
conversion in some cases like in Database.Export.
2019-12-06 15:30:45 -04:00
Joey Hess
718fa83da6
mention optimisations 2019-12-05 11:46:55 -04:00
Joey Hess
960f62a564
typo 2019-11-22 19:48:34 -04:00
Joey Hess
81d402216d cache the serialization of a Key
This will speed up the common case where a Key is deserialized from
disk, but is then serialized to build eg, the path to the annex object.

Previously attempted in 4536c93bb2
and reverted in 96aba8eff7.
The problems mentioned in the latter commit are addressed now:

Read/Show of KeyData is backwards-compatible with Read/Show of Key from before
this change, so Types.Distribution will keep working.

The Eq instance is fixed.

Also, Key has smart constructors, avoiding needing to remember to update
the cached serialization.

Used git-annex benchmark:
  find is 7% faster
  whereis is 3% faster
  get when all files are already present is 5% faster
Generally, the benchmarks are running 0.1 seconds faster per 2000 files,
on a ram disk in my laptop.
2019-11-22 17:49:16 -04:00
Joey Hess
7263aafd2b
Merge branch 'master' into sqlite 2019-11-22 12:49:35 -04:00
Joey Hess
92e1bb250b
simplify the name of the test cases 2019-11-21 17:38:58 -04:00
Joey Hess
58a8005441
Merge branch 'master' into sqlite 2019-11-21 17:28:27 -04:00
Joey Hess
a9888f6151
Windows: Fix handling of changes to time zone.
Used to work but was broken in version 7.20181031, specifically commit
5ab0f48ffb.

That this was not noticed over at least 1 daylight savings time zone
changes makes me wonder if the TSDelta stuff is still needed.
Perhaps the mtime on Windows no longer changes when the time zone is changed?

(cherry picked from commit 09ee6b0ccb)
2019-11-21 17:28:18 -04:00
Joey Hess
d4661959de
Merge branch 'master' into sqlite 2019-11-21 17:26:50 -04:00
Joey Hess
25ba8156bc
improve benchmark --databases
* benchmark: Changed --databases to take a parameter specifiying the size
  of the database to benchmark.
* benchmark --databases: Display size of the populated database.
* benchmark --databases: Improve the "addAssociatedFile to (new)"
  benchmark to really add new values, not overwriting old values.
2019-11-21 17:25:20 -04:00
Joey Hess
43f19ef00a
Fix bug that made bare repos be treated as non-bare when --git-dir was used.
Eg:

git clone url --bare r
git --git-dir r annex init

This resulted in worktree = Just "." and so several things that check
worktree to determine when the repo is bare ran code paths intended for
non-bare. One such code path[1] ran git checkout with --worktree=. which
actually makes it ignore core.bare config, and so the current directory
got populated with a checkout of the master branch in this example. There
was probably also other breakage.

The fix is a bit complicated because whether the repo is bare is not
known until after Git.Config reads the config, but Git.Config handles
setting the RepoLocations's worktree when core.worktree is set. So have
to assume the worktree is the cwd, let core.worktree override that,
and then if the repo turns out to be bare, it's set back to Nothing.
(And then GIT_WORK_TREE can still override all of that.)

[1] switchHEADBack, which runs even when the clone is not from a bare repo.
2019-11-21 13:26:02 -04:00
Joey Hess
b207d944f3
sync, assistant: Pull and push from git-lfs remotes.
Oversight, forgot to add it to gitSyncableRemote
2019-11-18 16:13:21 -04:00
Joey Hess
5877de5e80
git-lfs: remember urls, and autoenable remotes using known urls
* git-lfs: The url provided to initremote/enableremote will now be
  stored in the git-annex branch, allowing enableremote to be used without
  an url. initremote --sameas can be used to add additional urls.
* git-lfs: When there's a git remote with an url that's known to be
  used for git-lfs, automatically enable the special remote.
2019-11-18 16:09:09 -04:00
Joey Hess
cee14f147a
stop displaying rsync progress, and use git-annex's own progress display for local-to-local repo transfers
Reasons to do this include:

1. I've gotten pretty used to git-annex's own progress display, which is
   used for all transfers over ssh (except to old git-annex-shell),
   and for most special remote transfers. It's getting to seem weird to see
   the rsync progress display instead.
2. When -J was used, the rsync output could not be shown, and so there was
   no progress display. Now there will be.

Progress will also be displayed now when cp CoW is used. But I'd expect a CoW
copy to typically run so fast that the progress display will barely be
noticable.

This commit was sponsored by Peter on Patreon.
2019-11-15 13:21:06 -04:00
Joey Hess
a95efcbc55
releasing package git-annex version 7.20191114 2019-11-14 21:58:23 -04:00
Joey Hess
b321526473
OSX link libs into git-core directory
So that binaries in that directory can find the library next to them,
where they get modified to look.

This is a hack; it would be better for OSXMkLibs to build a list of what
libraries are needed where.

Unsure if this is needed due to a recent reversion, or is an older
problem, so updated changelog accordingly.
2019-11-14 18:31:58 -04:00
Joey Hess
f037ad92ec
OSX git-annex.app: Fix a regression that broke git-remote-https, git-remote-http, and git-shell
Putting the binaries in bundle/git-core/bin didn't work on OSX,
linker can't find the libraries next to those binaries where it expects to.
So instead put the binaries in the progDir.
2019-11-14 16:15:42 -04:00
Joey Hess
842449b086
linuxstandalone: Fix a regression that broke git-remote-https. 2019-11-14 15:08:23 -04:00
Joey Hess
667d38a8f1
Fix a crash (STM deadlock) when -J is used with multiple files that point to the same key
See the comment for a trace of the deadlock.

Added a new StartStage. New worker threads begin in the StartStage.
Once a thread is ready to do work, it moves away from the StartStage,
and no thread will ever transition back to it.

A thread that blocks waiting on another thread that is processing
the same key will block while in the StartStage. That other thread
will never switch back to the StartStage, and so the deadlock is avoided.
2019-11-14 13:51:09 -04:00
Joey Hess
890330f0fe
make --json-error-messages capture url download errors
Convert Utility.Url to return Either String so the error message can be
displated in the annex monad and so captured.

(When curl is used, its errors are still not caught.)
2019-11-12 13:52:38 -04:00
Joey Hess
3b34d123ed
Added annex.allowsign option.
This commit was sponsored by Ilya Shlyakhter on Patreon.
2019-11-11 16:28:56 -04:00
Joey Hess
aa010108cd
Merge branch 'master' into sqlite 2019-11-07 13:20:04 -04:00
Joey Hess
09ee6b0ccb
Windows: Fix handling of changes to time zone.
Used to work but was broken in version 7.20181031, specifically commit
5ab0f48ffb.

That this was not noticed over at least 1 daylight savings time zone
changes makes me wonder if the TSDelta stuff is still needed.
Perhaps the mtime on Windows no longer changes when the time zone is changed?
2019-11-06 14:36:49 -04:00
Joey Hess
73e928fcfb
prep release 2019-11-06 12:21:02 -04:00
Joey Hess
6147130e86
Merge branch 'master' into sqlite 2019-11-05 12:59:28 -04:00
Joey Hess
e2d4c133f5
init: fix data loss bug
Fix bug that lost modifications to unlocked files when init is re-ran in an
already initialized repo.

In retrospect needing scanUnlockedFiles False in the direct mode upgrade
path was a good hint that it was unsafe when used with True.

However, this bug did not affect upgrade from v5. In such an upgrade, an
unlocked file that is modified is left as-is. The only place
scanUnlockedFiles True did overwrite modified unlocked files is during an
git-annex init of a repo that was already initialized by git-annex.

(I also tried a scenario where the repo had not been initialized by
git-annex yet, but was cloned from a v7 repo with an unlocked file, and the
pointer file replaced with some other content, and the data loss did not
occur in that situation.)

Since the fixed scanUnlockedFiles avoids overwriting non-pointer files,
it should be safe to run in any situation, so there's no need any longer
for the parameter.
2019-11-05 12:41:15 -04:00
Joey Hess
09c7cbbaa8
update for things already fixed in this branch 2019-10-30 13:57:22 -04:00
Joey Hess
25f912de5b
benchmark: Add --databases to benchmark sqlite databases
Rescued from commit 11d6e2e260 which removed
db benchmarks in favor of benchmarking arbitrary git-annex commands. Which
is nice and general, but microbenchmarks are useful too.
2019-10-29 16:59:27 -04:00
Joey Hess
fd96408c67
releasing package git-annex version 7.20191024 2019-10-25 13:07:58 -04:00
Joey Hess
59b8294b2b
prep release 2019-10-24 14:40:36 -04:00
Joey Hess
31a5b58b2c
documentation for making git add only annex when configured by annex.largefiles
Code change should be trvial, but not yet implemented. This
significantly complicated the task of documenting how git-annex works.

I'm not sure how useful the annex.gitaddtoannex confguration is after
this change; seems that if a user has an annex.largefiles they will want
it applied consistently. But the last thing I want to hear is more
complaining from users about git add doing something they don't want it
to.

There's a pretty high risk users who got used to the git add behavior
and don't have annex.largefiles configured will miss the NEWS and
complain bitterly about their suddenly bloated repositories. Oh well.

Removed outdated comments about the old behavior to avoid confusion.
I don't know if I've found all the places that griping spread to.
2019-10-24 14:01:54 -04:00
Joey Hess
bd197be3ad
annex.gitaddtoannex configuration
Added annex.gitaddtoannex configuration. Setting it to false prevents
git add from usually adding files to the annex.
(Unless the file was annexed before, or a renamed annexed file is detected.)

Currently left at true; some users are encouraging it be set to false.
2019-10-23 15:29:46 -04:00
Joey Hess
bbdeb1a1a8
sync: Fix crash when there are submodules and an adjusted branch is checked out
Reverse adjusting the branch uses treeItemToTreeContent, which was missed
when adding submodule support earlier.
2019-10-23 11:52:56 -04:00
Joey Hess
9a5d9019ba
Deal with pkexec changing to root's home directory when running a command.
Wow, that's not documented anywhere, and seems like a major gotcha in
pkexec.

Broke enable-tor.
2019-10-21 12:39:19 -04:00
Joey Hess
5db79339a1
init: Fix a failure when used in a submodule on a crippled filesystem.
When the submodule's parent repo has an adjusted unlocked branch,
it gets cloned by git, but git checks out master. git annex init then
fails because it wants to enter the adjusted branch, but:

  adjusted branch adjusted/master(unlocked) already exists.

  Aborting because that branch may have changes that have not yet reached master

Note that init actually then exits 0, leaving master checked out.

This could also happen, absent submodules, if the parent repo has
an adjusted unlocked branch, but it is not checked out. In the more common
case where that branch is checked out, the clone uses the same branch,
so no problem.

The choices to fix this:

* Init could delete the existing adjusted branch, and re-adjust.
  But then running init inside an adjusted branch on a crippled filesystem
  would lose any changes that have not been synced back to master.
* Init could sync any changes back to master, but that would be very surprising
  behavior for it.
* Init could simply check out the existing adjusted branch. If the branch
  is diverged from master, well, sync will sort that out later.
  This mirrors the behavior of cloning a repo that has an adjusted branch
  checked out that has not yet been synced back to master.
  Picked this choice.
2019-10-21 11:41:15 -04:00
Joey Hess
f60e8f2c93
releasing package git-annex version 7.20191017 2019-10-17 18:19:47 -04:00
Joey Hess
904b175707
Fix build with persistent-2.10.
Added an additional constraint that persistent needs.
This also builds with persistent-2.9.2 without needing any cpp.
2019-10-17 11:58:31 -04:00
Joey Hess
5463f97ca2
OSX: Deal with symbolic link problem that caused git to not be included in the git-annex.dmg
Homebrew now has eg:

datalads-imac:~ joey$ ls -l /Users/joey/homebrew/Cellar/git/2.23.0/libexec/git-core
total 36776
lrwxr-xr-x   1 joey  staff       13 Aug 29 13:38 git -> ../../bin/git
lrwxr-xr-x   1 joey  staff       13 Aug 29 13:38 git-add -> ../../bin/git

So the target of the symlink also needs to be installed now.

Doing it in shell code was too hairy for my dentistry-addled brain, so
reimplemented in haskell. Also using it for building linuxstandalone.
2019-10-17 11:01:41 -04:00
Joey Hess
4306dfbe68
remove empty log files in transition
forget --drop-dead: Remove several classes of git-annex log files when they
become empty, further reducing the size of the git-annex branch.

Noticed while testing sameas uuid removal, but it could happen other times
too.

An empty log file is always treated by git-annex the same as no file
being present, and when the files are per-key, it can be a sizable space
saving to exclude them from the tree.
2019-10-14 16:04:15 -04:00
Joey Hess
9828f45d85
add RemoteStateHandle
This solves the problem of sameas remotes trampling over per-remote
state. Used for:

* per-remote state, of course
* per-remote metadata, also of course
* per-remote content identifiers, because two remote implementations
  could in theory generate the same content identifier for two different
  peices of content

While chunk logs are per-remote data, they don't use this, because the
number and size of chunks stored is a common property across sameas
remotes.

External special remote had a complication, where it was theoretically
possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or
EXPORTSUPPORTED. Since the uuid of the remote is typically generate in
Remote.setup, it would only be possible to pass a Maybe
RemoteStateHandle into it, and it would otherwise have to construct its
own. Rather than go that route, I decided to send an ERROR in this case.
It seems unlikely that any existing external special remote will be
affected. They would have to make up a git-annex key, and set state for
some reason during INITREMOTE. I can imagine such a hack, but it doesn't
seem worth complicating the code in such an ugly way to support it.

Unfortunately, both TestRemote and Annex.Import needed the Remote
to have a new field added that holds its RemoteStateHandle.
2019-10-14 13:51:42 -04:00
Joey Hess
37f725a9f7
Merge branch 'master' into sameas 2019-10-11 15:56:00 -04:00
Joey Hess
8131451c35
releasing package git-annex version 7.20191009 2019-10-09 12:33:09 -04:00
Joey Hess
f4dd7d5191
work around windows having infected git's plumbing
Work around git cat-file --batch's odd stripping of carriage return from
the end of the line (some windows infection), avoiding crashing when the
repo contains a filename ending in a carriage return.
2019-10-08 15:27:05 -04:00
Joey Hess
8966ba2cff
git-annex-standalone.rpm: Fix the git-annex-shell symlink 2019-10-08 14:43:28 -04:00
Joey Hess
53da7f1cf8
update uninit to handle all the v7 stuff
* uninit: Remove several git hooks that git-annex init sets up.
* uninit: Remove the smudge and clean filters that git-annex init sets up.
2019-10-08 14:34:00 -04:00
Joey Hess
1113caa53e
preserve unlocked file mtime when dropping
When dropping an unlocked file, preserve its mtime, which avoids git status
unncessarily running the clean filter on the file.

If the index file has close to the same mtime as a work tree file, git will
not trust the index to be up-to-date, and re-runs the clean filter
unncessarily. Preserving the mtime when depopulating a pointer file avoids
git status doing a little (or maybe a lot) of unncessary work.

There are other places that the mtime could be preserved, including other
places where pointer files are written perhaps, but also
populatePointerFile. But, I don't know of cases where those lead to git
status doing unncessary work, so I just fixed the one I'm aware of for now.
2019-10-08 14:01:12 -04:00
Joey Hess
2e6fd5de71
fix flipped diffUTCTime
fsck --incremental/--more: Fix bug that prevented the incremental fsck
information from being updated every 5 minutes as it was supposed to be; it
was only updated after 1000 files were checked, which may be more files
that are possible to fsck in a given fsck time window.

Thanks to Peter Simons for help with analysis of this bug.

Auditing for other cases of the same mistake, the keys db also had it
backwards. This seems unlikely to really have been a problem;
it would need associated files updates etc to be coming in slowly for some
reason and then be interrupted to cause any problem.

IIRC the design of the keys db assumes that any interruped
operation will be restarted, and so it can lose any buffered database
updates safely.
2019-10-03 09:54:19 -04:00
Joey Hess
61b384d2b7
add --sameas option, not yet used 2019-10-01 12:36:25 -04:00
Joey Hess
3066bdb1fb
fix annex.largefiles largerthan/smallerthan bug
Fix bug in handling of annex.largefiles that use largerthan/smallerthan.
When adding a modified file, it incorrectly used the file size of the old
version of the file, not the current size.

That was the only largefiles limit that didn't directly look at the file on
disk already. Added a new type to keep straight the two different ways such
a limit can be matched. I kind of wanted to extend MatchingFile or FileInfo
to indicate that the matcher is supposed to operate on files from disk or
annex, but it turned out to be too complex to implement it that way.

This also changes the LimitAnnexFiles case when lookupFileKey does not find
a key. It used to fall back to statting the file, now it always returns
False. I doubt the old code could really get to that point, but if it
somehow does, it's better for preferred content matching to be consistent.
2019-09-30 17:15:08 -04:00
Joey Hess
b90ddbc383
enable-tor: Use pkexec to run command as root when gksu and kdesu are not available.
gksu is no longer in debian, even stable

kdesu in debian is not installed in PATH any longer, though the executable
is still present under /usr/lib

pkexec is packagekit's replacement for those older commands.
2019-09-30 15:19:01 -04:00
Joey Hess
f2737a5fbe
enable-tor: Run kdesu with -c option. 2019-09-30 15:14:05 -04:00
Joey Hess
2b55a2b882
remotedaemon: Don't list --stop in help since it's not supported.
Also, move out of plumbing section. When using tor, the remotedaemon is
part of the user's workflow, as it runs the tor hidden service.
2019-09-30 14:40:46 -04:00
Joey Hess
090898a138
adjust --lock: This enters an adjusted branch where files are locked.
Straightforward, except for the issue of how to reverse LockAdjustment.

With --unlock, a commit that modifies/adds unlocked files gets reverse
adjusted to use locked files. That's fairly reasonable, I think.

But reversing --lock by unlocking all modified files feels wrong. Maybe
that's just because repositories typically seem to still have mostly
locked files in them (unless one is in an adjusted unlocked branch of
course!)

It may be that eventually how to reverse both will need to be configurable,
I don't know.
2019-09-27 14:23:25 -04:00
Joey Hess
9628ae2e67
Close sqlite databases more robustly.
Had a report of close throwing ErrorBusy on CIFS.

Retrying up to 16 seconds is a balance between hopefully waiting long
enough for the problem to clear up and waiting so long that git-annex seems
to hang.

The new dependency is free; persistent depends on unliftio-core.
2019-09-26 12:25:21 -04:00
Joey Hess
8af791d769
Test: Use more robust directory removal method.
I just had a test that crashed at cleanup on linux with:

.t/gpgtest/12/S.gpg-agent.browser: removeDirectoryRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:getSymbolicLinkStatus: does not exist (No such file or directory)
sleeping 10 seconds and will retry directory cleanup
git-annex: .t/gpgtest/14/S.gpg-agent.browser: removeDirectoryRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:removeContentsRecursive:removePathRecursive:getSymbolicLinkStatus: does not exist (No such file or directory)

removePathForcibly is supposed to be more robust to things in the directory vanishing while it's running, etc.
Will probably avoid such crashes.

It was added to directory-1.2.7, which comes with ghc since 8.0.2.
Since base >= 4.11.1.0 means ghc 8.4.4, I expect all builds will have it,
but I ifdefed it to be sure.
2019-09-24 16:59:37 -04:00
Joey Hess
6ae0a44c64
git-lfs: Added support for http basic auth 2019-09-24 14:46:20 -04:00
Joey Hess
de564df8b3
git-lfs: Only do endpoint discovery once when concurrency is enabled
This avoids some extra work, but I don't think it was possible for two ssh
endpoint discoveries run concurrently to both prompt for the ssh password;
Annex.Ssh itself deals with concurrency.

This is mostly groundwork for http password prompting.
2019-09-24 13:01:51 -04:00
Joey Hess
b13a350556
added --unlocked and --locked 2019-09-19 12:33:13 -04:00
Joey Hess
fda1bdd679
Added --mimetype and --mimeencoding file matching options.
Already had these for largefiles matching, but I forgot to add them as
command-line options.
2019-09-19 12:09:59 -04:00
Joey Hess
ab739242a3
releasing package git-annex version 7.20190912 2019-09-13 12:53:40 -04:00
Joey Hess
a8fea1644d
docs for git-annex-standalone rpm 2019-09-13 12:18:36 -04:00
Joey Hess
4508198507
building a standalone rpm from the standalone tarball
This allows the rpm to be built anywhere the necessary build deps are
available (including on debian) and the resulting package will work on as
broad a range of rpm distributions as the libc/kernel supports.

The DistributionUpdate changes to use the new script have not yet been
tested.
2019-09-13 11:53:17 -04:00
Joey Hess
4a4e08e123
release prep 2019-09-12 13:53:22 -04:00
Joey Hess
fef3cd055d
Removed support for git versions older than 2.1
debian oldoldstable has 2.1, and that's what i386ancient uses. It would be
better to require git 2.2, which is needed to use adjusted branches, but
can't do that w/o losing support for some old linux kernels or a
complicated git backport.
2019-09-11 16:14:43 -04:00
Joey Hess
061231621e
Merge branch 'master' into v7-default 2019-09-10 16:06:43 -04:00
Joey Hess
94c75d2bd9
init: Fix a reversion that broke initialization on systems that need to use pid locking
This brings back .git/annex/misctmp, but only for init. If an init
is interrupted while probing using that temp directory, the files it left
will get deleted 1 week later by a subsequent git-annex run.
2019-09-10 13:37:07 -04:00
Joey Hess
0af7ebdc2a
info: Display trust level when getting info on a uuid, same as on a remote. 2019-09-01 16:48:46 -04:00
Joey Hess
f845195354
Added annex.autoupgraderepository configuration
Can be set to false to prevent any automatic repository upgrades.

Also, removed direct mode specific upgrade code in Annex.Init, and made
needsUpgrade always include the name/path of the repo, so if
there's a problem it's clear what repo has the problem.

And, made needsUpgrade catch any exceptions that might occur during the
upgrade, so it can display a more useful error message than just the
exception.
2019-09-01 13:42:26 -04:00
Joey Hess
3f0eef4baa
v7 for all repositories
* Default to v7 for new repositories.
* Automatically upgrade v5 repositories to v7.
2019-08-30 14:09:14 -04:00
Joey Hess
1558e03014
Refuse to upgrade direct mode repositories when git is older than 2.22
That git fixed a memory leak that could cause an OOM during the upgrade.

Most git-annex builds have a new enough git already.
OSX git was upgraded with brew.

Linux i386ancient build's git was too old. Upgrading it to a fixed
git didn't work (due to the newer git not working with the old ssh,
https://bugs.chromium.org/p/git/issues/detail?id=7 )

Choices to deal with that were:

* Somehow make direct mode upgrade work with the old git, avoiding its
  OOM problem. One way would be to switch the repo to indirect mode
  first, and so upgrade to a repo with locked files. Not good when
  the filesystem does not support symlinks.
* backport the OOM fix from git 2.22
  (And do what about the version number so git-annex knows it's fixed?)
* backport openssh (and possibly more stuff)
* move the i386ancient build to at least Debian stretch (still backporting git)
  But this will make it no longer work with some of the ancient kernels it
  targets.

Of those, backporting the OOM fix seemed the best approach. Put "oomfix"
in the git version number to indicate it.

I have not automated building the git backport, so here's the patch I
used:

diff -ur orig/git-2.1.4/convert.c git-2.1.4/convert.c
--- orig/git-2.1.4/convert.c	2014-12-18 18:42:18.000000000 +0000
+++ git-2.1.4/convert.c	2019-08-29 20:05:04.371872338 +0100
@@ -404,7 +404,7 @@
 	if (start_async(&async))
 		return 0;	/* error was already reported */

-	if (strbuf_read(&nbuf, async.out, len) < 0) {
+	if (strbuf_read(&nbuf, async.out, 0) < 0) {
 		error("read from external filter %s failed", cmd);
 		ret = 0;
 	}
diff -ur orig/git-2.1.4/GIT-VERSION-GEN git-2.1.4/GIT-VERSION-GEN
--- orig/git-2.1.4/GIT-VERSION-GEN	2014-12-18 18:42:18.000000000 +0000
+++ git-2.1.4/GIT-VERSION-GEN	2019-08-29 20:06:39.132743228 +0100
@@ -1,7 +1,7 @@
 #!/bin/sh

 GVF=GIT-VERSION-FILE
-DEF_VER=v2.1.4
+DEF_VER=v2.1.4.oomfix

 LF='
 '
diff -ur orig/git-2.1.4/configure git-2.1.4/configure
--- orig/git-2.1.4/configure	2014-12-18 18:42:19.000000000 +0000
+++ git-2.1.4/configure	2019-08-29 20:27:45.896380015 +0100
@@ -580,8 +580,8 @@
 # Identity of this package.
 PACKAGE_NAME='git'
 PACKAGE_TARNAME='git'
-PACKAGE_VERSION='2.1.4'
-PACKAGE_STRING='git 2.1.4'
+PACKAGE_VERSION='2.1.4.oomfix'
+PACKAGE_STRING='git 2.1.4.oomfix'
 PACKAGE_BUGREPORT='git@vger.kernel.org'
 PACKAGE_URL=''

diff -ur orig/git-2.1.4/version git-2.1.4/version
--- orig/git-2.1.4/version	2014-12-18 18:42:19.000000000 +0000
+++ git-2.1.4/version	2019-08-29 20:06:17.572545210 +0100
@@ -1 +1 @@
-2.1.4
+2.1.4.oomfix
2019-08-29 15:24:41 -04:00
Joey Hess
4f59ac05b6
info: remove "repository mode"
info: Removed the "repository mode" from its output (including the --json
output) since with the removal of direct mode, there is no repository mode.
2019-08-29 14:12:22 -04:00