Commit graph

1085 commits

Author SHA1 Message Date
Joey Hess
e726800dda
add upper bounds on Cabal version
hackage now rejects packages without this. My bet is any version of
cabal is going to work, I'm using the public API. Annoying.
2023-01-26 15:31:29 -04:00
Joey Hess
65167463aa
releasing package git-annex version 10.20230126 2023-01-26 15:27:32 -04:00
Joey Hess
579d9b60c1
improve concurrency of move/copy --from --to
Use separate stages for download and upload. In the common case where
it downloads the file from one remote and then uploads to the other,
those are by far the most expensive operations, and there's a decent
chance the two remotes bottleneck on different resources.

Suppose it's being run with -J2 and a bunch of 10 mb files. Two threads
will be started both downloading from the src remote. They will probably
finish at the same time. Then two threads will be started uploading to
the dst remote. They will probably take the same time as well. Before
this change, it would alternate back and forth, bottlenecking on src and dst.
With this change, as soon as the two threads start uploading to dst, two
more threads are able to start, downloading from src. So bandwidth to
both remotes is saturated more often.

Other commands that use transferStages only send in one direction at a
time. So the worker threads for the other direction will sit idle, and
there will be no change in their behavior.

Sponsored-by: Dartmouth College's DANDI project
2023-01-24 13:59:39 -04:00
Joey Hess
f8bc208e89
findkeys: New command, very similar to git-annex find but operating on keys
I've long been asked for `git-annex find --all` or something like that,
but pushed back on it because I feel that the command is analagous to
find(1) and so it would be surprising for it to list keys rather than
files. So instead, add a new findkeys subcommand.

Note that the use of withKeyOptions is rather strange because usually
that is used to fall back to --all rather than listing files, but here
it's made to default to --all like behavior and never list files.

A performance thing that could be improved is that withKeyOptions
always reads and caches location logs. But findkeys with no options does
not need them, so it could be made faster. That caching does speed up
options like --in though. This is really just a subset of a more general
performance thing that --all reads location logs sometimes unncessarily.
Anyway, it needs to read the location log in order to checkDead,
and it seems good that findkeys does skip dead keys.

Also, cleaned up comments on git-annex-find man page asking for --all
option.

Sponsored-by: Dartmouth College's DANDI project
2023-01-17 14:51:57 -04:00
Joey Hess
f316b7f105
Revert "Removed the vendored git-lfs and the GitLfs build flag"
This reverts commit efda811404.

Turns out that datalad is building git-annex against debian bullseye.
https://github.com/datalad/git-annex/issues/149
2023-01-04 17:33:29 -04:00
Joey Hess
efda811404
Removed the vendored git-lfs and the GitLfs build flag
AFAICS all git-annex builds are using the git-lfs library not the vendored
copy.

Debian stable does have a too old haskell-git-lfs package to be able to
build git-annex from source, but there is not currently a backport of a
recent git-annex to Debian stable. And if they update the backport at some
point, they should be able to backport the library too.

Sponsored-by: Svenne Krap on Patreon
2022-12-26 12:49:53 -04:00
Joey Hess
ab11fd70e2
releasing package git-annex version 10.20221212 2022-12-12 12:51:59 -04:00
Joey Hess
2eedf58630
replace guessed win32 version with actual version 2.13.4.0 2022-11-09 13:08:27 -04:00
Joey Hess
d22bd53310
releasing package git-annex version 10.20221103 2022-11-03 14:07:53 -04:00
Joey Hess
6fbd337e34
avoid uncessary keys db writes; doubled speed!
When running eg git-annex get, for each file it has to read from and
write to the keys database. But it's reading exclusively from one table,
and writing to a different table. So, it is not necessary to flush the
write to the database before reading. This avoids writing the database
once per file, instead it will buffer 1000 changes before writing.

Benchmarking getting 1000 small files from a local origin,
git-annex get now takes 13.62s, down from 22.41s!
git-annex drop now takes 9.07s, down from 18.63s!
Wowowowowowowow!

(It would perhaps have been better if there were separate databases for
the two tables. At least it would have avoided this complexity. Ah well,
this is better than splitting the table in a annex.version upgrade.)

Sponsored-by: Dartmouth College's Datalad project
2022-10-12 15:33:16 -04:00
Joey Hess
32a44c3813
releasing package git-annex version 10.20221003 2022-10-03 13:24:21 -04:00
Joey Hess
7059322a6c
Support "inbackend" in preferred content expressions
Well, actually, fix a typo that has always been in the implementation of
that. "inbacked" used to work, but let's not tell users about that; they
might try to use it and expect git-annex to keep supporting the typo..

Sponsored-by: Jack Hill on Patreon
2022-09-26 16:06:49 -04:00
Joey Hess
dcc2957d9c
improve documentation about backends
I noticed that, using just the man pages, there is no real description
of what backends are, or what ones are available. Except for some
examples.

Added a git-annex-backends man page, that is just a stub, but at least
describes what they basically are, and tells how to find the supported
ons, and links to the backends web page.

Sponsored-by: Brett Eisenberg on Patreon
2022-09-26 15:59:10 -04:00
Joey Hess
2478e9e03a
restage: New git-annex command, handles restaging unlocked files
This is much easier and less failure-prone than having the user run
git update-index --refresh themselves.

Sponsored-by: Dartmouth College's DANDI project
2022-09-23 16:29:59 -04:00
Joey Hess
6a3bd283b8
add restage log
When pointer files need to be restaged, they're first written to the
log, and then when the restage operation runs, it reads the log. This
way, if the git-annex process is interrupted before it can do the
restaging, a later git-annex process can do it.

Currently, this lets a git-annex get/drop command be interrupted and
then re-ran, and as long as it gets/drops additional files, it will
clean up after the interrupted command. But more changes are
needed to make it easier to restage after an interrupted process.

Kept using the git queue to run the restage action, even though the
list of files that it builds up for that action is not actually used by
the action. This could perhaps be simplified to make restaging a cleanup
action that gets registered, rather than using the git queue for it. But
I wasn't sure if that would cause visible behavior changes, when eg
dropping a large number of files, currently the git queue flushes
periodically, and so it restages incrementally, rather than all at the
end.

In restagePointerFiles, it reads the restage log twice, once to get
the number of files and size, and a second time to process it.
This seemed better than reading the whole file into memory, since
potentially a huge number of files could be in there. Probably the OS
will cache the file in memory and there will not be much performance
impact. It might be better to keep running tallies in another file
though. But updating that atomically with the log seems hard.

Also note that it's possible for calcRestageLog to see a different file
than streamRestageLog does. More files may be added to the log in
between. That is ok, it will only cause the filterprocessfaster heuristic to
operate with slightly out of date information, so it may make the wrong
choice for the files that got added and be a little slower than ideal.

Sponsored-by: Dartmouth College's DANDI project
2022-09-23 15:47:24 -04:00
Joey Hess
78440ca37d
move assistant and webapp build-depends into main build-depends
For some reason, cabal 3.4.1.0 builds w/o the assistant and webapp,
even when the flag is explicitly turned on. Moving the build-depends from
inside the if flag section to the main build-depends somehow fixes this.

Since the webapp build deps are thus always available, there is no reason
not to build the webapp when building the assistant. So, got rid of the
webapp build flag. Kept the assistant build flag for now, since building
without it does at least still speed up the build.

Sponsored-by: Brock Spratlen on Patreon
2022-08-29 15:23:49 -04:00
Joey Hess
e801634875
prep release 2022-08-22 12:02:04 -04:00
Joey Hess
472f5c142b
Use createFile_NoRetry from win32 2.13.3.1
Sponsored-by: Tobias Ammann on Patreon
2022-08-02 10:45:39 -04:00
Joey Hess
a0e788c94a
releasing package git-annex version 10.20220724 2022-07-25 14:07:20 -04:00
Joey Hess
4b520e0683
increase cabal-version to work with recent cabal
It started complaining about custom setup needing too old a version of
cabal, a very confusing error message.

1.12 is the version of Cabal on the i386ancient builder.

Sponsored-by: Jack Hill on Patreon
2022-07-16 14:57:29 -04:00
Joey Hess
ba13c1e2ac
depend on version of Win32 that exports c_createFile 2022-07-12 16:28:01 -04:00
Joey Hess
2d65c4ff1d
avoid unix-compat's rename
On Windows, that does not support long paths
https://github.com/jacobstanley/unix-compat/issues/56

Instead, use System.Directory.renamePath, which does support long paths.

Sponsored-by: Dartmouth College's Datalad project
2022-07-12 14:55:02 -04:00
Joey Hess
02ef3d6a64
fix build with assistant disabled and webapp enabled
The webapp modules cannot build with the assistant disabled, so make the
webapp be under the assistant build flag.

Sponsored-by: Jarkko Kniivilä on Patreon
2022-06-29 14:19:18 -04:00
Joey Hess
b223988e22
remove --backend from global options
--backend is no longer a global option, and is only accepted by commands
that actually need it.

Three commands that used to support backend but don't any longer are
watch, webapp, and assistant. It would be possible to make them support it,
but I doubt anyone used the option with these. And in the case of webapp
and assistant, the option was handled inconsistently, only taking affect
when the command is run with an existing git-annex repo, not when it
creates a new one.

Also, renamed GlobalOption etc to AnnexOption. Because there are many
options of this type that are not actually global (any more) and get
added to commands that need them.

Sponsored-by: Kevin Mueller on Patreon
2022-06-29 13:33:25 -04:00
Joey Hess
c1b9ea2759
The 23 never happened release.
It's 24 somewhere..
2022-06-23 13:55:54 -04:00
Joey Hess
57d088e9c2
fix release version 2022-06-23 13:35:14 -04:00
Joey Hess
bea665a4d7
enable --as-needed on freebsd
based on doc/bugs/FreeBSD_patches.mdwn which indicates it works, though
sadly without anything more than a patch.

If this breaks anything it will be reverted.
2022-05-31 13:04:52 -04:00
Joey Hess
b60d85c4c0
releasing package git-annex version 10.20220525 2022-05-25 14:01:31 -04:00
Joey Hess
4e4c44ed8e
hah, I mean 0504 of course 2022-05-04 11:47:40 -04:00
Joey Hess
cb0e89bf77
releasing package git-annex version 10.20220404 2022-05-04 11:46:56 -04:00
Joey Hess
959beeea9f
releasing package git-annex version 10.20220322 2022-03-22 13:56:45 -04:00
Joey Hess
a460aa8b70
Removed the NetworkBSD build flag
Debian stable and the i386ancient build both have a new enough network
to not need this flag any longer.

Sponsored-by: Svenne Krap on Patreon
2022-03-22 11:52:52 -04:00
Joey Hess
982eb7ed0d
remove vendored http-client-restricted
Removed vendored copy of http-client-restricted, and removed the
HttpClientRestricted build flag that avoided that dependency.

http-client-restricted is in Debian stable, and the i386ancient build also
uses it, so I think this vendored copy is no longer needed.

Sponsored-by: Noam Kremen on Patreon
2022-03-22 11:50:06 -04:00
Joey Hess
8bbd683f31
relax enough i386ancient deps to allow new tasty
The new ansi-terminal was needed for test concurrency, and the new
concurrent-output fixes several bugs. And it turns out this is all
that's needed to use the new tasty.

Sponsored-by: Kevin Mueller on Patreon
2022-03-22 10:59:22 -04:00
Joey Hess
a33f1a0815
re-relax tasty dependency version for i386ancient build
Dependency issues were looking difficult to support tasty-1.2 with that
build. Not using `after` only affects rerunning and limiting tests,
since tasty's concurrency is not used, so this build will just not
support that.

We are probably nearing end of life on this build; it also doesn't
support git-lfs or http-client-restricted. The 2.6.32 kernel it supports
is at this point 13 years old, and stopped being supported by linux LTS
developers 10 years ago. It was supported by RHEL 6.10 through November
2020. At this point, no new hardware should be shipping with this
kernel, but that probably does not stop certian embedded vendors from
shipping it. And there is certainly some hardware still using it. But
the returns from supporting it are diminishing, and the quality of the
build for it is also diminishing.

Sponsored-by: Nicholas Golder-Manning on Patreon
2022-03-22 10:31:10 -04:00
Joey Hess
be31a8a3d2
cleaner test dependencies
This improves the display of tests.

tasty-1.2 is in debian stable.

Sponsored-by: Dartmouth College's Datalad project
2022-03-16 12:53:08 -04:00
Joey Hess
d3b7c6705c
clean up concurrent output of tests
Using concurrent-output this is easy. Just have to check if tasty has
color enabled, and propagate it into the worker processes, some of which
will be run without a controlling console.

Also added a call to installSignalHandlers; I noticed that interrupting
the test suite could leave the console in a bad state and this fixes
that.

The ansi-terminal dependency is free, since tasty also depends on it.

Sponsored-by: Dartmouth College's Datalad project
2022-03-16 12:41:28 -04:00
Joey Hess
952664641a
turn of PackageImports in cabal file
This makes it easier to build eg benchmarks of individual modules.

May be that most of these PackageImports are not really necessary,
dunno.
2022-02-25 13:16:36 -04:00
Joey Hess
1c4b0b4c2b
releasing package git-annex version 10.20220222 2022-02-22 13:33:45 -04:00
Joey Hess
e6e60b644b
releasing package git-annex version 10.20220127 2022-01-27 14:53:22 -04:00
Joey Hess
7e7a7140ce
update for v10
Sponsored-by: Dartmouth College's Datalad project
2022-01-21 12:32:44 -04:00
Joey Hess
9d5db6a09a
add upgrade.log
The upgrade from V9 uses this to avoid an automatic upgrade until 1 year
after the V9 update. It can also be used in future such situations.

Sponsored-by: Dartmouth College's Datalad project
2022-01-19 15:52:29 -04:00
Joey Hess
856ce5cf5f
split upgrade into v9 and v10
v10 will run 1 year after the upgrade to v9, to give time for any v8
processes to die. Until that point, the v10 upgrade will be tried by
every process but deferred, so added support for deferring upgrades.

The upgrade prevention lock file that will be used by v10 is not yet
implemented, so it does not yet defer.

Sponsored-by: Dartmouth College's Datalad project
2022-01-19 13:09:33 -04:00
Joey Hess
ff570ad363
add v9 annex.version, not yet the default
This is the start of v9, but it's currently identical to v8, and v8 is
not upgraded to it. git-annex upgrade will upgrade to v9 with this
change.

Sponsored-by: Dartmouth College's Datalad project
2022-01-11 14:59:39 -04:00
Joey Hess
479ec0d533
releasing package git-annex version 8.20211231 2021-12-31 15:11:50 -04:00
Joey Hess
b1d719f9d2
handle transitions with read-only unmerged git-annex branches
Capstone to this feature. Any transitions that have been performed on an
unmerged remote ref but not on the local git-annex branch, or vice-versa
have to be applied on the fly when reading files.

Sponsored-by: Dartmouth College's Datalad project
2021-12-28 13:23:32 -04:00
Joey Hess
74fcc389d8
releasing package git-annex version 8.20211123 2021-11-23 15:20:24 -04:00
Joey Hess
c3af94eff4
releasing package git-annex version 8.20211117 2021-11-17 12:20:29 -04:00
Joey Hess
f3326b8b5a
git-lfs gitlab interoperability fix
git-lfs: Fix interoperability with gitlab's implementation of the git-lfs
protocol, which requests Content-Encoding chunked.

Sponsored-by: Dartmouth College's Datalad project
2021-11-10 13:51:11 -04:00
Joey Hess
68257e9076
add git-annex filter-process
filter-process: New command that can make git add/checkout faster when
there are a lot of unlocked annexed files or non-annexed files, but that
also makes git add of large annexed files slower.

Use it by running: git
config filter.annex.process 'git-annex filter-process'

Fully tested and working, but I have not benchmarked it at all.
And, incremental hashing is not done when git add uses it, so extra work is
done in that case.

Sponsored-by: Mark Reidenbach on Patreon
2021-11-04 15:02:36 -04:00
Joey Hess
c260833a6b
releasing package git-annex version 8.20211028 2021-10-28 12:00:56 -04:00
Joey Hess
7bdc7350a5
remove git-annex-shell compat code
* Removed support for accessing git remotes that use versions of
  git-annex older than 6.20180312.
* git-annex-shell: Removed several commands that were only needed to
  support git-annex versions older than 6.20180312.
  (lockcontent, recvkey, sendkey, transferinfo, commit)

The P2P protocol was added in that version, and used ever since, so
this code was only needed for interop with older versions.

"git-annex-shell commit" is used by newer git-annex versions, though
unnecessarily so, because the p2pstdio command makes a single commit at
shutdown. Luckily, it was run with stderr and stdout sent to /dev/null,
and non-zero exit status or other exceptions are caught and ignored. So,
that was able to be removed from git-annex-shell too.

git-annex-shell inannex, recvkey, sendkey, and dropkey are still used by
gcrypt special remotes accessed over ssh, so those had to be kept.
It would probably be possible to convert that to using the P2P protocol,
but it would be another multi-year transition.

Some git-annex-shell fields were able to be removed. I hoped to remove
all of them, and the very concept of them, but unfortunately autoinit
is used by git-annex sync, and gcrypt uses remoteuuid.

The main win here is really in Remote.Git, removing piles of hairy fallback
code.

Sponsored-by: Luke Shumaker
2021-10-11 15:36:51 -04:00
Joey Hess
e28cf82b45
releasing package git-annex version 8.20211011 2021-10-11 12:53:17 -04:00
Joey Hess
1a586e473b
releasing package git-annex version 8.20210903 2021-09-03 12:01:12 -04:00
Joey Hess
9cae7c5bbf
releasing package git-annex version 8.20210803 2021-08-03 12:20:45 -04:00
Joey Hess
73e0cbbb19
fix problem populating pointer files
This is a result of an audit of every use of getInodeCaches,
to find places that misbehave when the annex object is not in the inode
cache, despite pointer files for the same key being in the inode cache.

Unfortunately, that is the case for objects that were in v7 repos that
upgraded to v8. Added a note about this gotcha to getInodeCaches.

Database.Keys.reconcileStaged, then annex.thin is set, would fail to
populate pointer files in this situation. Changed it to check if the
annex object is unmodified the same way inAnnex does, falling back to a
checksum if the inode cache is not recorded.

Sponsored-by: Dartmouth College's Datalad project
2021-07-27 14:26:49 -04:00
Joey Hess
47d3dccf19
whereused implemented
except --historical

Sponsored-by: Jack Hill on Patreon
2021-07-14 14:27:21 -04:00
Joey Hess
12e48fcebe
add git-annex-filter-branch man page to cabal tarball
forgot to do this earlier
2021-07-14 13:49:20 -04:00
Joey Hess
065db484e0
releasing package git-annex version 8.20210714 2021-07-14 12:23:24 -04:00
Joey Hess
fd99ce6c95
releasing package git-annex version 8.20210630 2021-06-30 11:48:33 -04:00
Joey Hess
a6e281e008
releasing package git-annex version 8.20210621 2021-06-21 12:17:46 -04:00
Joey Hess
a58c90ccf4
skeleton of filter-branch command, with option parser 2021-05-14 10:59:48 -04:00
Atemu
cab398c945 Cabal: Use -O0 for development builds
Can be configured with `--flags=-production`

Time for full build on my machine: 2m -> 45s
2021-05-05 07:53:52 +02:00
Joey Hess
27e5f3cd52
releasing package git-annex version 8.20210428 2021-04-28 12:16:45 -04:00
Joey Hess
441f65c2cf
split out Annex.CopyFile
Goal is to use it in Remote.Directory, but also it's nice to shrink Remote.Git.
2021-04-14 14:06:43 -04:00
Joey Hess
a10cc80997
split out Logs.Export.Pure
This will allow Annex.Branch to use it, in transitions code.

This commit was sponsored by Luke Shumaker on Patreon.
2021-04-13 14:10:23 -04:00
Joey Hess
542a0018cd
add 2 missing man page files to cabal file
Only useful when using cabal install to install it with man pages.
It's hard to remember to list new man page files here.
2021-04-08 14:39:52 -04:00
Joey Hess
d16d739ce2
implement fastDebug
Most of the changes here involve global option parsing: GlobalSetter
changed so it can both run an Annex action to set state, but can also
change the AnnexRead value, which is immutable once the Annex monad is
running.

That allowed a debugselector value to be added to AnnexRead, seeded
from the git config. The --debugfilter option's GlobalSetter then updates
the AnnexRead.

This improved GlobalSetter can later be used to move more stuff to
AnnexRead. Things that don't involve a git config will be easier to
move, and probably a *lot* of things can be moved eventually.

fastDebug, while implemented, is not used anywhere yet. But it should be
fast..
2021-04-06 15:24:28 -04:00
Joey Hess
aaba83795b
switch from hslogger to purpose-built Utility.Debug
This uses a DebugSelector, rather than debug levels, which will allow
for a later option like --debug-from=Process to only
see debuging about running processes.

The module name that contains the thing being debugged is used as the
DebugSelector (in most cases; does not need to be a hard and fast rule).
Debug calls were changed to add that. hslogger did not display
that first parameter to debugM, but the DebugSelector does get
displayed.

Also fastDebug will allow doing debugging in places that are used in
tight loops, with the DebugSelector coming from the Annex Reader
essentially for free. Not done yet.
2021-04-05 13:40:31 -04:00
Joey Hess
8868a3a4c7
Fix build with persistent-2.12.0.1
persistent stopped using askLogFunc, and the thing to use is askLoggerIO
from monad-logger. Bumped the dep to the first version that contained that.

Note that the i386ancient build uses a newer monad-logger than 0.3.10,
so the new versioned dep should not break it, and presumably nothing else
either.

This commit was sponsored by Noam Kremen on Patreon.
2021-04-01 12:21:02 -04:00
Joey Hess
315a81e3c6
releasing package git-annex version 8.20210330 2021-03-30 14:33:28 -04:00
Joey Hess
41d9148c72
fix attoparsec lower bound
needed for parseOnly
2021-03-24 11:54:16 -04:00
Joey Hess
a343ea76c8
releasing package git-annex version 8.20210310 2021-03-10 13:59:00 -04:00
Joey Hess
cbf94fd13d
prep for fixing find --branch --unlocked
Added LinkType to ProvidedInfo, and unified MatchingKey with
ProvidedInfo. They're both used in the same way, so there was no real
reason to keep separate.

Note that addLocked and addUnlocked still set matchNeedsFileName,
because to handle MatchingFile, they do need it. However, they
don't use it when MatchingInfo is provided. This should be ok,
the --branch case will be able skip checking matchNeedsFileName,
since it will provide a filename in any case.
2021-03-02 13:39:31 -04:00
Joey Hess
eb594c710e
unregisterurl: New command
Implemented by generalizing registerurl. Without the implicit batch mode
of registerurl since that is only a backwards compatability thing
(see commit 1d1054faa6).
2021-03-01 14:28:24 -04:00
Joey Hess
d670346b22
releasing package git-annex version 8.20210223 2021-02-23 14:40:45 -04:00
Joey Hess
62e152f210
incremental checksum on download from ssh or p2p
Checksum as content is received from a remote git-annex repository, rather
than doing it in a second pass.

Not tested at all yet, but I imagine it will work!

Not implemented for any special remotes, and also not implemented for
copies from local remotes. It may be that, for local remotes, it will
suffice to use rsync, rely on its checksumming, and simply return Verified.
(It would still make a checksumming pass when cp is used for COW, I guess.)
2021-02-09 17:03:27 -04:00
Joey Hess
135757d64a
automatic stall detection
annex.stalldetection can now be set to "true" to make git-annex do
automatic stall detection when it detects a remote is updating its transfer
progress consistently enough.

This commit was sponsored by Luke Shumaker on Patreon.
2021-02-03 13:33:57 -04:00
Joey Hess
58216ef39d
Include libkqueue.h file needed to build the assistant on BSDs
I suspect this is a bug in cabal sdist, because with
Includes: Utility/libkqueue.h
the file is not included, but putting it in extra-files does
get it into the tarball.
2021-02-01 12:00:56 -04:00
Joey Hess
a82aca67b8
releasing package git-annex version 8.20210127 2021-01-27 11:13:25 -04:00
Joey Hess
e2ba8ae4a6
update copyright year 2021-01-22 14:00:36 -04:00
Joey Hess
cc89699457
mincopies
This is conceptually very simple, just making a 1 that was hard coded be
exposed as a config option. The hard part was plumbing all that, and
dealing with complexities like reading it from git attributes at the
same time that numcopies is read.

Behavior change: When numcopies is set to 0, git-annex used to drop
content without requiring any copies. Now to get that (highly unsafe)
behavior, mincopies also needs to be set to 0. It seemed better to
remove that edge case, than complicate mincopies by ignoring it when
numcopies is 0.

This commit was sponsored by Denis Dzyubenko on Patreon.
2021-01-06 14:15:19 -04:00
Joey Hess
7d843e909d
releasing package git-annex version 8.20201129 2020-12-29 13:51:40 -04:00
Joey Hess
6280af2901
generate more compact git-annex branch for imports
Especially from borg, where the content identifier logs
all end up being the same identical file!

But also, for other imports, the location tracking logs can,
in some cases, be identical files.

Bonus optimisation: Avoid looking up (and parsing when set)
GIT_ANNEX_VECTOR_CLOCK env var every time a log is written to.
Although the lookup does happen at startup even when no
log will be written now.
2020-12-23 15:25:16 -04:00
Joey Hess
bcd55b365c
import from borg is basically working
Still some issues to deal with, see TODO and XXX.

Here's what gets logged, for each key:

cid log:
1608582045.832799227s 6720ebad-b20e-4460-a8f2-2477361aea75 !MjAyMC0xMi0yMVQxMTozMzoxNw==:!MjAyMC0xMi0yMVQxMzowNzoyNg==

The "!Mj" are base64 encoded borg archive names, since mine were
dates and contained some characters not allowed in cid logs unescaped.
There were archives that each contained the key. This list will grow as
more borg backups are done and learned about.

tree generated:
120000 blob 5ef6a4615c084819b44cd4e3a31657664ddf643b	x/dotgit/annex/objects/06/mv/SHA256E-s30--a5d8532e64ec28f5491e25e7a6c1cb68f80507c1be6c1b35f8ec53d25413e5da/SHA256E-s30--a5d8532e64ec28f5491e25e7a6c1cb68f80507c1be6c1b35f8ec53d25413e5da
120000 blob 063a139d3021c8db60f5c576d29fada2b824d91c	x/dotgit/annex/objects/72/PP/SHA256E-s30--e80b09a854b4e4d99a76caaa6983b34272480e0b4fdb95d04234a54b4849b893/SHA256E-s30--e80b09a854b4e4d99a76caaa6983b34272480e0b4fdb95d04234a54b4849b893
120000 blob b53b54916fd6abf21fedf796deca08d5ac7a75af	x/dotgit/annex/objects/Ww/pk/SHA256E-s30--6aac072a8ebf02a5807c4f15e77ed585a6c87b3b333ba625a3c8d6b4dc50a9f2/SHA256E-s30--6aac072a8ebf02a5807c4f15e77ed585a6c87b3b333ba625a3c8d6b4dc50a9f2

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-12-21 16:37:55 -04:00
Joey Hess
ca31d7e54f
refactor
That code was not borg specific, and I can see making more remotes for
other backup software.
2020-12-18 17:08:44 -04:00
Joey Hess
3207e8293b
start borg special remote
Compiles, but unusable so far.
2020-12-18 16:03:51 -04:00
Joey Hess
004a4f5fb1
factor out Types.Transferrer 2020-12-09 13:28:49 -04:00
Joey Hess
677003a6df
rename helper
More consistent name with TransferrerPool
2020-12-09 13:24:24 -04:00
Joey Hess
05c0543e8e
move new interface to git-annex transfer
This is to avoid breakage when upgrading or downgrading git-annex with a
process running that uses the interface. It's better to keep the
compatability code for a few years than worry about such breakage.

This commit was sponsored by Brett Eisenberg on Patreon.
2020-12-09 12:33:56 -04:00
Joey Hess
41f2c308ff
stall detection is working
New config annex.stalldetection, remote.name.annex-stalldetection, which
can be used to deal with remotes that stall during transfers, or are
sometimes too slow to want to use.

This commit was sponsored by Luke Shumaker on Patreon.
2020-12-08 15:22:18 -04:00
Joey Hess
47016fc656
move TransferrerPool from Assistant state to Annex state
This commit was sponsored by Graham Spencer on Patreon.
2020-12-07 13:21:35 -04:00
Joey Hess
72e5764a87
move TransferrerPool from assistant
This old code will now be useful for git-annex beyond the assistant.

git-annex won't use the CheckTransferrer part, and won't run transferkeys
as a batch process, and will want withTransferrer to not shut down
transferkeys processes. Still, the rest of this is a good fit for what I
need now.

Also removed some dead code, and simplified a little bit.

This commit was sponsored by Mark Reidenbach on Patreon.
2020-12-07 12:50:48 -04:00
Joey Hess
31e417f351
finish message serialization of progress meters
Any given transfer can only display 1 progress meter at a time, or so
this code assumes. In some cases, there are progress meters for
different stages of a transfer, perhaps, and that is supported by this.

This commit was sponsored by Ethan Aubin.
2020-12-04 13:50:46 -04:00
Joey Hess
dad8442572
releasing package git-annex version 8.20201127 2020-11-27 12:57:02 -04:00
Joey Hess
0896038ba7
annex.adjustedbranchrefresh
Added annex.adjustedbranchrefresh git config to update adjusted branches
set up by git-annex adjust --unlock-present/--hide-missing.

Note, in a few cases, I was not able to make the adjusted branch
be updated in calls to moveAnnex, because information about what
file corresponds to a key is not available. They are:

* If two files point to one file, then eg, `git annex get foo` will
  update the branch to unlock foo, but will not unlock bar, because it
  does not know about it. Might be fixable by making `git annex get
  bar` do something besides skipping bar?
* git-annex-shell recvkey likewise (so sends over ssh from old versions
  of git-annex)
* git-annex setkey
* git-annex transferkey if the user does not use --file
* git-annex multicast sends keys with no associated file info

Doing a single full refresh at the end, after any incremental refresh,
will deal with those edge cases.
2020-11-16 14:27:28 -04:00
Joey Hess
af6af35228
split out Annex.Content.Presence
This will let a module that Annex.Content imports use inAnnex.
Unsure yet if I will need that, but this split still seems to make
sense, and Annex.Content was way too long so splitting it is good.
2020-11-16 11:24:57 -04:00
Joey Hess
864af53a2d
releasing package git-annex version 8.20201116 2020-11-16 09:38:29 -04:00
Joey Hess
c6604406b9
bump dep on filepath-bytestring
needed for makeRelative
2020-11-10 11:15:14 -04:00
Joey Hess
885974be99
add newtypes for QuickCheck to avoid LANG=C issues
All properties changed to use them, except for
prop_encode_c_decode_c_roundtrip, which already filtered to ascii
for other reasons.

A few modules had to be split out, because Setup does not build-depend
on QuickCheck.
2020-11-09 20:21:18 -04:00
Joey Hess
2c8cf06e75
more RawFilePath conversion
Converted file mode setting to it, and follow-on changes.

Compiles up through 369/646.

This commit was sponsored by Ethan Aubin.
2020-11-05 18:45:37 -04:00
Joey Hess
f9fc26f05a
Merge branch 'master' into rawfilepath 2020-11-04 14:21:44 -04:00
Joey Hess
2dabd4cc2d
releasing package git-annex version 8.20201103 2020-11-03 11:53:11 -04:00
Joey Hess
d6e94a6b2e
got configure working after Utility.Path ByteString conversion
Had to split out some modules because getWorkingDirectory needs unix,
which is not a build-dep of configure.

This commit was sponsored by Brock Spratlen on Patreon.
2020-10-28 15:01:19 -04:00
Joey Hess
aed64428d5
allow magicmime on windows
John Thorvald Wodder II got it working using
https://github.com/datalad/file-windows so don't hard-disable it.

The stack.yaml still disables this build flag, because it needs an extra
C library to be installed, which stack cannot automate.
2020-10-26 13:34:27 -04:00
Joey Hess
2dd38b6403
switch to Haskell2010
When I put in Haskell98 this spring, I was under the mistaken
apprehension that ghc defaulted to that. But it actually its default
is a third mode, which is closer to Haskell2010 but with some differences.
The manual says "By default, GHC mainly aims to behave (mostly) like a
Haskell 2010 compiler"

Fixed two cases where the Haskell98 do indentation flexability let
wrongly indented code build. That is one of the places where
ghc does not behave like Haskell2010 by default.

The other place that I think I was concerned about, is GHC manual
section 19.1.1.3. Expressions and patterns. But that only seems to
affect code using bottoms, so would only affect pure functions throwing
an error, which I don't think git-annex does in many places as it's
pretty horrid style. And it would only affect rare cases like shown in
that section. If it did happen, it would mean that the error was not
thrown before specifying Haskell98, and then was. Haskell2010 behaves
the same as Haskell98.

This commit was sponsored by Denis Dzyubenko on Patreon.
2020-10-19 11:26:16 -04:00
Joey Hess
cf33be21ac
releasing package git-annex version 8.20201007 2020-10-07 14:10:56 -04:00
Joey Hess
5555697ae6
Enable building with git-annex benchmark by default
Only turning it off when the criterion library is not installed.

Not enabled for osx or i386ancient yet since that will need some
invesitgation to update their respective stack.yaml files.
2020-10-02 13:57:10 -04:00
Joey Hess
1a785d05c0
releasing package git-annex version 8.20200908 2020-09-08 14:20:47 -04:00
Joey Hess
6ea511beb4
Removed the S3 and WebDAV build flags
So these special remotes are always supported.

IIRC these build flags were added because the dep chains were a bit too
long, or perhaps because the libraries were not available in Debian stable,
or something like that. That was long ago, those reasons no longer apply,
and users get confused when builtin special remotes are not available, so
it seems best to remove the build flags now.

If this does cause a problem it can be reverted of course..

This commit was sponsored by Jochen Bartl on Patreon.
2020-09-08 12:42:59 -04:00
Joey Hess
8656afd3e1
rename http special remote to httpalso
"http" was too generic and easy to confuse with web. The new name makes
clear it's used in addition to some other remote. And other protocols
can use the same naming scheme.
2020-09-02 10:41:53 -04:00
Joey Hess
571ec900ac
Added http special remote, which is useful for accessing other remotes that publish content stored in them via http/https.
With automatic layout learning!
2020-09-01 15:16:35 -04:00
Joey Hess
b24ba92231
refactor out Annex.PidLock 2020-08-26 12:29:13 -04:00
Joey Hess
06a4ab39fa
wip external remote async protocol extension 2020-08-12 15:17:53 -04:00
Joey Hess
bcbdada8bf
fixed 2020-08-10 13:12:55 -04:00
Joey Hess
f75be32166
external backends wip
It's able to start them up, the only thing not implemented is generating
and verifying keys. And, the key translation for HasExt.
2020-07-29 15:23:18 -04:00
Joey Hess
555fe669e1
refactoring in preparation for external backends 2020-07-29 12:00:27 -04:00
Joey Hess
abd56fb019
Fix a bug in find --batch in the previous version. 2020-07-20 19:50:53 -04:00
Joey Hess
af901d1366
releasing package git-annex version 8.20200720 2020-07-20 14:41:12 -04:00
Joey Hess
087b7ee66a
Revert "data type that starts off using a set but converts to a bloom filter when large"
This reverts commit 7e2c4ed216.

I was not able to use this in the end..
See comment in the previous commit.
2020-07-01 20:12:19 -04:00
Joey Hess
7e2c4ed216
data type that starts off using a set but converts to a bloom filter when large
This adds a dep on hashable, but it's a free dependency, since
unordered-containers already pulled it in.

Using unordered-containers for the set seems to make sense, since it
hashes and bloom filter hashes too. (Though different hashes.)
I dunno, never quite know if I should use unordered-containers or containers.
2020-07-01 14:06:12 -04:00
Joey Hess
104b3a9c6a
Build with the http-client-restricted library when available
Otherwise use the vendored copy as before.

The library is in Debian testing but not stable. Once it reaches
stable, the vendored copy can be removed.

Did not add it to debian/control because IIRC that's used to build
git-annex on stable too, possibly. However, the Debian maintainer will
probably want to make the package depend on libghc-http-client-restricted-dev

This commit was sponsored by Ilya Shlyakhter on Patreon.
2020-06-22 11:31:31 -04:00
Joey Hess
01eb863a14
Build with the git-lfs library when available
Otherwise use the vendored copy as before.

The library is in Debian testing but not stable. Once it reaches
stable, the vendored copy can be removed.

Did not add it to debian/control because IIRC that's used to build
git-annex on stable too, possibly. However, the Debian maintainer will
probably want to make the package depend on libghc-git-lfs-dev.

This commit was sponsored by Ilya Shlyakhter on Patreon.
2020-06-22 11:21:25 -04:00
Joey Hess
6ef62cb3c7
fix unused import warning
Network.HTTP.Client exports makeConnection since 0.5.3.

Debian stable has a newer version than 0.5.3, so bumping the
min version seems better than adding an ifdef.
2020-06-22 10:55:37 -04:00
Joey Hess
48a88d822d
releasing package git-annex version 8.20200617 2020-06-17 15:59:34 -04:00
Joey Hess
660d8d3a87
simpler way to do this
Remove old code that can be trivially implemented using async in a much
nicer way (that is async exception safe).

I've audited all forkOS calls (except for ones in the assistant),
and this was the last remaining one that is not async exception safe.
The rest look ok to me.
2020-06-05 14:18:06 -04:00
Joey Hess
c429bbf2bd
remove workaround for old versions of process
ghc 8.4.4 has process 1.6.3, which was the first version to include
getPid.
2020-06-03 16:03:08 -04:00
Joey Hess
156e728b56
bump process version
Want to use eg withCreateProcess.

The base constraint already implied a ghc version bundled with process
1.6 or newer.
2020-06-03 12:09:41 -04:00
Joey Hess
484a74f073
auto-init autoenable=yes
Try to enable special remotes configured with autoenable=yes when git-annex
auto-initialization happens in a new clone of an existing repo. Previously,
git-annex init had to be explicitly run to enable them. That was a bit of a
wart of a special case for users to need to keep in mind.

Special remotes cannot display anything when autoenabled this way, to avoid
interfering with the output of git-annex query commands.

Any error messages will be hidden, and if it fails, nothing is displayed.
The user will realize the remote isn't enable when they try to use it,
and can run git-annex init manually then to try the autoenable again and
see what failed.

That seems like a reasonable approach, and it's less complicated than
communicating something across a pipe in order to display it as a side
message. Other reason not to do that is that, if the first command the
user runs is one like git-annex find that has machine readable output,
any message about autoenable failing would need to not be displayed anyway.
So better to not display a failure message ever, for consistency.

(Had to split out Remote.List.Util to avoid an import cycle.)
2020-05-27 12:40:35 -04:00
Joey Hess
01513da127
releasing package git-annex version 8.20200522 2020-05-22 12:07:59 -04:00
Joey Hess
6952060665
addurl --preserve-filename and a few related changes
* addurl --preserve-filename: New option, uses server-provided filename
  without any sanitization, but with some security checking.

  Not yet implemented for remotes other than the web.

* addurl, importfeed: Avoid adding filenames with leading '.', instead
  it will be replaced with '_'.

  This might be considered a security fix, but a CVE seems unwattanted.
  It was possible for addurl to create a dotfile, which could change
  behavior of some program. It was also possible for a web server to say
  the file name was ".git" or "foo/.git". That would not overrwrite the
  .git directory, but would cause addurl to fail; of course git won't
  add "foo/.git".

sanitizeFilePath is too opinionated to remain in Utility, so moved it.

The changes to mkSafeFilePath are because it used sanitizeFilePath.
In particular:

	isDrive will never succeed, because "c:" gets munged to "c_"
	".." gets sanitized now
	".git" gets sanitized now
	It will never be null, because sanitizeFilePath keeps the length
	the same, and splitDirectories never returns a null path.

Also, on the off chance a web server suggests a filename of "",
ignore that, rather than trying to save to such a filename, which would
fail in some way.
2020-05-08 16:22:55 -04:00
Joey Hess
2d51dd2e8c
changes required by cabal-version 1.10
Extensions got renamed.

Default-Language is required. I had to put Haskell98 because there are
subtle differences between 98 and 2010 and git-annex has always been
built with the default, which was 98 when there was a default. I don't
know how to establish that git-annex will behave the same under 2010.
2020-05-04 15:49:14 -04:00
Joey Hess
a5830c3f6e
bump cabal-version
hackage now requires 1.10 or newer
2020-05-04 15:45:10 -04:00
Joey Hess
e72ab75633
releasing package git-annex version 8.20200501 2020-05-01 17:41:36 -04:00
Joey Hess
04352ed9c5
check-ignore resource pool
Much like check-attr before.
2020-04-21 11:25:28 -04:00
Joey Hess
cee6b344b4
cat-file resource pool
Avoid running a large number of git cat-file child processes when run with
a large -J value.

This implementation takes care to avoid adding any overhead to git-annex
when run without -J. When run with -J, there is a small bit of added
overhead, to manipulate the resource pool. That optimisation added a
fair bit of complexity.
2020-04-20 15:19:31 -04:00
Joey Hess
fe9cf1256e
move remoteList into dupState
This does mean that RemoteDaemon.Transport.Tor's call runs it, otherwise
no change, but this is groundwork for doing more such expensive actions
in dupState.
2020-04-17 14:36:45 -04:00
Joey Hess
bcc0ec5b99
fix runtime crash on incomplete pattern match in lambda
This was very susprising to me that it was not caught by -Wall, so I
enabled -Wincomplete-uni-patterns to catch such things. It found a
second one just lines above, but no others anywhere.
2020-04-13 16:03:21 -04:00
Joey Hess
2caf579718
cache annex index filename for 1.5% speedup to queries 2020-04-10 13:37:04 -04:00
Joey Hess
dbad6c5c39
releasing package git-annex version 8.20200330 2020-03-30 13:46:24 -04:00
Joey Hess
14a4a9f4cd
releasing package git-annex version 8.20200309 2020-03-09 17:08:16 -04:00
Joey Hess
bfa015ae4e
Merge branch 'v7' 2020-02-26 18:49:36 -04:00
Joey Hess
f2a50a9944
version 2020-02-26 18:42:34 -04:00
Joey Hess
029c883713
Merge branch 'master' into v8 2020-02-19 14:32:11 -04:00
Joey Hess
cd8a208b8c
releasing package git-annex version 7.20200219 2020-02-19 12:45:30 -04:00
Joey Hess
46bf2a259b
releasing package git-annex version 7.20200204 2020-02-04 14:33:03 -04:00
Joey Hess
818d140748
oh I lost my version bump here
gonna update the release tag due to dumb joke
2020-02-02 17:03:00 -04:00
Joey Hess
c20fe23079
remove deleted module 2020-01-15 13:08:39 -04:00
Joey Hess
71f78fe45d
wip separate RemoteConfig parsing
Remote now contains a ParsedRemoteConfig. The parsing happens when the
Remote is constructed, rather than when individual configs are used.

This is more efficient, and it lets initremote/enableremote
reject configs that have unknown fields or unparsable values.

It also allows for improved type safety, as shown in
Remote.Helper.Encryptable where things that used to match on string
configs now match on data types.

This is a work in progress, it does not build yet.

The main risk in this conversion is forgetting to add a field to
RemoteConfigParser. That will prevent using that field with
initremote/enableremote, and will prevent remotes that already are set
up from seeing that configuration. So will need to check carefully that
every field that getRemoteConfigValue is called on has been added to
RemoteConfigParser.

(One such case I need to remember is that credPairRemoteField needs to be
included in the RemoteConfigParser.)
2020-01-13 12:39:21 -04:00
Joey Hess
71ecfbfccf
be stricter about rejecting invalid configurations for remotes
This is a first step toward that goal, using the ProposedAccepted type
in RemoteConfig lets initremote/enableremote reject bad parameters that
were passed in a remote's configuration, while avoiding enableremote
rejecting bad parameters that have already been stored in remote.log

This does not eliminate every place where a remote config is parsed and a
default value is used if the parse false. But, I did fix several
things that expected foo=yes/no and so confusingly accepted foo=true but
treated it like foo=no. There are still some fields that are parsed with
yesNo but not not checked when initializing a remote, and there are other
fields that are parsed in other ways and not checked when initializing a
remote.

This also lays groundwork for rejecting unknown/typoed config keys.
2020-01-10 14:52:48 -04:00