Commit graph

51 commits

Author SHA1 Message Date
Joey Hess
904be4e6be
add --branch option to git-annex find and mildly deprecate findref in favor of it
No deprecation warning at run time, just one on the man page.

One thing findref remains able to do that find cannot is to run in a bare
repo. Find was made to refuse to run in a bare repo because it seemed
confusing for it to not list any files ever in that situation. It would be
better for find --branch to work in a bare repo but not without --branch
but I don't currently have a way to do that.

Probably a better solution would be to make git-annex in a bare repo
default to --branch master or something like that instead of --all.

This commit was sponsored by Denis Dzyubenko on Patreon.
2018-12-09 14:10:37 -04:00
Joey Hess
029ae8d4db
support findred and --branch with file matching options
* findref: Support file matching options: --include, --exclude,
  --want-get, --want-drop, --largerthan, --smallerthan, --accessedwithin
* Commands supporting --branch now apply file matching options --include,
  --exclude, --want-get, --want-drop to filenames from the branch.
  Previously, combining --branch with those would fail to match anything.
* add, import, findref: Support --time-limit.

This commit was sponsored by Jake Vosloo on Patreon.
2018-12-09 13:38:35 -04:00
Joey Hess
4a788fbb3b
sync --content now supports --hide-missing adjusted branches
This relies on git ls-files --with-tree, which I'm using in a way that
its man page does not document. Hm. I emailed the git list to try to get
the docs improved, but at least the git test suite does test the same
kind of use case I'm using here.

Performance impact when not in an adjusted branch is limited to some
additional MVar accesses, and a single git call to determine the name of
the current branch. So very minimal.

When in an adjusted branch, the performance impact is
in Annex.WorkTree.lookupFile, which starts doing an equal amount of work
for files that didn't exist as it already did for files that were
unlocked.

This commit was sponsored by Jochen Bartl on Patreon.
2018-10-19 17:51:25 -04:00
Joey Hess
53526136e8
move commandAction out of CmdLine.Seek
This is groundwork for nested seek loops, eg seeking over all files and
then performing commandActions on a list of remotes, which can be done
concurrently.

This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
2018-10-01 14:12:06 -04:00
Joey Hess
50217f62a1
avoid duplicate add action for v6 unlocked modified file
The new second pass sees the file as type changed because the first
pass's changes have typically not reached git yet. So, have to
explicitly check for unmodified files in the second pass.

Note that, if the file has been touched but not really modified,
the first pass will handle it, and so the second pass does nothing.

This commit was sponsored by Jochen Bartl on Patreon.
2018-09-12 15:20:34 -04:00
Joey Hess
2743224658
change v6 git-annex add of staged unmodified unlocked file
v6: When a file is unlocked but has not been modified, and the unlocking is
only staged, git-annex add did not lock it. Now it will, for consistency
with how modified files are handled and with v5.

Note the removal of the sameInodeCache check. Otherwise it would see
that the unmodified file is unmodified and stop there. That check seems to have
been copied from the direct mode branch. But, direct mode had a specific
reason to check for unmodified content, that does not apply to v6.

The second pass means there is potential for a race, eg the unlocked
file could be modified in between the first and second passes.
No problem with that, since both passes do the same thing.

This commit was sponsored by Jake Vosloo on Patreon.
2018-09-12 14:00:05 -04:00
Joey Hess
bea0ad220a
avoid --all buffering list of all keys
In Annex.Branch.branch, the (++) was killing laziness.
Rewrote so it streams lazily.

filterM also kills laziness, so made loggedKeys use a Unchecked type,
and check if the key is dead in the seek loop.

Note that loggedKeysFor still buffers, so git-annex info <remote> and
git-annex unused --from remote still use more memory than necessary.

Also removed some unused functions from Annex.Journal.
2018-04-26 16:00:20 -04:00
Joey Hess
bfa26661d1
import: Avoid buffering all filenames to be imported in memory.
Test case is 24 directories each containing files named 1..10000.
The concat and filterM destroyed what laziness there is in
dirContentsRecursive, making it buffer all the filenames. Memory
use was around 300 mb (possibly growing slightly as it progressed).
After this fix, memory use drops to a constant 59 mb.

Note that dirContentsRecursive still buffers the entire content of a
directory (not subdirectories) so this is still not optimal.
2018-04-26 12:06:12 -04:00
Joey Hess
fc845e6530
more lambda-case conversion 2017-12-05 15:00:50 -04:00
Joey Hess
85ed38a574
Avoid repeated checking that files passed on the command line exist.
git annex add, git annex lock etc make multiple seek passes,
and each seek pass checked that files existed. That was unncessary
redundant work.

Fixed by adding a new WorkTreeItem type, make seek actions use it,
and check that the files exist when constructing it.

This commit was supported by the NSF-funded DataLad project.
2017-10-16 14:10:20 -04:00
Joey Hess
e662aceeac
improve type 2017-08-31 12:47:08 -04:00
Joey Hess
bb060f000f
error when metadata set is used with file that does not exist
When setting metadata of a file that did not exist, no error message was
displayed, unlike getting metadata and most other git-annex commands. Fixed
this oversight.

Note that, if the file exists but is not annexed, there's no error.
This is the same behavior as other git-annex commands.

This commit was supported by the NSF-funded DataLad project.
2017-06-01 11:40:47 -04:00
Joey Hess
34db79e1a5
Bugfix: Passing a command a filename that does not exist sometimes did not display an error, when a path to a directory was also passed.
It was relying on segmentPaths to work correctly, so when it didn't,
sometimes the file that did not exist got matched up with a non-null
list of results. Fixed by always checking if each parameter exists.

There are two reason segmentPaths might not work correctly.

For one, it assumes that when the original list of paths
has more than 100 paths, it's not worth paying the CPU cost to
preserve input orders.

And then, it fails when a directory such as "." or ".." or
/path/to/repo is in the input list, and the list of found paths
does not start with that same thing. It should probably not be using
dirContains, but something else.

But, it's not clear how to handle this fully. Consider
when [".", "subdir"] has been expanded by git ls-files to
["subdir/1", "subdir/2"]
-- Both of the inputs contained those results, so there's
no one right answer for segmentPaths. All these would be equally valid:
	[["subdir/1", "subdir/2"], []]
	[[], ["subdir/1", "subdir/2"]]
	[["subdir/1"], [""subdir/2"]]

So I've not tried to improve segmentPaths.
2017-03-02 13:06:20 -04:00
Joey Hess
0a4479b8ec
Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors.
ghc 8 added backtraces on uncaught errors. This is great, but git-annex was
using error in many places for a error message targeted at the user, in
some known problem case. A backtrace only confuses such a message, so omit it.

Notably, commands like git annex drop that failed due to eg, numcopies,
used to use error, so had a backtrace.

This commit was sponsored by Ethan Aubin.
2016-11-15 21:29:54 -04:00
Joey Hess
1a0e2c9901
get, move, copy, mirror: Added --failed switch which retries failed copies/moves
Note that get --from foo --failed will get things that a previous get --from bar
tried and failed to get, etc. I considered making --failed only retry
transfers from the same remote, but it was easier, and seems more useful,
to not have the same remote requirement.

Noisy due to some refactoring into Types/
2016-08-03 12:37:12 -04:00
Joey Hess
d13194b230
--branch, stage 2
Show branch:file that is being operated on.

I had to make ActionItem a type and not a type class because
withKeyOptions' passed two different types of values when using the type
class, and I could not get the type checker to accept that.
2016-07-20 15:23:43 -04:00
Joey Hess
4236dc83a8
propigate error 2016-07-20 13:44:50 -04:00
Joey Hess
bf8bf14e8e
--branch, stage 1
Added --branch option to copy, drop, fsck, get, metadata, mirror, move, and
whereis commands. This option makes git-annex operate on files that are
included in a specified branch (or other treeish).

The names of the files from the branch that are being operated on are not
displayed yet; only the keys. Displaying the filenames will need changes
to every affected command.

Also, note that --branch can be specified repeatedly. This is not really
documented, but seemed worth supporting, especially since we may later want
the ability to operate on all branches matching a refspec. However, when
operating on two branches that contain the same key, that key will be
operated on twice.
2016-07-20 12:05:26 -04:00
Joey Hess
737e45156e
remove 163 lines of code without changing anything except imports 2016-01-20 16:36:33 -04:00
Joey Hess
ec28151722
improve data type 2016-01-01 15:56:24 -04:00
Joey Hess
f7256842cc
wait for git lstree to exit 2016-01-01 15:51:29 -04:00
Joey Hess
a983a3a7a2
rename stuff for v5 unlocked files to indicate it's old 2015-12-15 14:08:07 -04:00
Joey Hess
e7183d83d3
fsck for v6 unlocked files
This only adds 1 stat to each file fscked for locked files, so
added overhead is minimal.

For unlocked files it has to access the database to see if a file
is modified.
2015-12-11 16:07:54 -04:00
Joey Hess
751120c171
avoid pre-commit hook messing up new-style unlocked files in v6 repo 2015-12-09 15:18:54 -04:00
Joey Hess
664cc987e8
support pointer files
Backend.lookupFile is changed to always fall back to catKey when
operating on a file that's not a symlink.

catKey is changed to understand pointer files, as well as annex symlinks.

Before, catKey needed a file mode witness, to be sure it was looking at a
symlink. That was complicated stuff. Now, it doesn't actually care if a
file in git is a symlink or not; in either case asking git for the content
of the file will get the pointer to the key.

This does mean that git-annex will treat a link
foo -> WORM--bar as a git-annex file, and also treats
a regular file containing annex/objects/WORM--bar as a git-annex file.

Calling catKey could make git-annex commands need to do more work than
before. This would especially be the case if a repo contained many regular
files, and only a few annexed files, as now git-annex will need to ask
git about the contents of the regular files.
2015-12-07 15:35:36 -04:00
Joey Hess
0f66f766b0 metadata: Fix reversion introduced in 5.20150727 that caused display of metadata to not work. 2015-08-11 13:19:01 -04:00
Joey Hess
160d4b9fe0 convert Unused, and remove some dead code for old style option parsing 2015-07-10 16:05:56 -04:00
Joey Hess
032e6485fa use Alternative for parsing KeyOptions 2015-07-09 12:44:03 -04:00
Joey Hess
60806dd191 wip 2015-07-08 17:59:06 -04:00
Joey Hess
a2ba701056 started converting to use optparse-applicative
This is a work in progress. It compiles and is able to do basic command
dispatch, including git autocorrection, while using optparse-applicative
for the core commandline parsing.

* Many commands are temporarily disabled before conversion.
* Options are not wired in yet.
* cmdnorepo actions don't work yet.

Also, removed the [Command] list, which was only used in one place.
2015-07-08 13:36:25 -04:00
Joey Hess
29c03145e6 sync: Add support for --all and --unused. 2015-06-16 16:50:03 -04:00
Joey Hess
d28e8fbfd5 get --incomplete: New option to resume any interrupted downloads. 2015-06-02 14:20:38 -04:00
Joey Hess
e27b97d364 Merge branch 'master' into concurrentprogress
Conflicts:
	Command/Fsck.hs
	Messages.hs
	Remote/Directory.hs
	Remote/Git.hs
	Remote/Helper/Special.hs
	Types/Remote.hs
	debian/changelog
	git-annex.cabal
2015-05-12 13:23:22 -04:00
Joey Hess
efb37e7c78 Improve behavior when a git-annex command is told to operate on a file that doesn't exist. It will now continue to other files specified after that on the command line, and only error out at the end. 2015-04-30 15:28:17 -04:00
Joey Hess
8077ccbd54 get, move, copy, mirror: Concurrent downloads and uploads are now supported!
This works, and seems fairly robust. Clean get of 20 files at -J3. At -J10,
there are some messages about ssh multiplexing, probably due to a race
spinning up the ssh connection cacher. But, it manages to get all the files
ok regardless.

The progress bars are a scrambled mess though, due to bugs in
ascii-progress, which I've already filed. Particularly this one:
https://github.com/yamadapc/haskell-ascii-progress/issues/8
2015-04-10 17:08:07 -04:00
Joey Hess
8aa6b5f2a6 Fix truncation of parameters that could occur when using xargs git-annex.
This will only ever result in a few more git-ls-files being run than were run
before. (Only 1 more is really needed, but around 10 more are currently run
for a max length command line.)

So, no need to worry about the extra zombie, or lost laziness due to concat.
2015-04-02 01:20:07 -04:00
Joey Hess
cd6b62f35e --auto is no longer a global option; only get, drop, and copy accept it.
Not a behavior change unless you were passing it to a command that ignored it.
2015-03-25 17:06:14 -04:00
Joey Hess
f153079152 metadata: When setting metadata, do not recurse into directories by default, since that can be surprising behavior and difficult to recover from. The old behavior is available by using --force. 2015-02-10 16:06:53 -04:00
Joey Hess
b94eb9b22c relFile does not have to be relative; rename to currFile 2015-02-06 16:03:02 -04:00
Joey Hess
dfab5e6ff4 import: Support file matching options such as --exclude, --include, --smallerthan, --largerthan 2015-02-06 15:58:06 -04:00
Joey Hess
afc5153157 update my email address and homepage url 2015-01-21 12:50:09 -04:00
Joey Hess
3bab5dfb1d revert parentDir change
Reverts 965e106f24

Unfortunately, this caused breakage on Windows, and possibly elsewhere,
because parentDir and takeDirectory do not behave the same when there is a
trailing directory separator.
2015-01-09 13:11:56 -04:00
Joey Hess
965e106f24 made parentDir return a Maybe FilePath; removed most uses of it
parentDir is less safe than takeDirectory, especially when working
with relative FilePaths. It's really only useful in loops that
want to terminate at /

This commit was sponsored by Audric SCHILTKNECHT.
2015-01-06 18:55:56 -04:00
Joey Hess
adc5ca70a8 pre-commit: Block partial commit of unlocked annexed file, since that left a typechange staged in index
I had hoped that the git devs could change git's handling of partial
commits to not use a false index file, but seems not.

So, this relies on some git internals to detect that case. The test suite
has a test case added to catch it if changes to git break it.

This commit was sponsored by Paul Tagliamonte.
2014-11-10 15:36:24 -04:00
Joey Hess
7b50b3c057 fix some mixed space+tab indentation
This fixes all instances of " \t" in the code base. Most common case
seems to be after a "where" line; probably vim copied the two space layout
of that line.

Done as a background task while listening to episode 2 of the Type Theory
podcast.
2014-10-09 15:09:11 -04:00
Joey Hess
9ed63d1545 Promote file not found warning message to an error. 2014-09-11 13:36:28 -04:00
Joey Hess
ecc3dc8433 findref: New command, like find but shows files in a specified git ref. 2014-04-17 18:41:24 -04:00
Joey Hess
2f538dd65c add --include-dotfiles: New option, perhaps useful for backups. 2014-03-26 14:52:07 -04:00
Joey Hess
1669e80e85 Windows: Avoid using unix-compat's rename, which refuses to rename directories.
Opened a bug about this: https://github.com/jystic/unix-compat/issues/10
2014-01-29 15:19:03 -04:00
Joey Hess
cce69eee4d avoid using function named that conflicts with name used in newer version of process library 2014-01-29 13:44:53 -04:00