git-annex/doc/git-annex-fsck.mdwn

132 lines
4 KiB
Text
Raw Normal View History

# NAME
git-annex fsck - check for problems
# SYNOPSIS
git annex fsck `[path ...]`
# DESCRIPTION
With no parameters, this command checks the whole annex for consistency,
and warns about or fixes any problems found. This is a good complement to
`git fsck`.
With parameters, only the specified files are checked.
# OPTIONS
* `--from=remote`
Check a remote, rather than the local repository.
Note that by default, files will be copied from the remote to check
their contents. To avoid this expensive transfer, and only
verify that the remote still has the files that are expected to be on it,
add the `--fast` option.
* `--fast`
Avoids expensive checksum calculations (and expensive transfers when
fscking a remote).
* `--incremental`
Start a new incremental fsck pass. An incremental fsck can be interrupted
at any time, with eg ctrl-c.
* `--more`
Continue the last incremental fsck pass, where it left off.
* `--incremental-schedule=time`
This makes a new incremental fsck be started only a specified
time period after the last incremental fsck was started.
The time is in the form "10d" or "300h".
Maybe you'd like to run a fsck for 5 hours at night, picking up each
night where it left off. You'd like this to continue until all files
have been fscked. And once it's done, you'd like a new fsck pass to start,
but no more often than once a month. Then put this in a nightly cron job:
git annex fsck --incremental-schedule 30d --time-limit 5h
* `--distributed`
Normally, fsck only fixes the git-annex location logs when an inconsistecy
is detected. In distributed mode, each file that is checked will result
in a location log update noting the time that it was present.
This is useful in situations where repositories cannot be trusted to
continue to exist. By running a periodic distributed fsck, those
repositories can verify that they still exist and that the information
about their contents is still accurate.
This is not the default mode, because each distributed fsck increases
the size of the git-annex branch. While it takes care to log identical
location tracking lines for all keys, which will delta-compress well,
there is still overhead in committing the changes. If this causes
the git-annex branch to grow too big, it can be pruned using
[[git-annex-forget]](1)
* `--expire="[repository:]time`..."
This option makes the fsck check for location logs of the specified
repository that have not been updated by a distributed fsck within the
specified time period. Such stale location logs are then thrown out, so
git-annex will no longer think that a repository contains data, if it is
not participating in distributed fscking.
The repository can be specified using the name of a remote,
or the description or uuid of the repository. If a time is specified
without a repository, it is used as the default value for all
repositories. Note that location logs for the current repository are
never expired, since they can be verified directly.
The time is in the form "60d" or "1y". A time of "never" will disable
expiration.
Note that a remote can always run `fsck` later on to re-update the
location log if it was expired in error.
* `--numcopies=N`
Override the normally configured number of copies.
To verify data integrity only while disregarding required number of copies,
use `--numcopies=1`.
* `--all`
Normally only the files in the currently checked out branch
are fscked. This option causes all versions of all files to be fscked.
This is the default behavior when running git-annex in a bare repository.
* `--unused`
Operate on files found by last run of git-annex unused.
* `--key=keyname`
Use this option to fsck a specified key.
* file matching options
The [[git-annex-matching-options]](1)
can be used to specify files to fsck.
# OPTIONS
# SEE ALSO
[[git-annex]](1)
# AUTHOR
Joey Hess <id@joeyh.name>
Warning: Automatically converted into a man page by mdwn2man. Edit with care.