Commit graph

204 commits

Author SHA1 Message Date
Joey Hess
a18eae9a0f
nice git ack space optimisation when setting the same metadata value for multiple files 2014-02-13 01:57:43 -04:00
Joey Hess
361aee0470
avoid churning in git to no benefit when optimising metadata log
I think this is now optimal.
2014-02-12 23:24:04 -04:00
Joey Hess
8076530284 improve simplifier 2014-02-12 22:50:41 -04:00
Joey Hess
a05ac13e92 fix metadata log simplifier and additional quickcheck tests 2014-02-12 22:27:55 -04:00
Joey Hess
9f7e76130e add metadata command to get/set metadata
Adds metadata log, and command.

Note that unsetting field values seems to currently be broken.
And in general this has had all of 2 minutes worth of testing.

This commit was sponsored by Julien Lefrique.
2014-02-12 21:30:33 -04:00
Joey Hess
c390e896d1 fix windows build (and make --stop work on windows, incidentially)
The Utility.PID will clean up other code soon.
2014-02-11 15:25:59 -04:00
Joey Hess
4f7e72b51a fix parsing of unused log; keys can contain spaces 2014-02-08 15:27:11 -04:00
Joey Hess
a44e01c29c --in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}" 2014-02-06 12:43:56 -04:00
Joey Hess
1572c460e8 avoid using openFile when withFile can be used
Potentially fixes some FD leak if an action on an opened file handle fails
for some reason. There have been some hard to reproduce reports of
git-annex leaking FDs, and this may solve them.
2014-02-03 10:19:06 -04:00
Joey Hess
32f1f68dc9 typo 2014-01-28 17:17:21 -04:00
Joey Hess
f0dfac4d96 fix build with old ghc that used old-time type 2014-01-28 17:14:43 -04:00
Joey Hess
eefda291c6 fix warning 2014-01-28 14:43:20 -04:00
Joey Hess
891c85cd88 use locking on Windows
This is all the easy cases, where there was already a separate lock file.
2014-01-28 14:42:03 -04:00
Joey Hess
3518c586cf fix transfers of key with no associated file
Several places assumed this would not happen, and when the AssociatedFile
was Nothing, did nothing.

As part of this, preferred content checks pass the Key around.

Note that checkMatcher is sometimes now called with Just Key and Just File.
It currently constructs a FileMatcher, ignoring the Key. However, if it
constructed a FileKeyMatcher, which contained both, then it might be
possible to speed up parts of Limit, which currently call the somewhat
expensive lookupFileKey to get the Key.

I have not made this optimisation yet, because I am not sure if the key is
always the same. Will need some significant checking to satisfy myself
that's the case..
2014-01-23 16:44:02 -04:00
Joey Hess
e0bd088f08 add webapp UI to manage unused files 2014-01-23 15:09:43 -04:00
Joey Hess
3da0064657 assistant unused file handling
Make sanity checker run git annex unused daily, and queue up transfers
of unused files to any remotes that will have them. The transfer retrying
code works for us here, so eg when a backup disk remote is plugged in,
any transfers to it are done. Once the unused files reach a remote,
they'll be removed locally as unwanted.

If the setup does not cause unused files to go to a remote, they'll pile
up, and the sanity checker detects this using some heuristics that are
pretty good -- 1000 unused files, or 10% of disk used by unused files,
or more disk wasted by unused files than is left free. Once it detects
this, it pops up an alert in the webapp, with a button to take action.

TODO: Webapp UI to configure this, and also the ability to launch an
immediate cleanup of all unused files.

This commit was sponsored by Simon Michael.
2014-01-22 22:53:18 -04:00
Joey Hess
4b55afe9e9 add "unused" preferred content expression
With a really nice optimisation that keeps it from having any overhead
in normal operation!

This commit was sponsored by Ulises Vitulli.
2014-01-22 16:35:32 -04:00
Joey Hess
ae3cd632bd add timestamps to unused log files
This will be used in expiring old unused objects. The timestamp is when it
was first noticed it was unused.

Backwards compatability: It supports reading old format unused log files.
The old version of git-annex will ignore lines in log files written by the
new version, so the worst interop problem would be git annex dropunused not
knowing some numbers that git-annex unused reported.
2014-01-22 15:33:02 -04:00
Joey Hess
f7cdc40f7b reorg 2014-01-21 18:08:56 -04:00
Joey Hess
0ef282a116 numcopies cleanup, part 2
This includes several bug fixes.
2014-01-21 17:25:39 -04:00
Joey Hess
b40df4f0d0 reorganize numcopies code (no behavior changes)
Move stuff into Logs.NumCopies. Add a NumCopies newtype.

Better names for various serialization classes that are specific to one
thing or another.
2014-01-21 16:08:59 -04:00
Joey Hess
d66535f065 global numcopies setting
* numcopies: New command, sets global numcopies value that is seen by all
  clones of a repository.
* The annex.numcopies git config setting is deprecated. Once the numcopies
  command is used to set the global number of copies, any annex.numcopies
  git configs will be ignored.
* assistant: Make the prefs page set the global numcopies.

This global numcopies setting is needed to let preferred content
expressions operate on numcopies.

It's also convenient, because typically if you want git-annex to preserve N
copies of files in a repo, you want it to do that no matter which repo it's
running in. Making it global avoids needing to warn the user about gotchas
involving inconsistent annex.numcopies settings.
(See changes to doc/numcopies.mdwn.)

Added a new variety of git-annex branch log file, that holds only 1 value.
Will probably be useful for other stuff later.

This commit was sponsored by Nicolas Pouillard.
2014-01-20 16:47:56 -04:00
Joey Hess
93161d0dea copyright year 2014-01-08 16:29:15 -04:00
Joey Hess
3e68c1c2fd add remote state logs
This allows a remote to store a piece of arbitrary state associated with a
key. This is needed to support Tahoe, where the file-cap is calculated from
the data stored in it, and used to retrieve a key later. Glacier also would
be much improved by using this.

GETSTATE and SETSTATE are added to the external special remote protocol.

Note that the state is left as-is even when a key is removed from a remote.
It's up to the remote to decide when it wants to clear the state.

The remote state log, $KEY.log.rmt, is a UUID-based log. However,
rather than using the old UUID-based log format, I created a new variant
of that format. The new varient is more space efficient (since it lacks the
"timestamp=" hack, and easier to parse (and the parser doesn't mess with
whitespace in the value), and avoids compatability cruft in the old one.

This seemed worth cleaning up for these new files, since there could be a
lot of them, while before UUID-based logs were only used for a few log
files at the top of the git-annex branch. The transition code has also
been updated to handle these new UUID-based logs.

This commit was sponsored by Daniel Hofer.
2014-01-03 16:35:57 -04:00
Joey Hess
8e3032df2d added GETWANTED, SETWANTED for Tobias's flickr remote
This was unexpectedly difficult because of a depdenency cycle. To parse a
preferred content expression involves several things that need to operate
on the list of remotes. Which needs Remote.External. The only way to avoid
this cycle (I tried breaking it at several points) was to skip parsing the
expression in SETWANTED.

That's sorta ok, because git-annex already has to deal with unparsable
preferred content expressions being stored, in order to handle eg,
upgrades. But I'm still not very happy that I cannot check it.

I feel this is a strong indication that I need to beware of further
bloating the special remote protocol interface.
2014-01-01 20:12:20 -04:00
Joey Hess
f0a6de1ca2 add PreferredContentExpression type 2014-01-01 19:58:02 -04:00
Richard Hartmann
974fe009bf Another round of s/amoung/among/ 2013-12-19 12:30:53 -04:00
Joey Hess
f931272681 syntax 2013-12-11 00:18:58 -04:00
Joey Hess
011b8bc7ec pull in Win32-extras, to be able to get current process id in Windows
Fixed up a number of things that had worked around there not being a way to
get that.

Most notably, transfer info files on windows now include the process id,
since no locking is currently done. This means the file format varies
between windows and unix.
2013-12-11 00:15:10 -04:00
Joey Hess
ecd42aef8e different PID types for Unix and Windows
Windows has a larger (unsigned) PID space, so cannot use the unix CInt
there.

Note that TransferInfo does not yet ever get the TransferPid populated,
as there is missing locking.
2013-12-10 23:48:42 -04:00
Joey Hess
6edac746f0 merge improved fsck types from git-repair and some associated changes 2013-11-30 14:29:11 -04:00
Joey Hess
53ab737723 clean up cruft left in log by bug 2013-11-09 14:30:26 -04:00
Joey Hess
8e1b8af6e7 fix crash on empty description
Caused by bug fixed in 46cf00ffd8
2013-11-09 13:50:44 -04:00
Joey Hess
049e80e865 refactor 2013-10-28 14:05:55 -04:00
Joey Hess
d345e5b52f add git fsck to cronner, and UI for repository repair (not yet wired up) 2013-10-22 16:02:52 -04:00
Joey Hess
92d5452a19 write via temp file 2013-10-14 16:15:38 -04:00
Joey Hess
296e21b381 add schedule command
Mostly because it gives me an excuse and a hook to document the schedule
expression format.
2013-10-13 15:40:38 -04:00
Joey Hess
88ec6eff15 add/remove/edit schedule UI working
Once I built the basic widget, it turned out to be rather easy to replicate
it once per scheduled activity and wire it all up to a fully working UI.

This does abuse yesod's form handling a bit, but I think it's ok.
And it would be nice to have it all ajax-y, so that saving one modified
form won't lose any modifications to other forms. But for now, a nice
simple 115 line of code implementation is a win.

This late night hack session commit was sponsored by Andrea Rota.
2013-10-11 03:04:11 -04:00
Joey Hess
af5e1d0494 half way complete cronner thread to run scheduled activities 2013-10-08 11:48:28 -04:00
Joey Hess
b9375acb18 add schedule to vicfg 2013-10-07 17:11:13 -04:00
Joey Hess
29ca49dad4 add a log file for scheduled activities 2013-10-07 16:06:34 -04:00
Joey Hess
57d49a6d04 remove *>=> and >=*> ; use <$$> instead
I forgot I had <$$> hidden away in Utility.Applicative.
It allows doing the same kind of currying as does >=*>
and I found using it made the code more readable for me.

(*>=> was not used)
2013-09-27 19:58:48 -04:00
Joey Hess
c1990702e9 hlint 2013-09-25 23:19:01 -04:00
Joey Hess
4dc4a9a385 assistant: Clear the list of failed transfers when doing a full transfer scan. This prevents repeated retries to download files that are not available, or are not referenced by the current git tree.
This is motivated by a user report that the assistant was repeatedly
retrying transfers of files that had been deleted (in direct mode, so
removing the only copy).

Note that the glacier code retries failed transfers after a while to retry
downloads that have aged long enough to be available. This is ok; if we're
doing a full transfer scan we'll retry on every file that is still in the
git tree.

Also note that this makes the assistant less likely to get every file
referenced by old revs of the git tree. Not something the assistant tries
to ensure anyway, so I feel this is acceptable.
2013-09-25 11:46:17 -04:00
Joey Hess
eb42bde19a sync, pre-commit, indirect: Avoid unnecessarily catting non-symlink files from git, which can be so large it runs out of memory. 2013-09-19 14:48:42 -04:00
Joey Hess
51ce7fcaf1 fix warning 2013-09-04 21:37:13 -04:00
Joey Hess
0831e18372 forget --drop-dead: Completely removes mentions of repositories that have been marked as dead from the git-annex branch.
Wrote nice pure transition calculator, and ugly code to stage its results
into the git-annex branch. Also had to split up several Log modules
that Annex.Branch needed to use, but that themselves used Annex.Branch.

The transition calculator is limited to looking at and changing one file at
a time. While this made the implementation relatively easy, it precludes
transitions that do stuff like deleting old url log files for keys that are
being removed because they are no longer present anywhere.
2013-08-31 17:51:13 -04:00
Joey Hess
62beaa1a86 refactor git-annex branch log filename code into central location
Having one module that knows about all the filenames used on the branch
allows working back from an arbitrary filename to enough information about
it to implement dropping dead remotes and doing other log file compacting
as part of a forget transition.
2013-08-29 19:13:00 -04:00
Joey Hess
4a915cd3cd add forget command
Works, more or less. --dead is not implemented, and so far a new branch
is made, but keys no longer present anywhere are not scrubbed.

git annex sync fails to push the synced/git-annex branch after a forget,
because it's not a fast-forward of the existing synced branch. Could be
fixed by making git-annex sync use assistant-style sync branches.
2013-08-28 16:41:13 -04:00
Joey Hess
fcd5c167ef untested transition detection on merging, and transition running code 2013-08-28 15:57:42 -04:00