promote forum post to feature request; add design
This commit is contained in:
parent
e14e74b467
commit
05ead99cc2
2 changed files with 89 additions and 22 deletions
89
doc/todo/Limit_file_revision_history.mdwn
Normal file
89
doc/todo/Limit_file_revision_history.mdwn
Normal file
|
@ -0,0 +1,89 @@
|
|||
Hi, I am assuming to use git-annex-assistant for two usecases, but I would like to ask about the options or planed roadmap for dropped/removed files from the repository.
|
||||
|
||||
Usecases:
|
||||
|
||||
1. sync working directory between laptop, home computer, work komputer
|
||||
2. archive functionality for my photograps
|
||||
|
||||
Both usecases have one common factor. Some files might become obsolate and
|
||||
in long time frame nobody is interested to keep their revisions. Let's
|
||||
assume photographs. Usuall workflow I take is to import all photograps to
|
||||
filesystem, then assess (select) the good ones I want to keep and then
|
||||
process them what ever way.
|
||||
|
||||
Problem with git-annex(-assistant) I have is that it start to revision all
|
||||
of the files at the time they are added to directory. This is welcome at
|
||||
first but might be an issue if you are used to put 80% of the size of your
|
||||
imported files to trash.
|
||||
|
||||
I am aware of what git-annex is not. I have been reading documentation for
|
||||
"git-annex drop" and "unused" options including forums. I do understand
|
||||
that I am actually able to delete all revisions of the file if I will drop
|
||||
it, remove it and if I will run git annex unused 1..###. (on all synced
|
||||
repositories).
|
||||
|
||||
I actually miss the option to have above process automated/replicated to the other synced repositories.
|
||||
|
||||
I would formulate the 'use case' requirements for git-annex as:
|
||||
|
||||
* command to drop an file including revisions from all annex repositories?
|
||||
(for example like moving a file to /trash folder) that will schedulle
|
||||
it's deletition)
|
||||
* option to keep like max. 10 last revisions of the file?
|
||||
* option to keep only previous revisions if younger than 6 months from now?
|
||||
|
||||
Finally, how to specify a feature request for git-annex?
|
||||
|
||||
> By moving it here ;-) --[[Joey]]
|
||||
|
||||
> So, let's spec out a design.
|
||||
>
|
||||
> * Add preferred content terminal to configure whether a repository wants
|
||||
> to hang on to unused content.
|
||||
> Something like "unused=true" I suppose, because not having a parameter
|
||||
> would complicate preferred content parsing, and I cannot think
|
||||
> of a useful parameter.
|
||||
> * In order to quickly match that terminal, the Annex monad will need
|
||||
> to keep a Set of unused Keys. This should only be loaded on demand.
|
||||
> NB: There is some potential for a great many unused Keys to cause
|
||||
> memory usage to balloon.
|
||||
> * Client repositories will end their preferred content with
|
||||
> `and unused=false`. Transfer repositories too, because typically
|
||||
> only client repos connect to them, and so otherwise unused files
|
||||
> would build up there. Backup repos would want unused files. I
|
||||
> think that archive repos would too.
|
||||
> * Make the assistant check for unused files periodically. Exactly
|
||||
> how often may need to be tuned, but once per day seems reasonable
|
||||
> for most repos. Note that the assistant could also notice on the
|
||||
> fly when files are removed and mark their keys as unused if that was
|
||||
> the last associated file. (Only currently possible in direct mode.)
|
||||
> * It makes sense for the
|
||||
> assistant to queue transfers of unused files to any remotes that
|
||||
> do want them (eg, backup remotes). If the files can successfully be
|
||||
> sent to a remote, that will lead to them being dropped locally as
|
||||
> they're not wanted.
|
||||
> * Add a git config setting like annex.expireunused=7d. This causes
|
||||
> *deletion* of unused files after the specified time period if they are
|
||||
> not able to be moved to a repo that wants them.
|
||||
> (The default should be annex.expireunused=false.)
|
||||
> * How to detect how long a file has been unused? We can't look at the
|
||||
> time stamp of the object; we could use the mtime of the .map file,
|
||||
> that that's direct mode only and may be replaced with a database
|
||||
> later. Seems best to just keep a unused log file with timestamps.
|
||||
> * After the assistant scans for unused files, if annex.expireunused
|
||||
> is not set, and there is some significant quantity of unused files
|
||||
> (eg, more than 1000, or more than 1 gb, or more than the amount of
|
||||
> remaining free disk space),
|
||||
> it can pop up a webapp alert asking to configure it.
|
||||
>
|
||||
> This does not cover every use case that was requested.
|
||||
> But I don't see a cheap way to ensure it keeps eg the past 10 versions of
|
||||
> a file. I guess that if you care about that, you leave
|
||||
> annex.expireunused=false, and set up a backup repository where the unused
|
||||
> files will be moved to.
|
||||
>
|
||||
> Note that since the assistant uses direct mode by default, old versions
|
||||
> of modififed files are not guaranteed to be retained. But they very well
|
||||
> might be. For example, if a file is replicated to 2 clients, and one
|
||||
> client directly edits it, or deletes it, it loses the old version,
|
||||
> but the other client will still be storing that old version.
|
Loading…
Add table
Add a link
Reference in a new issue