smudge design

2015-11-23 16:53:05 -04:00 · 2015-11-23 16:53:05 -04:00 · 33fb0de1a3
commit 33fb0de1a3
parent 59c2001d2f
2 changed files with 224 additions and 33 deletions
--- a/doc/devblog/day_339_smudging_out_direct_mode.mdwn
+++ b/doc/devblog/day_339_smudging_out_direct_mode.mdwn
@ -0,0 +1,56 @@
+I'm considering ways to get rid of direct mode, replacing it with something
+better implemented using [[todo/smudge]] filters.
+
+## git-lfs
+
+I started by trying out git-lfs, to see what I can learn from it. My
+feeling is that git-lfs brings an admirable simplicity to using git with
+large files. For example, it uses a push-hook to automatically
+upload file contents before pushing a branch.
+
+But its simplicity comes at the cost of being centralized. You can't make a
+git-lfs repository locally and clone it onto other drive and have the local
+repositories interoperate to pass file contents around. Everything has to
+go back through a centralized server. I'm willing to pay complexity costs
+for decentralization.
+
+Its simplicity also means that the user doesn't have much control over what
+files are present in their checkout of a repository. git-lfs downloads
+all the files in the work tree. It doesn't have facilities for dropping
+files to free up space, or for configuring a repository to only want to get
+a subset of files in the first place. Some of this could be added to it 
+I suppose.
+
+## replacing direct mode
+
+Anyway, as smudge/clean filters stand now, they can't be used to set up
+git-annex symlinks; their interface doesn't allow it. But, I was able to
+think up a design that uses smudge/clean filters to cover the same use
+cases that direct mode covers now.
+
+Thanks to the clean filter, adding a file with `git add` would check in a
+small file that points to the git-annex object. When a file has been added
+this way, the file in the work tree remains the only copy of the object
+until you use git-annex to copy it to another repository. So if you modify
+the work tree file, you can lose the old version of the object.
+
+This is analagous to how direct mode works now, and it avoids needing to
+store 2 copies of every file in the local repository.
+
+In the same repository, you could also use `git annex add` to check
+in a git-annex symlink, which would protect the object from modification,
+in the good old indirect mode way. `git annex lock` and `git annex unlock` 
+could switch a file between those two modes.
+
+So this allows mixing directly writable annexed files and locked down
+annexed files in the same repository. All regular git commands and all
+git-annex commands can be used on both sorts of files.
+
+That's much more flexible than the current direct mode, and I think it will
+be able to be implemented in a simpler, more scalable, and robust way too.
+I can lose the direct mode merge code, and remove hundreds of lines of
+other special cases for direct mode.
+
+The downside, perhaps, is that for a repository to be usable on a crippled
+filesystem, all the files in it will need to be unlocked. A file can't
+easily be unlocked in one checkout and locked in another checkout.