update; showstopper issue with current git
developed a patch for git, we'll see if they like it..
This commit is contained in:
parent
d1154d0837
commit
cd199e442f
1 changed files with 32 additions and 17 deletions
|
@ -41,18 +41,17 @@ efficiently using the existing code in `Branch.catFile`.
|
||||||
|
|
||||||
### efficiency
|
### efficiency
|
||||||
|
|
||||||
|
#### clean
|
||||||
|
|
||||||
The trick is doing it efficiently. Since git a2b665d, v1.7.4.1,
|
The trick is doing it efficiently. Since git a2b665d, v1.7.4.1,
|
||||||
something like this works to provide a filename to the clean script:
|
something like this works to provide a filename to the clean script:
|
||||||
|
|
||||||
git config --global filter.huge.clean huge-clean %f
|
git config --global filter.huge.clean huge-clean %f
|
||||||
|
|
||||||
This avoids it needing to read all the current file content from stdin
|
This could avoid it needing to read all the current file content from stdin
|
||||||
when doing eg, a git status or git commit. Instead it is passed the
|
when doing eg, a git status or git commit. Instead it is passed the
|
||||||
filename that git is operating on, in the working directory.
|
filename that git is operating on, in the working directory.
|
||||||
|
|
||||||
(The smudge script can also be provided a filename with %f, but it
|
|
||||||
cannot directly write to the file or git gets unhappy.)
|
|
||||||
|
|
||||||
So, WORM could just look at that file and easily tell if it is one
|
So, WORM could just look at that file and easily tell if it is one
|
||||||
it already knows (same mtime and size). If so, it can short-circuit and
|
it already knows (same mtime and size). If so, it can short-circuit and
|
||||||
do nothing, file content is already cached.
|
do nothing, file content is already cached.
|
||||||
|
@ -61,6 +60,21 @@ SHA1 has a harder job. Would not want to re-sha1 the file every time,
|
||||||
probably. So it'd need a local cache of file stat info, mapped to known
|
probably. So it'd need a local cache of file stat info, mapped to known
|
||||||
objects.
|
objects.
|
||||||
|
|
||||||
|
But: Even with %f, git actually passes the full file content to the clean
|
||||||
|
filter, and if it fails to consume it all, it will crash (may only happen
|
||||||
|
if the file is larger than some chunk size; tried with 500 mb file and
|
||||||
|
saw a SIGPIPE.) This means unnecessary works needs to be done,
|
||||||
|
and it slows down *everything*, from `git status` to `git commit`.
|
||||||
|
**showstopper** I have sent a patch to the git mailing list to address
|
||||||
|
this.
|
||||||
|
|
||||||
|
#### smudge
|
||||||
|
|
||||||
|
The smudge script can also be provided a filename with %f, but it
|
||||||
|
cannot directly write to the file or git gets unhappy.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
### dealing with partial content availability
|
### dealing with partial content availability
|
||||||
|
|
||||||
The smudge filter cannot be allowed to fail, that leaves the tree and
|
The smudge filter cannot be allowed to fail, that leaves the tree and
|
||||||
|
@ -82,13 +96,13 @@ huge-smudge:
|
||||||
|
|
||||||
<pre>
|
<pre>
|
||||||
#!/bin/sh
|
#!/bin/sh
|
||||||
read sha1
|
read f
|
||||||
file="$1"
|
file="$1"
|
||||||
echo "smudging $sha1" >&2
|
echo "smudging $f" >&2
|
||||||
if [ -e ~/$sha1 ]; then
|
if [ -e ~/$f ]; then
|
||||||
cat ~/$sha1 # possibly expensive copy here
|
cat ~/$f # possibly expensive copy here
|
||||||
else
|
else
|
||||||
echo "$sha1 not available"
|
echo "$f not available"
|
||||||
fi
|
fi
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
|
@ -96,16 +110,17 @@ huge-clean:
|
||||||
|
|
||||||
<pre>
|
<pre>
|
||||||
#!/bin/sh
|
#!/bin/sh
|
||||||
temp="$1"
|
file="$1"
|
||||||
if grep -q 'not available' "$temp"; then
|
# in real life, this should be done more efficiently, not trying to read
|
||||||
awk '{print $1}' "$temp" # provide what we would if the content were avail!
|
# the whole file content!
|
||||||
|
if grep -q 'not available' "$file"; then
|
||||||
|
awk '{print $1}' "$file" # provide what we would if the content were avail!
|
||||||
exit 0
|
exit 0
|
||||||
fi
|
fi
|
||||||
sha1=`sha1sum "$temp" | cut -d' ' -f1`
|
echo "cleaning $file" >&2
|
||||||
echo "cleaning $sha1" >&2
|
ls -l "$file" >&2
|
||||||
ls -l "$temp" >&2
|
ln -f "$file" ~/$file # can't delete temp file
|
||||||
ln -f "$temp" ~/$sha1 # can't delete temp file
|
echo $file
|
||||||
echo $sha1
|
|
||||||
</pre>
|
</pre>
|
||||||
|
|
||||||
.gitattributes:
|
.gitattributes:
|
||||||
|
|
Loading…
Reference in a new issue