update for filter-branch
This commit is contained in:
parent
c525d18cf7
commit
7d57866c3e
2 changed files with 76 additions and 5 deletions
|
@ -1,13 +1,78 @@
|
|||
[[!meta title="Splitting a git-annex repository"]]
|
||||
|
||||
Note: this is the reverse of [[migrating two seperate disconnected directories to git annex]].
|
||||
|
||||
I have a [git annex](https://git-annex.branchable.com/) repo for all my media
|
||||
that has grown to 57866 files and git operations are getting slow, especially
|
||||
on external spinning hard drives, so I decided to split it into separate
|
||||
repositories.
|
||||
|
||||
This is how I did it, with some help from `#git-annex`. Suppose the old big repo is at `~/oldrepo`:
|
||||
Here is how to split out a repository that contains a subset of the files
|
||||
in the larger repository. The larger repository is left as-is, but similar
|
||||
methods can be used to remove the files from it. Or, it can be deleted
|
||||
once it gets split up into several smaller repositories.
|
||||
|
||||
(This is the reverse of [[migrating two seperate disconnected directories
|
||||
to git annex]].)
|
||||
|
||||
Suppose the old big repo is at `~/oldrepo`, and you want to split out
|
||||
photos from it, and those are all located inside `~/oldrepo/photos`.
|
||||
|
||||
First, let's create a new empty repo.
|
||||
|
||||
mkdir ~/photos
|
||||
cd photos
|
||||
git init
|
||||
|
||||
Now to populate the new repo with the files we want from the old repo. We
|
||||
can use `git filter-branch` to create a git branch that contains only the
|
||||
history of the files in `photos`. That command has a *lot* of options and
|
||||
ways to use it, but here is one simple way:
|
||||
|
||||
cd ~/oldrepo
|
||||
|
||||
# filter a branch to with only the files wanted by the new repository
|
||||
git branch split-master master
|
||||
git filter-branch --prune-empty --subdirectory-filter photos split-master
|
||||
|
||||
# replace the new repo's master branch with the filtered branch
|
||||
git push ~/photos split-master
|
||||
git branch -D split-master
|
||||
cd ~/photos
|
||||
git reset --hard split-master
|
||||
git branch -d split-master
|
||||
|
||||
Next, the git-annex branch needs to be filtered to include only
|
||||
the files in `photos`, and that filtered branch sent to the new repository.
|
||||
That can be done with the [[git-annex-filter-branch]](1) command.
|
||||
|
||||
cd ~/oldrepo
|
||||
annexrev=$(git annex filter-branch photos --include-all-key-information --include-all-repo-config --include-global-config)
|
||||
git push ~/photos $annexrev:refs/heads/git-annex
|
||||
|
||||
Next, initialize git-annex on the new repository. This uses
|
||||
the same annex.uuid as was in the old repository. That's ok, because
|
||||
the repository that's been split off will never have the old repository
|
||||
as a remote.
|
||||
|
||||
cd ~/photos
|
||||
git annex reinit $(git config --file ../tofilter/.git/config annex.uuid)
|
||||
|
||||
Finally the annexed file contents need to be copied to the new repository:
|
||||
|
||||
cd ~/photos
|
||||
|
||||
# Hardlink all the annexed data from the old repo
|
||||
cp -rl ~/oldrepo/.git/annex/objects .git/annex/
|
||||
|
||||
# Remove unneeded hard links
|
||||
git annex unused --quiet
|
||||
git annex drop --unused --force
|
||||
|
||||
# Fix up annex links to content and make sure it's all ok.
|
||||
git annex fsck
|
||||
|
||||
# alternative older method
|
||||
|
||||
Here is another way to do it. Suppose the old big repo is at `~/oldrepo`:
|
||||
|
||||
```
|
||||
# Create a new repo for photos only
|
||||
|
|
|
@ -3,8 +3,14 @@
|
|||
subject="""comment 1"""
|
||||
date="2017-05-11T16:28:32Z"
|
||||
content="""
|
||||
This is a simple way to split a repository, but the resulting split git
|
||||
repository will be larger than is really necessary.
|
||||
2021 update: The new [[git-annex-filter-branch]] command
|
||||
can be used to produce a filtered version of the git-annex branch that only
|
||||
includes information for the files you want. I have updated the tip to
|
||||
show how to do it that way, and kept the old way as an alternative
|
||||
|
||||
The old, alternative way is a simple way to split a repository, but the
|
||||
resulting split git repository will be larger than is really necessary.
|
||||
(The new method avoids this problem.)
|
||||
|
||||
When you `dropunused` all the hard links that are not present in the
|
||||
repository, git-annex will commit a log to the git-annex branch saying "I
|
||||
|
|
Loading…
Add table
Reference in a new issue