Merge branch 'master' of ssh://git-annex.branchable.com
This commit is contained in:
commit
1b4425bbc2
5 changed files with 154 additions and 0 deletions
|
@ -0,0 +1,88 @@
|
||||||
|
Hi,
|
||||||
|
|
||||||
|
I'm trying to get my head around groups, wanted, etc. for a particular use case.
|
||||||
|
|
||||||
|
**Problem:** I can't work out how to get a source(?) repository to automatically drop files when they hit a transfer repository.
|
||||||
|
|
||||||
|
|
||||||
|
I have a machine (`Machine 1`) that is used for data acquisition but it is behind a strict firewall (both physical and virtual). I usually physically carry a USB drive over, set up a rsync ssh -> local-USB-drive from the one machine (`Machine 2`) that is able to connect over the network to `Machine 1`. As it is a pain to lug the drive over, I only do this rsync maybe weekly, so the rsync takes many hours (~24) to complete. Then (when I remember) I visit and I carry the USB drive back... Naturally, this slows down my work process.
|
||||||
|
|
||||||
|
What I was hoping to do was set up git-annex with the assistant to help me. I am able to run the assistant, but not the webapp on `Machines 1 and 2`. :-(
|
||||||
|
|
||||||
|
My thought was - as these have to be disconnected network transfers...
|
||||||
|
|
||||||
|
- `Repository 1 -> Repository 2` (when space permits)
|
||||||
|
- `Repository 2 -> Repository 3` (when space permits) `-> Repository 4` (USB drive(s))
|
||||||
|
|
||||||
|
|
||||||
|
Another limitation is that `Repos/Machines 2 & 3` have limited storage space.
|
||||||
|
|
||||||
|
|
||||||
|
As a test case I can set up (`Repo1 -> Repo2`) and (`Repo2 -> Repo3`) (on other machines, but the commands should be the same...)
|
||||||
|
|
||||||
|
After reading a bit I made a changed [preferred content](/preferred_content/standard_groups/) for a transfer repo to:
|
||||||
|
|
||||||
|
```
|
||||||
|
not (inallgroup=client and copies=client:1) and ($client)
|
||||||
|
```
|
||||||
|
|
||||||
|
i.e. `copies` from `2` to `1`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
|
||||||
|
Finally...The question
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
**BUT** I can't work out how to get `Repo1` (the source) to automatically drop the files when they hit `Repo2` (what I'm guessing should be a transfer repository).
|
||||||
|
|
||||||
|
Can anyone suggest how to automagically do this with the assistant?
|
||||||
|
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
|
||||||
|
If it would help I can share the git-annex commands I've been using, but as I'm only doing testing up at the moment, I'm happy to start from scratch if there is a RTFM page out there. :-)
|
||||||
|
|
||||||
|
|
||||||
|
I've put some details about my thoughts on the repositories and restrictions below.
|
||||||
|
|
||||||
|
|
||||||
|
Thanks - Olaf
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
Repository 1
|
||||||
|
------------
|
||||||
|
- Type: source (Data collection)
|
||||||
|
- Human readable directory structure
|
||||||
|
- Physically: Machine 1
|
||||||
|
- Strict firewall only incoming network connections from Machine 2
|
||||||
|
- Storage: 50Gb
|
||||||
|
|
||||||
|
|
||||||
|
Repository 2
|
||||||
|
------------
|
||||||
|
- Type: transfer
|
||||||
|
- Physically: Machine 2
|
||||||
|
- Reasonably relaxed firewall, can talk to Repository 3
|
||||||
|
- Limited storage: 10Gb
|
||||||
|
|
||||||
|
|
||||||
|
Repository 3
|
||||||
|
------------
|
||||||
|
- Type: transfer
|
||||||
|
- Pysically: Machine 3
|
||||||
|
- Reasonably relaxed firewall, can talk to Repository 2
|
||||||
|
- Limited storage: 10Gb
|
||||||
|
- Connected to USB drive(s)
|
||||||
|
|
||||||
|
|
||||||
|
Repository 4, 5, ...
|
||||||
|
--------------------
|
||||||
|
- Type: ? Client ?
|
||||||
|
- Human readable directory structure
|
||||||
|
- Physically: USB drive
|
||||||
|
- Usually (but not always) connected to machine 3
|
||||||
|
- Large storage (2Tb) + Additional drives
|
|
@ -0,0 +1,14 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="CandyAngel"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/15c0aade8bec5bf004f939dd73cf9ed8"
|
||||||
|
subject="comment 4"
|
||||||
|
date="2017-05-10T09:21:34Z"
|
||||||
|
content="""
|
||||||
|
> And I doubt CandyAngel was counting only the sizes of symlinks and not git repos or at least directory inodes to hold all the symlinks.)
|
||||||
|
|
||||||
|
In that repository, it is only top level directories (no sub directories) and each directory in it only has symlinks (up to 8000 of them). Directories are **mkdir $(uuidgen -r)**, hence the wildcard for du.
|
||||||
|
|
||||||
|
It would be including the directory size to hold all the inodes, but it definitely *isn't counting .git* as this annex spans 3 drives with 6TB of content so far. Well, 6 drives because of \"numcopies 2\" :P
|
||||||
|
|
||||||
|
I will calculate this a different way and only count symlinks, when I have access to it again.
|
||||||
|
"""]]
|
|
@ -0,0 +1,25 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="CandyAngel"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/15c0aade8bec5bf004f939dd73cf9ed8"
|
||||||
|
subject="comment 5"
|
||||||
|
date="2017-05-10T12:44:08Z"
|
||||||
|
content="""
|
||||||
|
$ find -name .git -prune -o -type l | wc -l
|
||||||
|
1034886
|
||||||
|
|
||||||
|
Just over a million symlinks.. very convenient :)
|
||||||
|
|
||||||
|
$ find -name .git -prune -o -type l -printf '%s\n' | awk '{sum+=$1} END {print sum/1024**3}'
|
||||||
|
195.9 # 195MB actual size
|
||||||
|
$ find -name .git -prune -o -type l -print0 | du -ch --files0-from=- | tail -n1
|
||||||
|
4.0G total # 4GB disk usage
|
||||||
|
|
||||||
|
And in comparison to my earlier comment 2 weeks ago:
|
||||||
|
|
||||||
|
$ du -shc *-* | tail -n3
|
||||||
|
33M fd79bbd4-d41e-4ea8-acc8-86437c5eed7c
|
||||||
|
33M ffbd042e-f6d9-4450-9a57-8ed1086f587c
|
||||||
|
4.1G total
|
||||||
|
|
||||||
|
So directory inode sizes are dwarfed by the 4K disk usage but ~198b actual usage of the symlinks (~96% wasted space?).
|
||||||
|
"""]]
|
|
@ -0,0 +1,16 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="CandyAngel"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/15c0aade8bec5bf004f939dd73cf9ed8"
|
||||||
|
subject="comment 6"
|
||||||
|
date="2017-05-10T12:45:59Z"
|
||||||
|
content="""
|
||||||
|
Oops,
|
||||||
|
|
||||||
|
find -name .git -prune -o -type l -printf '%s\n' | awk '{sum+=$1} END {print sum/1024**3}'
|
||||||
|
|
||||||
|
should have been
|
||||||
|
|
||||||
|
find -name .git -prune -o -type l -printf '%s\n' | awk '{sum+=$1} END {print sum/1024**2}'
|
||||||
|
|
||||||
|
That'll teach me to prematurely copy it :P
|
||||||
|
"""]]
|
|
@ -0,0 +1,11 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="https://launchpad.net/~barthelemy"
|
||||||
|
nickname="barthelemy"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/e99cb15f6029de3225721b3ebdd0233905eb69698e9b229a8c4cc510a4135438"
|
||||||
|
subject="comment 3"
|
||||||
|
date="2017-05-09T23:38:27Z"
|
||||||
|
content="""
|
||||||
|
Hi Joel,
|
||||||
|
thank you for the precision (and for git annex, and for all the rest!)
|
||||||
|
Cheers
|
||||||
|
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue