Merge branch 'master' into import-from-s3

This commit is contained in:
Joey Hess 2019-04-23 15:34:26 -04:00
commit 48d30d8753
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
9 changed files with 123 additions and 18 deletions

View file

@ -77,13 +77,15 @@ perform :: ImportFeedOptions -> Cache -> URLString -> CommandPerform
perform opts cache url = go =<< downloadFeed url
where
go Nothing = next $ feedProblem url "downloading the feed failed"
go (Just f) = case findDownloads url f of
[] -> next $
feedProblem url "bad feed content; no enclosures to download"
l -> do
showOutput
ok <- and <$> mapM (performDownload opts cache) l
next $ cleanup url ok
go (Just feedcontent) = case parseFeedString feedcontent of
Nothing -> next $ feedProblem url "parsing the feed failed"
Just f -> case findDownloads url f of
[] -> next $
feedProblem url "bad feed content; no enclosures to download"
l -> do
showOutput
ok <- and <$> mapM (performDownload opts cache) l
next $ cleanup url ok
cleanup :: URLString -> Bool -> CommandCleanup
cleanup url True = do
@ -142,14 +144,14 @@ findDownloads u f = catMaybes $ map mk (feedItems f)
Nothing -> Nothing
{- Feeds change, so a feed download cannot be resumed. -}
downloadFeed :: URLString -> Annex (Maybe Feed)
downloadFeed :: URLString -> Annex (Maybe String)
downloadFeed url
| Url.parseURIRelaxed url == Nothing = giveup "invalid feed url"
| otherwise = Url.withUrlOptions $ \uo ->
liftIO $ withTmpFile "feed" $ \f h -> do
hClose h
ifM (Url.download nullMeterUpdate url f uo)
( parseFeedString <$> readFileStrict f
( Just <$> readFileStrict f
, return Nothing
)

View file

@ -0,0 +1,45 @@
### Please describe the problem.
I am following these instructions:
<https://git-annex.branchable.com/tips/android_sync_with_adb/>
but encouter this issue:
```
$ git annex initremote android type=adb androiddirectory=/sdcard/DCIM encryption=none exporttree=yes importtree=yes
initremote android
git-annex: importtree is not supported by this special remote
failed
git-annex: initremote: 1 failed
```
### What steps will reproduce the problem?
```
mkdir testAdbRemote
cd testAdbRemote
git init
git annex init
git annex initremote android type=adb androiddirectory=/sdcard/DCIM encryption=none exporttree=yes importtree=yes
```
### What version of git-annex are you using? On what operating system?
Latest conda one in Ubuntu 18.04 LTS
```
$ git annex version
git-annex version: 7.20190322-g7e5502b
build flags: Assistant Webapp Pairing S3(multipartupload)(storageclasses) WebDAV Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite
dependency versions: aws-0.21.1 bloomfilter-2.0.1.0 cryptonite-0.25 DAV-1.3.3 feed-1.0.1.0 ghc-8.4.2 http-client-0.5.14 persistent-sqlite-2.9.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar hook external
operating system: linux x86_64
supported repository versions: 5 7
upgrade supported from repository versions: 0 1 2 3 4 5 6
local repository version: 5
```
### Please provide any additional information below.
Not needed
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
git-annex has worked wonderfully well for me in the past two years of use and having the described workflow to sync with my Android would be icing on the cake! Moreover, Joey is an awesome dev!
> [[done]] --[[Joey]]

View file

@ -0,0 +1,7 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2019-04-23T16:41:43Z"
content="""
This feature is new and has not been in a release of git-annex yet.
"""]]

View file

@ -0,0 +1,27 @@
Started today on `git annex import` from S3, in the "import-from-s3"
branch.
It looks like I'm going to support both versioned and unversioned buckets;
the latter will need --force to initialize since it can lose data.
One thought I had about that is: It's probably better for git-annex to be
able to import data from an unversioned S3 bucket with caveats about
avoiding unsafe operations (export) that could lose data, than it is for
git-annex to not be able to import from the bucket at all, guaranteeing
that past versions of modified files will be lost. (Rationalization is a
powerful drug.)
To support unversioned buckets, some kind of stable content identifier is
needed other than the S3 version id. Luckily, S3 has etags, which are
md5sum of the content, so will work great. But, the `aws` haskell library
needs one small change to return an etag, so this will be
blocked on that change.
I've gotten listing importable contents from S3 working for unversioned
buckets, including dealing with S3's 1000 item limit by paging.
Listing importable contents from versioned buckets is harder, because
it needs to synthesize a git version history from the information that S3
provides. I think I have a method for doing this that will generate the
trees that users will expect to see, and also will generate the same past
trees every time, avoiding a proliferation of git trees. Next step:
Converting my prose description of how to do that into haskell.

View file

@ -0,0 +1,9 @@
Despite struggling with a keyboard controller that's increasingly prone to
flaking out and not registering some key presses while doubling others, I
managed to finis implementing import from versioned S3 buckets. It's quite
nice to see it download past versions of files and construct a git history.
Still enough unimplemented stuff and bugs to need to work on this for
probably one more day.
(Imagine here me stuggling for a full minute to :wq)

View file

@ -0,0 +1,9 @@
My main use for git-annex is to manage a photo/video archive of my extended family.
I regularly use `git annex import --clean-duplicates` to clean up old copies before going through the effort of manually putting data into the global directory structure. I only care about seeing how much data was removed, i.e. `git annex import --clean-duplicates foo | grep -v 'not duplicate; skipping'`. It would be a lot nicer to suppress this natively with `--quiet`. Ideally, in this case, only files which are being deleted (i.e. destructive actions), errors, and final stats (how many files were not deleted and how many were deleted) would be printed.
Along similar lines, I don't care about everything which goes right in an `fsck`; I care about anything going wrong. A `--quiet` should print errors and final stats, nothing more.
Best,
Richard

View file

@ -1,9 +0,0 @@
[[!comment format=mdwn
username="maryjil2596"
avatar="http://cdn.libravatar.org/avatar/2ce6b78d907f10b244c92330a4f0bd00"
subject="samsung printer error code u1-2320"
date="2019-04-18T10:10:14Z"
content="""
Those who are facing an issue with their Samsung printer follow this link <a href=\"https://errorcode0x.com/fixed-samsung-printer-error-code-u1-2320/\">samsung printer error code u1-2320</a> and immediately eliminate all the technical errors.
"""]]

View file

@ -0,0 +1,12 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="indeed a useful use case"
date="2019-04-19T13:26:05Z"
content="""
Indeed it would be nice if there was an easy way to split a git annex repository into smaller ones, while those smaller ones also obtain all the git-annex branch availability/metadata information about the files they inherit. The situations comes up quite frequently whenever it is desired to modularize bigger repositories. The simplest use case is to make a specific subdirectory into a git/git-annex submodule. Is there a way/recipe to easily accomplish also moving all git-annex branch metadata. And the original repository should get those files removed within its git tree.
One possible way we see is to clone the original repository, remove all other files, move subdirectory files \"up\" needed number of directories, and then rewrite git history to forget and then use `annex forget` but that one wouldn't \"forget\" information about the files which are not in the current tree, so would also require some manual trimming of `git-annex` branch before `annex forget`.
But may be there is a better way?
"""]]

View file

@ -0,0 +1,3 @@
Currently, git-annex-migrate leads to content (and metadata) being stored under both old and new keys. git-annex-unused can drop the contents under the old key, but then you can't access the content if you check out an older commit. Maybe, an option can be added to migrate keys using [git-replace](https://git-scm.com/docs/git-replace) ? You'd git-replace the blob .git/annex/objects/old_key with the blob .git/annex/objects/new_key, the blob ../.git/annex/objects/old_key with the blob ../.git/annex/objects/new_key , etc. You could then also have a setting to auto-migrate non-checksum keys to checksum keys whenever the contents gets downloaded.
More generally, git-annex-replace could be implemented this way, doing what git-replace does, but for git-annex keys rather than git hashes. [[git-annex-pre-commit]] might need to be changed to implement replacement of keys added later.