git-annex/doc/design
Joey Hess 8bde6101e3
sqlite datbase for importfeed
importfeed: Use caching database to avoid needing to list urls on every
run, and avoid using too much memory.

Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster,
and memory use dropped from 203000k to 59408k.

Database.ImportFeed is Database.ContentIdentifier with the serial number
filed off. There is a bit of code duplication I would like to avoid,
particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use
the persistent sqlite tables, so despite the code being the same, they
cannot be factored out.

Since this database includes the contentidentifier metadata, it will be
slightly redundant if a sqlite database is ever added for metadata. I
did consider making such a generic database and using it for this. But,
that would then need importfeed to update both the url database and the
metadata database, which is twice as much work diffing the git-annex
branch trees. Or would entagle updating two databases in a complex way.
So instead it seems better to optimise the database that
importfeed needs, and if the metadata database is used by another command,
use a little more disk space and do a little bit of redundant work to
update it.

Sponsored-by: unqueued on Patreon
2023-10-23 16:46:22 -04:00
..
adjusted_branches
assistant Typo fix unncessary -> unnecessary. 2022-08-20 09:40:19 -04:00
balanced_preferred_content Added a comment 2023-07-24 13:10:09 +00:00
encryption
exporting_trees_to_special_remotes
external_backend_protocol Added a comment: xxHash as the backend 2022-12-12 08:21:35 +00:00
external_special_remote_protocol comment and update todo 2023-06-23 12:25:08 -04:00
git-remote-daemon
iabackup
metadata
new_repo_versions
p2p_protocol
requests_routing Added a comment: Friendly bump to keep on the radar 2019-10-24 09:26:23 +00:00
adjusted_branches.mdwn
assistant.mdwn
balanced_preferred_content.mdwn
caching_database.mdwn sqlite datbase for importfeed 2023-10-23 16:46:22 -04:00
encryption.mdwn Fix typos "=yet" -> "=yes" 2023-03-10 18:07:20 +01:00
exporting_trees_to_special_remotes.mdwn comment 2022-05-02 14:45:45 -04:00
external_backend_protocol.mdwn this protocol is not draft for some time 2020-10-22 19:55:29 -04:00
external_special_remote_protocol.mdwn let Remote.availability return Unavilable 2023-08-16 14:31:31 -04:00
gcrypt.mdwn
git-remote-daemon.mdwn
iabackup.mdwn
importing_trees_from_special_remotes.mdwn
metadata.mdwn
new_repo_versions.mdwn Typo: sansative -> sensitive 2023-03-17 15:14:50 -04:00
p2p_protocol.mdwn typo 2021-08-09 12:44:20 -04:00
preferred_content.mdwn
requests_routing.mdwn
roadmap.mdwn avoid truncating the list of confirmed items 2023-06-23 16:20:00 -04:00