From ffeef759179e58810481d94bbaab083dec251327 Mon Sep 17 00:00:00 2001 From: thk Date: Sun, 19 Apr 2020 08:15:47 +0000 Subject: [PATCH] --- doc/bugs/find_by_metadata_is_slow.mdwn | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) create mode 100644 doc/bugs/find_by_metadata_is_slow.mdwn diff --git a/doc/bugs/find_by_metadata_is_slow.mdwn b/doc/bugs/find_by_metadata_is_slow.mdwn new file mode 100644 index 0000000000..7cdec7d63b --- /dev/null +++ b/doc/bugs/find_by_metadata_is_slow.mdwn @@ -0,0 +1,19 @@ +### Please describe the problem. + +Finding files by metadata is possible with + + git annex find --metadata some-key="something*" + +From looking at the code, inspecting .git/annex and running this multiple times, it seems to me that this does not use any cache. +Accordingly, it takes ~25s on my large repo. + +This is too slow for many use cases. + +I was not sure whether to add this here as a bug or in the todo section. +The design notes for [[design/caching_database]] gave me the impression that some metadata caching is already planned? + +A metadata cache could track the git-annex commit sha-1 that it was built for. Then git-annex only needs to check, whether any change to a metadata file happened in a commit between the current HEAD of the git-annex branch and the commit sha-1 of the cache. + +### What version of git-annex are you using? On what operating system? + +8.20200227-gf56dfe791