* Disable flaky test
* Add helper for storing test artifacts
* Refactor screen capture tests
We have a pattern for inspecting a screen capture, so this refactor codifies that pattern into a helper. This gives us shorter test code, consistency (previously, the display in test code and the display captured could theoretically be different), and better debugging/observability on failure.