While objects in the sync queue that fail to save should remain in the
queue, objects that just don't exist remotely need to be removed, or
else they'll be retried forever.
If an object exists locally but not remotely and the local version has a
version number, that's an error. I don't think that should ever happen,
but it can if things somehow get out of sync due to other bugs.
To address, reprocess the API delete log during a full sync and then
reset the version number of all remaining local objects that don't exist
remotely (not just unmodified objects, as was the case previously) to 0
for uploading.
When remote deletions are reprocessed, delete local objects that haven't
been modified and show the conflict resolution window for any local
items that have.
Also:
- Clean up checking of last remote library version during download
syncs
- Add Zotero.DataObjects.getAllKeys()
If the item was deleted on one side and moved to the trash on the other,
just delete the item on the trash side. Since trash emptying happens
automatically, this would otherwise result in a conflict even if the
user carefully avoided making changes before a manual sync.
If the API returns a modified item after an upload (e.g., to strip invalid
characters), don't update the Date Modified field when saving those changes to
the local version (though it would still be good to avoid API-side changes as
much as possible).
And omit in ZFS file sync requests
The API previously didn't allow these properties to be set for group items,
because they were set atomically during the file upload process, but 1) that's
not really necessary (makes a little sense for 'filename', but not really a big
deal if an old file is renamed on another computer before the new file is
synced down) and 2) skipping them results in the properties getting erased
after items are uploaded and the empty values returned by the server overwrite
the local values.
- Use custom exception for user-initiated sync cancellations, which can bubble
up to the sync runner -- this should help with a sync stop button (#915)
- Separate out deletions-downloading code
- Refactor delay generator handling on library version mismatch
- Clearer variable names
Previously, objects were first downloaded and saved to the sync cache,
which was then processed separately to create/update local objects. This
meant that a server bug could result in invalid data in the sync cache
that would never be processed. Now, objects are saved as they're
downloaded and only added to the sync cache after being successfully
saved. The keys of objects that fail are added to a queue, and those
objects are refetched and retried on a backoff schedule or when a new
client version is installed (in case of a client bug or a client with
outdated data model support).
An alternative would be to save to the sync cache first and evict
objects that fail and add them to the queue, but that requires more
complicated logic, and it probably makes more sense just to buffer a few
downloads ahead so that processing is never waiting for downloads to
finish.
While trying to get translation and citing working with asynchronously
generated data, we realized that drag-and-drop support was going to
be...problematic. Firefox only supports synchronous methods for
providing drag data (unlike, it seems, the DataTransferItem interface
supported by Chrome), which means that we'd need to preload all relevant
data on item selection (bounded by export.quickCopy.dragLimit) and keep
the translate/cite methods synchronous (or maintain two separate
versions).
What we're trying instead is doing what I said in #518 we weren't going
to do: loading most object data on startup and leaving many more
functions synchronous. Essentially, this takes the various load*()
methods described in #518, moves them to startup, and makes them operate
on entire libraries rather than individual objects.
The obvious downside here (other than undoing much of the work of the
last many months) is that it increases startup time, potentially quite a
lot for larger libraries. On my laptop, with a 3,000-item library, this
adds about 3 seconds to startup time. I haven't yet tested with larger
libraries. But I'm hoping that we can optimize this further to reduce
that delay. Among other things, this is loading data for all libraries,
when it should be able to load data only for the library being viewed.
But this is also fundamentally just doing some SELECT queries and
storing the results, so it really shouldn't need to be that slow (though
performance may be bounded a bit here by XPCOM overhead).
If we can make this fast enough, it means that third-party plugins
should be able to remain much closer to their current designs. (Some
things, including saving, will still need to be made asynchronous.)
Also:
- Remove last-sync-time mechanism for both WebDAV and ZFS, since it can
be determined by storage properties (mtime/md5) in data sync
- Add option to include synced storage properties in item toJSON()
instead of local file properties
- Set "Fake-Server-Match" header in setHTTPResponse() test support
function, which can be used for request count assertions -- see
resetRequestCount() and assertRequestCount() in webdavTest.js
- Allow string (e.g., 'to_download') instead of constant in
Zotero.Sync.Data.Local.setSyncState()
- Misc storage tweaks
This mostly gets ZFS file syncing and file conflict resolution working
with the API sync process. WebDAV will need to be updated separately.
Known issues:
- File sync progress is temporarily gone
- File uploads can result in an unnecessary 412 loop on the next data
sync
- This causes Firefox to crash on one of my computers during tests,
which would be easier to debug if it produced a crash log.
Also:
- Adds httpd.js for use in tests when FakeXMLHttpRequest can't be used
(e.g., saveURI()).
- Adds some additional test data files for attachment tests
This will appear much less frequently, since non-conflicting field changes on
both sides can be resolved automatically, but genuine field conflicts still
require manual conflict resolution.
The merge pane is no longer editable, since the itembox code to do that is
async and can't run in a modal window, but it's not really necessary,
particularly with conflicts happening less frequently.
TODO:
- Remote item deletions
- File conflicts
- Maybe handle some edge cases where the conflicted items fail to save
Save uploaded data to cache, and update local object if necessary (which
it mostly shouldn't be except for invalid characters and HTML filtering
in notes)
Also add some upload and JSON tests
There's a lot more to do, and this isn't ready for actual usage, but the
basic functionality is mostly in place and has decent test coverage. It
can successfully upgrade a library last used with classic syncing and
pull down changes via the API. Uploading mostly works but is currently
disabled for safety until it has better test coverage.
Downloaded JSON is first saved to a cache table, which is then used to
populate other tables and later for generating PATCH requests and
automatically resolving conflicts (since it shows what was changed
locally and what was changed remotely). Objects with unmet dependencies
or unknown fields are skipped for now but don't block the rest of the
sync.
Some of the bigger remaining to-dos:
- Tests for uploading
- Re-do the preferences to get an API key
- File sync integration
- Full-text syncing integration
- Manual conflict resolution (though this already includes much smarter
conflict handling that automatically resolves many conflicts)