While trying to get translation and citing working with asynchronously
generated data, we realized that drag-and-drop support was going to
be...problematic. Firefox only supports synchronous methods for
providing drag data (unlike, it seems, the DataTransferItem interface
supported by Chrome), which means that we'd need to preload all relevant
data on item selection (bounded by export.quickCopy.dragLimit) and keep
the translate/cite methods synchronous (or maintain two separate
versions).
What we're trying instead is doing what I said in #518 we weren't going
to do: loading most object data on startup and leaving many more
functions synchronous. Essentially, this takes the various load*()
methods described in #518, moves them to startup, and makes them operate
on entire libraries rather than individual objects.
The obvious downside here (other than undoing much of the work of the
last many months) is that it increases startup time, potentially quite a
lot for larger libraries. On my laptop, with a 3,000-item library, this
adds about 3 seconds to startup time. I haven't yet tested with larger
libraries. But I'm hoping that we can optimize this further to reduce
that delay. Among other things, this is loading data for all libraries,
when it should be able to load data only for the library being viewed.
But this is also fundamentally just doing some SELECT queries and
storing the results, so it really shouldn't need to be that slow (though
performance may be bounded a bit here by XPCOM overhead).
If we can make this fast enough, it means that third-party plugins
should be able to remain much closer to their current designs. (Some
things, including saving, will still need to be made asynchronous.)
This mostly gets ZFS file syncing and file conflict resolution working
with the API sync process. WebDAV will need to be updated separately.
Known issues:
- File sync progress is temporarily gone
- File uploads can result in an unnecessary 412 loop on the next data
sync
- This causes Firefox to crash on one of my computers during tests,
which would be easier to debug if it produced a crash log.
Also:
- Adds httpd.js for use in tests when FakeXMLHttpRequest can't be used
(e.g., saveURI()).
- Adds some additional test data files for attachment tests
Also:
* _finalizeErase in Zotero.DataObject is now inheritable
* Call _initErase before starting a DB transaction
* removes Zotero.Libraries.add and Zotero.Libraries.remove (doesn't seem like this is used any more)
There's a lot more to do, and this isn't ready for actual usage, but the
basic functionality is mostly in place and has decent test coverage. It
can successfully upgrade a library last used with classic syncing and
pull down changes via the API. Uploading mostly works but is currently
disabled for safety until it has better test coverage.
Downloaded JSON is first saved to a cache table, which is then used to
populate other tables and later for generating PATCH requests and
automatically resolving conflicts (since it shows what was changed
locally and what was changed remotely). Objects with unmet dependencies
or unknown fields are skipped for now but don't block the rest of the
sync.
Some of the bigger remaining to-dos:
- Tests for uploading
- Re-do the preferences to get an API key
- File sync integration
- Full-text syncing integration
- Manual conflict resolution (though this already includes much smarter
conflict handling that automatically resolves many conflicts)
And use it in resetDB() test support function, mainly to allow
skipBundledFiles for resetDB calls. Translator installation and
initialization can take a long time, but tests that need a clean DB
don't necessarily rely on translators. Without this, running resetDB()
in beforeEach() for many tests is prohibitively slow.
Since modal windows (e.g., the Create Bib window and the Quick Copy site
editor window) can't use yield, style retrieval
(Zotero.Styles.getVisible()/getAll()) is now synchronous, depending on a
previous async Zotero.Styles.init(). The translator list is generated in
the prefs window and passed into the Quick Copy site editor, but it's
possible the translators API should be changed to make getTranslators()
synchronous with a prior init() as well.
- Simplified schema
- Tags are now added without reloading entire tag selector
- On my system, adding 400 tags to an item (separately, with the tag
selector updating each time) went from 59 seconds to 42. (Given that
it takes only 13 seconds with the tag selector closed, though,
there's clearly more work to be done.)
- Tag selector now uses HTML flexbox (in identical fashion, for now, but
with the possibility of fancier changes later, and with streamlined
logic thanks to the flexbox 'order' property)
- Various async fixes
- Tests
Get rid of data_access.js, at long last. Existing calls to
Zotero.getCollections() will need to be replaced with
Zotero.Collections.getByLibrary() or .getByParent().
Also removes Zotero.Collection::getCollections(), which is redundant
with Zotero.Collections.getByLibrary(), and Zotero.Collections.add().
The latter didn't didn't include a libraryID anyway, so code might as
well just use 'new Zotero.Collection' instead.
Groups were already being loaded for the collections list, so we might
as well just store them initially and let Zotero.Libraries.getName() be
a synchronous call.
'b' for *b*undled files
Translators and styles take a long time to install and initialize in
source installations, and they're unnecessary for many tests.
This shaves about 10 seconds off each test run for me on one system (and
that's with some help from filesystem caching).
0 allowed for some accidental behavior due to old code expecting NULL,
and it prevented easy checks (``if (!libraryID)``) for a passed
libraryID. Code now uses Zotero.Libraries.userLibraryID instead of a
hard-coded value (except in schema.js). Functions can still make
libraryID optional, but they should then use
Zotero.Libraries.userLibraryID if that's to mean the user library.
There might be some code that still expects 0 that I missed.
- Protocol handler extensions can now handle promises and can also make
data available as it's ready instead of all at once (e.g., reports now
output one entry at a time)
- zotero:// URL syntaxes are now more consistent and closer to the web
API (old URLs should work, but some may currently be broken)
Also:
- Code to generate server API, currently available for testing via
zotero://data URLs but eventually moving to HTTP -- zotero://data URLs match
web API URLs, with a different prefix for the personal library (/library vs.
/users/12345)
- Miscellaneous fixes to data objects
Under the hood:
- Extensions now return an AsyncChannel, which is an nsIChannel implementation
that takes a promise-yielding generator that returns a string,
nsIAsyncInputStream, or file that will be used for the channel's data
- New function Zotero.Utilities.Internal.getAsyncInputStream() takes a
generator that yields either promises or strings and returns an async input
stream filled with the yielded strings
- Zotero.Router parsers URLs and extract parameters
- Zotero.Item.toResponseJSON()
This required doing additional caching at startup (e.g., item types and fields)
so that various methods can remain synchronous.
This lets us switch back to using the current Sqlite.jsm. Previously we were
bundling the Fx24 version, which avoided freezes with locking_mode=EXCLUSIVE
with both sync and async queries.
Known broken things:
- Autocomplete
- Database backup
- UDFs (e.g., REGEXP function used in Zotero.DB.getNextName())
Promise-based rewrite of most of the codebase, with asynchronous database and file access -- see https://github.com/zotero/zotero/issues/518 for details.
WARNING: This includes backwards-incompatible schema changes.
An incomplete list of other changes:
- Schema overhaul
- Replace main tables with new versions with updated schema
- Enable real foreign key support and remove previous triggers
- Don't use NULLs for local libraryID, which broke the UNIQUE index
preventing object key duplication. All code (Zotero and third-party)
using NULL for the local library will need to be updated to use 0
instead (already done for Zotero code)
- Add 'compatibility' DB version that can be incremented manually to break DB
compatibility with previous versions. 'userdata' upgrades will no longer
automatically break compatibility.
- Demote creators and tags from first-class objects to item properties
- New API syncing properties
- 'synced'/'version' properties to data objects
- 'etag' to groups
- 'version' to libraries
- Create Zotero.DataObject that other objects inherit from
- Consolidate data object loading into Zotero.DataObjects
- Change object reloading so that only the loaded and changed parts of objects are reloaded, instead of reloading all data from the database (with some exceptions, including item primary data)
- Items and collections now have .parentItem and .parentKey properties, replacing item.getSource() and item.getSourceKey()
- New function Zotero.serial(fn), to wrap an async function such that all calls are run serially
- New function Zotero.Utilities.Internal.forEachChunkAsync(arr, chunkSize, func)
- Add tag selector loading message
- Various API and name changes, since everything was breaking anyway
Known broken things:
- Syncing (will be completely rewritten for API syncing)
- Translation architecture (needs promise-based rewrite)
- Duplicates view
- DB integrity check (from schema changes)
- Dragging (may be difficult to fix)
Lots of other big and little things are certainly broken, particularly with the UI, which can be affected by async code in all sorts of subtle ways.
Cmd on OS X, Shift on Windows/Linux
How do I not get to close a ticket for this?
Unfortunately on Windows it doesn't seem possible to set the cursor
effect to arbitrary states (see note in libraryTreeView.js::
_setDropEffect() for the gory details), so this just uses the default
cursor there. On OS X and Linux the cursor reflects the requested
action.
This should make it much easier to debug startup errors, particularly in
Standalone.
This also adds a general mechanism to set Zotero initialization options via
command-line flags.
- Close Zotero pane before database is closed prior to reload, instead
of waiting until reload is complete
- Show an error message if Zotero Standalone is not accessible when it
should be
Cmd on OS X, Shift on Windows/Linux
How do I not get to close a ticket for this?
Unfortunately on Windows it doesn't seem possible to set the cursor
effect to arbitrary states (see note in libraryTreeView.js::
_setDropEffect() for the gory details), so this just uses the default
cursor there. On OS X and Linux the cursor reflects the requested
action.
Other changes:
- Factored out Zotero.Translators from Zotero.Translator. The latter
should be usable in the bookmarklet and connectors without changes.
- configOptions, displayOptions, and hiddenPrefs no longer copy on
read. I don't think this actually affects any existing code.
- Zotero.Translate._loadTranslator() now returns a promise
This change should improve Firefox startup time. If the Zotero pane is
opened before Zotero has initialized, the pane will be blocked with a
progress bar until it's done. Standalone currently does the same, but it
should be changed to just delay opening the window if possible.
The upgrade wizard has been disabled for schema upgrades, in the hope
that upgrades can be done in a way that won't break compatibility with
earlier versions (or that can at least be done in a way such that we can
put out point releases of the last major version that provide
compatibility during beta/post-upgrade periods).
This patch likely breaks many things, and definitely breaks connector
mode.
This change also removes upgrade steps for databases from Zotero 2.1b2
and earlier. Users with such databases will need to upgrade via Zotero
4.0.x first or delete their data directories and start anew. This should
only affect users who haven't opened Zotero since Nov. 2010.
- New tag colors support, with the ability to assign colors to up to 6
tags per library. Tags with colors assigned will show up at the top of
the tag selector and can be added to (and removed from) selected items
by pressing the 1-6 keys on the keyboard. The tags will show up as
color swatches before an item's title in the items list.
- Synced settings, with Notifier triggers when they change and
accessible via the API (currently restricted on the server to
'tagColors', but available for other things upon request)
- Silent DB upgrades for backwards-compatible changes. We'll do
something fancier with async DB queries in 4.0, but this will work for
changes that can be made without breaking compatibility with older
clients, like the creation of new tables. The 'userdata' value is
capped at 76, while further increments go to 'userdata2'.
TODO:
- Try to avoid jitter when redrawing swatches
- Optimize tag color images for retina displays
- Redo attachment dots in flat style?
- Clear all colors from an item with 0 (as in Thunderbird), but I don't
think we can do this without undo
- New promise-based architecture
- Library-specific file sync queues, allowing other libraries to
continue if there's an error in one library
- Library-specific sync errors, with error icons next to each library
- Changed file uploading in on-demand download mode, which had been missing
- On-demand download progress indicator in middle pane
- More accurate progress indicator
- Various tweaks and bug fixes
- Various future tweaks and bug fixes
Adds three new preferences controlling whether a new collection is created upon import:
- extensions.zotero.import.createNewCollection.fromFile
Controls whether a new collection is created by import from cog menu
- extensions.zotero.import.createNewCollection.fromClipboard
Controls whether a new collection is created by import from clipboard
- extensions.zotero.import.createNewCollection.fromFileOpenHandler
Controls whether a new collection is created when import is initiated by
double-clicking a file or by dragging it over Zotero. This preference applies only to
Zotero Standalone, and is configurable in the dialog that appears asking the user
to confirm the import.
Closes#19, [papercuts] (26) Clipboard import into an open collection
Can choose to download files "at sync time" or "as needed"
On-demand defaults to on, but remains off for existing users
To-do:
- Handling of local and remote file changes on on-demand download
(currently if a file exists it isn't downloaded, which means a
remotely modified file won't be redownloaded in on-demand mode)
- Additional control over file downloading and retention
Other changes:
- Overhauled entire file syncing architecture
- Replaced numAttachments column with Note and Attachment columns with
dynamic icons to indicate status
- Double-clicking a parent with a missing best attachment and on-demand
downloading off no longer loads the parent URL
- Bugs
- Adds a per-library "Duplicate Items" virtual search to the source list -- shows up by default for "My Library" but can be added to and removed from all libraries
- Current matching algorithm is very basic: finds exact title matches (after normalizing case/diacritics/punctuation/spacing) and DOI/ISBN matches (untested)
- In duplicates view, sets are selected automatically; in other views, duplicate items can be selected manually and the merge interface can be brought up with "Merge Items" in the context menu
- Can select a master item and individual fields to merge from other versions
- Word processor integration code will automatically find mapped replacements and update documents with new item keys
Possible future improvements:
- Improved detection algorithms
- UI tweaks
- Currently if any items differ, all available versions will be shown as master item options, even if only one item is different; probably the earliest equivalent item should be shown for each distinct version
- Caching of results for performance
- Confidence scale
- Creator version selection (currently the creators from the chosen master item are kept)
- Merging of matching child items
- Better sorting of duplicates if not clustered together by the selected sort column
- Relation path compression when merging items that are already mapped to previously removed duplicates
Other changes in this commit:
- Don't show Trash in word processor integration windows
- Consider items in trash to be missing in word processor documents
- Selection of special views (Trash, Unfiled, Duplicates) is now restored properly in new windows
- Disabled field transform context menu when item isn't editable
- Left/right arrow now expands/collapses all selected items instead of just the last-selected row
- Relation deletions are now synced
- The same items row is now reselected after item deletion
- (dev) Zotero.Item.getNotes(), Zotero.Item.getAttachments(), and Zotero.Item.getTags() now return empty arrays rather than FALSE if no matches -- tests on those return values in third-party code will need to be changed
- (dev) New function Zotero.Utilities.removeDiacritics(str, lowercaseOnly) -- could be used to generate ASCII BibTeX keys
- (dev) New 'tempTable' search condition can take a table to join against -- useful for implementing virtual source lists
- (dev) Significant UI code cleanup
- (dev) Moved all item pane content into itemPane.xul
- Probably various other things
Needless to say, this needs testing.
- Closes#1831, Connectors should be able to save via API in the absence of Zotero Standalone
- Fixes Zotero.Utilities.deepCopy() for arrays
- Fixes some circumstances where an error would not be saved for future error reporting
- Fixes connector status checking
- Implement connector for Firefox (should switch in/out of connector mode automatically when Standalone is launched or closed, although this has only been tested extensively on OS X)
- Share core translation code between Zotero and connectors
Still to be done:
- Run translators in non-Fx connectors (this works in theory, but it's not currently enabled for any translators)
- Show translation results in non-Fx connectors
- Ability to translate to server when Zotero Standalone is not running
From 1.0.82:
Fixes to cs:if and cs:else-if logic. Was failing to set the match
attribute on singleton nodes, among other things.
Treat greek letters as part of the "romanesque" character set, for
name formatting purposes.
If cs:number input has numbers separated by spaces, but also
extraneous non-comma, non-ampersand, non-hyphen characters, then
render as literal.
From 1.0.83:
Use en-dash rather than hyphen in formatted page range joins.
From 1.0.84:
Preserve explicit double spaces in HTML output.
From 1.0.85:
Pass full suite of double space suppression tests.
Language switching within a style works, for both localized dates and
terms. (This functionality is not yet valid in CSL.)
From 1.0.86:
Address bug that caused unreasonable delays with large namesets.
From 1.0.87:
Where et-al-min, et-al-subsequent-min or names-min are set to some
value larger than zero, and the number of names in a nameset exceeds
an arbitrary threshold of 50, truncate author lists that are longer
than the *-min value (plus a buffer value of 2), to raise performance
to a reasonable level with large namesets.
From 1.0.88:
Adjustments to language condition logic, to use only the base language
name for matching, while applying the full locale specified in the
conditional to node children.
Refinements to the recently introduced handling of brace-enclosed
leading characters as initialized form of names.
Adjustments to print statement arbitration, with a view to playing
nice with various systems.
- Translators now use configOptions and displayOptions properties in their metadata instead of Zotero.configure() and Zotero.addOption() to specify interfaces.
- Replaced now broken Zotero.Utilities.inArray() appearances in MODS.js with proper indexOf() calls
- Zotero.Utilities is now a singleton
- Zotero.Utilities.HTTP is now Zotero.HTTP
- Zotero.Utilities.md5 and Zotero.Utilities.Base64 are now located under Zotero.Utilities.Internal
- Zotero.Utilities.AutoComplete has been eliminated
This needs testing to make sure there is no associated breakage.
closes#1650: suppress author does not work for multiple sources
closes#1505: Edit Biblography Button Strips Year Disambiguation
closes#1503: Editing a bibliography resets all reference numbers to 1 (new)
closes#1262: Broken pluralization with et al. + other issues
closes#1238: Localize quotation marks
closes#1191: Harmonize 'plural/pluralize' label attribute with CSL schema
closes#1154: Only one works page numbers are added to the citation are when citing multiple works by the same author
closes#1097: Disambiguation issues
closes#1083: Defect in IEEE CSL with Multiple Citations
closes#993: more sophisticated subsequent-author-substitute
closes#833: text-transform doesn't work with name
- Fix search for January dates
- Support 'yesterday'/'today'/'tomorrow' and localized equivalents (case-insensitive) in date searches (e.g., [Date Added] [is] ['yesterday'])
- Group file sync via Zotero File Storage
- Split file syncing into separate modules for ZFS and WebDAV
- Dragging items between libraries copies child notes, snapshots/files, and links based on checkboxes for each (enabled by default) in the Zotero preferences
- Sync errors now trigger an exclamation/error icon separate from the sync icon, with a popup window displaying the error and an option to report it
- Various errors that could cause perpetual sync icon spinning now stop the sync properly
- Zotero.Utilities.md5(str) is now md5(strOrFile, base64)
- doPost(), doHead(), and retrieveSource() now takes a headers parameter instead of requestContentType
- doHead() can now accept an nsIURI (with login credentials), is a background request, and isn't cached
- When library access or file writing access is denied during sync, display a warning and then reset local group to server version
- Perform additional steps (e.g., removing local groups) when switching sync users to prevent errors
- Compare hash as well as mod time when checking for modified local files
- Don't trigger notifications when removing groups from the client
- Clear relation links to items in removed groups
- Zotero.Item.attachmentHash property to get file MD5
- importFromFile() now takes libraryID as a third parameter
- Zotero.Attachments.getNumFiles() returns the number of files in the attachment directory
- Zotero.Attachments.copyAttachmentToLibrary() copies an attachment item, including files, to another library
- Removed Zotero.File.getFileHash() in favor of updated Zotero.Utilities.md5()
- Zotero.File.copyDirectory(dir, newDir) copies all files from dir into newDir
- Preferences shuffling: OpenURL to Advanced, import/export character set options to Export, "Include URLs of paper articles in references" to Styles
- Other stuff I don't remember
Suffice it to say, this could use testing.
Closes#884, final period missing when a citation is first added in note styles
Closes#1298, issues with footnotes and citations in OOo
Closes#1069, Use async HTTP calls for integration requests
Closes#1027, User-customizable integration port number
Closes#698, Migration away from VBA
Closes#1085, Migrate VBA plug-in to new XML-based API
Closes#792, Auto-updating of OO plugins
- dropping Zotero items into a bucket puts them in that IA bucket
- double clicking a bucket takes you to that IA bucket
In order to enable Zotero Commons:
1) Get an access key and secret key at http://www.archive.org/account/s3.php
2) Go to about:config
3) Search "commons" (no quotes)
4) Set "extensions.zotero.commons.enabled" to true
5) Enter your S3 access key into "extensions.zotero.commons.accessKey"
6) Enter your S3 secret key into "extensions.zotero.commons.secretKey"
7) Enter your buckets into "extensions.zotero.commons.buckets" as a comma separated list
Note: Steps 4-7 take effect in new windows
Known issue: Window title isn't set correctly
Says Ben:
===========================================
Some uses for this feature include:
- people who don't want to use Firefox as their main browser could set their Firefox homepage to zotero://fullscreen, making Zotero act as a standalone application
- people who do use Firefox could add the link to their Bookmarks Toolbar, and simply Shift-click it to have a Zotero "standalone application"
===========================================
- Add Zotero.Error(message, error) constructor to create a throwable error object with an error code
- Allow only one automatic client reset between manual syncs
- Fix "Source item for keyed source doesn't exist in Zotero.Item.getSource()" error
- Object produced by item.serialize() now contains .sourceItemKey instead of .sourceItemID
- Better error logging for missing XPCOM files
- Support for group libraries
- General support for multiple libraries of different types
- Streamlined sync support
- Using solely libraryID and key rather than itemID, and removed all itemID-changing code
- Combined two requests for increased performance and decreased server load
- Added warning on user account change
- Provide explicit error message on SSL failure
- Removed snapshot and link toolbar buttons and changed browser context menu options and drags to create parent items + snapshots
- Closes#786, Add numPages field
- Fixes#1063, Duplicate item with tags broken in Sync Preview
- Added better purging of deleted tags
- Added local user key before first sync
- Add clientDateModified to all objects for more flexibility in syncing
- Added new triples-based Relation object type, currently used to store links between items copied between local and group libraries
- Updated zotero.org translator for groups
- Additional trigger-based consistency checks
- Fixed broken URL drag in Firefox 3.5
- Disabled zeroconf menu option (no longer functional)
Developer-specific changes:
- Overhauled data layer
- Data object constructors no longer take arguments (return to 1.0-like API)
- Existing objects can be retrieved by setting id or library/key properties
- id/library/key must be set for new objects before other fields
- New methods:
- ZoteroPane.getSelectedLibraryID()
- ZoteroPane.getSelectedGroup(asID)
- ZoteroPane.addItemFromDocument(doc, itemType, saveSnapshot)
- ZoteroPane.addItemFromURL(url, itemType)
- ZoteroPane.canEdit()
- Zotero.CollectionTreeView.selectLibrary(libraryID)
- New Zotero.URI methods
- Changed methods
- Many data object methods now take a libraryID
- ZoteroPane.addAttachmentFromPage(link, itemID)
- Removed saveItem and saveAttachments parameters from Zotero.Translate constructor
- translate() now takes a libraryID, null for local library, or false to not save items (previously on constructor)
- saveAttachments is now a translate() parameter
- Zotero.flattenArguments() better handles passed objects
- Zotero.File.getFileHash() (not currently used)
- allow deletion of multiple styles simultaneously
- split Zotero.Styles/Zotero.Style and Zotero.CSL into style.js and csl.js respectively
- add Zotero.File.getBinaryContents for binary-safe file reading
- add Zotero.MIMETypeHandler to provide a unified interface for registering observers and capturing MIME types with Zotero
- Still experimental and incomplete, with no lock support and not much error handling
Also:
- New expiry date for sync functions
- Attachment character set was being dropped during syncing
- Possibly improves sizing issues with preferences window
- Fixes problems with attachment filenames with extended characters
- Fixes some problem with tags that I don't remember
- Makes XMLHTTPRequest calls are now background requests (no auth windows or other prompts)
- Z.U.HTTP.doOptions() now takes an nsIURI instead of a URL spec
- New methods:
- Zotero.Utilities.rand(min, max)
- Zotero.Utilities.probability(x)
- Zotero.Utilities.Base64.encode(str) and decode(str)
- Zotero.getTempDirectory()
- Zotero.Date.dateToISO(date) - convert JS Date object to ISO 8601 UTC date/time
- Zotero.Date.isoToDate(isoDate) - convert an ISO 8601 UTC date/time to a JS Date object
closes#831, transparent EZProxy support
adds a proxy pane to the preferences
asks before saving proxies to the DB (to avoid the potential phishing risk #831 would otherwise pose)
closes#704, EndNote to Zotero style converter (won't actually convert styles due to copyright concerns, but will load them into the DB)
also adds CSL style manager
Also fixed to not display tags twice if both manual and automatic and to not display automatic tags if manual versions of the same tags are already linked to the current item
- Inspired by Dan Chudnov's Python/MODS-based Zeroconf demo at THATcamp
- Enabled by extensions.zotero.zeroconf.enabled (off by default)
- Currently supports only OS X (tested on Leopard, not sure about earlier versions)
- Uses Apple's dns-sd and mDNS command-client clients, but should be able to be extended to other clients, though a native library would be far superior
- Discovery is on-demand for now via Actions menu ("Search for Shared Libraries")
- Includes rudimentary web server (code copied from integration.js) that serves items as sync XML -- no authentication yet!
- Only supports top-level items
- Remote libraries show up in left pane (under remote computer name, for now)
- Items can be dragged into collections (but not the library yet, for some reason)
- On first run, might cause a long pause and the "This file was downloaded from the Internet" message on Leopard -- can't manage to get around the quarantine for the script file that we need to access stdout from Firefox
- Needs a lot of work, and without a real JS (or otherwise Mozilla-native) Zeroconf library we can't do proper discovery without intermittent polling
- But it works, at least for me
Also includes some data/sync-layer changes that I needed along the way (and that we'll need for shared collections of any type)
Apologies for the massive (and, due to data_access.js splitting, difficult-to-follow) commit. Please note that external code that accesses the data layer may need to be tweaked for compatibility. Here's a comprehensive-as-possible changelog:
- Added server sync functionality (incomplete)
- Overhaul of data layer
- Split data_access.js into separate files (item.js, items.js, creator.js, etc.)
- Made creators and collections first-class objects, similar to items
- Constructors now take id as first parameter, e.g. new Zotero.Item(1234, 'book'), to allow explicit id setting and id changing
- Made various data layer operations (including attachment fields) require a save() rather than making direct DB changes
- Better handling of unsaved objects
- Item.setCreator() now takes creator objects instead of creator ids, and Item.save() will auto-save unsaved creators
- clone() now works on unsaved objects
- Newly created object instances are now disabled after save() to force refetch of globally accessible instance using Zotero.(Items|Creators|etc.).get()
- Added secondary lookup key to data objects
- Deprecated getID() and getItemType() methods in favor of .id and .itemTypeID properties
- toArray() deprecated in favor of serialize(), which has a somewhat modified format
- Added support for multiple creators with identical data -- currently unimplemented in interface and most of data layer
- Added Item.diff() for comparing item metadata
- Database changes
- Added SQLite triggers to enforce foreign key constraints
- Added Zotero.DB.transactionVacuum flag to run a VACUUM after a transaction
- Added Zotero.DB.transactionDate, .transactionDateTime, and transactionTimestamp to retrieve consistent timestamps for entire transaction
- Properly store 64-bit integers
- Set PRAGMA locking_mode=EXCLUSIVE on database
- Set SQLite page size to 4096 on new databases
- Set SQLite page cache to 8MB
- Do some database cleanup and integrity checking on migration from 1.0 branch
- Removed IF NOT EXISTS from userdata.sql CREATE statements -- userdata.sql is now processed only on DB initialization
- Removed itemNoteTitles table and moved titles into itemNotes
- Abstracted metadata edit box and note box into flexible XBL bindings with various modes, including read-only states
- Massive speed-up of item tree view
- Several fixes from 1.0 branch for Fx3 compatibility
- Added Notifier observer to log delete events for syncing
- Zotero.Utilities changes
- New methods getSQLDataType() and md5()
- Removed onError from Zotero.Utilities.HTTP.doGet()
- Don't display more than 1024 characters in doPost() debug output
- Don't display passwords in doPost() debug output
- Added Zotero.Notifier.untrigger() -- currently unused
- Added Zotero.reloadDataObjects() to reset all in-memory objects
- Added |chars| parameter to Zotero.randomString(len, chars)
- Added Zotero.Date.getUnixTimestamp() and Date.toUnixTimestamp(JSDate)
- Adjusted zotero-service.js to simplify file inclusion
Various things (such as tags) are temporarily broken.
Including in the DB, which it turns out isn't really all that bad (thanks, among other things, to SQLite's ability to DROP tables within transactions without autocommitting (which MySQL can't do))
In other words, show both "Shakespeare" and "Shakespeare, William" in the drop-down, and if the latter is chosen, save both fields
One issue is that since the autocomplete is by default limited to the width of the textbox, longer entries get truncated (though you can see them with a mouseover), and that may not be easy to fix.
Changed "Scholar" to "Zotero", everywhere
Apologies to anyone with working copy changes, but there are probably the fewer at this moment than there will be again.
Hopefully this won't break anything, though existing prefs will be lost. I avoided scholar.google.com--if you know any other legitimate "scholar"s in the code, be sure to fix them once I'm done here.
This is a multi-commit change--there's at least one more coming. *Do not update to this version! It won't work!*
Differentiates between single and double fields for the search, but there's a problem in the current implementation in that only one field is editable at once, so displaying two-field names in a drop-down is a little problematic. While I could display the full names, comma-delimited, and get the discrete parts (which is what Scholar.Utilities.AutoComplete.getResultComment(), included in this commit, is for--the creatorID for the row would be hidden in the autocomplete drop-down comment field), it's a bit unclear what should happen when a user selects a comma-separated name from the drop-down of one of the fields. One option would be to have a row for the last name (in case that's all they want to complete) and other rows for "last, first" matches, and selecting one of the two-part names would replace whatever's in the opposite name field with the appropriate text (and save it to the DB, I'm afraid, unless I change how the creator fields work), keeping the focus in the current textbox for easy tabbing. Not great, but it might work.
Other ideas?
Example usage:
var windowWatcher = Components.classes["@mozilla.org/embedcomp/window-watcher;1"].
getService(Components.interfaces.nsIWindowWatcher);
var progress = new Scholar.ProgressWindow(windowWatcher.activeWindow);
progress.changeHeadline('Indexing item...');
progress.addLines(['All About Foo'], ['chrome://scholar/skin/treeitem-book.png']);
progress.addDescription('Bar bar bar bar bar');
progress.show();
progress.fade();
There are currently two types of fulltext searching: an SQL-based word index and a file scanner. They each have their advantages and drawbacks.
The word index is very fast to search and is currently used for the find-as-you-type quicksearch. However, indexing files takes some time, so we should probably offer a preference to turn it off ("Index attachment content for quicksearch" or something). There's also an issue with Chinese characters (which are indexed by character rather than word, since there are no spaces to go by, so a search for a word with common characters could produce erroneous results). The quicksearch doesn't use a left-bound index (since that would probably upset German speakers searching for "musik" in "nachtmusik," though I don't know for sure how they think of words) but still seems pretty fast.
* Note: There will be a potentially long delay when you start Firefox with this revision as it builds a fulltext word index of your existing items. We obviously need a notification/option for this. *
The file scanner, used in the Attachment Content condition of the search dialog, offers phrase searching as well as regex support (both case-sensitive and not, and defaulting to multiline). It doesn't require an index, though it should probably be optimized to use the word index, if available, for narrowing the results when not in regex mode. (It does only scan files that pass all the other search conditions, which speeds it up considerably for multi-condition searches, and skips non-text files unless instructed otherwise, but it's still relatively slow.)
Both convert HTML to text before searching (with the exception of the binary file scanning mode).
There are some issues with which files get indexed and which don't that we can't do much about and that will probably confuse users immensely. Dan C. suggested some sort of indicator (say, a green dot) to show which files are indexed.
Also added (very ugly) charset detection (anybody want to figure out getCharsetFromString(str)?), a setTimeout() replacement in the XPCOM service, an arrayToHash() method, and a new header to timedtextarea.xml, since it's really not copyright CHNM (it's really just a few lines off from the toolkit timed-textbox binding--I tried to change it to extend timed-textbox and just ignore Return keypress events so that we didn't need to duplicate the Mozilla code, but timed-textbox's reliance on html:input instead of html:textarea made things rather difficult).
To do:
- Pref/buttons to disable/clear/rebuild fulltext index
- Hidden prefs to set maximum file size to index/scan
- Don't index words of fewer than 3 non-Asian characters
- MRU cache for saved searches
- Use word index if available to narrow search scope of fulltext scanner
- Cache attachment info methods
- Show content excerpt in search results (at least in advanced search window, when it exists)
- Notification window (a la scraping) to show when indexing
- Indicator of indexed status
- Context menu option to index
- Indicator that a file scanning search is in progress, if possible
- Find other ways to make it index the NYT front page in under 10 seconds
- Probably fix lots of bugs, which you will likely start telling me about...now.
Addresses #260, Add auto-complete to search window
- New XPCOM autocomplete component for Zotero data -- can be used by setting the autocompletesearch attribute of a textbox to 'zotero' and passing a search scope with the autocompletesearchparam attribute. Additional parameters can be passed by appending them to the autocompletesearchparam value with a '/', e.g. 'tag/2732' (to exclude tags that show up in item 2732)
- Tag entry now uses more or less the same interface as metadata -- no more popup window -- note that tab isn't working properly yet, and there's no way to quickly enter multiple tags (though it's now considerably quicker than it was before)
- Autocomplete for tags, excluding any tags already set for the current item
- Standalone note windows now register with the Notifier (since tags needed item modification notifications to work properly), which will help with #282, "Notes opened in separate windows need item notification"
- Tags are now retrieved in alphabetical order
- Scholar.Item.replaceTag(oldTagID, newTag), with a single notify
- Scholar.getAncestorByTagName(elem, tagName) -- walk up the DOM tree from an element until an element with the specified tag name is found (also checks with 'xul:' prefix, for use in XBL), or false if not found -- probably shouldn't be used too widely, since it's doing string comparisons, but better than specifying, say, nine '.parentNode' properties, and makes for more resilient code
A few notes:
- Autocomplete in Minefield seems to self-destruct after using it in the same field a few times, taking down saving of the field with it -- this may or may not be my fault, but it makes Zotero more or less unusable in 3.0 at the moment. Sorry. (I use 3.0 myself for development, so I'll work on it.)
- This would have been much, much easier if having an autocomplete textbox (which uses an XBL-generated popup for the suggestions) within a popup (as it is in the independent note edit panes) didn't introduce all sorts of crazy bugs that had to be defeated with annoying hackery -- one side effect of this is that at the moment you can't close the tags popup with the Escape key
- Independent note windows now need to pull in itemPane.js to function properly, which is a bit messy and not ideal, but less messy and more ideal than duplicating all the dual-state editor and tabindex logic would be
- Hitting tab in a tag field not only doesn't work but also breaks things until the next window refresh.
- There are undoubtedly other bugs.
- About panel now gets version number automatically
- Change version from 1.0a1 to 1.0b1
* Important: If you're on an SVN install, you need to rename the scholar@chnm.gmu.edu text file in your profile extension directory to zotero@chnm.gmu.edu *
XPI installs will (I think) update automatically, since I kept an entry in updates.rdf with the old GUID
not yet implemented:
- formatted in-text citations, rather than placeholders
- footnotes
- selection of citation style (for now, only APA is available)
- support for non-ASCII characters
- exclusion of notes from select items window
- Windows support (although it shouldn't be difficult)
- probably much more...
Attachments.importFromURL() now first does a HEAD request to get the MIME type and passes that through Scholar.MIME.hasInternalHandler() (now abstracted from Scholar.File, along with the other MIME functions) -- if it can handle the MIME type, it uses a hidden browser; otherwise, it use a remote web page persist to save the file directly
- Fixed bug in File.hasInternalHandler() (no access to navigator from XPCOM)
- Changed "View Attachment" action to check File.hasInternalHandler() and use window.loadURI() for internally handled files and nsIFile.launch() for external -- this prevents the user from getting a helper app dialog when they try to view external files. I basically had to duplicate most of Mozilla's content detection logic and "guess" whether or not it will be able to handle the file internally, which seems a little silly, but, while I feel there are probably better ways to do various parts of this, what's here seems to do the trick. Let me know if you notice it guessing incorrectly (i.e. you get a helper app dialog rather than having a file just open or it launches a file that should've just been loaded into the window). Also look for text files that should be launched rather than opened, especially XML-based data files, as this is a chance for Scholar to be smarter than Firefox itself--for example, OmniGraffle files, which are actually just XML files, normally open up in Firefox as an XML tree, but Scholar will launch them instead. (I imagine the same will need to be done for OmniOutliner, among other things...)
Implemented advanced/saved search architecture -- to use, you create a new search with var search = new Scholar.Search(), add conditions to it with addCondition(condition, operator, value), and run it with search(). The standard conditions with their respective operators can be retrieved with Scholar.SearchConditions.getStandardConditions(). Others are for special search flags and can be specified as follows (condition, operator, value):
'context', null, collectionIDToSearchWithin
'recursive', 'true'|'false' (as strings!--defaults to false if not specified, though, so should probably just be removed if not wanted), null
'joinMode', 'any'|'all', null
For standard conditions, currently only 'title' and the itemData fields are supported -- more coming soon.
Localized strings created for the standard search operators
API:
search.setName(name) -- must be called before save() on new searches
search.load(savedSearchID)
search.save() -- saves search to DB and returns a savedSearchID
search.addCondition(condition, operator, value)
search.updateCondition(searchConditionID, condition, operator, value)
search.removeCondition(searchConditionID)
search.getSearchCondition(searchConditionID) -- returns a specific search condition used in the search
search.getSearchConditions() -- returns search conditions used in the search
search.search() -- runs search and returns an array of item ids for results
search.getSQL() -- will be used by Dan for search-within-search
Scholar.Searches.getAll() -- returns an array of saved searches with 'id' and 'name', in alphabetical order
Scholar.Searches.erase(savedSearchID) -- deletes a given saved search from the DB
Scholar.SearchConditions.get(condition) -- get condition data (operators, etc.)
Scholar.SearchConditions.getStandardConditions() -- retrieve conditions for use in drop-down menu (as opposed to special search flags)
Scholar.SearchConditions.hasOperator() -- used by Dan for error-checking
add Scholar.Cite and Scholar.CSL for parsing items into a bibliography using CSL. unfortunately, the output is not very good at the moment, and the format likely needs some changes, but I'm working with a few other people on getting it to that point.
closes#100, migrate ingester to Scholar.Translate
closes#88, migrate scrapers away from RDF
closes#9, pull out LC subject heading tags
references #87, add fromArray() and toArray() methods to item objects
API changes:
all translation (import/export/web) now goes through Scholar.Translate
all Scholar-specific functions in scrapers start with "Scholar." rather than the jumbled up piggy bank un-namespaced confusion
scrapers now longer specify items through RDF (the beginning of an item.fromArray()-like function exists in Scholar.Translate.prototype._itemDone())
scrapers can be any combination of import, export, and web (type is the sum of 1/2/4 respectively)
scrapers now contain functions (doImport, doExport, doWeb) rather than loose code
scrapers can call functions in other scrapers or just call the function to translate itself
export accesses items item-by-item, rather than accepting a huge array of items
MARC functions are now in the MARC import translator, and accessed by the web translators
new features:
import now works
rudimentary RDF (unqualified dublin core only), RIS, and MARC import translators are implemented (although they are a little picky with respect to file extensions at the moment)
items appear as they are scraped
MARC import translator pulls out tags, although this seems to slow things down
no icon appears next to a the URL when Scholar hasn't detected metadata, since this seemed somewhat confusing
apologizes for the size of this diff. i figured if i was going to re-write the API, i might as well do it all at once and get everything working right.
- changes scrapers table to translators table; all import/export/web translators now belong in this table
- adds Scholar.Translate to handle translation issues. eventually, Scholar.Ingester.Document will become part of this interface
- adds Scholar_File_Interface (in fileInterface.js) to handle UI for export and eventually import. (David, when you have time, please connect Scholar_File_Interface.exportFile to a button.)
- adds an export translator for MODS. all of our metadata, but not our hierarchy (projects, etc.) translates directly and unambiguously into valid MODS. eventually, we can use RDF or another format to handle hierarchy.
- adds utilities.getVersion() and utilities.inArray() for simplified scraper coding
- fixes minor interface issues with the nifty chrome scraping status window