zotero/chrome/chromeFiles/skin/default/scholar
Dan Stillman ab13c3980a Fulltext search support
There are currently two types of fulltext searching: an SQL-based word index and a file scanner. They each have their advantages and drawbacks.

The word index is very fast to search and is currently used for the find-as-you-type quicksearch. However, indexing files takes some time, so we should probably offer a preference to turn it off ("Index attachment content for quicksearch" or something). There's also an issue with Chinese characters (which are indexed by character rather than word, since there are no spaces to go by, so a search for a word with common characters could produce erroneous results). The quicksearch doesn't use a left-bound index (since that would probably upset German speakers searching for "musik" in "nachtmusik," though I don't know for sure how they think of words) but still seems pretty fast.

* Note: There will be a potentially long delay when you start Firefox with this revision as it builds a fulltext word index of your existing items. We obviously need a notification/option for this. *

The file scanner, used in the Attachment Content condition of the search dialog, offers phrase searching as well as regex support (both case-sensitive and not, and defaulting to multiline). It doesn't require an index, though it should probably be optimized to use the word index, if available, for narrowing the results when not in regex mode. (It does only scan files that pass all the other search conditions, which speeds it up considerably for multi-condition searches, and skips non-text files unless instructed otherwise, but it's still relatively slow.)

Both convert HTML to text before searching (with the exception of the binary file scanning mode).

There are some issues with which files get indexed and which don't that we can't do much about and that will probably confuse users immensely. Dan C. suggested some sort of indicator (say, a green dot) to show which files are indexed.

Also added (very ugly) charset detection (anybody want to figure out getCharsetFromString(str)?), a setTimeout() replacement in the XPCOM service, an arrayToHash() method, and a new header to timedtextarea.xml, since it's really not copyright CHNM (it's really just a few lines off from the toolkit timed-textbox binding--I tried to change it to extend timed-textbox and just ignore Return keypress events so that we didn't need to duplicate the Mozilla code, but timed-textbox's reliance on html:input instead of html:textarea made things rather difficult).

To do:

- Pref/buttons to disable/clear/rebuild fulltext index
- Hidden prefs to set maximum file size to index/scan
- Don't index words of fewer than 3 non-Asian characters
- MRU cache for saved searches
- Use word index if available to narrow search scope of fulltext scanner
- Cache attachment info methods
- Show content excerpt in search results (at least in advanced search window, when it exists)
- Notification window (a la scraping) to show when indexing
- Indicator of indexed status
- Context menu option to index
- Indicator that a file scanning search is in progress, if possible
- Find other ways to make it index the NYT front page in under 10 seconds
- Probably fix lots of bugs, which you will likely start telling me about...now.
2006-09-21 00:10:29 +00:00
..
about.css Closes #12, credits panel. 2006-07-06 20:43:32 +00:00
addCitationDialog.css - closes #225, ability to cite a specific page/paragraph/etc in Word integration. the output isn't quite right at the moment, but the interface works. 2006-09-11 01:05:26 +00:00
citation-add-gray.png - closes #225, ability to cite a specific page/paragraph/etc in Word integration. the output isn't quite right at the moment, but the interface works. 2006-09-11 01:05:26 +00:00
citation-add.png - closes #225, ability to cite a specific page/paragraph/etc in Word integration. the output isn't quite right at the moment, but the interface works. 2006-09-11 01:05:26 +00:00
citation-delete-gray.png - closes #225, ability to cite a specific page/paragraph/etc in Word integration. the output isn't quite right at the moment, but the interface works. 2006-09-11 01:05:26 +00:00
citation-delete.png - closes #225, ability to cite a specific page/paragraph/etc in Word integration. the output isn't quite right at the moment, but the interface works. 2006-09-11 01:05:26 +00:00
cog.png A cog menu each for collections and items (the same as the contextual menu, for now) 2006-06-22 00:13:21 +00:00
item-attachments-add.png Renamed "Files" to "Attachments" -- since Files could be links as well as actually files (or both, for web page snapshots), things were getting just about as confusing as when Items were called Objects. 2006-08-12 00:18:20 +00:00
overlay.css Closes #189, "extra" field should allow multiple lines 2006-09-12 08:47:24 +00:00
scholar.css Fulltext search support 2006-09-21 00:10:29 +00:00
search-cancel-active.png Closes #154, Clicking an item in Related should display that item. 2006-08-02 15:13:31 +00:00
search-cancel.png Closes #154, Clicking an item in Related should display that item. 2006-08-02 15:13:31 +00:00
tag.png Fixes #32, implement tags in tags tab 2006-07-05 13:09:58 +00:00
textfield-dual.png Closes #247, Add interface option for institutional creators in item edit pane 2006-09-08 22:46:49 +00:00
textfield-single.png Closes #247, Add interface option for institutional creators in item edit pane 2006-09-08 22:46:49 +00:00
toolbar-collection-add.png Got some new icons for the lists 2006-06-20 17:08:30 +00:00
toolbar-collection-edit-gray.png Got some new icons for the lists 2006-06-20 17:08:30 +00:00
toolbar-collection-edit.png Got some new icons for the lists 2006-06-20 17:08:30 +00:00
toolbar-fullscreen-bottom.png The fullscreen button is now an image. It looks different depending on whether the pane is on top or bottom, and changes background when activated. 2006-07-28 16:49:19 +00:00
toolbar-fullscreen-top.png The fullscreen button is now an image. It looks different depending on whether the pane is on top or bottom, and changes background when activated. 2006-07-28 16:49:19 +00:00
toolbar-item-add.png Got some new icons for the lists 2006-06-20 17:08:30 +00:00
toolbar-large-active.png Created Scholar toolbar button (use Customize Toolbar... option) 2006-07-26 16:42:26 +00:00
toolbar-large-disabled.png Created Scholar toolbar button (use Customize Toolbar... option) 2006-07-26 16:42:26 +00:00
toolbar-large.png Created Scholar toolbar button (use Customize Toolbar... option) 2006-07-26 16:42:26 +00:00
toolbar-note-add.png The fullscreen button is now an image. It looks different depending on whether the pane is on top or bottom, and changes background when activated. 2006-07-28 16:49:19 +00:00
toolbar-openurl.png Closes #172, add preference for EndNote MIME type stealing feature 2006-08-09 15:44:11 +00:00
toolbar-small-active.png Created Scholar toolbar button (use Customize Toolbar... option) 2006-07-26 16:42:26 +00:00
toolbar-small-disabled.png Created Scholar toolbar button (use Customize Toolbar... option) 2006-07-26 16:42:26 +00:00
toolbar-small.png Created Scholar toolbar button (use Customize Toolbar... option) 2006-07-26 16:42:26 +00:00
treeitem-artwork.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-attachment-file.png Renamed "Files" to "Attachments" -- since Files could be links as well as actually files (or both, for web page snapshots), things were getting just about as confusing as when Items were called Objects. 2006-08-12 00:18:20 +00:00
treeitem-attachment-link.png Renamed "Files" to "Attachments" -- since Files could be links as well as actually files (or both, for web page snapshots), things were getting just about as confusing as when Items were called Objects. 2006-08-12 00:18:20 +00:00
treeitem-attachment-snapshot.png Renamed "Files" to "Attachments" -- since Files could be links as well as actually files (or both, for web page snapshots), things were getting just about as confusing as when Items were called Objects. 2006-08-12 00:18:20 +00:00
treeitem-attachment-web-link.png Renamed "Files" to "Attachments" -- since Files could be links as well as actually files (or both, for web page snapshots), things were getting just about as confusing as when Items were called Objects. 2006-08-12 00:18:20 +00:00
treeitem-attachment.png Renamed "Files" to "Attachments" -- since Files could be links as well as actually files (or both, for web page snapshots), things were getting just about as confusing as when Items were called Objects. 2006-08-12 00:18:20 +00:00
treeitem-book.png Got some new icons for the lists 2006-06-20 17:08:30 +00:00
treeitem-bookSection.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-film.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-interview.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-journalArticle.png Got some new icons for the lists 2006-06-20 17:08:30 +00:00
treeitem-letter.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-magazineArticle.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-manuscript.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-newspaperArticle.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-note.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-thesis.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treeitem-website.png closes #22, button in note pane for new note 2006-06-26 12:58:22 +00:00
treesource-collection.png Got some new icons for the lists 2006-06-20 17:08:30 +00:00
treesource-library.png Got some new icons for the lists 2006-06-20 17:08:30 +00:00
treesource-search.png Closes #172, add preference for EndNote MIME type stealing feature 2006-08-09 15:44:11 +00:00
zotero_logo_14px.png Replaced "Scholar is loaded" line with Zotero logo 2006-08-30 05:34:12 +00:00
zotero_logo_18px.png Closes #59, logos 2006-08-30 18:56:10 +00:00
zotero_logo_20px.png Transparent and larger Zotero status bar logo; fixed white side margins 2006-08-30 16:37:43 +00:00
zotero_logo_20px_2.png With the three different versions of Zotero, for comparison purposes (a gross misuse of SVN, I know) 2006-08-30 17:00:17 +00:00
zotero_logo_20px_3.png With the three different versions of Zotero, for comparison purposes (a gross misuse of SVN, I know) 2006-08-30 17:00:17 +00:00
zotero_z_32px.png Closes #59, logos 2006-08-30 18:56:10 +00:00