New methods:
Item.addTag(tag)
Item.getTags() -- array of tagIDs
Item.removeTag(tagID)
Tags.getName(tagID)
Tags.getID(tag)
Tags.add(text) -- returns tagID of new tag
Tags.purge() -- purge obsolete tags
The last two are for use by Item.addTag() and Item.removeTag(), respectively, and probably don't need to be used elsewhere.
Item.save() changed to not call notify() if a transaction is in progress, which is totally going to trip someone up at some point, and probably me, but it's better than a new parameter
Item.toArray() implemented -- builds up a multidimensional array of item data, converting all type ids to their textual equivalents -- currently has empty placeholder arrays for tags and seeAlso
Sample source output:
'itemID' => "2"
'itemType' => "book"
'title' => "Computer-Mediated Communication: Human-to-Human Communication Across the Internet"
'dateAdded' => "2006-03-12 05:25:50"
'dateModified' => "2006-03-12 05:25:50"
'publisher' => "Allyn & Bacon Publishers"
'year' => "2002"
'pages' => "347"
'ISBN' => "0-205-32145-3"
'creators' ...
'0' ...
'firstName' => "Susan B."
'lastName' => "Barnes"
'creatorType' => "author"
'notes' ...
'0' ...
'note' => "text"
'tags' ...
'seeAlso' ...
'1' ...
'note' => "text"
'tags' ...
'seeAlso' ...
'tags' ...
'seeAlso' ...
Sample note output:
'itemID' => "17"
'itemType' => "note"
'dateAdded' => "2006-06-27 04:21:16"
'dateModified' => "2006-06-27 04:21:16"
'note' => "text"
'sourceItemID' => "2"
'tags' ...
'seeAlso' ...
sourceItemID won't exist if it's an independent note.
We'll use the same format in reverse for fromArray, so Simon, let me know if you need more data (preserving type ids, etc) or want anything in a different form.
- Added 'note' item type
- Updated API to support independent note creation
Notes are, more or less, just regular items, with an item type of 1. They're created through Scholar.Notes.add(text, sourceItemID), which returns the itemID of the new note. sourceItemID is optional--if left out, an independent note will be created. (There's currently nothing stopping you from doing getNewItemByType(1) yourself, but the note would be contentless and broken, so you shouldn't do that.) Note data could've been stuffed into itemData, but I kept it separate in itemNotes to keep metadata searching faster and to keep things cleaner.
Methods calls that can be called on all items:
isNote() (same as testing for itemTypeID 1)
Method calls that can be called on source items only:
numNotes()
getNotes() (array of note itemIDs for a source)
Method calls that can be called on note items only:
updateNote(text)
setNoteSource(sourceItemID) (for changing source--use empty or false to make independent, which is currently what happens when you delete a note--will get option with #91)
getNote() (note content)
getNoteSource() (sourceItemID of a note)
Calling the above methods on the wrong item types will throw an error.
*** This will break note creation/display until David updates interface code. ***
- Move processDocuments, a function for loading a DOM representation of a document or set of documents, to Scholar.Utilities.HTTP
- Add Scholar.Ingester.ingestURL, a simplified function to scrape a URL (closes#33)
- Added detection for network failure -- debug message is output and noNetwork property is added to the xmlhttp object
- Removed onStatus callback from HTTP.doGet and HTTP.doPost -- that was copied over from the Piggy Bank API, but the onDone callback has to handle errors anyway, so it can just check the status code if it actually cares to differentiate non-200 status codes from any other error
- Added error handling for empty responseXML to Schema._updateScrapersRemoteCallback
- Renamed SCHOLAR_CONFIG['REPOSITORY_CHECK_RETRY'] to SCHOLAR_CONFIG['REPOSITORY_RETRY_INTERVAL']
Scholar.Prefs also registers itself as a preferences observer and can be used to trigger actions when certain prefs are changed by editing the switch statement in the observe() method
Updated preferences.js to use Scholar.Prefs
- Added methods getID(idOrName) and getName(idOrName) to Scholar.CreatorTypes and Scholar.ItemTypes to take either typeID or typeName
- Removed getTypeName() in each and changed references accordingly
- Streamlined both classes to be as similar as possible
- Make Amazon scraper work with multiple documents
- Fix bugs in processDocuments
- Make Scholar.Ingester.Utilities.getItemArray() willing to take an array of DOM nodes to search for links, and finally take advantage of the fact that objects have no length
- Multiple item detection code is now a part of the scraperJavaScript, rather than the scrapeDetectCode, and code to choose which items to add is part of Scholar.Ingester.Utilities, accessible from inside scrapers. The alternative approach would result in one request (or, in the case of JSTOR, three requests) per new item, while in some cases (e.g. Voyager) only one request is necessary to get all of the items.
- When possible, corporate creators/contributors are categorized with their own RDF types (prefixDummy + "corporateCreator/corporateContributor)
- Remove extraneous debug code in extensions
- Don't try to display an SQLite error when it's "not an error" (i.e. when the error is in something else)
- Switch to nsIFile instead of nsILocalFile to retrieve the profile directory
Fix in Collection.erase() -- when the DB methods started returning values in their native type, the collection id became an int rather than a string and "new Array(this._id)" became a length declaration rather than an elements declaration
- Ingester lets callback function save items, rather than saving them itself.
- Better handling of multiple items in API, although no scrapers currently implement this.
Added a separate retry interval so that the extension retries sooner after failures (browser offline, request failure, etc.)
Revision 200 -- w00t i am victorious
- Removed localLastUpdated field from scrapers table and renamed centralLastUpdated to lastUpdated; updated scraper queries accordingly
- Added query in scrapers.sql to update version table 'repository' row to prevent immediate downloads of newly installed scrapers
- Get version property from extension manager in Scholar.init() and assign to Scholar.version
Scholar.HTTP.doGet(url, onStatus, onDone) and Scholar.HTTP.doPost(url, body, onStatus, onDone) -- onStatus and onDone are callbacks to call on non-200 responses and the response body, respectively
Assigned guids to scrapers, replaced INSERT queries with REPLACE queries, and removed table DELETE query at top -- this will allow scrapers to be updated without deleting any others that may exist (e.g. that someone is developing, third-party, etc.)
scholar.properties addition is just to keep metadata panel from breaking and could probably be removed once interface is hard-coded to not display the notes field there
- Send a 'delete' rather than a 'remove' to itemViews when items are actually deleted (removals from collections still get 'remove' unless it's part of a collection erase)
Added creatorTypeID to the PK in itemCreators, because a creator could conceivably have two roles on an item
Throw an error if attempt to save() an item with two identical creator/creatorTypeID combinations, since, assuming this shouldn't be allowed (i.e. we don't have a four-column PK), there's really no way to handle this elegantly on my end -- interface code can use new method Item.creatorExists(firstName, lastName, creatorTypeID, skipIndex), with skipIndex set to the current index, to make sure this isn't done
Integrated the scrapers with the schema update mechanism. Changed a bunch of schema methods to handle both schema.sql and scrapers.sql (or others, if need be) and altered the version table to track mu
ltiple versions for different files. This theoretically should detect that the version table has changed and force a reinitialization of the DB--let me know if there are problems.
- Broke schema functions into separate object and got rid of DB_VERSION config constant in favor of a toVersion variable in the _migrateSchema command (which isn't technically necessary either, since the version number at the top of schema.sql is now always compared to the DB version at startup) but will help reduce the chance that someone will update the schema file without adding migration steps)
- Removed Amazon scraper from schema.sql, as it will be loaded with the rest of the scrapers
- Lots of work to be done. For example, the present way of showing a textbox is sort of a hack - taking a label, assigning certain properties to a textbox, then removing the label and placing the textbox in its place. I will be looking into our options.
- Also, I need to figure out adding/editing/deleting creators.
- When the textbox loses focus, it updates and saves the item (like iCal).
Date: removed Scholare.Date functions, as they are overkill. We can use the built-in Date.toLocaleString()
- Implemented loadDocument API, for loading and parsing the DOMs of HTML documents in the background
- Added scraper code to SVN repository (now includes 12 scrapers, see Writeboard for details)
To update to the latest versions of all scrapers, ensure you have an up-to-date version of sqlite3, then run:
sqlite3 ~/Library/Application\ Support/Firefox/Profiles/profileName/scholar.sqlite < scrapers.sql
The real search function will be considerably more advanced/flexible, but this should work as a placeholder for the moment. Probably quick enough for FAYT, at least with a ~0.5 second delay to avoid unnecessary calls while people are typing (which is probably a good idea anyway). This search doesn't use indexes at all, so if more speed is needed, one option would be to maintain a manual FULLTEXT-type index (using triggers, ideally) that could be quickly searched, but we'd lose intra-word filtering, which people would probably expect...
parentCollectionID is optional and defaults to moving item to root
Returns true on success, false on attempt to move collection into its existing parent, itself or a descendent collection; throws exception on invalid parentCollectionID
Sends a columnTree notify() with previous parent ID, id of collection itself, and new parent ID (unless the previous or new are the root, in which case it's omitted) -- that may or may not make sense for the interface code and can be changed if needed