Commit graph

8 commits

Author SHA1 Message Date
Simon Kornblith
3d881eec13 - Make scrapers return standard ISO-style YYYY-MM-DD dates. Still need to work on journal article scrapers.
- Ingester lets callback function save items, rather than saving them itself.
- Better handling of multiple items in API, although no scrapers currently implement this.
2006-06-17 21:21:15 +00:00
Simon Kornblith
0753d78910 - Add VLTS scraper
- Fix loadDocument/processDocuments (broken by r145)
2006-06-06 21:35:23 +00:00
Simon Kornblith
152c9bf9e7 - Small changes to MARC record support
- Implemented loadDocument API, for loading and parsing the DOMs of HTML documents in the background
- Added scraper code to SVN repository (now includes 12 scrapers, see Writeboard for details)

To update to the latest versions of all scrapers, ensure you have an up-to-date version of sqlite3, then run:
sqlite3 ~/Library/Application\ Support/Firefox/Profiles/profileName/scholar.sqlite < scrapers.sql
2006-06-06 18:25:45 +00:00
Simon Kornblith
85d8153024 Add library, hooks for scraping MARC records. 2006-06-03 22:26:01 +00:00
Simon Kornblith
93652a137c Fix issues with asynchronous scraping and XMLHttpRequest 2006-06-02 23:53:42 +00:00
Simon Kornblith
bb57e6ba7d Provide visual feedback for scraping 2006-06-02 18:22:34 +00:00
Simon Kornblith
639a006efb XPCOM-ize ingester, fix swapped first and last name in ingested info, stop ingesting pages field (this should be for pages of the source used, not the total number of pages, right?) 2006-06-02 03:19:12 +00:00
Simon Kornblith
551582eb7e Still getting the hang of Subversion...the rest of the ingester code 2006-06-01 06:53:39 +00:00