Commit graph

193 commits

Author SHA1 Message Date
Dan Stillman
36af25b3e9 Cache file link mode 2006-08-09 16:25:28 +00:00
Dan Stillman
d7990b0e03 Updated Scholar.Files.linkFromURL() to take title and mime type as parameters, to prevent loading huge external PDFs just to get the content type when the ingester already knows it (though that will hopefully be alleviated by #173 and #174 later) 2006-08-09 06:32:16 +00:00
Dan Stillman
318cf3194f Addresses #171, Add more conditions to advanced search architecture
Added conditions 'tagID', 'tag' (text), 'creator' (concats first and last before comparing), and 'note'
2006-08-08 23:08:52 +00:00
Simon Kornblith
6efd6d2cc4 closes #99, add options for export 2006-08-08 23:00:33 +00:00
Simon Kornblith
af080fe384 allow EndNote MIME handler to be unregistered without restarting Firefox 2006-08-08 21:40:33 +00:00
Simon Kornblith
3edb6e0286 closes #86, steal EndNote download links
Scholar should now attempt to process citation information from EndNote download links (MIME types application/x-endnote-refer and application/x-research-info-systems). in situations where Scholar cannot process the information, a standard helper app dialog will appear. this behavior is controlled by the preference extensions.scholar.parseEndNoteMIMETypes.
2006-08-08 21:17:07 +00:00
Dan Stillman
9f57379415 Make searchConditionIDs a little easier to work with--now accessible by with the .id property of search conditions in addition to index 2006-08-08 15:40:42 +00:00
Dan Stillman
f8739ee6c5 Closes #135, Associate MIME types with abstract file types and implement Scholar.FileTypes.getIDFromMIMEType()
MIME type prefixes are handled using wildcards (e.g. audio/foobar will return the audio file type since it matches 'audio/%')
2006-08-08 08:23:23 +00:00
Dan Stillman
1de9007608 Take two 2006-08-08 07:05:39 +00:00
Dan Stillman
b5cb0e3a92 Fixed repeat single-file loading with Files.importFromURL() and Files.linkFromURL() (have to use the "pageload" event rather than "load" -- thanks Simon) 2006-08-08 07:05:05 +00:00
Dan Stillman
425d806307 Closes #158, Add linkFromURL() and importFromURL() functions to Scholar.Files 2006-08-08 06:08:21 +00:00
Dan Stillman
d7ed7c256c Fix the startup trouble the search code was causing (moved DB call into init() function rather than constructor) -- sorry about that 2006-08-08 05:26:51 +00:00
Simon Kornblith
504ebf8996 closes #162, do sniffing for import formats
import should now work regardless of file extensions. this should make #86 (steal EndNote download links) fairly easy to implement.
2006-08-08 02:46:52 +00:00
Dan Stillman
d67d96c321 Closes #7, Add advanced search functionality to data layer
Implemented advanced/saved search architecture -- to use, you create a new search with var search = new Scholar.Search(), add conditions to it with addCondition(condition, operator, value), and run it with search(). The standard conditions with their respective operators can be retrieved with Scholar.SearchConditions.getStandardConditions(). Others are for special search flags and can be specified as follows (condition, operator, value):

'context', null, collectionIDToSearchWithin
'recursive', 'true'|'false' (as strings!--defaults to false if not specified, though, so should probably just be removed if not wanted), null
'joinMode', 'any'|'all', null

For standard conditions, currently only 'title' and the itemData fields are supported -- more coming soon.

Localized strings created for the standard search operators


API:

search.setName(name) -- must be called before save() on new searches
search.load(savedSearchID)
search.save() -- saves search to DB and returns a savedSearchID
search.addCondition(condition, operator, value)
search.updateCondition(searchConditionID, condition, operator, value)
search.removeCondition(searchConditionID)
search.getSearchCondition(searchConditionID) -- returns a specific search condition used in the search
search.getSearchConditions() -- returns search conditions used in the search
search.search() -- runs search and returns an array of item ids for results
search.getSQL() -- will be used by Dan for search-within-search

Scholar.Searches.getAll() -- returns an array of saved searches with 'id' and 'name', in alphabetical order
Scholar.Searches.erase(savedSearchID) -- deletes a given saved search from the DB

Scholar.SearchConditions.get(condition) -- get condition data (operators, etc.)
Scholar.SearchConditions.getStandardConditions() -- retrieve conditions for use in drop-down menu (as opposed to special search flags)
Scholar.SearchConditions.hasOperator() -- used by Dan for error-checking
2006-08-08 02:04:02 +00:00
Simon Kornblith
216f0c7581 closes #83, figure out how to implement OpenURL
closes #76, implement extensible search/retrieval architecture for obtaining metadata

OpenURL COinS lookup is now implemented using a real search architecture system. at the moment, it works with Open WorldCat for books, CrossRef for journal articles (provided the COinS object contains a DOI or an ISSN), and PubMed when a PMID is available.
2006-08-08 01:06:33 +00:00
Simon Kornblith
6626eba844 addresses #83, figure out how to implement OpenURL
OpenURL lookup now works for books. this means that all that's necessary to add scrapable book metadata to a page is an ISBN, as shown below:

<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info:ofi/fmt:kev:mtx:book&amp;rft.isbn=1579550088"></span>

also, we can now scrape Open WorldCat and Wikipedia Book Sources pages with no specialized code involved.

i'm still looking for a better way of looking up journal article metadata. it's currently implemented with CrossRef, but CrossRef simply will not work without a DOI, and is also incomplete (only holds the last name of the first author).
2006-08-07 05:15:30 +00:00
Simon Kornblith
56769079b0 addresses #83, figure out how to implement OpenURL
Scholar.OpenURL.resolve(item) returns the URL that retrieves an item from the user's OpenURL resolver. this means we can implement a "find in my library" feature.
Scholar.OpenURL.discoverResolvers() returns a list of available resolvers for the user's current location (by IP address).
2006-08-06 21:59:50 +00:00
Dan Stillman
d4acec8a77 Addresses #111, minor modifications to field list schema, and http://chnm.grouphub.com/P2995041
Still some outstanding questions on Basecamp, though
2006-08-06 16:10:28 +00:00
Simon Kornblith
fc589a37cf closes #131, make import/export symmetrical
all 4 import/export formats currently supported (MODS, Hybrid RDF, Unqualified Dublin Core, and RIS) now work as both import and export translators
2006-08-06 09:34:51 +00:00
Simon Kornblith
9144b56772 addresses #131, make import/export symmetrical
closes #163, make translator API allow creator types besides author

import and export in the multi-ontology RDF format should now work properly. collections, notes, and see also are all preserved. more extensive testing will be necessary later.
2006-08-05 20:58:45 +00:00
Dan Stillman
1ce4de835b Fixes #167, Item note cache is not set on new note creation 2006-08-05 07:42:32 +00:00
Dan Stillman
8dd972dea1 Make Collection.getDescendents() a[n officially] public method and add second param to limit results to 'collection' or 'item' 2006-08-05 06:39:15 +00:00
Dan Stillman
701762a11f Fixes #166, Scholar.ItemTypes.getID("journalArticle") throws "Invalid item type journalarticle"
Fixed ignoreCase logic (and also set all but CharacterSets to false, since there's no reason for them to be true)

Also made CachedTypes.getID() and getName() return false and '', respectively, on unknown types rather than letting them hit the error (there's still the 'invalid * type' debug message)
2006-08-04 19:39:53 +00:00
Dan Stillman
9d58fac7e0 Abstracted the Scholar.*Types logic to a base function that can be extended and added singletons for the various types -- rock the JS prototype model 2006-08-04 04:34:16 +00:00
Simon Kornblith
b4c8dbe700 closes #157, add database infrastructure for different CSL styles
CSL is stored in a new "csl" table. only metadata relevant to updates and selection (ID, date updated, and title) is stored in columns.
2006-08-03 04:54:16 +00:00
Simon Kornblith
30af2c89df - closes #130, add progress bar for import/export
- eliminates "unresponsive script" message on import/export

i tried to make a progress bar that actually provides useful information, but for some reason, XUL interface updates are done asynchronously, and thus don't actually happen as long as the import/export operation continues. the code is there, but disabled, if there's some solution to this issue, but i searched and couldn't find one.
2006-08-02 21:06:58 +00:00
Dan Stillman
5c6d9de4b8 Addresses #77, maintain database backups
Temporarily added in a check of the backup file on startup, since I'm not entirely convinced that the backup mechanism on shutdown couldn't create a corrupt file under certain conditions

If you run with debug output on and notice the "Backup file was corrupt" message, let me know.
2006-08-01 23:32:18 +00:00
Dan Stillman
d3bc693dab Closes #77, maintain database backups
The Scholar database is backed up on browser close. On startup, if the main database is damaged, the extension saves a copy of the damaged file and tries to restore from the last automatic backup. If it fails, it creates a new database file.

New methods:

Scholar.getScholarDatabase(string [ext])
Scholar.backupDatabase()
Scholar.moveToUnique(file, newFile) -- find a unique filename using the leafName of newFile as the suggested name (using the built-in Mozilla functionality) and move the file there
Scholar.Date.getFileDateString(file)
Scholar.Date.getFileTimeString(file)
2006-08-01 23:10:31 +00:00
Dan Stillman
9a0457b43e Register shutdown handler to call Scholar.shutdown() on exit 2006-08-01 18:01:56 +00:00
Dan Stillman
40ef9f669d Closes #90, Add flag to delete child notes when a source is deleted
Item.erase(true) deletes child items as well instead of just unlinking
2006-07-31 06:05:19 +00:00
Dan Stillman
526d368aaf Closes #117, permit dashes and commas in "pages" field
Closes #118, add "translator" creator type
Closes #122, add DOI and abbreviated journal title fields
Addresses #45, reorder item fields -- source/rights moved down to bottom; date fields not yet moved
2006-07-31 04:31:44 +00:00
Dan Stillman
9da1c210a0 Change repository check time back to once a day 2006-07-31 03:38:02 +00:00
Dan Stillman
1adeb840bf Closes #98, Cache note content to avoid repeated DB calls 2006-07-30 21:49:34 +00:00
Dan Stillman
6ab7fd1e18 Closes #119, When Item.isNote(), Item.getField('title') should return first line of note
Returns the first 80 characters of the note content as the title

Also changed setField() to use the loadIn parameter for primary fields so it can be used instead of this._data without affected _changedItems
2006-07-30 21:01:23 +00:00
Dan Stillman
82106afc95 JavaScript, how I love thee.
Fixes URL not being stored with saved web pages.
2006-07-28 16:20:48 +00:00
Simon Kornblith
6305e4cada closes #55, export bibliography to printable version
closes #4, Make printable version

- moves functions for creating and deleting hidden browser objects to scholar.js (from ingester.js), since these are necessary for printing as well
- allows saving bibliography in HTML or printing bibliography. style support is not yet complete (pending finalization of 0.9 version of CSL specification).
2006-07-27 23:01:55 +00:00
Dan Stillman
441696767a Don't return non-independent file items in Scholar.getItems() (thanks David) 2006-07-27 15:55:03 +00:00
Dan Stillman
c093e7b62b Item.isRegularItem() = !(Item.isNote() || Item.isFile()) 2006-07-27 15:04:22 +00:00
Dan Stillman
c50dedc90a Addresses #17, add filesystem/ability to store files
Not finished, but enough to give David something to work with

No BLOBs -- just linking/importing of files and loaded documents


New Scholar.Item methods:

incrementFileCount() (used internally)
decrementFileCount() (used internally)
isFile()
numFiles()
getFile() -- returns nsILocalFile or false if associated file doesn't exist (note: always returns false for items with LINK_MODE_LINKED_URL, since they have no files -- use getFileURL() instead)
getFileURL() -- returns URL string
getFileLinkMode() -- compare to Scholar.Files.LINK_MODE_* constants: LINKED_FILE, IMPORTED_FILE, LINKED_URL, IMPORTED_URL
getFileMimeType() -- mime type of file (e.g. text/plain)
getFileCharset() -- charsetID of file
getFiles() -- array of file itemIDs this file is a source for

New Scholar.Files methods:

importFromFile(nsIFile file [, int sourceItemID])
linkFromFile(nsIFile file [, int sourceItemID])
importFromDocument(nsIDOMDocument document [, int sourceItemID])
linkFromDocument(nsIDOMDocument document [, int sourceItemID])

New class Scholar.FileTypes -- partially implemented, not yet used

New class  Scholar.CharacterSets -- same as other *Types classes:
getID(idOrName)
getName(idOrName)
getTypes() (aliased to getAll(), which I'll probably change the others to as well)

Charsets table with all official character sets (copied from Mozilla source)

Renamed Item.setNoteSource() to setSource() and Item.getNoteSource() to getSource() and adjusted to handle both notes and files
2006-07-27 09:16:02 +00:00
Dan Stillman
4959535aff Scholar DB now stored in scholar subfolder of profile directory
New methods for retrieving profile directory, scholar subdirectory and scholar/storage subsubdirectory
2006-07-27 08:45:48 +00:00
Simon Kornblith
cbe3611182 references #110, implement CSL for citation styling
add Scholar.Cite and Scholar.CSL for parsing items into a bibliography using CSL. unfortunately, the output is not very good at the moment, and the format likely needs some changes, but I'm working with a few other people on getting it to that point.
2006-07-22 01:25:46 +00:00
Simon Kornblith
c64e5c841f closes #78, figure out import/export architecture
closes #100, migrate ingester to Scholar.Translate
closes #88, migrate scrapers away from RDF
closes #9, pull out LC subject heading tags
references #87, add fromArray() and toArray() methods to item objects

API changes:
all translation (import/export/web) now goes through Scholar.Translate
all Scholar-specific functions in scrapers start with "Scholar." rather than the jumbled up piggy bank un-namespaced confusion
scrapers now longer specify items through RDF (the beginning of an item.fromArray()-like function exists in Scholar.Translate.prototype._itemDone())
scrapers can be any combination of import, export, and web (type is the sum of 1/2/4 respectively)
scrapers now contain functions (doImport, doExport, doWeb) rather than loose code
scrapers can call functions in other scrapers or just call the function to translate itself
export accesses items item-by-item, rather than accepting a huge array of items
MARC functions are now in the MARC import translator, and accessed by the web translators

new features:
import now works
rudimentary RDF (unqualified dublin core only), RIS, and MARC import translators are implemented (although they are a little picky with respect to file extensions at the moment)
items appear as they are scraped
MARC import translator pulls out tags, although this seems to slow things down
no icon appears next to a the URL when Scholar hasn't detected metadata, since this seemed somewhat confusing

apologizes for the size of this diff. i figured if i was going to re-write the API, i might as well do it all at once and get everything working right.
2006-07-17 04:06:58 +00:00
Simon Kornblith
d65328c830 adds Biblio/DC/FOAF/PRISM/VCard RDF export type. Bruce D'Arcus, author of CiteProc and co-lead on the OpenOffice bibliographic project, is currently using this as his ontology, and we can unambiguously encode all of our metadata with it.
caveats:
- it's not human readable. mozilla doesn't nest blank nodes, so everything's scattered throughout the file. it would be relatively easy to do post-processing with E4X or even regexps to correct this.
- there's no generic callNumber field, so all callNumbers are encoded as LCC.

adds container creation routines to dataMode rdf

changes Dublin Core export to Unqualified Dublin Core, and removes DC Terms qualifiers
2006-07-07 18:41:21 +00:00
Simon Kornblith
c02666fcd3 add an API for Mozilla's RDF data source, so that import/export translators will be able to create and parse RDF with minimal effort
convert Dublin Core export to new API
2006-07-06 21:55:46 +00:00
Dan Stillman
40b3ecc996 Addresses #87, Add fromArray() and toArray() methods to Item objects
Item.getTags() (which toArray() uses) now returns actual tags rather than ids -- separate method getTagIDs to return ids
2006-07-06 13:06:32 +00:00
Simon Kornblith
2d8ed16d88 adds export of tags to MODS.
adds export of seeAlso info and project hierarchy to RDF. for now, this is embedded in the modsCollection root element.

uses nodeIDs for Dublin Core RDF.
2006-07-06 03:39:32 +00:00
Simon Kornblith
c0251085a9 Add export filters for RIS and Dublin Core RDF 2006-07-05 21:44:01 +00:00
Simon Kornblith
8b4a44be0f fixes a bug that made the Google Books translator not appear
adjusts the Google Books translator to work with the latest revision of the site

renames the MODS translator to just MODS, because "Metadata Object Description Schema (MODS)" was too long for the export dialog
2006-06-30 19:21:36 +00:00
Dan Stillman
6c88563ded Changed translators.type column to INT and added index
Updated scraper updated mechanism to handle new translator schema, roughly
2006-06-29 07:58:50 +00:00
Dan Stillman
35eb1292a5 Changed getRandomID() to use the full SQLite three-byte range if unable to find a two-byte id in 3 tries 2006-06-29 07:03:24 +00:00