Commit graph

10359 commits

Author SHA1 Message Date
Simon Kornblith
4284132db5 update scrapers.sql version 2006-08-12 04:29:59 +00:00
Simon Kornblith
ddb4fc872c remove the doStatus argument from Scholar.Utilities.HTTP 2006-08-12 04:27:49 +00:00
Dan Stillman
7e1a678f9b Addresses #136, Detect mime type and character set of local files when importing
Use new MIME type detection tricks when linking and importing files -- now for charset detection...
2006-08-12 03:54:13 +00:00
Dan Stillman
794238c23f - Added file.js to chnmIScholarService.js
- Fixed bug in File.hasInternalHandler() (no access to navigator from XPCOM)

- Changed "View Attachment" action to check File.hasInternalHandler() and use window.loadURI() for internally handled files and nsIFile.launch() for external -- this prevents the user from getting a helper app dialog when they try to view external files. I basically had to duplicate most of Mozilla's content detection logic and "guess" whether or not it will be able to handle the file internally, which seems a little silly, but, while I feel there are probably better ways to do various parts of this, what's here seems to do the trick. Let me know if you notice it guessing incorrectly (i.e. you get a helper app dialog rather than having a file just open or it launches a file that should've just been loaded into the window). Also look for text files that should be launched rather than opened, especially XML-based data files, as this is a chance for Scholar to be smarter than Firefox itself--for example, OmniGraffle files, which are actually just XML files, normally open up in Firefox as an XML tree, but Scholar will launch them instead. (I imagine the same will need to be done for OmniOutliner, among other things...)
2006-08-12 03:45:57 +00:00
Dan Stillman
d0d1ed8c1d Scholar.File -- some methods to help with MIME type detection of local files (might be abstracted into MIME class later)
Methods:

getExtension(ext)
isExternalTextExtension(ext)
getSample(nsIFile)
sniffForMIMEType(nsIFile)
sniffForBinary(nsIFile)
getMIMETypeFromFile(nsIFile)
hasInternalHandler(nsIFile)
2006-08-12 01:41:48 +00:00
Dan Stillman
f07ff9ac2a Renamed "Files" to "Attachments" -- since Files could be links as well as actually files (or both, for web page snapshots), things were getting just about as confusing as when Items were called Objects.
If you have attachments to the old terminology, feel free to file a complaint.

Changed interface code too, since David is gone (or at the very least has more important things to do with his remaining time)
2006-08-12 00:18:20 +00:00
Simon Kornblith
36a402713c rename Scholar.Utilities.Ingester.HTTPUtilities to Scholar.Utilities.Ingester.HTTP for consistency 2006-08-11 16:34:22 +00:00
David Norton
e56abbc5f4 Closes #138, Ability to view files
Some minor fixes to editing creators.
2006-08-11 15:48:26 +00:00
Dan Stillman
1447b3be92 Item.getLocalFileURL() -- Return a file:/// URL path to files and snapshots 2006-08-11 15:34:06 +00:00
Simon Kornblith
064ecd17db removes unnecessary pieces of piggy bank API from utilities and updates translators to abide by current translator guidelines 2006-08-11 15:28:18 +00:00
Dan Stillman
1e8aa81c02 Fixes #108, Delay repository check by a few seconds if DB transaction is already in progress 2006-08-11 05:51:55 +00:00
Dan Stillman
9c7f33e21a Extended itemCreators primary key to include orderIndex and removed artificial restriction on adding the same creator/creatorType more than once for the same source -- who knows, maybe they just have the same name...
Properly ignore firstName for institutional creators in Item.setCreator() and Item.creatorExists() (which is now unused)
2006-08-11 05:15:56 +00:00
Dan Stillman
957b220cd3 Closes #56, add "institution" field in creators table to deal with institutional authors
- 'isInstitution' parameter added to Item.setCreator(), Creators.getID(), Creators.add()

- 'isInstitution' property added to return from Creators.get() and Item.getCreator()


var obj = Scholar.Items.getNewItemByType(1);
obj.setField('title', 'Digital History for Dummies');
obj.setCreator(0, '', 'Center for History and New Media', 1, true); // true == institutional creator
var id = obj.save();


Note: 'firstName' field is ignored when 'isInstitution' is true
2006-08-11 04:36:44 +00:00
Dan Stillman
8a56951c0b Moved comments from removed Database Architecture writeboard into schema.sql 2006-08-10 22:49:17 +00:00
Dan Stillman
9e75fc2589 Localized string for website accessDate 2006-08-10 22:46:44 +00:00
Dan Stillman
97628b0d2c More helpful debug error output in DB parameter handling and Notifier.trigger() 2006-08-10 22:46:00 +00:00
Dan Stillman
892478be2e Fixed bug that was breaking Scholar.Files.getFile() 2006-08-10 22:44:45 +00:00
David Norton
d815154efa Removing a saved search removes the item in the left pane.
Search dialog:
 - should work now.
 - Any/All control
 - All 8 operators
Add search button uses an icon
2006-08-10 22:39:21 +00:00
Dan Stillman
9c496770a6 Addresses #63, Add ECL license info to source code
Replaced GPL license.txt with ECL
2006-08-10 22:07:28 +00:00
Dan Stillman
7d48fdbeda Closes #175, Add ability to specify certain conditions as required in an ANY search
Conditions in ANY queries can be made required by passing 'true' as an extra parameter to addCondition() and updateCondition() -- this can be used for limiting ANY queries to particular collections (in place of the removed 'context' condition), but if there was an elegant way to expose it to the user for all ANY queries, it's something users might find very useful.
2006-08-10 21:05:28 +00:00
Dan Stillman
3a410c5acd Addresses #171, Add more conditions to advanced search architecture
- Implemented 'collectionID' and 'savedSearchID' conditions (a.k.a. search within a search) and removed special 'context' condition. Per my conversation with Dan, the 'recursive' flag is now a global flag that applies to all specified collectionIDs, which is less than ideal but probably better than the alternatives (e.g. having condition-specific recursive checkboxes). It does mean, however, that a "Search subfolders" checkbox is irrelevant if there are no collectionID conditions and should probably be greyed out until applicable.

Another side effect is that it's no longer possible to do an ANY search and return results only within a specific folder (though it can now be done by putting the ANY conditions in a subsearch). Since ANY searches are always annoying in this regard, what I might do is add a way to mark particular conditions as required even in ANY mode, which would allow for quite a lot of flexibility...

Note also that while 'collectionID' and 'savedSearchID' are standard conditions, they should probably be combined into a single condition on the interface side (like playlists and smart playlists under just 'Playlist' in iTunes).

- Now skips invalid/obsolete saved conditions in load()
2006-08-10 09:44:08 +00:00
Dan Stillman
4fe960d190 Search updates:
- Remaining searchConditionIDs are no longer affected by removeCondition() (i.e. they now act like autoincrements), which should make interface code simpler

- Changed default join mode to ALL

- Fixed loading of saved searches with no search conditions
2006-08-10 04:32:36 +00:00
Dan Stillman
0061a8d0df Fix problem with addCondition() overwriting existing search conditions (thanks David) 2006-08-10 02:52:29 +00:00
Simon Kornblith
3a1ffb6174 make LOC/WebVoyage scraper and other scrapers using Scholar.loadTranslator work again 2006-08-09 18:59:38 +00:00
Dan Stillman
f3a66085f5 Closes #173, Try to detect content type of linked pages without loading entire file
Closes #174, Don't load images and attached files when detecting content type in linkFromURL()

If mime type not provided, Scholar.Files.linkFromURL() now uses XMLHTTPRequest HEAD request to get the content type without loading file (thanks Simon for the idea)

If title not provided, try to figure it out from URL, though not particularly intelligently (last slash)

Note that order of title and mimeType parameters is now swapped

This code should be a bit smarter about unexpected conditions
2006-08-09 18:37:34 +00:00
Simon Kornblith
cde7170868 references #169, add OpenURL interface hooks
Scholar.OpenURL.discoverResolvers() now returns an array of {name, url, version} objects
2006-08-09 16:58:54 +00:00
Dan Stillman
36af25b3e9 Cache file link mode 2006-08-09 16:25:28 +00:00
David Norton
6877d33e61 Closes #172, add preference for EndNote MIME type stealing feature
Addresses #169, add OpenURL interface hooks
Addresses #170, Put "Link" option before "Import" in drop-down menu

Fixes some advanced search flaws (there are still bugs)
2006-08-09 15:44:11 +00:00
David Norton
8d5f1e62b6 Closes #47, advanced search
Closes #152, Saved Searches (interface layer)

For now, advanced search IS a saved search.

There are still bugs. The 'search' icon is ugly. I wanted to get it out there, however.
2006-08-09 11:43:33 +00:00
Dan Stillman
d7990b0e03 Updated Scholar.Files.linkFromURL() to take title and mime type as parameters, to prevent loading huge external PDFs just to get the content type when the ingester already knows it (though that will hopefully be alleviated by #173 and #174 later) 2006-08-09 06:32:16 +00:00
Dan Stillman
318cf3194f Addresses #171, Add more conditions to advanced search architecture
Added conditions 'tagID', 'tag' (text), 'creator' (concats first and last before comparing), and 'note'
2006-08-08 23:08:52 +00:00
Simon Kornblith
6efd6d2cc4 closes #99, add options for export 2006-08-08 23:00:33 +00:00
Simon Kornblith
af080fe384 allow EndNote MIME handler to be unregistered without restarting Firefox 2006-08-08 21:40:33 +00:00
Simon Kornblith
3edb6e0286 closes #86, steal EndNote download links
Scholar should now attempt to process citation information from EndNote download links (MIME types application/x-endnote-refer and application/x-research-info-systems). in situations where Scholar cannot process the information, a standard helper app dialog will appear. this behavior is controlled by the preference extensions.scholar.parseEndNoteMIMETypes.
2006-08-08 21:17:07 +00:00
Dan Stillman
9f57379415 Make searchConditionIDs a little easier to work with--now accessible by with the .id property of search conditions in addition to index 2006-08-08 15:40:42 +00:00
Dan Stillman
f8739ee6c5 Closes #135, Associate MIME types with abstract file types and implement Scholar.FileTypes.getIDFromMIMEType()
MIME type prefixes are handled using wildcards (e.g. audio/foobar will return the audio file type since it matches 'audio/%')
2006-08-08 08:23:23 +00:00
Dan Stillman
1de9007608 Take two 2006-08-08 07:05:39 +00:00
Dan Stillman
b5cb0e3a92 Fixed repeat single-file loading with Files.importFromURL() and Files.linkFromURL() (have to use the "pageload" event rather than "load" -- thanks Simon) 2006-08-08 07:05:05 +00:00
Dan Stillman
425d806307 Closes #158, Add linkFromURL() and importFromURL() functions to Scholar.Files 2006-08-08 06:08:21 +00:00
Dan Stillman
d7ed7c256c Fix the startup trouble the search code was causing (moved DB call into init() function rather than constructor) -- sorry about that 2006-08-08 05:26:51 +00:00
Simon Kornblith
504ebf8996 closes #162, do sniffing for import formats
import should now work regardless of file extensions. this should make #86 (steal EndNote download links) fairly easy to implement.
2006-08-08 02:46:52 +00:00
Dan Stillman
d67d96c321 Closes #7, Add advanced search functionality to data layer
Implemented advanced/saved search architecture -- to use, you create a new search with var search = new Scholar.Search(), add conditions to it with addCondition(condition, operator, value), and run it with search(). The standard conditions with their respective operators can be retrieved with Scholar.SearchConditions.getStandardConditions(). Others are for special search flags and can be specified as follows (condition, operator, value):

'context', null, collectionIDToSearchWithin
'recursive', 'true'|'false' (as strings!--defaults to false if not specified, though, so should probably just be removed if not wanted), null
'joinMode', 'any'|'all', null

For standard conditions, currently only 'title' and the itemData fields are supported -- more coming soon.

Localized strings created for the standard search operators


API:

search.setName(name) -- must be called before save() on new searches
search.load(savedSearchID)
search.save() -- saves search to DB and returns a savedSearchID
search.addCondition(condition, operator, value)
search.updateCondition(searchConditionID, condition, operator, value)
search.removeCondition(searchConditionID)
search.getSearchCondition(searchConditionID) -- returns a specific search condition used in the search
search.getSearchConditions() -- returns search conditions used in the search
search.search() -- runs search and returns an array of item ids for results
search.getSQL() -- will be used by Dan for search-within-search

Scholar.Searches.getAll() -- returns an array of saved searches with 'id' and 'name', in alphabetical order
Scholar.Searches.erase(savedSearchID) -- deletes a given saved search from the DB

Scholar.SearchConditions.get(condition) -- get condition data (operators, etc.)
Scholar.SearchConditions.getStandardConditions() -- retrieve conditions for use in drop-down menu (as opposed to special search flags)
Scholar.SearchConditions.hasOperator() -- used by Dan for error-checking
2006-08-08 02:04:02 +00:00
Simon Kornblith
216f0c7581 closes #83, figure out how to implement OpenURL
closes #76, implement extensible search/retrieval architecture for obtaining metadata

OpenURL COinS lookup is now implemented using a real search architecture system. at the moment, it works with Open WorldCat for books, CrossRef for journal articles (provided the COinS object contains a DOI or an ISSN), and PubMed when a PMID is available.
2006-08-08 01:06:33 +00:00
David Norton
9e5c15423a Fixes #164, On item delete, if "Erase Files" is not checked, it still shows files and notes being deleted.
If you are simple removing an item from a project, it won't ask you if you want to delete files and notes.

ItemTreeView:
 - notify() now works with multiple ids for action=modify.
 - saveSelection() returns an array of selected IDs instead of saving it to a class variable
 - rememberSelection(selection) takes an array of IDs.
2006-08-07 15:25:29 +00:00
Simon Kornblith
6626eba844 addresses #83, figure out how to implement OpenURL
OpenURL lookup now works for books. this means that all that's necessary to add scrapable book metadata to a page is an ISBN, as shown below:

<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info:ofi/fmt:kev:mtx:book&amp;rft.isbn=1579550088"></span>

also, we can now scrape Open WorldCat and Wikipedia Book Sources pages with no specialized code involved.

i'm still looking for a better way of looking up journal article metadata. it's currently implemented with CrossRef, but CrossRef simply will not work without a DOI, and is also incomplete (only holds the last name of the first author).
2006-08-07 05:15:30 +00:00
Simon Kornblith
e3d062a819 fix inappropriately truncated field values in InnoPAC 2006-08-07 01:49:56 +00:00
Simon Kornblith
2b5b65f4dd addresses #83, figure out how to implement OpenURL
adds preliminary support for COinS microformat data. does not yet support COinS where there is only a DOI or ISBN.
2006-08-07 00:30:36 +00:00
Simon Kornblith
56769079b0 addresses #83, figure out how to implement OpenURL
Scholar.OpenURL.resolve(item) returns the URL that retrieves an item from the user's OpenURL resolver. this means we can implement a "find in my library" feature.
Scholar.OpenURL.discoverResolvers() returns a list of available resolvers for the user's current location (by IP address).
2006-08-06 21:59:50 +00:00
Simon Kornblith
c0bab22016 bring scrapers into sync with updated database schema 2006-08-06 17:34:41 +00:00
Dan Stillman
d4acec8a77 Addresses #111, minor modifications to field list schema, and http://chnm.grouphub.com/P2995041
Still some outstanding questions on Basecamp, though
2006-08-06 16:10:28 +00:00