Re: Excerpts from FOS Newsletter

From: Peter Suber <peters_at_earlham.edu>
Date: Mon, 18 Mar 2002 18:51:58 +0000

      Excerpts from the Free Online Scholarship (FOS) Newsletter
      March 18, 2002


Article summarizing software

One of the myriad ways that sophisticated software will help researchers is
to write short summaries of digital articles. Imagine succinct,
AI-generated summaries accompanying URLs in a search engine. Imagine
bookmarking a hundred relevant-looking articles for a research project and
siccing a summarizer on them to see which deserve a full read. Imagine
right-clicking on a paragraph of postmodern discourse, and selecting "cut
the crap" from a pop-up menu.

Gerald DeJong pioneered this kind of AI with FRUMP (Fast Reading
Understanding and Memory Program), a 1979 adaptation of Roger Schank's
script-based AI. FRUMP could read long newspaper stories and write
strikingly accurate short summaries. To see where this technology is
today, visit the Columbia Newsblaster, an AI news portal from Columbia
University's NLP (Natural Language Processing) Group. Newsblaster collects
news in real time from a dozen major free online sources, and breaks it
into general categories (e.g. U.S. World, Science) and specific topics
(e.g. stem cell research). Then it writes its own summary of the news on
each topic, and gives links to full stories for those who want to read
more. Judge for yourself, but I'm sure you'll find the auto-generated
summaries to be clear, accurate, and successful in distinguishing what's
central to a story from what's peripheral.

Apart from the intelligent software, a summarizing service like Newsblaster
depends on the availability of free online content to harvest as data for
the software. Imagine a "Researchblaster" for your discipline, harvesting
the growing number of free, online, full-text articles, and offering
accurate summaries organized by category and topic. The Columbia NLP Group
is working on such a system for the field of medicine.

Columbia Newsblaster
http://www.cs.columbia.edu/nlp/newsblaster/

Columbia Natural Language Processing Group
http://www.cs.columbia.edu/nlp/index.html

Papers from the Columbia NLP Group on summarizing medical articles
http://www.cs.columbia.edu/~noemie/papers/was01.pdf
http://www.cs.columbia.edu/~pablo/community/nlp/ISMB2001disambiguation.pdf
http://www.cs.columbia.edu/~pablo/community/nlp/klavans_amia00.pdf

Stephen Wan's resources on Automatic Text Summarization, including a
history of the field, list of projects, glossary, and bibliography
http://www.ics.mq.edu.au/~swan/summarization/index.htm

* Postscript. Does anyone know of free online research sites in any field
that already run summarizing software? How about freely available
summarizing software capable of taking web content as data?

Text-summarizing or "gisting" software is just one example of software that
will take FOS as data and return services unobtainable or even unimaginable
to researchers in the age of print. Here's another example of from this
week's news. In FOSN for 11/2/01, I wondered whether taxonomy or
categorization software, which evolved for business, was being used
anywhere for academic research. This week the Institute of Physics
announced that it is using the Vivisimo Clustering Engine for searching its
online journals.
http://vivisimo.com/docs/IOP_release.doc

----------

Developments

* The Budapest Open Access Initiative has now been translated into Russian.
http://www.soros.org/openaccess/
(Sign it, persuade your institution to sign it, take steps to implement it,
and spread the word.)

* _In Cognito_ is now making its full-text articles freely accessible
online. It will still publish a priced, print edition.

_In Cognito_
http://www.univ-ubs.fr/valoria/cognito/

Press release on the new open-access edition
http://makeashorterlink.com/?U19821B8

* On March 14, the Text-e symposium finished discussing its tenth and last
text. But its forum will remain open for a general discussion of the
issues raised during the symposium: the impact of the web on reading,
writing, research, and the diffusion of knowledge. The moderators have
posted their "conclusions" to the site to stimulate further
discussion. However, their conclusions are much more about the nature and
advantages of online symposia than the impact of the web on reading,
writing, and research.
http://www.text-e.org/index.cfm?switchLang=Eng&

* Apache has released 1.0rc2 (version 1.0 release candidate 2) of Xindice,
its open-source database specifically designed to story large archives of
XML. The code may now be downloaded from the site. (PS: Xandice was
formerly called dbXML; see FOSN for 10/19/01.)
http://xml.apache.org/xindice/
(Thanks to El.pub Weekly.)

* In a March 2 posting to his web site, Henry Gladney gives a a short (1.5
pp.) overview of some of his recent work in the long-term preservation of
digital documents.
http://home.pacbell.net/hgladney/tdo.pdf

To see past coverage of these stories in FOSN, use the search engine at the
FOSN archive.
http://www.topica.com/lists/suber-fos/read

* Launched last September, Open Source Schools is a portal for open source
software and "open content" in the service of education. You won't find
much about FOS at the site, yet, but the mission statement for the
organization says it aims "to assist in the movement to broaden 'free' and
'open source' to include more than software". It appears to be "open" to
learning more about FOS, and supporting it, if any readers have an interest
spreading the word.
http://opensourceschools.org/
(Thanks to C-FIT.)

* In FOSN for 3/11/02, I cited this article on the Budapest Open Access
Initiative. Because I couldn't find the author's name, I called it
anonymous. Helene Bosc has discovered that the author's name is Fabrice
Node-Langlois. Thanks, Helene.

La revolte des savants pour la libre publication (for _Figaro_)
http://www.lefigaro.fr/sciences/20020218.FIG0147.html
(The article is no longer available at this URL.)

Conferences

If you plan to attend one of the following conferences, please share your
observations with us through our discussion forum.

* Digital Resources and International Information Exchange: East-West
http://www.iliac.org/seminar/sem1.html
March 18 (Flushing NY), 20 (Stamford CT)

* Internet Librarian International 2002
http://www.internet-librarian.com/index.html
London, March 18-20

* The New Information Order and the Future of the Archive
http://www.ed.ac.uk/iash/archive.conference.html
Edinburgh, March 20-23

* Institute of Mueum and Library Services. Building Digital Communities
http://webwise.mse.jhu.edu/
Baltimore, March 20-22

* Advanced Licensing Workshop
http://www.arl.org/scomm/licensing/advlic.html
Dallas, March 20-22

* Electronic Publishing Strategy
http://www.alpsp.org/tEPS220302.htm
London, March 22

* Association of Information and Dissemination Centers (ASDIC) Spring 2002
Meeting
http://www.asidic.org/s02program.html
St. Augustine, Florida, March 24-26

* OCLC Institute. Steering by Standards. (A series of satellite
videoconferences.)
http://www.oclc.org/institute/events/sbs.htm
Cyberspace. OAI, March 26. OAIS, April 19. Metadata standards in the
future, May 29.

* WebSearch University
http://www.websearchu.com/
San Francisco, March 25-26; Stamford CT, April 30 - May 1; Washington DC,
September 23-24; Chicago, Octeober 22-23; Dallas, November 19-20.

* European Colloquium on Information Retrieval Research
http://www.cs.strath.ac.uk/ECIR02/
Glasgow, March 25-27

* e-Content: Discovering and Delivering Value
http://www.informationhighways.net/conf/cindex.html
Toronto, March 25-27

* New Developments in Digital Libraries
http://www.iceis.org/workshops/nddl/nddl-cfp.htm
Ciudad Real, Spain, April 2-3

* The New Information Order and the Future of the Archive
http://www.ed.ac.uk/iash/archive.conference.html
Edinburgh, March 20-23

* Copyright Management in Higher Education: Ownership, Access and Control
http://www.umuc.edu/distance/odell/cip/copy_manage2002/
Adelphi, Maryland, April 4-5

* Global Knowledge Partnership Annual Meeting
http://makeashorterlink.com/?F21C3456
Addis Ababa, April 4-5

* What Scholars Need to Know to Publish Today: Digital Writing and Access
for Readers
http://library.albany.edu/symposium/
Albany, New York, April 8

* International Conference on Information Technology: Coding and Computing
http://www.cs.clemson.edu/~srimani/itcc2002/cfp.html
Las Vegas, April 8-10

* NetLab and Friends: 10 Years of Digital Library Development
http://www.lub.lu.se/netlab/conf/
Lund, April 10-12

* E-Content 2002 (on ebooks)
http://litc.sbu.ac.uk/econtent/index.html
London, April 11

* Censorship and Free Access to Information in Libraries and on the Internet
http://www.db.dk/kon/temadag/Censurogytringsfrihed_eng.htm
Copenhagen, April 11

* International Learned Journals Seminar: We Can't Go On Like This: The
Future of Journals
http://www.alpsp.org/s120402.htm
London, April 12

* SIAM International Conference on Data Mining
http://www.siam.org/meetings/sdm02/
Arlington, Virginia, April 11-13

* Creating access to information: EBLIDA workshop on getting a better deal
from your information licences
http://www.eblida.org/conferences/licensing/licensing.htm
The Hague, April 12

* Licensing Electronic Resources to Libraries
http://www.arl.org/scomm/licensing/pworkshop.html
Philadelphia, April 15

* United Kingdom Serials Group Annual Conference and Exhibition
http://www.uksg.org/conference.htm
University of Warwick, April 15- 17

* Conference on Computers, Freedom, and Privacy
http://www.cfp2002.org/
San Francisco, April 16-19

* EDUCAUSE Networking 2002
http://www.educause.edu/netatedu/events/net2002/
Washington, D.C., April 17-18

* Museums and the Web 2002
http://www.archimuse.com/mw2002/
Boston, April 17-20

* Legal Guidelines for Use of Intellectual Property in Higher Education
http://www.oneonta.edu/conference/copyright/
Oneonta, NY, April 19

* Information, Knowledges and Society: Challenges of A New Era
http://www.congreso-info.cu/venglish.htm
Havana, April 22-26

* DAI Institute on The State of Digital Preservation: An International
Perspective
http://www.clir.org/agenda-digpres.html
Washington, D.C., April 24-25

* CLIR Sponsors' Symposium: New Challenges, New Solutions: Libraries for
the Future
http://www.clir.org/agenda_sponsorsymp.html
Washington, D.C., April 26

* The European Library: The Gate to Europe's Knowledge: Milestone Conference
http://www.europeanlibrary.org/
Frankfurt am Main, April 29-30

----------

The Free Online Scholarship Newsletter is supported by a grant from the
Open Society Institute.
http://www.osi.hu/infoprogram/

==========

This is the Free Online Scholarship Newsletter (ISSN 1535-7848).

Please feel free to forward any issue of the newsletter to interested
colleagues. If you are reading a forwarded copy of this issue, you may
subscribe by signing up at the FOS home page.

FOS home page, general information, subscriptions, editorial position
http://www.earlham.edu/~peters/fos/index.htm

FOS Newsletter, subscriptions, back issues
http://www.topica.com/lists/suber-fos

FOS Discussion Forum, subscriptions, postings
http://www.topica.com/lists/fos-forum

Guide to the FOS Movement
http://www.earlham.edu/~peters/fos/guide.htm

Sources for the FOS Newsletter
http://www.earlham.edu/~peters/fos/sources.htm

Peter Suber
http://www.earlham.edu/~peters

Copyright (c) 2002, Peter Suber
http://www.earlham.edu/~peters/copyrite.htm
Received on Mon Mar 18 2002 - 18:53:15 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:46:28 GMT