plazi.org: new repository for taxonomic descriptions

From: Donat Agosti <agosti_at_amnh.org>
Date: Tue, 26 Feb 2008 05:22:09 -0500

Below the announcement of our new repository of biological species
descriptions.

I am aware, that is not exactly the Green Road, but the system offers self
archiving via a DSpace repository, where prints, preprints or XML versions
can be deposited. For the latter we offer an XML editor GoldenGATE
allowing semiautomatic mark up and enhancement of the publications using
the TaxonX XML schema. The species description repository will essentially
only be working once we have at least 100% Green Road compliance.

Systematics is marred by thousands of journals and a big fear, that big
brother is chasing you. This is furthermore enforced by the dual nature of
our research institutions, the large natural history museums, which become
more corporations followed by respective legal framework, which is exactly
the opposite of what we scientists want. We want a legal framework that
allows open access.

This became very obvious during last weeks meeting at Kew Botanical
Gardens of the European Distributed Institute of Taxonomy (EDIT) meeting
(http://www.editwebrevisions.info/content/ipr-and-web), where Naomi Korn,
a IP consultant to the Natural History Museum, made the point, that we
scientists must protect us so we could make money out of our research. A
research which is hardly recognized because there is very little access to
its results, and thats despite the ongoing biodiveristy crisis.

My point to the audience was, that we need ultimately access to all the
descriptions, and that one of the ways to get there is self archiving,
urging the institution to provide self archives
(http://hdl.handle.net/10199/15439). But the real goal for me as a
scientist is to transform the publications so that they become part of the
semantic Web, and we can mine its content.

Finally, another avenue is that we begin to address legal issues which
make a great number of the systematists hesitate to even self archive
there own publications. Willi Egloff, a well know Swiss copyright lawyer
brings in the Swiss and European perspective to the prevailing US view.

Cheers

Donat



PLAZI.ORG - THE DIGITAL REPOSITORY FOR SPECIES DESCRIPTIONS.

Knowledge of the actual number of species on planet Earth is one of the
last frontiers in science. It is not known exactly how many species have
been identified and described, much less the number of as yet undescribed
species.

However, the species we do know are documented in well over hundred
million pages of printed scientific books and journals. ~V This knowledge
is hidden in libraries, and no single library holds all this knowledge.

The species descriptions are very rich in data, essentially a quality
controlled summary of what is known at any specific time about a
particular species. In best cases, this information includes a detailed
morphological description, drawings and images, a summary on behavior and
ecology and a detailed list of all the specimens studied. In more recent
publications, links to DNA sequences or video documentation ~V among other
forms of data ~V may be provided. Recently e-publications have become
available, but many of these are copyrighted and thus not generally
available open to the public for perusal or use. Nor are they easily
machine-searchable for discovery and re-use of contents.

Recently, the Biodiversity Heritage Library as a large scale operation to
digitize this biodiversity literature has been launched. Currently, it
includes major US and UK natural history libraries, with the ultimate goal
of including the entire global literature. All publications will be openly
accessible to the public, unless they are copyrighted -- thus most of the
recent publications are still out of reach. The BHL thus falls short of
optimizing the potential uses of these publications.

Tagging the ~Sboundaries~T of a species description and identifying the
species dealt with, supports discovery and retrieval of data not possible
through Google. Mark-up of species descriptions permits queries, such as
which are the "red ant in London", a very common form of query.

Under some national copyright legislation like the Swiss, descriptions can
not be copyrighted because they are through historical constraints (there
are tens of millions of descriptions) and peer review standardized and
listing factual, in most cases morphological data describing species, and
thus they can all be made readily accessible.

Plazi.org is a new Web based service that offers access to descriptions of
species and an archive to store the publications as marked up documents.
GoldenGate, a dedicated editor has been developed to mark up the
publications supporting the extraction of descriptions, based on a TaxonX,
an XML schema modeling the logic content of these publications. The Plazi
Search and Retrieval Server, building on this systematic mark-up of texts,
allows powerful search functions to find species descriptions, or even
simple mention of species, permitting users to answer questions like:
~SWhich species occur together~T?

Plazi.org includes already more than 3,700 description of 3,000 taxa with
a goal of archiving all the forthcoming new descriptions and, contingent
upon additional funding, all the descriptions of the known 12,278 ant
species listed in the Hymenoptera Name Server/ antbase.org, enhanced with
globally unique species numbers (LSID~Rs: Life Science Identifiers). While
ants provide the original test case, the service is not restricted to ants
but is potentially open to all groups, from Bacteria to Plants, and will
support most major languages. All descriptions are machine readable and
thus can be picked up for mash-ups or individual Websites.

Plazi.org is run and developed by Donat Agosti, Terry Catapano, Christiana
Klingenberg and Guido Sautter, its development is supported by Grants from
the US National Science Foundation (to the American Museum of Natural
History), the German Deutsche Forschungsgemeinschaft (to University of
Karlsruhe) and the Global Biodiveristy Facility (GBIF; to Plazi.org and
Zootaxa), and is collaborating with the Hymenoptera Name Server at Ohio
State University (Norm Johnson), Zoobank (Richard Pyle), University of
Massachusetts (Robert Morris), antweb.org (Brian Fisher) and Zootaxa
(Zhi-Qiang Zhang).

Plazi.org has been released to the public at the EDIT "IPR and the web:
challenges for taxonomy" meeting in London, Feb. 20, 2008
(http://www.editwebrevisions.info/content/ipr-and-web)

Related Links:

First descriptions of the first ant described (Linnaeus, 1758)
http://plazi.org:8080/GgSRS/search?searchMode=displayDocument&idQuery=7986054728766B7340246F844D016C8E

recent publications with fine grained mark-up allowing extraction of
specimen data and plotting automatically maps
http://plazi.org:8080/GgSRS/search?searchMode=displayDocument&idQuery=4CED5222CB80220AD603CE26264DAA64

Description in Russian:(please set encoding=utf-8 in your browser.
http://plazi.org:8080/GgSRS/search?searchMode=displayDocument&idQuery=AE3F888CA4C71EE36CC01A3D4FAF58F0

Description in Chinese
http://plazi.org:8080/GgSRS/search?searchMode=displayDocument&idQuery=5F64B0928243F0606C12CF587904DCFB

Fish
http://plazi.org:8080/GgSRS/search?searchMode=displayDocument&idQuery=30E2ACE97FCC02A34806994547F8E1F5

Bacterium
http://plazi.org:8080/GgSRS/search?searchMode=displayDocument&idQuery=A4EA41C37AB446BF1E98F819177A8299

Plant
http://plazi.org:8080/GgSRS/search?searchMode=displayDocument&idQuery=59DE9EE281471FA25FC7E4DAE168652C

Fungi
http://plazi.org:8080/GgSRS/search?searchMode=displayDocument&idQuery=C9BDF9CE2F33390723224B21E216FE01


Donat


-- 
Dr. Donat Agosti
Research Associate, American Museum of Natural History and Smithsonian
Institution
Email: agosti_at_amnh.org
Web: http://antbase.org
CV: http://research.amnh.org/entomology/social_insects/agosticv_2003.html
Dalmaziquai 45
3005 Bern
Switzerland
+41-31-351 7152
Received on Tue Feb 26 2008 - 12:04:17 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:49:14 GMT