ABSTRACT

If we look at the current products of traditional publishers like Elsevier and Springer, no integrated metadata models, XML containers or models for scholarly communications are used. Enhanced Publications are mostly just articles with additional files that contain data sets and multimedia material.

In the eCrystals Federation Project metadata about the crystal, like chemical formula of the crystallised material, molecule name, and authors, are stored in simple Dublin Core (DC). Additional chemical information is stored as Qualified Dublin Core. The derived data sets are stored as files in the repository together with the files for representations or images of the crystal and the molecule. The eCrystals Federation Project is based on the eBank project which exports metadata using the OAI-PMH protocol in two different metadata formats, simple DC and METS. The eCrystals Federation Project will be a test bed for the OAI-ORE model (Atom Publishing Model), as DC and METS are too limited.

The ARROW/DART/ARCHER strategy does not require the use of a single metadata schema to describe all digital objects stored in the repository. Multiple formats to suit individual content models can be supported. OCLC and ARROW are working together to test a mapping tool developed by OCLC called the Interoperability Core that is based on mappings and crosswalks between different metadata formats. Metadata can be stored and searched in the native format generated by the community of practice. Using this strategy ARROW can be populated with metadata from a variety of formats and through various mappings converted to an interoperable core that can then be converted to DC for harvesting via OAI-PMH by resource discovery services.

If we look at all the above described projects, repositories are used not only to store and ensure permanent access to publications of multiple types, such as articles, conference-papers, reports and books, but also to store and offer access to data sets, research data, images and multimedia. Metadata varies from DC to METS or individual content models for community specific applications. Links between objects in 70 repositories are now mostly bidirectional. Repositories may be discipline-specific or more generally oriented like institutional repositories.

In future, repositories will more and more be used for all kind of data: different publication types, data sets, research data and extra materials, e.g. images, video. Enhanced Publications can be created on the basis of the objects held in repositories. The internal format and repository infrastructure must be flexible enough to deliver common metadata formats such as DC, MODS, DIDL or METS, or more community specific metadata formats. Above all repository infrastructures must support the OAI-ORE model. The development of systems to manage the complete cycle of e-research and scientific collaboration will be based on repository infrastructures.