ECHO Content ECHO Technology ECHO Network ECHO Policy
Search About the ECHO Initiative Promotion Activities Intranet Full text search New fulltextsearch (internal)

Towards a Web of Culture and Science

Editorial

The rapidly increasing role of the Internet for a knowledge-based society represents a major challenge for cultural heritage and science as we know it today. The crisis of information management in the natural sciences is evident from ever-growing costs for scientific journals as well as from inadequate access structures which document the failure of transferring the established publication system of the natural sciences from the print to the electronic medium. Within the present, commercially dominated system of dissemination and access, science is simply unable to reveal its full impact so that investments in science fail to reach the returns they could in principle attain. Equally dramatically, cultural heritage as well as the humanities dedicated to its study are in danger of being left behind by technological development. The main body of sources constituting cultural heritage has in fact not yet been transferred to the new medium. This deficit concerns not only the humanities, which vitally depend on information representing cultural heritage, but also society at large.

The humanities and the social sciences so far have been incapable of initiating a dynamics comparable to that launched by the new information technologies in the natural sciences. The imminent risk of culture and science losing ground in the new medium and the society shaped by it has been identified in several international strategy documents pointing to the lack of implementation of cultural resources in the new electronic media, as well as of sustainable standards and adequate tools. Existing programs have, however, so far failed to launch a self-sustaining infrastructure and dynamics for establishing a synergy between cultural heritage, scientific knowledge, and technological potential. If such a dynamics could be launched, on the other hand, it would massively challenge traditional boundaries blocking cooperation between different national cultures and scientific disciplines such as those related to language, thus creating entirely new potentials for research and public culture.

This report is dedicated to the challenges of the ongoing information revolution for culture and science. It gives an account of the recent efforts to create a future-oriented infrastructure on the Web ensuring open access to these fundamental resources of humanity, in particular the Berlin Declaration launched by the Max Planck Society and the European Cultural Heritage Online (ECHO) Initiative. Such an infrastructure should ease the transfer of contents relevant to culture and science from the traditional media to the medium of the future, it should exploit and further develop the technical potential inherent in the Web, and should turn the Web into an open, interactive, and sustainable public think-tank strengthening our capacity to solve the global problems of mankind. Readers of this booklet are invited to join the process initiated by the Berlin Declaration and the ECHO Initiative, subscribing to an open-access policy and enriching a growing infrastructure for scientific knowledge and cultural heritage giving rise to the Web of the future which will have to be a Web of Culture and Science.

"Cultural Heritage encompasses material culture, in the form of objects, structures, sites and landscapes, as well as living (or expressive) culture as evidenced in forms such as music, crafts, performing arts, literature, oral tradition and language. The emphasis is on cultural continuity from the past, through the present and into the future, with the recognition that culture is organic and evolving."
The World Bank summary of a meeting in Washington, 26-27 January 1988

The Crisis of Culture and Science in the Information Age

1.1 Challenges of the Information Revolution for Culture

This is a time in which technology and culture seem to be as decoupled from each other as they have ever been in the recent past. Technological visions of progress, in particular, have lost their appeal of being guarantors of the progress of culture into the bargain. While catchwords such as "information society" or "postgenomic society," not to speak of "traffic of the future," have lost glamour and credibility as promises of a better civil society, scepticism if not hostility with regard to science and technology are spreading. European culture, in particular, jointly created by the homo faber and the homme des lettres, faces a crisis: while it provided the foundation for magnificent technological achievements in a long-range development reaching back to antiquity, cultural heritage and its values are dramatically losing ground in the techno-scientific world that has emerged from it.

The medium of today and tomorrow, the Internet, might in fact leave behind a culture which is the heritage of our past but urgently needed to meet the challenges of the future. This cultural heritage, which binds us together even more strongly than our institutions, is presently in danger of being left behind, of missing the train, so to speak, of the rapid technological developments carrying us into a new information age. Moreover, wars and dwindling public funds for the preservation of cultural heritage are contributing to its rapid degradation. Creating a larger space for culture on the Web would not only be important to the present scientific and public communities, but also help to secure cultural heritage against the threats of war and natural catastrophes for future generations. The Web could preserve digital representations of cultural objects for the memory of mankind and also serve to easily identify originals which have been lost because of theft or plundering, thus helping to minimize their potential on the antiquities market. Openly accessible catalogues of the cultural heritage of mankind kept in museums and other collections would therefore constitute a direct and powerful support of the UNESCO preservation policy whose urgency and viability the recent events in Iraq have made all too clear.

At present, however, the bulk of information which forms the core of cultural heritage, the great works of literature and art, as well the treasures of scientific, scholarly, and philosophical writings going back to the dawn of our civilization, are largely excluded from the information system already constituting the backbone of an ever-more knowledge-based world. And the little culture that is included in the World Wide Web due to the efforts of a few pioneers is almost drowned by the tides of information garbage.

It is precisely the few shining examples of culture on the Web that make evident the potentials of the bulk of information constituting our cultural memory, which is still not represented within the new medium. Among these potentials is the chance of overcoming the fragmentation of cultural heritage by traditional institutions and disciplines. This fragmentation process has been determined, to a large extent, by preservation concerns (it is easier to preserve paintings by gathering them together in the same building, and the same holds true for the preservation of books, archival records, drawings, natural history objects, video tapes, sound tracks and so on). But, once the information has been processed and transformed into a digital entity, it must no longer follow the fate of its physical support. There is thus no reason to store it according to the same systems used to preserve the object it emulates. On the Web, there are no buildings or walls, and we are not obliged to reproduce distinctions based on the topologies of material objects or on the various nature of their physical shells. It is possible - and indeed necessary - to reorganize digital records into new cognitive architecture, where the strict constraints of the physical world no longer apply.

What is needed is a vision exploiting the new technological possibilities for the creation of a public culture of science, a vision that includes the humanities and thus keeps alive the roots of our techno-scientific world in our cultural history. Such a vision must address a double challenge presenting itself to cultural heritage in the age of the Internet, a quantitative and a qualitative one: the need to make a substantial amount of the sources constituting the cultural memory of mankind electronically available, and the need to create an adequate intellectual, technological, and social infrastructure rendering this cultural memory accessible as a resource for addressing the questions of today, be they scholarly or from an orientation-seeking public.

The deficit in the extent to which cultural information is available on the net is indeed accompanied by another deficit with perhaps even more pernicious consequences: the underdevelopment of cultural techniques adequate to the new information technologies. In the fields of language technology, image analysis, and the implementation of mathematics on the net, that is, in fields of high economic and technological impact, bottlenecks become visible that are related to the negligence of an adequate transfer of the traditional cultural techniques of writing, depicting, and calculating to the new medium. The problems of language technology, for instance, have long been considered merely an engineering challenge and not a field to which the humanities can bring their century-long expertise in the linguistic representation of meaning. Meaning is, after all, not only in the text but also in the cultural context so strikingly absent from the new medium. In short, the lack of implementation in the new medium of the cultural information and techniques which are the domain of the humanities represents a major stumbling block to what might otherwise become a second Internet revolution.

This revolution will, however, not take place automatically, merely as a consequence of technical developments, but it requires the creation of new, content-laden information structures which can only result from an effort to overcome the present marriage of ignorance between the scholarship and technology. What is needed is not just hardware and software, but a grid-like, open infrastructure supporting the accumulation and extraction of meaning from information distributed over the Web. An infrastructure capable of responding to these challenges will have to support preservation and free access to the cultural heritage of mankind against the ruthless and short-sighted pragmatism of technical and economic progress. It will have to provide public facilities for web-supported training and education. It will have to encourage the free exchange of information and arguments across currently existing social, political, and religious boundaries. It will have to guarantee a lasting memory of mankind comprising a representation of human history in a new form. And it will finally have to develop mechanisms for self-organisation and for the evaluation of the information it makes available. These requirements cannot be fulfilled by merely adding content or utilizing further evolution of existing technologies. The goals necessitate a massive, well-reflected and carefully concerted push of technological innovation, social organization and content enrichment. Precisely because the creation of such an infrastructure depends on scholarly as well as technological competence, it will not come true without the active participation of the scientific community at large and without the massive support of science policy.

1.2 Challenges of the Information Revolution for Science

The new electronic media of information production and dissemination are in the process of dramatically changing the conditions under which scientific information circulates. They will affect the infrastructure of science no less profoundly than the invention of printing. In the Gutenberg era of printed information, the responsibilities for the main parts of the flow of scientific information are clear: Research results are produced by scientists. They are disseminated by publishers and archived by libraries. Information is filtered by a process of evaluation performed by scientists (peers) and organized by publishers. Only that which survives this filtering process is being disseminated. Information is retrieved by scientists using bibliographical tools within an infrastructure offered by libraries. The system is well-established and has been impressively stable. It is now, however, endangered by technological changes with radical consequences. Even within the system of printed information these technological changes are felt by the increasing prices charged by publishers for dissemination, which scientific organizations are no longer able to cover.

The information revolution has radically changed the technical and economic basis for maintaining the scientific information flow. Research results are being produced and can be immediately disseminated in electronic form. Dissemination is no longer a cost-intensive component. It can in principle be handled by scientists without the services of the publishers. This is demonstrated, for instance, by electronic research archives for some areas of the natural sciences, which spread research results without any costs for the users once an institution is hooked up to the net. Furthermore, in the electronic medium evaluation follows and does not precede dissemination. This is the outcome of the self-reflecting capacities of a universal representation of knowledge. If this vision of the Internet era should lead to an acceptable model for the flow of scientific information, one has to make significant investments into the solution of two major open problems. The first is that of archiving, ensuring the long-term availability of electronic information. The second is that of an adequate access and retrieval infrastructure.

The forms of scientific representation will radically change as a consequence of the information revolution. There is no reason why future scientific representations should take the forms of journals or books, forms that are largely determined by the print medium and the respective agents. Instead they may take any form suitable as a contribution to a global scientific information network and its structure. Even now we are familiar with a variety of possible forms of representation, ranging from entries in data-bases, via digital archives and collections of links, to interactive research environments. There is, in particular, no longer any reason to preclude access to the information hinterland (observational and experimental data, software tools, historical sources), presently only serving as a logistic background for published research results. This will help ensure the reliability of scientific information, to broaden the scope of available resources, and to avoid the duplication of efforts. The electronic medium offers scientists the opportunity to reach, practically without delay and at little additional costs, the primary objective of their work, an impact on the body of knowledge of their scientific community. It is therefore in their natural interest to freely broadcast their information rather than to force their fellow-researchers to buy it with a considerable time-lag from a publisher.

The immediacy and in principle unrestricted scope of electronic dissemination increases the likelihood of rapid responses, distinguishing valuable from non-valuable contributions. The quick settling of the cold fusion issue in the Internet even before most discussions appeared in print is a case in point. Of course, even for print information it is, at least in the long run, not peer review but usage that eventually decides on the quality of a scientific contribution. Under the new conditions, the same process can exercise its selective effect much more rapidly. In contrast to traditional peer-reviewing, open peer-commentary as it can be realized in an electronic network does not lose valuable information but rather adds to the available body of knowledge. At the same time, open peer reviewing allows for more differentiated judgements rather than just for an "in/out" decision about publication which does not distinguish quality differences between materials that have survived the selection process. The new medium could thus facilitate and improve the quality of the selection process. An appropriate infrastructure for transforming quality judgement into a navigational aide in the ocean of information is, however, at present still not available.

The question of how to ensure the longevity of electronically represented scientific information is, of course, not only a technical but also an institutional problem. No single type of institution is presently equipped to offer a complete solution. Most probably, only a kind of "New Deal" among the agents in the scientific information flow can provide the basis for a solution. In any case, this problem represents a major challenge that has to be addressed by research organization not only at an administrative but also at a political level.

Since the costs for information dissemination have been dramatically reduced, the publishers will in the long-run risk losing their main source of revenue unless they offer new services, adding value to the scientific information. The new role of libraries in the Internet era is also open. They have to find their place in the new distribution of labour.

1.3 The Insufficiency of Existing Solutions

How can the pernicious situation described above be changed? Different scenarios are conceivable: The "scout solution" is based on the assumption that the transfer of cultural heritage to the new medium can be achieved by pilot ventures only. The other common scenario is the "big player" solution. It essentially assumes that the dominating forces of the economic or academic market will sooner or later take care of bringing cultural heritage to the net.

The big player solution is most familiar from present debates on electronic journals. While the few publishers who hold a near monopoly in certain areas of scientific publishing are indeed offering more and more material on the Internet, their approach has been rightly characterized as a "Faustian deal" in which a fatal price has ultimately to be paid by the scholarly community. In fact, although electronic dissemination is considerably cheaper than print dissemination, journal prices - in general still coupled to those of print subscriptions - not only continue to increase but, what is worse, the revenue accumulated by the publishers is in general not reinvested in a future-bound infrastructure for scientific information on the Web. On the contrary, the great challenges for such an infrastructure, for example, the archiving problem or the problem of an integrated retrieval environment remain, for the time being, largely unsolved - menacing the longevity and interoperability of scientific and scholarly information in the electronic medium. This is the fatal price of the Faustian deal. There will be no escape from it as long as the scholarly community has to repurchase from the big players the information it produced in the first place, at the same time being left responsible for its infrastructure on the Web.

The situation of the digital availability of the primary sources of cultural memory is even more problematic. While sceptics are still debating the compatibility between culture and the Web, the big players have long since begun to secure exclusive rights on the reproduction of cultural artefacts and even to purchase important documents and collections with the intention of commercialising their digital images. From the codices of Leonardo da Vinci to the photographs of Ansel Adams, every piece of cultural heritage is a potential asset in this new market. In the hope of spectacular gains, new firms have been founded, claims staked out, and portals opened up with a "gold rush" mentality. And indeed, looking back at its first phase, one already recognizes the typical ruins documenting the transiency of every gold rush: portals promising to become gateways to unimaginable cultural treasures which actually lead nowhere; key documents of European history are, on the other hand, confined to CD ROMs which are condemned to gather dust until they become outdated with the next generation of soft or hardware. Meanwhile they are banished from the World Wide Web, which enlivens and enhances every significant piece of information exposed to it by its self-organizing connectivity, at least as long as this connectivity is not smothered by passwords or pay-per-view access. It has become particularly evident that the big players have failed, in spite of their eagerness to control large domains of cultural heritage, to create an infrastructure that guarantees a steady and reliable flow of this heritage from the old medium into the new. On the contrary, they have contributed to an increasing inaccessibility of cultural heritage - not only because of the restrictive copyright laws they seek to impose but also because sources are now often held back by museums, archives, and libraries in the dim hope of future commercialisation. This hope can, however, hardly be sustained by a practice that amounts to a ruinous exploitation of limited resources rather than representing a concerted effort to augment them.

The scout solution, on the other hand, is based on the assumption that the transfer of cultural heritage to the new medium can be achieved by pilot ventures, perhaps in combination with an establishment of standards for production and dissemination. In contrast to the big player solution, it amounts to the realization that bringing culture to the Internet actually means settling a new continent rather than just exploiting its resources in a gold rush. But it also amounts to the assumption that this can be done by merely sending out a few scouts to survey the new territory and set up a model farm here and there. However, this approach should not be criticized too brashly. As a matter of fact, almost everything presently available in terms of digital libraries demonstrating the potential of the new media for cultural heritage is due to the breakthroughs achieved by this strategy. But it must be legitimate to ask whether this strategy is adequate to meet the principal challenge of the future: the creation of a self-sustaining representation of culture and science in the new medium.

Looking back at the successes and failures of the projects funded by national agencies as well as by the European community, one finds indeed that so many of the feasibility studies, pilot projects, test beds, and proofs of concept, however impressive they are if taken by themselves, have actually failed to launch such a self-sustaining dynamics. The dead links, blind alleys, and empty databases characterizing some of the most ambitious homepages of such projects signal that they did not succeed in making a difference for the scientific community at large, let alone for the role of cultural and scientific memory in an Internet society. Such projects are rather like chip factories in the jungle, incapable of leading off productive development because even the most basic infrastructure is lacking.

Admittedly, the humanities, responsible for preserving, exploring, and keeping cultural heritage alive, is a difficult environment for technical innovations. Scholars in the humanities have hardly even begun to realize that the new information technologies not only confront them with a competence problem that is unparalleled in the natural sciences but also that they are faced with entirely new possibilities to overcome the deeply entrenched boundaries of narrow specialization. The assumption that the humanities can be catapulted into the Internet age by enticing them with exemplary pilot projects reminds a historian of the overlay astute attempts of Jesuits in the 17th century to convert the Chinese mandarins to Christianity: they offered in fact a few extraordinarily beautiful clocks as a gift to the Chinese emperor in the futile hope that he would ask for more European technology and religion once the donated clocks needed rewinding.

The Vision of a Web of Culture and Science

2.1 The Open-Access Initiative and the Agora Solution

In order to initiate the far-reaching upheaval that a comprehensive digitization of our cultural, historical, and scientific heritage would amount to, neither missionary zeal, brute force, nor standardization and coordination efforts alone will do. What is needed is rather an infrastructure that makes scientific contributions as rapidly and effectively available as possible, using the potential of the Internet to constitute a global and interactive representation of human knowledge, including cultural heritage and the guarantee of worldwide access. This is the main goal of the open-access initiative. In order to realize its vision of a global and accessible representation of knowledge, the future Web has to be sustainable, interactive, and transparent. Content and software tools must be openly accessible and compatible. Open access contributions must include original scientific research results, raw data and metadata, source materials, digital representations of pictorial and graphical materials and scholarly multimedia material. An open-access infrastructure should enable users to pursue their specific interests while contributing, at the same time, to a shared body of digitally represented knowledge. This is the key of the Agora solution. It aims at launching a dynamics that combines the development of the whole with the benefit of the individual, a combination that has actually been the hallmark of all great civilisatory enterprises, beginning with the foundation of the Greek polis which achieved such a synthesis of interests in its agora. A self-accelerating dynamics leading to an ever-more comprehensive electronic representation of culture and science heritage can only emerge if certain minimal conditions are fulfilled. Among them are the requirements of open access, interoperability, modularity, and interactivity. Only if digital sources are made freely available on the Web, only if the same tools can be applied because they share compatible structures, only if diverse digital collections can be integrated to yield an interconnected whole, and only if it is possible to combine the power of computing with the power of the human mind in the analysis of sources, will a set of data turn into a meaningful representation of human knowledge.

It has turned out that even the most convincing standards, models, or tools will remain island solutions as long as those still lacking expertise in electronic information management or access to appropriate equipment are unable to join in. It would be an error to consider the implementation of the agora solution simply as a matter of technological developments which, once completed, have to trickle down from the initiated to the laymen. It makes just as little sense to develop standards without the tools to implement them as it does to develop tools without understanding the questions they should help to answer. The real challenge of the agora solution is thus to achieve an integration of intellectual and technical work and to promote technological developments which are driven by content. Its realization therefore presupposes an environment in which not only technology but also the knowledge about its innovative application to the pressing problems of culture and science is spreading.

In summary, the aim of the agora solution is to establish an open-source culture of the public and scholarly exploitation of cultural and scientific heritage on the Internet, comprising the promotion of content-driven technology in information management. The resulting infrastructure should allow every scientific institution, archive, library, museum, or educational institution to make their resources available online with little effort and in a way that guarantees their interoperability with other representations of human knowledge. In order to make participation in the agora attractive, every potential contributor should gain a surplus value when entering the agora by making contents or tools available on the Web. In particular, all possible meaningful links between a newly available corpus of materials and the already existing ones should be enabled; tools developed for particular aspects of culture or science should be transformed into modules of a universal working environment applicable to all pertinent domains of human knowledge.

2.2 A Web of Culture and Science

The establishment of an open-access infrastructure must go along with a further transformation of the Web. The Web represents a powerful achievement in the connectivity of human knowledge that until recently seemed inconceivable. The revolution it has caused has rightly been compared with those of the invention of writing and of printing technology. But the rapid development of the Web itself is about to surpass the basic development of the technologies it is based on. In its present form, the Web makes more promises than it can actually keep - at least as long as it is restricted to the specific paradigm that has originally given rise to it.

As was the case when the Internet was created by turning a network of computers into a medium representing a universal hypertext, also its future will rather depend on requirements and possibilities that are revealed only in the context of innovative usage scenarios. Such usage scenarios will emerge when the Internet is used as a virtual public think-tank, a web of culture and science serving as a medium of reflection on current global challenges of human civilization such as the destruction of ecological equilibria, social impacts of epidemics and drug addictions, or terrorism and other devastating consequences of oppression and increasing mass impoverishment.

If both the natural sciences and the humanities should even under these conditions of global challenges be capable of providing the knowledge crucial for solving the problems of the human species, then this knowledge must also be represented, integrated, and made available in a form allowing for global orientation and action. It is time to take up the opportunity offered by the Internet to create such a medium of global human reflection.

Due to its origin in the idea of hypertext, the World Wide Web is centred on textual data enriched by illustrative insertions of audio-visual materials. The status quo paradigm of the Web is a client-server interaction, that is, a fundamentally asymmetric relationship between providers inserting content into the Web hypertext (server) and users who essentially read texts or provide answers to questions by filling out forms (clients). The hyperlinks of the Web represent structures of meaning that transcend the meaning represented by individual texts, but, at present, these "webized" structures of meaning, lacking any longevity, can only be blindly used e.g. by search engines which at best optimize navigation by taking into account the statistical behaviour of web users. However, these meaning structures themselves can so far hardly be made the object of interventions by the web community. There is at present no way to construct complex networks of meaningful relations between web contents. In fact, the providers have no influence on the links to the contents provided by them and the users have no impact on the available access structures to the content, except by becoming content providers themselves.

This asymmetric client-server relation largely determines the functionalities of the existing web-software. Web servers are not the standard tools of users, while the web-browsers used by them are restricted to accessing existing information in a standard form with only limited possibilities of further processing that information (such as e.g. changing fonts and background colours). As a consequence, the present Web offers no possibility for (radically) different views of the same underlying content, depriving users from the creative potential inherent in the dynamics of the ever-changing Web hypertext.

The Web of the future will thus continue to be essentially based on the representation of meaning by text. However, contrary to the existing web, its emerging paradigm is no longer constituted by the client-server asymmetry but by informed peer-to-peer interactions, that is, by a cooperation of equally competent partners who jointly act as providers and servers at the same time. Future users will work on shared knowledge by constructing new meaning while accessing the existing body of knowledge represented in the Web through meaningful links to texts and text corpora. An important framework for creating such meaningful links can be provided by what is presently discussed as the semantic web, that is, the automated creation of links between machine-understandable metadata. In a further perspective, however, such semantic linking will not be restricted to the use of specifically prepared metadata sets but will exploit the meaning structure of the Web itself in order to provide a content-based semantic access to information.

A basic faculty of human thinking is in fact the ability to reflect on existing knowledge and to produce metadata in a much more general sense than is presently current. The outcome of such reflections typically constitutes a network of meaningful relations. Although the representation of such networks of reflection within the Web does not raise unsolvable technical problems, the very mechanism of such a network of reflection can only to a very limited extent be mapped onto the structures of the Web due to the incompatibility of its traditional paradigm with the creation of meaningful relations between contents. The generation of the specific kind of metadata produced by the sciences and the humanities through analysing, annotating, and reformulating the content of texts is hardly supported by the present infrastructure of the Web. For this reason interactive working environments have to be developed that support the creation of such metadata, and which can then also be used for an improved navigation through scientific contents.

The new infrastructure of the Web which could emerge from a realization of the explanatory power of metadata in a more general sense provides a solution for a serious problem of the current as well as of the future Web, that is to produce order in the ever-growing complexity of the Web by content-sensitive linking. Even the most sophisticated search engines will reach their limits as long as the search criteria can at best exploit the statistics of human decisions about the quality of data. If, however, navigation can be based on content-specific metadata resulting in dynamically changing ontologies, the situation will change. Future "knowledge weaving web environments," will have powerful link editing functionalities and thus engender a self-organizing mechanism of the Web, improving its hypertext structure.

The Berlin Declaration and the Implementation of the Vision

3.1 The Berlin Declaration

How can the vision of a Web of Culture and Science be realized? Recently, for the first time the humanities joined forces with the natural sciences in an effort to create a common infrastructure for the representation of culture and science on the Internet. In October 2003 the Max Planck Society, together with the ECHO initiative, held an international Open Access conference in Berlin. Rather than just adding a new temporary project, the ECHO initiative aims at creating a core of a future permanent infrastructure to guarantee open access to cultural heritage in Europe. With its Heinz Nixdorf Center for Information Management and "e-Lib", an electronic library without walls, the Max Planck Society has already created an innovative, common infrastructure for its scientists. This infrastructure provides a promising platform for extending the open access culture, already well established in the domain of natural sciences, to include the humanities as well. This joining of forces between the sciences and the humanities was manifested in the "Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities," publicly requesting a profound change in the dissemination of scientific knowledge. The declaration has been signed by the Max Planck Society together with the alliance of the German non-university research organizations as well as by International Science and Cultural Heritage Organisations. The ECHO initiative took the occasion of the conference to make a first group of major collections, covering such diverse fields as the History of Arts and Architecture, Anthropology, Linguistics and History of Sciences, freely available in its new open access environment.

3.2 The ECHO-Initiative

The ECHO (European Cultural Heritage Online) Initiative is one of the first major projects funded by the EU Commission to directly bridge the gap between social sciences and humanities and new information technologies respectively. In its initial phase, sixteen partner institutions from nine European countries including candidate countries are set to integrate content and technology in a pan-European infrastructure adequate to the Internet age. The ECHO consortium has set itself a charter to ensure basic values, goals and restrictions. These general propositions include the free availability of tools and content (in particular European cultural heritage) on the Internet, the support of open standards, measures on long-term archiving and the provision of a common infrastructure.

3.3 Basic Functionalities

The ECHO Initiative has made available seed collections including digitized texts and images of sources, video films, as well as scholarly metadata. The seed collections presently covered by the project range from Cuneiform to historical sources, from Texts by Galilei, Leibniz and Newton to language and psychology research data, ethnological representations to collections on Relativity Revolutions. Seed collections are, by definition, constituted by digitized collections of cultural heritage that are freely available and sufficiently structured for allowing the cumulative association of further materials and instruments. Such seed collections should, in particular, offer criteria as well as tools for adding further content, ensuring as far as possible that all possible meaningful links with the contents already in existence can be automatically or interactively implemented.

Much of the material now freely available could only be digitized because the ECHO project helped institutions all over Europe to overcome the competence and technology thresholds for entering the Web. Crucial for lowering the threshold of active contributions to a Web of culture was the idea of an open-access kernel. An open-access kernel is constituted by a grid-like basic infrastructure of the Web with distributed, modularly interlocking contents and tools which are both freely available and serve to fulfil the basic needs associated with a digital representation of cultural heritage.

Unlike specifically developed digital library environments, the more universal ECHO environment offers interoperability between corpora and facilities for web-based collaborations. Among the basic functionalities of its open-access kernel are the zooming of images and tools for web-based commenting and annotation. Among the features envisaged for a fully developed open-access kernel are also distributed archival solutions with stable metadata for the contents and tools, the coordinated display of images and texts, language technology, core ontologies for the major types of sources, web-based environments for generating, handling, presenting, and annotating documents, and web-based services supporting the translation and semantic analysis of texts.

The language technology embedded in the existing open-access kernel allows the integration of distributed language resources within local data processing such as morphological analysis and searching, as well as linking to dictionary entries. The fully developed ECHO environment will allow for searches across morphological forms and across languages. Such technology thus offers the precondition for identifying, accessing, extracting and processing specific contents of text documents independent of their representation by particular languages. This may also serve as an illustration for how a content-driven technical innovation can foster the development of fundamentally new ways of semantic linking in the Web. It would be desirable to extend this technology soon to also cover formal languages such as those embodied in mathematical and chemical formulae.

3.4 The Max Planck Strategy

The Max Planck Society is a research organisation active in almost all fields of science and in the humanities. Its scientific communities are thus very diverse in their needs and expectations towards dissemination systems. Some communities are far advanced in using the Internet for publishing and conducting scientific discourse. Mathematics, computer science, astronomy and physics operate intensely with Internet resources but keep some conventional publication channels. The life sciences use the Internet for global cooperation, increasingly also for publication of primary research results and for their scientific quality management. Traditional publication channels for archival and interdisciplinary communication remain dominant. In the humanities the traditional channels of scientific discourse through books and journals still prevails but a rapidly growing community develops high-end digital systems for representing primary sources of text, artefacts and other objects in high quality with editorial material (transcripts, language corpora) and with multiple links to related resources.


In view of the varying time scales of the transition from print to the new media, the Max Planck Society adopted at an early stage a dual strategy towards the transfer of the scientific process to the Internet age. The dual strategy aims at versatile access to scientific primary and secondary information for all scientists of the Society without the traditional limitations of local physical libraries. The dual strategy consists of a more consumption-oriented and a more production-oriented wing, represented by hosting traditional and commercial electronic information sources, on the one hand, and by building up innovative multimedia dissemination systems, on the other. Taken together, this system represents a virtual library without walls of the entire Max Planck Society. Along the consumption-oriented strategy, this virtual library will hold traditional journals in electronic form, databases for information retrieval (the so-called e-lib module). Along the production-oriented strategy, the virtual library will assemble primary resource collections, electronic journals edited by communities organised by members of the society, the archives of the society and an institutional repository of the scientific achievements of the society (the so-called e-doc module). The latter module is characterised by a high degree of internal linking between related resources, by an interactive character and by high-end digital representations of objects and processes of scientific relevance (paintings, plans, video documents, observation data).

The realisation of the library without walls of the Max Planck Society will be modularly encompassing existing technologies but will also require the development of novel technologies, both on the level of the metastructure of the system and for individual collections. Interfaces to similar projects of other organisations (open archive standard), digital objects locators and multi-institutional backend long-term archival and backup systems will be required to ensure the sustainable growth and the reliability known so far from physical libraries.


The installation of such a system that conveniently enables the generation of peer-reviewed electronic journals will put substantial pressure on the commercial and institutional players of the traditional system based on exclusive licensing to open their products to public access. The Max Planck Society aims for a co-operation and an integration of the two wings of the strategy, provided that all stake holders of the process agree to the open access approach.


The library without walls will, in the future, provide a key resource for intelligent ways of scientific data mining. With the ever increasing complexity of research fields and with the rapid growth of knowledge in the fundamental sciences that are needed for the development of technological applications we shall see the evolution of data mining tools that are far more powerful than present search engines in finding and analysing relations in the content of the information. Today such relations are commonly used only with regard to metadata in a formal sense (e.g. citations).

The modularity of the library without walls makes it possible for other institutions with comparable needs to share the achievements of the developments within the Max Planck Society and vice-versa. It furthermore ensures that the library without walls will become another important corner stone of a future Web of Culture and Science.

3.5 The Berlin Road Map towards Open Access

The Max Planck Society has taken a leading role in Europe in promoting the change of the scientific dissemination system from a closed-user group structure into a resource with open access to everybody. This is based upon the principle of governance of transparency of publicly-funded research and upon the identification of the function of fundamental scientific and cultural knowledge for the creation of technological and social evolution forming the basis of sustainable development.

This process is being pursued in union with similar movements known as the Budapest process and the Bethesda process in the US. The current European Berlin process is, however, like the preceding activities, only useful if it creates practical consequences leading to the implementation of open access structures in the scientific discovery process. These structures should be of international format to represent the global nature of the scientific discovery process.

In recognition of this, signatories of the Berlin Declaration are expected to define a road map towards implementation of open access. This map has naturally two tracks pointing towards the inner structures of an institution and towards the outside, namely the other co-signatories of the declaration. An international follow-up conference in 2004 will be staged together with the US American Bethesda initiative to achieve the coordination process and to look into the progress made so far. To this end elements of the road map for the Max Planck Society may serve as examples for the other signatories of possible measures and requirements to achieve the core demands of the Berlin Declaration.

The first priority on the internal and on the external track is the creation of awareness for the issue. The average scholar and their decision makers were until now and still are barely aware of the information revolution. For this reason, on the internal track a series of seminars and presentations to the researchers within the Max Planck Society will be conducted to create an information base amongst those using and producing the relevant content. On the outside track, talks about the institutional consequences, about issues of quality control, the guaranteed transfer of the peer reviewing process and about the recognition of open access publication in career evaluations will be initiated.


The second priority is to create practical experience with open access publication. On the internal track the necessary electronic library and publishing toolkits need to be setup and provide open-source to all who are interested. To this end the Max Planck Society operates the Heinz Nixdorf Center for Information Management providing the design and partly the development capacity for the necessary infrastructure. An alliance with an institutional provider and hosting agent (the FIZ Karlsruhe) is being implemented to support and consolidate the initial efforts emerging from within the Society to build up an electronic document server, a virtual library, and "living reviews," as well as other electronic journal initiatives. The administrative structures to support and steer the development on a Society level have been created. On the outside track a national network of institutions from the list of signatories of the Berlin Declaration will be put together to share and complement the systemic developments of the Max Planck Society. The coordinating function of the Federal Ministry of Research and Education is needed here. Negotiations have started to ensure the active participation of the national political structures. The earlier the institutional system with its facets becomes operational, the earlier the exemplary effect of convincing other players to convert to open access will operate.

A further priority on the internal track is to redirect the financial streams within the Max Planck Society to support the development and operation of the central electronic library. This will be done be re-focusing distributed funds used before for local content acquisition and by the allocation of fresh money for the development and operation of the central structures. External funding for development projects will be sought, like the support from the Heinz Nixdorf Foundation that was instrumental for initiating the Max Planck Society's lead into the electronic information domain. Additional significant funds for supporting upfront publication costs (page charges) in external open access media for the individual researcher will be available from 2005 for at least a transitory period to encourage the acceptance of the new media. On the external track negotiations with German funding agencies will be conducted to support publications in open access media as an integral part of research project costs.


A pressing but complex priority is the legal issue of open access. The legal base needs to be created for granting the copyright on open access material to the public. In a first step a legal expertise is required to identify potential violations of the present copyright legislation. From there, the necessary modifications will have to be brought to the political institutions. On the internal track the MPI for International and Private Law holds excellent expertise on these issues and will consult with the Max Planck Society. One particular issue is the adaptation of the work contracts of all co-workers of the Max Planck Society to reinforce the practice of granting exclusive copyright on publications to third parties only with tight temporal restrictions. It is obvious that on the outside track a harmonisation of these issues on the national and European level will be required. For these modifications a significant time allocation will be required.

The Max Planck Society seeks to internationalize its efforts in open access. To achieve a sufficient impact on current publishers to change over to a meaningful form of open access, it is essential that the scientific community represented by eminent research organizations express their needs and desires unanimously. It would be preferable to change a large number of existing journal activities with their functioning reviewing structures into the new system rather than to erect parallel activities with overlapping and competing tasks. The group of signatory institutions of the Berlin Declaration should thus try to adjust their political targets with other major open access initiatives to stand together as a network of initiatives. The Max Planck Society will use its contacts to foster a process of forming an alliance for open access.

Efforts will be conducted on the external track to implement a standing conference at a working level between the signatories of the Berlin Declaration. This forum should meet regularly and prepare its harmonisation efforts. It should further watch the progress in implementation and alert the leadership when serious obstacles occur. Finally, it should further develop the process and co-ordinate the implementation of new concepts that are bound to result from the roll out of an initial critical nucleus of open-access media. The measures indicated are expected to be effective within the next three years, many of them requiring only the action of internal bodies within the next eighteen month.

The Next Steps

4.1 How to join the ECHO Initiative

The ECHO Initiative has started during its pilot phase to build up an open infrastructure which allows for a growing number of institutions beyond those who started the initiative to join the endavour. Already in this phase several new collections have been incorporated into the seed collections of the first phase by projects and institutions who have joined the initial partners of the ECHO Initiative in the course of this phase, and several others have expressed their intention to participate. This immediate response demonstrates the potential of existing activities which urgently need a stable infrastructure that guarantees the interoperability, long-term availibility of the results of their work. The rapid accumulation of new seed collections also demonstrates the ease with which an adequate infrastructure can achieve results without high costs, without extensive support, but with highly motivated institutional and personal engagement.

The seed collections which have been made freely available on the Web in the first ten months since the founding of the ECHO Initiative comprise sources on the origins of writings, a collection of cuneiform tablets, a collection of books and manuscripts on natural philosophy and mechanics by authors ranging from Aristotle to Einstein, photos of construction details of the Florentine Cathedral as well as texts and images in the field of life sciences, video sequences demonstrating the intuitive physical knowledge of children, a comparism of European sign languages and ethnological collections.

The ECHO presentation environment is based on standardized XML formats. It serves as a tool for presenting materials such as those described above. The environment allows for the coordination of texts and images and offers an image viewing tool with powerful functionalities for scaling high-resolution images according to constraints set by the browser and for referencing specific parts of these images. Thumbnails of such images help navigating through books and manuscripts much like browsing in a physical copy. For several languages, digitized texts are automatically analyzed upon uploading by means of language technology and linked to available dictionaries. Moreover, this application of language technology makes it possible to search for all morphological forms of words and word combinations. A powerful tool is being developed for editing and semantically analyzing texts in XML format.

The growing network of institutions and projects using these opportunities will in the future be supported and maintained by a number of innovation centers to be founded. Their task is to offer advice concerning the digitization of sources, to provide server and storage facilities, support the integration of projects into the ECHO presentation environment, and support the creation and implementation of open-source software utilized within the ECHO Initiative. Such centers should provide the resources to generalize and disseminate tools developed within the ECHO community.

At present, as long as such centers do not exist, the Max Planck Institute for the History of Science, coordinating the pilot phase, serves as contact for all proposals directed at realizing this infrastructure. In particular, support within the framework of presently available resources will be provided to make sources and tools freely accessible on the Web.

Please contact:

Simone Rieger
Project Coordinator
Max Planck Institute for the History of Science
Wilhelmstr. 44
D-10117 Berlin
Germany
e-mail: rieger@mpiwg-berlin.mpg.de

4.2 How to sign the Berlin Declaration

The Berlin Declaration emphasises the need for new institutional boundary conditions required for fostering the transition of the sciences and the humanities to the Internet Age. It was signed on October 22, 2003 by the President of the Max Planck Society Peter Gruss together with representatives from other large, German and international research organizations. In signing the Berlin Declaration the research organizations advocat consistently using the Internet for scientific communication and publishing. Their recommendations in favor of open access are directed not only at research institutions but also and to the same extent at institutions of culture. The initial signatories have expressed their hope that their initiative will be joined by other representatives of leading organizations and institutions since only a global implementation of the open access paradigm will be able to ensure that the Web of the future will be a Web of Culture and Science.

Governments, universities, research institutions, funding agencies, foundations, libraries, museums, archives, learned societies and professional associations who share the vision expressed in the Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities are therefore invited to join the signatories that have already signed the Declaration.

Please contact:

Prof. Dr. Peter Gruss
President of the Max Planck Society
Hofgartenstraße 8
D-80539 Munich
Germany
e-mail: praesident@gv.mpg.de

A printed version of this report is available on request.
This report was also printed
   CONTACT   IMPRESSUM   Last Update: June 2015