Towards a Web of Culture, History, and Science
A strategy for content-driven technological innovation in the framework of the ECHO-Initiative
Jürgen Renn (MPIWG)
Berlin, 25 August 2003
The Crisis of European Culture
European cultural heritage is presently dramatically loosing ground in the techno-scientific world that has emerged from it. In particular, the bulk of our cultural heritage, all the great works of history, literature and art, the treasures of scientific, scholarly, and philosophical writings, is still strikingly absent from the medium of the future, the Internet. And what is included in the Web due to the efforts of some pioneering projects is far from shaping its infrastructure rather almost drowned by the tides of the information garbage. The deficit in the extent to which cultural information is available on the net is paralleled by the underdevelopment of cultural techniques and infrastructures adequate to the new information technologies. In short, the media transfer of cultural heritage from the Gutenberg galaxy to the Internet is, evidently, no self-accelerating process essentially taking place by itself or triggered by existing intellectual, technological, or economic forces.
How can this pernicious situation be changed? Different scenarios are conceivable:
- The "scout solution" is based on the assumption that the transfer of cultural heritage to the new medium can be achieved by pilot ventures alone. Evidently, this approach has failed to launch a self-sustaining dynamics.
- The "big player" solution essentially assumes that the dominating forces of the economic or academic market will take care of bringing cultural heritage to the net. However, in spite of their eagerness to control large domains of cultural heritage, often imposing proprietory restrictions on free accessibility, the big players have also failed to create an infrastructure that guarantees a steady and reliable flow of this heritage from the old medium to the new.
In order to initiate the far-reaching upheaval that a comprehensive digitisation of our cultural, historical, and scientific heritage would amount to, neither missionary zeal, nor brute force, nor standardization and coordination efforts alone will do. What is needed is rather an infrastructure that enables each single participant in this process including the interested public and schools also those that are not yet involved in it to pursue their specific interests while contributing, at the same time, to a shared body of digitally represented knowledge. This is the key of the "Agora solution," adopted by the ECHO (European Cultural Heritage Online) Inititiave.
Its aim is to establish an open-source culture of the public and scholarly exploitation of cultural heritage on the Internet. This idea comprises the promotion of content-driven technology in information management - instead of the usual technology that puts constraints on content. The resulting infrastructure should allow every archive, library, museum, educational or scientific institution to make their sources available online with little effort and in a way that guarantees their interoperability with other elements of the European cultural heritage. Every potential ECHO-Associate will therefore gain a characteristic ECHO-surplus value when entering the Agora by making contents or tools available on the Web.
The ECHO-surplus value can be achieved by:
- transforming tools developed for particular aspects of cultural heritage into modules of a universal working environment applicable to all pertinent domains of cultural heritage,
- enabling all possible meaningful links, for instance between texts and dictionaries, but also conceptual links, for instance between sources documenting Renaissance art and sources documenting related contemporary scientific and technological knowledge,
- launching a self-sustaining dynamics leading to a steady increase of the cultural heritage available on the web, and to the development of ever more sophisticated instruments for its analysis and dissemination,
- presenting content that until now could not be made freely accessible on the web to the scientific and general public, and
- making computer-assisted tools, hitherto prevented from becoming standard instruments, available to a broad community of users.
The ECHO-Initiative is at present supported by or has established ties with major institutions capable of investing content, competence, and/or technology into such an Agora. Among them are major organizations of basic research, universities, technological centers, museums, archives, libraries, firms, projects, and foundations such as those indicated on the transparency.
The Web of Future
Realizing the ambitious goals of ECHO is not just a matter of filling the Web with content but also means to turn content into a motor for an innovation of the Web itself. The aim is to transform the Web from an ephemeral communication network of providers to an enduring representation of the shared knowledge embodied in the cultural, historical, and scientific heritage of mankind. This brings me to the vision of a Web of culture, history, and science.
The Web represents a powerful achievement in the connectivity of human knowledge that until recently seemed unconceivable. The revolution it has caused has rightly been compared with those of the invention of writing and of the invention of printing technology. But the rapid development of the Web itself is about to surpass the basic development of the technologies it is based on. In its present form, the Web makes more promises than it can actually keep at least as long as it is restricted to the specific paradigm that has originally given rise to it. In particular, it lacks longivity, interactivity, and transparency.
Just as it was the case when the Internet was created by turning a network of computers into a medium representing a universal hypertext, also its future will rather depend on requirements and possibilities that are revealed only in the context of innovative usage scenarios. Such usage scenarios will emerge when the Internet is used as a virtual public think tank serving as a medium of reflection on the global challenges of human civilization, such as the destruction of ecological equilibria, social impacts of epidemics and drug addictions, or terrorism and other devastating consequences of oppression and increasing mass impoverishment.
If both the natural sciences and the humanities should even under these conditions of global challenges be capable of providing the knowledge crucial for solving the problems of the human species, then this knowledge must also be represented, integrated, and made available in a form allowing for global orientation and action. It is time to take up the opportunity offered by the Internet to create such a medium of global human reflection.
As yet, nobody knows whether or not the net will be developed in this direction and what precisely it will look like if it does become such a public think tank, but some requirements are evident. A future web of culture, history, and science will have to
- offer free access to cultural heritage;
- guarantee an enduring memory of mankind;
- encourage the free exchange of information and arguments across still existing social, political, and religious boundaries;
- develop mechanisms for self-organisation and for the evaluation of the information it makes available.
In its present form, however, the Web still lacks such longivity, interactivity, and transparency. For realizing a web of culture, history, and science it is therefore not sufficient to make the current web just more efficient. The status quo paradigm of the present Web is a client-server interaction, that is, a fundamentally asymmetric relationship between providers inserting content into the Web hypertext (server) and users who essentially read texts or provide answers to questions by filling out forms (clients). The hyperlinks of the Web represent structures of meaning that transcend the meaning represented by individual texts, but, at present, these webized structures of meaning, lacking any longivity, can only be blindly used e.g. by search engines which at best optimize navigation by taking into account the statistical behavior of web users. However, these meaning structures can so far hardly be made themselves the object of interventions by the web community. There is at present no way to construct complex networks of meaningful relations between web contents. In fact, the providers have no influence on the links to the contents provided by them and the users have no impact on the available access structures to the content, except by becoming content providers themselves.
This asymmetric client-server relation largely determines the functionalities of the existing web-software. Webservers are not the standard tools of users, while the web-browsers used by them are restricted to accessing existing information in a standard form with only limited possibilities of further processing that information (such as e.g. changing fonts and background colors). As a consequence, the present Web offers no possibility for (radically) different views of the same underlying content, depriving users from the creative potential inherent in the dynamics of the ever-changing Web hypertext.
The emerging paradigm of the future Web is no longer constituted by this client-server assymmetry but by informed peer-to-peer interactions, that is, by a cooperation of equally competent partners who jointly act as providers and servers at the same time. Future users will work on shared knowledge by constructing new meaning while accessing the existing body of knowledge represented in the Web through meaningful links to documents and document corpora. An important framework for creating such meaningful links can be provided by what is presently discussed as the "semantic web," that is, the automated creation of links between machine-understandable metadata. In a further perspective, however, such semantic linking will not be restricted to the use of specifically prepared metadata sets but will exploit the meaning structure of the Web itself in order to provide a content-based semantic access to information.
The aim of the initiators of the ECHO project is to help closing the gap between social sciences and humanities and the new information technologies in order to establish a new quality of access to cultural heritage on the Web, thus transforming the latter from an ephemeral communication network of providers to an enduring representation of the shared knowledge embodied in the cultural heritage of mankind. However, in order to represent culture adequately on the web, not only a reorientation from the currently prevailing use of the net for commercial and topical issues to its use as a publically accessible comprehensive knowledge system is required, but also the development of new technologies supporting such a reorientation. The functionality of the software that is needed for a web of culture, history and science will accordingly depend not so much on exploiting technological potentials immediately at hand such as the increase of mass storage capacity or of transmission rates but rather on the requirements of the content to be represented and processed. In turn, the content-driven development of such technology may lead to fundamentally important contributions to the infrastructure of the web of the future, leading for instance to a new generation of browsers which might then more adequately be designated as "knowledge weavers."
The design of such software can be based on experiences and technical developments already accumulated by institutions with competencies in the webprocessing of cultural heritage data. By way of conclusion, let me briefly survey some of the main issues addressed by the software platform needed as an infrastructure for building up a web of culture, history, and science:
- Basic data structures: The platform has to be based on the processing of XML documents exploiting the hitherto unused potential of this standard of the future.
- Data transformations: It will provide facilities for processing the browsed data according to the content they represent.
- Natural language technology: The platform has to enhance the range and scope of language technology.
- Data on data: Taking into account that a basic faculty of human thinking is the ability to reflect on existing knowledge and to produce, so to speak, data on data, the software platform to be developed has to support the creation of scholarly metadata by interactive working environments, resulting not only in qualitatively improved relations between documents but also in richer metadata sets for navigation through scientific contents and their mutual relationships to each other.
- Content-sensitive linking: The realization of the explanatory power of data on data also provides a solution for a serious problem of the current as well as of the future Web, that is to produce order in the ever-growing complexity of the Web by content-sensitive linking. If navigation can be based on content-specific metadata resulting in dynamically changing ontologies, and if powerful link editing functionalities become part of the future "knowledge weavers," a self-organizing mechanism of the Web will be implemented which will improve the hypertext linking of the Web. Using content-specific metadata will make sure that the archiving of the cultural content to be put online will not result in a frozen data repository but give rise to an infrastructure for a living history of culture.