The amount of information on the Web is enormous and growing exponentially. Indeed, it is a major challenge to measure the amount of information contained in the Web. It is even harder to assess how much of this information is useful or original. In addition, the information on the Web comes in a huge range of formats from a vast number of disparate sources. All of these aspects raise a crucial research topic: how we are to browse, explore and query the Web at this scale? Once again, this theme requires the inter-disciplinary approach embodied in Web Science. From a computer science perspective, we need to know how inference can be supported at the Web scale; for example, how can context be represented and supported? What do psychology and linguistics tell us about the design of interfaces for querying complex data? How can the data sources within the Web be exploited to help us to develop understanding of the sociological aspects of the Web? Understanding the possibilities for inference online is an important skill for those moving into scientific research and development.
All economic, social and legal interactions are based on certain assumptions: that individuals can verify identities; can rely on the rules and institutions governing the interactions; and are assured that certain information will remain private. These assumptions are challenged by the Web: an environment where security, privacy and trust can be very difficult to monitor, verify and enforce. Will the Web grind to a halt as a result? Will ways be found to ensure that these basic features are present? Or will users of the Web find their own ways to cope with the absence of e.g., trust? These questions call on a broad range of Web Science disciplines: to understand how individuals perceive trust and privacy when they use the Web; to see how concepts such as trust can be computationally represented; to develop the legal institutions needed to govern Web interactions. An understanding of the technology underlying security, the variables underlying trust and the extent of the privacy that Web users demand is clearly valuable in a number of industries.
The Web is different from most hitherto-studied systems in that it is changing at a rate which is of the same order as, or maybe greater than, our ability to observe it. This introduces many new inter-disciplinary research challenges. How are we to instrument the Web and how can we log it or identify behaviours? Once we are able to measure what is happening in this world of constant flux, we can then turn to the issue of how to model and understand it. Mathematical tools help to analyse the changing structure of the Web (e.g., using graph theory). Sociology can develop an understanding of the two-way process by which individuals and technologies shape each other. A legal perspective is needed to assess whether law is a catalyst for Web dynamics, or merely reactive to it. Linguistics will allow us to assess how language (and e.g., the preponderance of people for whom English is a second language) is affecting the development of the Web. As the Web changes, so does practice. Understanding the dynamics and pace of change is very important in a number of industries.
The Web, as it exists today, is a complex mixture of open, public areas and closed, private zones. There are prominent advocates of both positions: those that maintain that the Web must be based on open platforms; those that argue that property rights provide the strongest incentive for innovation in the Web. There has been little systematic and coherent research to resolve these positions. That research must be interdisciplinary. From a technical computing perspective, we need to know exactly what is meant by “openness”. How can legal frameworks be constructed to deal with openness on the Web? Is openness necessary for innovation, or are private and commercial incentives more effective? Is openness compatible with the security requirements of e.g., e-health applications? Economic and legal issues predominate when we examine the open Web, and there are important questions of balance for proprietary Web development organisations. When is it important to release intellectual property to build a user base, and when should a more restrictive business model come into place? These are central issues for those concerned with providing software services and content on the Web.
Collective intelligence is the surprising result of collaborative endeavour with only light rules of co-ordination that lead to the emergence of large-scale, coherent resources (such as Wikipedia). The existence and stability of these resources present major challenges for all the researchers engaged in Web Science. How, from a technical point of view, can collective intelligence be enabled? What are the socio-economic reasons why individuals participate in collective endeavour? What legal framework governs (or should govern) the resources that are created? What is the psychology of identification with an online collective community? How can collective intelligence emerge, given the different languages used by different genders, races, classes and communities? What role is there for policy-makers to engage in and facilitate collaborative endeavour? In an age where political participation is declining, harnessing the potential of collective and collaborative intelligence is an important theme for governments as they try to engage citizens, verify their legitimacy and find creative policy levers.