Newsletter Autumn 2023





We are pleased to announce a new WSTNet Lab as we welcome the University of Texas at Austin under the leadership of Dhiraj Murthy and his Computational Media Lab to the network.
We will catch up with Dhiraj and his team over the coming months for an interview and we are delighted that the network continues to growand shows wide and growing support for Web Science research and principles.
The UK’s Competition and Markets Authority (CMA) has issued a warning about the potential risks of artificial intelligence (AI) foundation models. These AI systems, trained on massive, unlabeled data sets, underpin large language models and can be used for various tasks. The CMA has proposed principles to guide the development and use of foundation models, including accountability, access, diversity, choice, flexibility, fair dealing, and transparency. The report warns that poorly developed AI models could lead to societal harm, such as exposure to false and misleading information and AI-enabled fraud. The CMA also warns that market dominance from a few firms could lead to anticompetition concerns, with established players using foundation models to entrench their position and deliver overpriced or poor quality products and services. The CMA will provide an update on its thinking in early 2024. The UK government has tasked the CMA with weighing in on the country’s AI policy, but has opted to give responsibility for AI governance to sectoral regulators.
All dates are 23:59 Anywhere on Earth time
Understanding the Web
Making the Web Inclusive
The Web and Society
Doing Web Science
In addition to the topics at the heart of Web Science, we also welcome submissions addressing the interplay between the Web, AI, and society. New advances in AI are revolutionizing the way in which people use the Web and interact through it. As these technologies develop, it is crucial to examine their effect on society and the socio-technical environment in which we find ourselves. We are nearing the crossroads wherein content on the Web will increasingly be automatically generated, blended with that created by humans. This creates new potential yet brings new challenges and exacerbates existing ones in relation to data quality and misinformation. Additionally, we need to consider the role of the Web as a source of data for AI, including privacy and copyright concerns, as well as bias and representativity of resulting systems. The potential impact of new AI tools on the nature of work may bring a transformation of some careers while creating whole new ones. This year’s conference especially encourages contributions documenting different uses of AI in relation to how people use the Web, and in the ways the Web affects the creation and deployment of AI tools.
Please upload your submissions via EasyChair:
https://easychair.org/conferences/?conf=acmwebsci24
There are two submission formats:
All accepted submissions will be assigned an oral presentation (of two different lengths):
All papers should adopt the current ACM SIG Conference proceedings template (acmart.cls). Please submit papers as PDF files using the ACM template, either in Microsoft Word format (available at https://www.acm.org/publications/proceedings-templateunder “Word Authors”) or with the ACM LaTeX template on the Overleaf platform, which is available at https://www.overleaf.com/latex/templates/association-for-computing-machinery-acm-sig-proceedings-template/bmvfhcdnxfty. In particular; please ensure that you are using the two-column version of the appropriate template.
All contributions will be judged by the Program Committee upon rigorous peer review standards for quality and fit for the conference by at least three referees. Additionally, each paper will be assigned to a Senior Program Committee member to ensure review quality.
WebSci-2024 review is double-blind. Therefore, please anonymize your submission: do not put the author(s) names or affiliation(s) at the start of the paper, and do not include funding or other acknowledgments in papers submitted for review. References to authors’ own prior relevant work should be included but should not specify that this is the authors’ own work. It is up to the authors’ discretion how much to further modify the body of the paper to preserve anonymity. The requirement for anonymity does not extend outside of the review process, e.g., the authors can decide how widely to distribute their papers over the Internet. Even in cases where the author’s identity is known to a reviewer, the double-blind process will serve as a symbolic reminder of the importance of evaluating the submitted work on its own merits without regard to the authors’ reputation.
For authors who wish to opt-out of publication proceedings, this option will be made available upon acceptance. This will encourage the participation of researchers from the social sciences that prefer to publish their work as journal articles. All authors of accepted papers (including those who opt out of proceedings) are expected to present their work at the conference.
Oshani Seneviratne (Rensselaer Polytechnic Institute)
Luca Maria Aiello (IT University of Copenhagen)
Yelena Mejova (ISI Foundation)
For any questions and queries regarding the paper submission, please contact the chairs at acmwebsci24@easychair.org.
t.b.a.
t.b.a.
t.b.a.
Ian: Matt, it feels strange to welcome you as a more recent Lab Director when I think I’ve known you as part of the Web Science community for at least 10 years
Matt: Probably longer – I think my interest in Web Science and particularly Web data goes back to the very first Web Science conference in 2010 and perhaps before that.
Ian: So was Web Data your point of entry to Web Science?
Matt: Thats right, I’d spent a lot of time looking what was thought of as archived web data and trying to render those as large-scale researchable data collections. We went through a number of iterations from a system called Hub Zero through to Archives Unleashed and most recently that work was integrated into the Internet Archive research services by a team at University of Waterloo so that people who are looking to extract value and sound research conclusions from these data sets can find them and access them through well -supported high quality platforms and tools.
Ian: How hard it is to get everyone involved?
Matt: Well one of the major challenges is trying to get people to share and engage with these data sets outside of tightly controlled commercial offerings.
Ian: Well we’ve certainly seen Palantir, Recorded Future et al. work to derive interesting conclusions and predictions from large data sets like this.
Matt: I think the difference here is partly that many users (even if they are data rich) are much less interested in creating/curating data sets than they are in using them. We’ve seen humanities, CIS and engineering groups all derive huge benefits from well-curated third-party data. Getting those groups to create and share their own data too is tough without aligning the process with their academic objectives and the academic recognition system.
Ian: Has anyone cracked that problem in this space?
Matt: The Harvard Dataverse is an attractive platform which hosts data sets and generates benefits for both the contributors and the community as a whole by tracking/reporting which datasets are downloaded via a data DOI.
Ian: Which translates to recognisable impact in academe?
Matt: Absolutely, I had a data set which I was able to show had been downloaded more than 35’000 times. Thats significant impact.
Ian: So lets talk about the NetSci lab at Rutgers
Matt: This is a collaboration between a great team of leading academics in Communication, Information Science, and Journalism who are addressing a wider view of Human Networks interacting through Technological Networks as well as other contexts.
Ian: What is your current focus?
Matt: We are looking at systems of local information that feed/support their communities and how this intersects with the phenomena of misinformation. We’ve mapped the transition to more regional news structure and a steady decline in the production of quality local news (critical information, politics, education, disaster/safety) in favour of less substantial/serious content (sports, human interest etc) which, whilst potentially of interest, does little to support a local communities in more serious situations.
Ian: Do users simply live with less local content as a result?
Matt:In fact, this gap in local news coverage tends to increase the use of (local) social media such as Next Door and Facebook for new, where stories are largely unverified, not edited by a third party and, in some cases, anonymous. This leads to a greater risk that the information provided may be misinformation or even malicious.
Ian: How serious is the potential impact?
Matt: For example we have seen a troubling loss of local news connections between communities and infrastructure providers such that in the event of power outages in adverse weather events there is no longer a trusted independent local news source to disseminate news updates, timetables and disaster response information from the power company to the community but only what potentially poorly informed social media commentators may be saying. We are focused on better understanding the impact of the loss of a robust and trusted connection between physical systems and information systems.
Ian: What could be a potential response to address this disconnect?
Matt: We are considering the process of re-establishing a trust-based relationship between communities and service providers (industrial, government) via trusted intermediaries – a role that quality news/media organisations used to fill.
Ian: This sounds like really interesting work
Matt: We don’t believe we are even close to seeing the potential impact of mis-information – both inadvertent or even the weaponisation of (dis)information as it will continue to affect local and national news and our understanding of the truth.
Ian: Thanks for speaking to me today and welcome to the WSTNet.