WebSci’23 News Bulletin 4

15th ACM Web Science Conference: Inequalities in the Face of Concurrent Crises

30 April – 1 May 2023

Austin, Texas, USA (and online)

https://websci23.webscience.org/

With less than two weeks to go until the start of this year’s Web Science Conference, here comes our third news bulletin, introducing the paper sessions and their contents happening throughout the second conference day.

Additionally, we still have some availability of free online tickets for students and early career scholars. To apply for a free ticket, please complete the online application at the link below as soon as possible but not later than Friday 21, 2023. Notifications will be sent out by Monday 24, 2023.

Link to the application form: https://forms.gle/8SvUsxB4nrGc7YEb6

In case you have not had a look at our full program yet, you may find it on our website (https://websci23.webscience.org/program/). While you are there, you may also find more details on the publications belonging to the different paper sessions (https://websci23.webscience.org/paper-sessions/).

If you should have missed it, our previous news bulletin provided an overview of the different events happening throughout the two conference days, as well as a deep dive into the paper sessions of the first conference day. You may find this one and all other news bulletins in the News section on our website.

In case you have not yet registered for the conference and would like to attend either virtually or in-person, you may still do so by following the registration instructions provided on the website (https://websci23.webscience.org/registration/).

Still need more arguments for why this conference will be well worth you while? Keep on reading below and check out what great presentations we will have during the second day of the conference! Missing info on the presentations of day 1? Look for our News Bulletin #2, we got you covered there.

For this and everything else #WebSci23, keep an eye on the hashtag on social media, and check our website for further updates.

Best wishes

WebSci’23 Conference Committee

Paper Session 4 (Monday, 11:00 AM – 12:30 PM): Fairness and Bias

Following the keynote by Dhiraj Murthy, right before lunch, the paper session on Fairness and Bias is the first of the three outstanding paper sessions on the second day of this year’s conference.

Ángel Pavón Pérez et al. propose an approach based on covariance analysis to identify attributes that encapsulate sensitive information in datasets used in high-stake decision-making situations, like credit and loan approvals. Their approach promises to reduce model bias while maintaining the overall performance.

Tuðrulcan Elmas studies the common issue of data decay that many Social Media datasets suffer from. By dissecting the issue into different topical contexts and looking into potentially harmful contents with particular detail, the importance of collecting this type of data in real time to avoid the development of various biases is emphasized.

The issue of bias in knowledge graphs such as Wikidata is explored by Paramita Das and colleagues. By looking into different knowledge graph embeddings for professions sampled from around the globe, they raise awareness for the need for precise design choices when it comes to data and algorithm to avoid the propagation of biases into knowledge graphs.

Weixiang Wang and Sucheta Soundarajan study the multi-faceted topic of fairness in yet another domain; Recommendation systems. Their proposed FairLink framework allows for the specification of the appropriate fairness metric for proposing new links in a link recommendation system.

Adding to the comprehensive treatment of bias in this session, Fabian Haak and Philipp Schaer present a whole suite of resources for the evaluation of bias in online search. Their dataset of Google and Bing search queries helps to understand the ways in which standard search behavior may lead to biased opinion formation.

Margherita Berte and colleagues monitor the gender gap in the Italian labor market via data from the LinkedIn Advertising Platform. Their study exposes patterns on a subnational level that also relate to what they call a digitalization gender gap.

Paper Session 5 (Monday, 1:30 PM – 3:00 PM): Harmful and Problematic Behavior

After lunch, we will hear all about the problems of Harmful and Problematic Behavior on Social Media and beyond.

Mithun Das and Animesh Mukherjee colleagues take the problem of detecting harmful content on Social Media to the visual level. Their work explores different techniques of detecting abusive memes in a multilingual and multimodal setting.

Giuseppe Russo and colleagues study the ways in which the migration of banned communities to fringe platforms effects their degree of radicalization as well as their spillover back onto the mainstream platforms they were initially banned from. Their exploration of individual- and social-level factors builds the foundation for evidence-based moderation policies going forward.

Maricarmen Arenas and colleagues take a look at the networks of sex traffickers promoting OnlyFans accounts on Twitter. Their Multi-Level Clustering method allows for the detection of networks based on features beyond the mere textual content of their Tweets.

Joseph Kwarteng and colleagues not only shed light on the phenomenon of “Misogynoir”, a very specific type of hate at the intersection of racism and sexism that is inherently difficult to detect on platforms like Twitter for its subjectivity and intersectionality, but also provide a systematic investigation of the influence of annotators’ demographics on their annotation behavior. Their work thereby highlights the relevance of annotators’ perspectives and content comprehension when it comes to tasks like labelling hate speech.

Xinyu Wang and colleagues study the racist narratives and conspiracy theories targeted at Asians in the context of the Covid-19 pandemic. Their work with data from Twitter explores how these narratives and conspiracy theories are deeply rooted in historical stereotypes, uncovering insights for improved anti-racist efforts going forward.

Hina Qayyum and colleagues cover yet another type of harmful and problematic behavior on Twitter, focusing on the users that are responsible for the largest share of toxic content that proliferates on the platform, characterizing their thematic contents and posting behaviors.

Paper Session 6 (Monday, 3:15 PM – 4:45 PM): Misinformation and Misperceptions

The last paper session of the conference – before we prepare the center stage for David Rand’s keynote, our “The Future of Web Science Panel”, and the Best Paper and WST Test of Time Awardees – is focused on the topic of Misinformation and Misperceptions.

Francesco Pieri and colleagues shed light on the spread of propaganda and misinformation on Facebook and Twitter during the Russian invasion of Ukraine. Their various analyses touch upon questions of content amplification and moderation.

Mohamad Hoseini and colleagues study the global spread of the QAnon conspiracy theory on Telegram, offering a global overview of popular themes and different linguistic communities.

Focusing on countermeasures to the problem of online misinformation, Yingchen Ma and colleagues take a closer look at the phenomenon of social correction. Their work studies the dynamics around the correction of misinformation through the intervention of other users on Twitter.

Akram Sadat Hosseini and Steffen Staab look behind the scenes of online misinformation by exploring the emotional framings of different types of claims. Taking this approach yet another step further, they also evaluate the emotional responses and sharing behavior of users seeing these claims.

Jinkyung Park and colleagues study the problems of automatic misinformation detection methods that use source level labels. Their work shows how a focus on the article level helps to conduct fairness audits and improve the detection of misinformation.

Satrio Yudhoatmojo and colleagues look at two different web communities – Reddit and 4chan – and explore how they interact with scientific work shared as e-prints. Their work shows how scientific knowledge might be misinterpreted through dissemination and discussion in these channels.

Web3 – the promise and the reality

Web3 describes a group of technologies for managing collective interactions on the internet while avoiding centralised control, granting users agency over access to their data, and managing distribution of value as digital assets. The technologies are distributed ledgers including blockchain, cryptocurrencies, distributed autonomous organisations, decentralised finance, and non-fungible tokens. Web3 technologies offer technical solutions to problems of trust and verifiability online. Their open source basis makes them available to developers globally and across sectors and communities. Some of these technologies are already in use across many sectors and have been proposed as applicable to a much greater range of uses in the future. If the technologies prove successful sustainably at scale for a very wide range of functions, they might change and expand what the internet delivers for a high proportion of users, and genuinely warrant the description Web3.

Bob Metcalfe to give Turing Award lecture

CONNECTIVITY

The most important new fact about the human condition is that we are now suddenly connected. When I say “suddenly” I refer to the Internet’s birthday, October 29, 1969 and how two thirds of the human race, five billion people, are already on the Internet, in only 50 years. Suddenly.

The Arpanet started the Internet in 1969 by networking time-shared minicomputers serving dumb character terminals. Then in 1973 Xerox Palo Alto Research Center (PARC) decided to put a personal computer on every desk. Ethernet was invented on May 22, 1973 to provide local connectivity among those PCs, one on every desk, if you can imagine that..

The PARC Ethernet was formed by combining Jerrold coaxial vampire taps, Manchester on-off keying, and Alohanet randomized retransmissions. Then we wrapped it in internet protocols according to a layered reference model. Then we standardized it all: Ethernet, IP, TCP, TELNET, FTP, Mail, URL, HTML, HTTP.

Ethernet evolved rapidly away from its Jerrold-Manchester-Alohanet prototype. Ethernet’s legacy is instead packets to the desktop, abundance of bandwidth, and standardization. Come hear all about it. GPT is writing my lecture now.

About

When: April 30, 2023, 5-7PM Central Time,
Where: Zlotnik Ballroom at AT&T Conference Center at UT Austin,
Location: 1900 University Ave, Austin, TX 78705

WebSci’23 News Bulletin

15th ACM Web Science Conference: Inequalities in the Face of Concurrent Crises

30 April – 1 May 2023

Austin, Texas, USA (and online)

With the full conference program now being available on our website (https://websci23.webscience.org/program/), this second bulletin gives you an overview of the different events taking place throughout our two conference days.

Day 1 (Sunday, 30th of April) starts with the opening ceremony, introducing this year’s conference theme of Inequalities in the Face of Concurrent Crises and providing some information about the logistics and the various activities surrounding the WebSci’23 conference. The paper sessions happening throughout the day are on Politics and Ideology, Language and Emotions, and Online Communities and Digital Analytics. They are presented in a bit more detail and with little sneak-peeks of the exciting contents you may expect to learn about below. The day closes with the first of three keynotes; we are honored to invite internet-pioneer Bob Metcalfe to the stage, who will share his thoughts on the past, present, and future of the web with us. Afterwards, the Web Science Trust is inviting the conference participants to a reception event.

On Day 2 (Monday, 1st of May), we start with yet another keynote; make sure to get up early and join for Dhiraj Murthy’s talk. Afterwards, we will have the paper sessions on Fairness and Bias, Harmful and Problematic Behavior, and Misinformation and Misperceptions. David Rand’s keynote will then provide the perfect segway into our “The Future of Web Science”-panel. On the panel, we will have our conference chairs Dame Wendy Hall and Noshir Contractor discuss the future of web science with Deborah McGuinness (RPI), Weihang Wang (USC), and Ricardo Baeza-Yates (Northeastern University). The panel is moderated by Emőke-Ágnes Horvát, yet another of our conference chairs. After the panel and to close the conference in style, we will have the Awards Session. Make sure to join either in person or virtually when the winners of this year’s Web Science Trust Test of Time and Best Paper Awards are revealed! And for those who like to plan ahead, we will also learn about the location of next year’s conference.

If this all sounds as interesting to you as it does to us, make sure to use the opportunity to register. Registration is still open both for in-person attendance in Austin as well as for virtual attendance. Read more about the registration modalities on our website (https://websci23.webscience.org/registration/). 

Not yet convinced? Then keep on reading below and learn about the amazing presentations we will have during the first day’s paper sessions! And stay tuned, as a deep dive into the paper sessions on the second day will follow soon.

For this and everything else #WebSci23, keep an eye on the hashtag on social media, and check our website for further updates.

Best wishes

WebSci’23 Conference Committee

Paper Session 1 (Sunday, 11:00 AM – 12:30 PM): Politics and Ideology

Our first papers session of the conference is packed with social media studies into issues of politics and ideology online. 

We will learn about the communications patterns around changes in governments in democratic nations through our two first papers. Kunihiro Miyazaki and colleagues will present their work on the “honeymoon” effect, exploring how the activities of social media users might help explain the phenomenon of high approval ratings for newly elected leaders. Right after that, we will hear from Francesco Pierri, who puts a focus on the understudied but highly impactful topic of digital advertising in the run-up to elections. 

Political communication is also the subject of study in the next papers of this session. Hong Zhang and colleagues not only propose a hashtag-based stance labelling method, but also explore the hidden connections between seemingly independent political topics. After that, Paulo Henrique Santos and colleagues present their work on the political debate that is happening through TikTok videos in Brazil, sharing their insights on how to study this much discussed but arguably still understudied platform.

The first paper session ends with an exploration of the ideological spaces that women create for themselves online by Utkucan Balci and colleagues. While they extend their study of Reddit communities to the places to which some of the communities that were banned migrated to, Amaury Trujillo and Stefano Cresci studied the impact of exactly such moderation decisions. In the final paper of this session, they take a closer look at the user-level effects of moderation decisions on Reddit.

Paper Session 2 (Sunday, 1:30 PM – 3:00 PM): Language and Emotions

The first session on Sunday afternoon provides an overview of some of the most recent research into the intersection between language and emotion

Jinfen Li and Lu Xiao go first and present how they developed their Multi-EmoBERT tool to help identify multiple, co-existing emotions within a single text. Applying their method to fake news datasets, they explore the relationships between veracity, stance, and emotion. While Julie Jiang and colleagues also use emotions as a dimension of analysis, their research takes another direction, focusing on the subset of geotagged Tweets. Comparing them to a random dataset of Tweets, they discuss the characteristics of geotagged Tweets and raise awareness for potential issues of representativity when working with this specific type of social media data.

Hadi Asghari and colleagues take a closer look at the important topic of accessibility and inclusion. They explore the prevalence of “Leicht Sprache” – language meant to be easily understandable – on the German web and derive a set of technical and policy recommendations for a more inclusive web based on their findings.

The first of two studies into mental health issues within this paper session comes from Juhi Mittal and colleagues. By comparing and contrasting the Reddit posts sent to mental health subreddits by users who are also active in immigration subreddits to those of users that are not, they explore the changes in the language of mental health around immigration experiences. Also leveraging online language markers on Reddit, Tingting Liu and colleagues develop prediction models for detecting symptom of depression discourse and show that these models even work reliably for texts from other sources.

In between these two studies working with Reddit data, Dominik Bär and colleagues present their work on the social media activities around open-source journalism, using the example of the Bellingcat Twitter account. They look at the relative importance of open-source journalism for the traditional media ecosystem, explore the characteristics of successful user engagement, and assess the impact of the Russian invasion of Ukraine on the sentiment of the follower base towards Bellingcat.

Paper Session 3 (Sunday, 3:15 PM – 4:45 PM): Online Communities and Digital Analytics

The last paper session of Day 1 is all about exploring the characteristics of online communities and the digital analytics tools available for that.

Tom Alby starts the session with a talk on Google Analytics and the use of its features by webmasters, discussing whether or not their capabilities and data literacy as well as the requirements imposed by the GDPR led to the most efficient use of web analytics methods.

Mohit Chandra and Munmun de Choudhury – one of the winners of last year’s Test of Time award – will present their recent work on the effects of Covid-19 on employee experiences, focusing on the state and a potential new future of (remote) work post-pandemic.

Yuki Yanagida and colleagues go next with their study on the relationship between web information-seeking behavior and post-purchase satisfaction, leveraging insights from web search logs and product ratings on an e-commerce site.

In a study that is potentially relevant for the many web scientists relying on crowdworkers for their research, Catherince C. Marshall and Frank Shipman investigate the differences in bad data from surveys fielded in 2013, 2018, 2019, and 2022.

Focusing on yet another platform, Pier Paolo Tricomi and colleagues take a closer look at what is popular on Instagram, studying the underlying mechanisms that drive engagement. From their interpretable models, they derive guidelines for creating successful posts – so don’t miss this presentation if you want to kickstart your Instagram-influencer career!

The day of presentations closes with Alyssa Sha and colleagues, who propose a tool to link topics across Q&A platforms like Zhihu and Quora. They compare the results of their Wikidata-based method with outputs generated via GPT-3, addressing existing issues in the large language model.

Upcoming Brave Conversation

The next Brave Conversations event will take place on Friday 12th May, 2023, this time in the heart of the Eurozone in Brussels … this is the public first face to face we’ve held in a few years … 

 

See the Brave Conversations website:

 

Brussels 2023

braveconversations.org

 

For this event we are working in partnership with the Digital Enlightenment Forum, a small organisation that operates in Brussels with a big agenda and some great aspirations, and a network of very interesting people Brave Conversations feels is a perfect complement to Web Science and the work of the past decade.

 

For this event we are going to explore the context of doing business and government in 2023 and most specifically focus this event on the Smart Human – how to stay human and have agency in an increasingly digitised world reliant on smarter and smarter machines.  People are calling for a pause in AI development in order for we humans to take a breath and think … 

 

We would like to ask is that a bad thing?

 

As we increasingly integrate digitisation and digitalisation in to our lives we gain the benefits (which are worth exploring) but there is always a cost to pay, and that cost will take time to understand.

 

The event webpage can be found at:  https://braveconversations.org/brussels2023/

 

Registrations via EventBrite via:  https://www.eventbrite.co.uk/e/brave-conversations-brussels-2023-tickets-603615268517