New paper on LLMs vs SLM for fake news detection

In a recent ArXiv paper Beizhe Hu et al compare the performance of Large Language Models (LLM’s) with other more focussed approaches to detect fake news

Summary

Detecting fake news requires both a delicate sense of diverse clues and a profound understanding of the real-world background, which remains challenging for detectors based on small language models (SLMs) due to their knowledge and capability limitations. Recent advances in large language models (LLMs) have shown remarkable performance in various tasks, but whether and how LLMs could help with fake news detection remains underexplored. In this paper, we investigate the potential of LLMs in fake news detection. First, we conduct an empirical study and find that a sophisticated LLM such as GPT 3.5 could generally expose fake news and provide desirable multi-perspective rationales but still underperforms the basic SLM, fine-tuned BERT. Our subsequent analysis attributes such a gap to the LLM’s inability to select and integrate rationales properly to conclude. Based on these findings, we propose that current LLMs may not substitute fine-tuned SLMs in fake news detection but can be a good advisor for SLMs by providing multi-perspective instructive rationales. To instantiate this proposal, we design an adaptive rationale guidance network for fake news detection (ARG), in which SLMs selectively acquire insights on news analysis from the LLMs’ rationales. We further derive a rationale-free version of ARG by distillation, namely ARG-D, which services cost-sensitive scenarios without querying LLMs. Experiments on two real-world datasets demonstrate that ARG and ARG-D outperform three types of baseline methods, including SLM-based, LLM-based, and combinations of small and large language models.

 

Google to stop caching webpages

Google is discontinuing its practice of storing a backup of the entire Internet, known as “cached” links in Google Search.

This feature allowed users to access websites that were down or had changed. Google’s decision to remove this feature is due to improvements in webpage reliability over time. Although cached links have been disappearing since December, users can still create their own cache links by modifying the URL or using Google Search. Previously, cached links were accessible through a drop-down menu next to search results.

Google’s decision to stop caching webpages will save resources, but it also means users lose insight into how Google’s web crawler views the web.

The removal of cached sites will increase the burden on the Internet Archive to archive and track changes on webpages worldwide.

NYT sues OpenAI for copyright infringement

The New York Times is suing OpenAI and Microsoft, alleging that they’ve used an enormous number of NYT articles (into the millions) without permission (citation or payment) to train their AI systems, which unfairly disadvantages the newspaper and infringes their rights.

NYT claims this violates copyright law and is seeking billions of dollars in damages. NYT approached the companies for a resolution but couldn’t reach one. OpenAI says they’re “surprised and disappointed” by the lawsuit and are hoping for a mutually acceptable resolution.

This lawsuit is part of a larger trend of legal actions regarding AI and copyright. The core of NYT’s complaint is that AI tools like ChatGPT can replicate the information in articles without benefiting the original source and may also spread incorrect information.

OpenAI does not deny making use of the articles but is arguing that there is precedent to use copyright materials in the creation of new content.

New ACM Fellows include three Web Science colleagues

We are delighjted to annouce that the recently announced list of 68 2023 ACM fellows includes not one but three of our Web Science colleagues. In alphabetical order:

  • Prof. Sir Tim Berners-Lee – co-founder, WST patron and former trustee on the Web Science Trust board
  • Prof. Deborah McGuiness – Web Science Lab Director at RPI
  • Prof. Steffen Staab – current trustee of the Web Science Trust board and Web Science Lab Director at the University of Stuttgart

The ACM press release follows:

ACM, the Association for Computing Machinery, has named 68 Fellows for transformative contributions to computing science and technology. All the 2023 inductees are longstanding ACM Members who were selected by their peers for groundbreaking innovations that have improved how we live, work, and play.

“The announcement each year that a new class of ACM Fellows has been selected is met with great excitement,” said ACM President Yannis Ioannidis. “ACM is proud to include nearly 110,000 computing professionals in our ranks and ACM Fellows represent just 1% of our entire global membership. This year’s inductees include the inventor of the World Wide Web, the “godfathers of AI, and other colleagues whose contributions have all been important building blocks in forming the digital society that shapes our modern world.

In keeping with ACM’s global reach, the 2023 Fellows represent universities, corporations, and research centers in Canada, China, Germany, India, Israel, Norway, Singapore, the United Kingdom, and the United States. The contributions of the 2023 Fellows run the gamut of the computing field―including algorithm design, computer graphics, cybersecurity, energy-efficient computing, mobile computing, software analytics, and web search, to name a few.

Additional information about the 2023 ACM Fellows, as well as previously named ACM Fellows, is available through the ACM Fellows website.

Tim Berners-Lee
WWW Consortium

 

For inventing the World Wide Web, the first web browser, and the fundamental protocols and algorithms allowing the Web to scale

Deborah McGuinness
Rensselaer Polytechnic Institute

For contributions to knowledge technologies including ontologies and knowledge graphs

Steffen Staab
University of Stuttgart, University of Southampton

For contributions to semantic technologies and web science, and distinguished service to the ACM community

New UK tax rules on on-line transactions

To combat tax evasion and increase revenue, the UK’s HM Revenue and Customs (HMRC) has introduced new tax rules starting January 1, 2024, aimed at small sellers on platforms like Etsy, Depop, Airbnb, and Vinted.

These rules require platforms to record and report sellers’ income directly to HMRC, following guidelines from the Organisation for Economic Co-operation and Development (OECD). While HMRC already has power over UK platforms, the OECD rules will facilitate quick access to data from platforms outside the UK. This affects around two to five million businesses on digital platforms, including taxi services, food delivery, freelancers, and short-term rentals.

Platforms will gather sellers’ information like name, address, earnings, and fees, including property details for landlords. Sellers meeting their tax obligations won’t be much affected, but those neglecting them may face HMRC demands.

Individuals can earn up to £1,000 extra annually, called the Trading Allowance, tax-free, but surpassing it requires tax reporting. Sellers renting through platforms like Airbnb can utilize the rent-a-room scheme for tax-free earnings up to £7,500 yearly.

HMRC plans to send reminders to those unaware of their tax duties regarding online earnings. They’ve allocated £36.69 million and 24 full-time staff to enforce these rules.

The first reporting deadline for platforms is January 31, 2025, one year after implementation.