Amir, thanks for agreeing to be interviewed. Can you tell us where you are based and what your main research interests are..
I’m based at Cardiff University and my focus is on a particular type of cyber attack called “drive-by downloads” which are typically combined with social media posts on platforms like Twitter
How are these different from typical viruses or other attacks?
A Drive-By download involves one or more malicious scripts which execute without requiring the user to specifically download or click a suspicious object – the act of visiting the URL is enough to infect the host machine.
How does the social media element play out here?
Social media platforms often host/distribute click-bait in the form of a message which provokes interest and/or an emotional reaction in the user and encourages them to follow a (typically shortened and hence unrecognisable) URL to respond to it.
So what angle is your research taking on this?
Rather than attempting to look at the vast range of topic/ideas that might prompt a user to follow click bait we are looking at the types of stimulus like Events (e.g. Sports matches) which have a specific date/time around which the click-bait and URLs may be focused. If we can work to specific events as a focus we may be able to analyse patterns of (social) attack discovering which users and sites are involved, how these are structured in terms of topics and social vectors and work to dampen the scale of the retweet network which is generated and ultimately predict where attacks may happen and find ways to inoculate against them.
What has your research shown so far?
We analysed tweets from several events and categorised them as malicious or benign and within the malicious group the type of emotion (we discovered eight) that the tweets were trying to elicit to get a click-through or retweet. We found that fear-provoking tweets were most likely to be retweeted and persisted longer than other emotions.
We then analysed the effect of the different drive-by download scripts on the machine state of a test machine in order to subject these to a machine learning process. We were able to identify activities/patterns that the scripts attempted to execute on visiting the infected site and attempted to match/recognise these patterns within a short window as the script starts to execute. Success here would facilitate developing a “kill-switch” protocol that could potentially save the machine/network from infection. Our current model is identifying malicious URLs about 86% of the time which is very promising.
Where are you going next with the work?
We are keen to build a better profile of the influential users, the common topics, the infected sites (though these shift) and to be able to create an efficient and scalable method to scan for attacks/attackers using various factors (e.g. tweets from users created only hours/minutes before) such that we can weaken/disrupt the scale of the attack and ultimately inoculate users through an efficient combination of blacklisting and real-time detection processes.
How useful has the Web Science perspective been on this work?
Traditionally Cybersecurity has focused on machine impacts and technical networks but whilst the idea of the social exploit is far from new, social media enables social attacks and trust exploits on a scale we’ve never seen before and so understanding how social networks function and how they can be managed/influenced for better security is vital.
Where would you like to see Web Science go next as a discipline?
With a growing war between hackers and cybersecurity specialists there is not only a need to understand each specific attack in terms of machine learning/pattern matching but also to understand the broader social process of deliberate deception (feinting) in order to avoid detection. How do we filter for “noise”, fake data and other methods designed to fool automated detection and make our model resilient against such noise.
Amir has submitted his Thesis at Cardiff University and is shortly to be appointed a lecturer at Cardiff