Q. Thanks for joining me Sungwon – could you tell us a little about yourself?

A. I’m a doctoral student in Journalism & Media at Uni Texas at Austin with the media focus more on social media. I’m really interested in what social media can tell us about group behaviour.

Q.. How did you come to be interested in the Web and Web Science methods?

A. I guess like a lot of other colleagues it comes from an interdisciplinary background: My Batchelors was in Sociology – I got interested in how people come together to take collective actions (so-called network actions) and the processes underlying that. To understand that I thought that computational methods would be really helpful and so I got a Masters in Data Science which ultimately led me to researching into a social media data as a proxy for how people act and interact.

Q. What shape does that take?

A. Broadly speaking I am using computational methods to look at how people behave on social media platforms where individual actions may become collective actions (via networks) and the extent to which this might predict/explain larger societal actions

Q. What projects have you been working on?

A. Initially I worked on the issues of political polarisation between different Indian groups using TikTok data where the chief focus was on polarisation between Indian diaspora groups vs. Indian homeland groups though there were also religious divisions between Hindu and Muslim groups.

Q. So within religious groups there would have been a common common cultural background but differences in social environment coming from local influences in India or overseas. Interesting.

A. We were looking to develop new techniques to study social media data both in terms of the content of the messages as well as metadata from hashtags. This can be quite challenging to interpret as a researcher without an Indian cultural background as in the case of group hashtags such as #NRI #Modi NRI being “Non-resident Indian” and Modi a leading political figure in Indian politics so were are dealing with a user-developed “Folksonomy” vs a more formal taxonomy.

Q.. What is your current research focussing on?

A. Now I am working with AI-based vision and data science techniques to study the impact of social media on health using social media data on Vaping and e-Cigarettes. We believe social media influences/shapes young peoples’ understanding of smoking/vaping health outcomes and at this early stage of understanding vaping health issues, social influence and peer pressure are potentially very important.

Q. In the same way that media depictions (Movies and TV) shaped the perception of tobacco usage for earlier generations of young people?

A. Exactly. The average age of users here is 18-25 in this TikTok group and may well be significantly affected by peer pressure on social media.
e.g. VapeCloud competitions displays bragging rights/status about the size of cloud that can be produced

Q. Presumably whilst we would observe that this is less negative than, say, competitive self-harm or anorexia support group, nonetheless this involves group behaviour and peer pressure.

A. Exactly. We also observed significant amounts of co-reporting (tacking on) of Vaping to other activities:
e.g., “I am playing X + vaping” or “I am doing Y + vaping” . So I am also interested in why these groups are reporting vaping in other contexts.

Q. How are you looking at the data?

A. I’m using TikTok (meta) data around the posting and developing computer vision techniques to look at images and video. That way we analyse the post itself in terms of the image/video as well as any annotation from metadata/tags. We analyse the post with image analysis, video speech-to-text conversion plus user text descriptions and tags. There is no TikTok API so we need to scrape manually.

Q. What are the challenges here?

A. Whilst it is not hard to get data it may be harder to confirm that it is valid/complete. We may not be looking at all the relevant hashtags (and these may change over time) and posts may include target hashtags even when the post is not actually focussed on vaping #vape – perhaps users are including popular hashtags in the post to get more likes. The data itself is largely unstructured and so we have to do more cross-checking since we know that however good our analytical approach may be, if the source data is flawed then we are going to get unreliable results: garbage-in-garbage-out. This will be especially true for image / video analysis as we are starting to see challenges in terms of fake data from bots and LLM’s and the current rise of AI video where AI content (deep fakes etc) are polluting data streams which may distort our research findings. Ultimately we can try to analyse what is happening but the causes may remain elusive. Why do they vape and even compete at vaping? What are the underlying models driving the behaviour? Social science research at this scale was previously not possible (i.e., analysing 50 paper questionnaires vs 50 million social media data points). This is the new norm and seems impressive but whilst it is much easier to gather more data than ever we need to worry more about quality than ever.

Q. What are the future objectives for this research?

A. Understanding vaping as a “normal” activity vs deviant activity. Understanding social bonding and competitive behaviour. Looking at the idea of “Vape” vs “Vape challenge”. Looking at how social rewards correlate with individual behaviour creating larger network (group) behaviours and the extent to which these behaviours buy group membership getting the user more attention and higher status.

Q. Thanks for speaking to me today and good luck with the rest of your research.

Sungwon Jung is a doctoral student in Journalism & Media at the University of Texas at Austin.

She is interested in the impacts of social media on health and in studying how individual actions can become collective (network) actions.

Can this approach shed any light on future health trends and the importance of messaging for young people as they form more/less healthy habits as part of social learning?