Published August 28, 2025 | Version v1
Dataset Restricted

User Comments from a SNL 'Fast Fashion Ad' Sketch Combined with RoBERTa and BERTopic Outputs

  • 1. ROR icon TU Wien
  • 2. ROR icon Linnaeus University

Description

Context

This dataset was created for a Master's thesis in Digital Humanities by Ka Yee Suvini Lai (see Related Works for the thesis paper titled: Emotion Classification, Topic Modelling, and Discourse Evaluation of Audience Responses to SNL's Fast Fashion Sketch on Social Media: Leveraging RoBERTa, BERTopic and Discourse Analysis). The dataset consists of user comments from a SNL sketch titled 'Fast Fashion Ad', extracted across YouTube, Instagram and TikTok (n=4028). The dataset also contains emotion classification and topic modelling outputs from RoBERTa and BERTopic. 

Data Structure

The dataset consists of the following columns (with explanations in brackets):

  • comment_text (this column contains the user comments of the SNL sketch from Youtube, Instagram and Tiktok) 
  • top_emotion (RoBERTa's output of the highest emotion score from the comment)
  • emotion_scores (RoBERTa's output of all the emotions and their scores from the comment)
  • topic (BERTopic's output for the topic number for the comment)
  • topic_label (BERTopic's output for the topic number and topic label for the comment)
  • probability (BERTopic's output for the probability of the topic from the comment)

This dataset is a .csv file and is interoperable across many digital tools. It is the aggregated results from the RoBERTa and BERTopic Python Pipelines (see Related Works for the source code).

Further details

To gain access to the dataset, please reach out to the author via email: ka.lai@tuwien.ac.at

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You are currently not logged in. Do you have an account? Log in here

Additional details

Related works