Karma Chameleons: Data Collection Techniques and Account Characterization for Bot Detection on Reddit

Loading...
Thumbnail Image

Authors

Floam, Marissa

Issue Date

2025

Type

Thesis

Language

en_US

Keywords

bot detection , reddit , social bots , social media

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

Malicious bots on social media platforms present an ever-evolving threat to the integrity of online communication. As platforms like Twitter/X, Meta, and Reddit continue to grow in popularity, so do opportunities for malicious actors to create bot accounts. These bots, which are often designed to spread deceptive material, spam, or unoriginal content, can significantly influence public opinion and suppress creativity. As a result, it is crucial that general users are able to easily identify these bots so they can be removed, protecting the authenticity of these online spaces. This thesis addresses one of the main challenges in social media bot detection: the collection of bot or human account datasets with a clearly established ground truth that can be used to train machine learning models. Without these datasets, developing accurate models to distinguish between bot and human accounts becomes difficult. In contrast to Twitter/X, bot detection research targeting Reddit is scarce, even though it is an increasingly popular platform. This thesis outlines characteristics that differentiate bot accounts from human accounts on Reddit and proposes methods for creating reliable datasets for bot detection. Human and bot datasets are combined and tested in six decision tree models to determine the accuracy of each data collection method for use in bot detection. Accurately distinguishing between human and bot accounts is essential for dataset creation, and focusing on specific characteristics allows for clearer determinations between bot and human accounts. Ultimately, in order to maintain the integrity and trustworthiness of one of the most popular social media platforms, Reddit, this thesis addresses a critical gap in bot detection research by providing a reliable framework for data collection.

Description

Citation

Publisher

License

Journal

Volume

Issue

PubMed ID

DOI

ISSN

EISSN