Distilling Public Data from Multiple Sources for Cybersecurity Appplications
Loading...
Authors
Schnebly, James D
Issue Date
2020
Type
Thesis
Language
Keywords
Alternative Title
Abstract
The amount of data being produced every day is growing at a very high rate, opening the door to new knowledge while also bringing forth cyber breach opportunities for malicious users. In this thesis, the objective is to analyze public data to gain valuable insight for cybersecurity applications. Using public Twitter account data, a machine learning model is trained to identify bot accounts which helps lower the amount of fake news and malicious users. A survey of text summarization techniques to identify the best method for summarizing public data in the domain of cybersecurity is presented. A web application is also created to serve as a public tool for users to summarize input text of their choosing using a variety of algorithms. The contribution of this thesis is thus twofold: a model capable of identifying Twitter bots with high accuracy, and a web application for summarizing cybersecurity information from public data.