Analysis of Failed SSH Attempts for Intrusion Detection

Loading...
Thumbnail Image

Authors

Alkan, Akin

Issue Date

2024

Type

Thesis

Language

Keywords

Brute Force Attack , Btmp , Host-based Anomaly Detection , Intrusion Detection , Machine Learning , SSH

Research Projects

Organizational Units

Journal Issue

Alternative Title

Abstract

SSH brute force attacks remain among the most common attack types in computer systems. Recent threat analysis reports consistently highlight their prevalence as a top web security vulnerability. Various passive and active methodologies have been developed to deal with this problem, each with its unique set of advantages and disadvantages. Amongst all, analysis of monitoring logs is important to understand the root cause of the problem and implementing necessary countermeasures. Hence, this thesis focus on implementing automated intrusion detection solutions by analyzing historical failed SSH attempts. The data is captured between March 2022 and June 2022 from a server located at University of Nevada, Reno (UNR) campus. We first present thorough analysis of the dataset understand common patterns in the dataset such as origin of IP addresses, usernames, and time of SSH attempts. We identified various types of attack patterns, including slow, steady, and stealthy ones. Since the logs contain both benign and malicious attempts, we utilized external databases (e.g., IPWHOIS and ABUSEIPDB) to classify them as malicious or not, which served as a training data for machine learning models. We developed several machine learning models to categorize SSH attempts as malicious or benign. The models relied on several features including username, time difference of the attacks, the number of previous attempts, and similarity of IP addresses. We trained Random Forest, Decision Tree, XGBoost, SVM, and Logistic Regression models to evaluate their performance. The Decision Tree model exhibited the best performance, achieving 100% precision and a recall rate of 97.9%. In comparison, the best performed rule-based formulation failed to identify 1.5% malicious IPs whereas the Decision Tree model only missed 0.01%. We also validated the results against the public datasets. We noticed that the proposed model detected five malicious IP addresses before they appear in public databases such as ABUSEIPDB, which is a promising result for the proposed model.

Description

Citation

Publisher

License

Creative Commons Attribution 4.0 International

Journal

Volume

Issue

PubMed ID

DOI

ISSN

EISSN