Log analysis for cyber attack detection and classification using Apache Spark and Python
Loading...
Date
2022
Authors
Mukhamedkali Ye.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The IT infrastructure of lots of companies have increased from several to hundreds and thousands of servers over the past decades, in turn the digital footprints have risen. Digital footprints consist of access logs that include information about specific events, for instance, IP address of the user, timestamp when the event occurred and hostile activity that can affect the network. IT companies mostly used the architecture of Apache Hadoop that is based on MapReduce to analyze the access log files of the companies. These access log files contain confidential information which is related to security of servers, as for it these companies are seeking for the architecture which can analyze the access logs in real time. In order to transcend these limitations of Apache Hadoop, in this paper are given efficient architectures such as streaming based Apache Spark which can analyze both batch-based and real time data.
Description
Keywords
IT infrastructure, network, IT companies, Apache Sparkm, data