Data Science in Cybersecurity and Cyberthreat Intelligence


Data Science in Cybersecurity and Cyberthreat Intelligence

With the rapid increase of cyberattacks, accurate security information is becoming more and more difficult to obtain, for example due to inabilities to deal with the increasing data volume, data complexity, data variety, data veracity, and (in)scalability of data processing algorithms. To manipulate security data efficiently, one has to deal with the heterogeneity and inconsistency of data sources used to fuse security information, which come with different data structures, file formats, and serializations, and may constitute unstructured, semi-structured, and structured data. The quality and trustworthiness of data depend on how certain the represented knowledge is, which can be backed by data provenance.

Data scientists of organizations often do not work in security; instead they focus on business outcomes. However, data science can actually be a viable component of security strategies to gain cyber-situational awareness via advanced data processing techniques, fuse technical data from diverse sources, and ensure data quality and unambiguity. By using data science techniques, security professionals can manipulate and analyze network and security data efficiently, and uncover valuable insights from data-driven risk assessment and security performance management. Data science can be well utilized in cybersecurity in terms of data preparation, anomaly detection, exploratory data analysis, data visualization, modeling, and optimization. It is useful in preprocessing raw security data for machine learning, detecting anomalous behavior and malicious content, and creating machine learning algorithms to identify potential cyberthreats. The enormous datasets used in security applications require big data analytics and parallel computing frameworks, which can also be provided by data scientists to make security-related information meaningful.

This book presents a collection of state-of-the-art approaches to utilizing machine learning, formal knowledge bases and rule sets, and semantic reasoning to detect attacks on communication networks, including IoT infrastructures, to automate malicious code detection, to efficiently predict cyberattacks in enterprises, to identify malicious URLs and DGA-generated domain names, and to improve the security of mHealth wearables. This book details how analyzing the likelihood of vulnerability exploitation using machine learning classifiers can offer an alternative to traditional penetration testing solutions. In addition, the book describes a range of techniques that support data aggregation and data fusion to automate data-driven analytics in cyberthreat intelligence, allowing complex and previously unknown cyberthreats to be identified and classified, and countermeasures to be incorporated in novel incident response and intrusion detection mechanisms.

  • Sikos, L. F., Choo, K.-K., R. (eds.) (2020) Data Science in Cybersecurity and Cyberthreat Intelligence. Cham, Switzerland: Springer. DOI: 10.1007/978-3-030-38788-4

Can be ordered from Springer: https://www.springer.com/gp/book/9783030387877

Can be ordered from Amazon: https://www.amazon.com/gp/product/3030387879/ref=dbs_a_def_rwt_bibl_vppi_i6