应用Q学习决策的最优攻击路径生成方法

doi:10.19665/j.issn1001-2400.2021.01.018

摘要/Abstract

摘要：

论文主要研究的是基于Q-learning算法生成一种动态寻找最优攻击路径的方法,并且能够提高攻击方法的高效性与适应性。以Q-learning算法为基础,参考网络连通性,通过分区的手段,利用删除网络拓扑中不可达的路径的化简方法,并通过机器学习的方式模拟黑客攻击,将状态与动作结合,在不断地学习中能够提高自身的适应与决策能力,以达到高效生成最优攻击路径的目的。经过实验,所建立的模拟攻击者能够在存在IDS报警装置的环境里中获取到Q-learning方法中的状态-价值表,并且能够通过遍历Q表获取从源主机到目的主机的最优攻击路径序列,验证了模型和算法的有效性与准确性。同时,通过提前分区域分析主机可达性,删减了冗余节点,在大型的网络拓扑中具有很大的优势。

关键词: 攻击图, 网络安全, 强化学习, 最优化算法, Q-learning

Abstract:

The main research purpose of this paper is to generate a dynamic approach to finding the optimal attack path based on the Q-learning algorithm in machine learning,and to improve the efficiency and adaptability of this approach.The method,based on the Q-learning algorithm and by the reference network connectivity and partition,uses the delete inaccessible path in the network topology reduction method,and simulated by machine learning hacker attacks,combines state and action,in keep learning to improve their ability of adaptation and decision-making,so as to generate the optimal attack path efficiently.Finally,through experiments,the established simulated attacker can obtain the state-value table in the Q-learning method in the environment with the IDS alarm device,and can obtain the optimal attack path sequence from the source host to the destination host by traversing the Q table,which verifies the validity and accuracy of the model and algorithm.At the same time,by analyzing the host reachability in advance,the redundant nodes are greatly reduced,a great advantage in large network topology.

Key words: attack graph, network security, reinforcement learning, optimization algorithm, Q-learning

中图分类号:

TP309

李腾,曹世杰,尹思薇,魏大卫,马鑫迪,马建峰. 应用Q学习决策的最优攻击路径生成方法[J]. 西安电子科技大学学报, 2021, 48(1): 160-167.

LI Teng,CAO Shijie,YIN Siwei,WEI Dawei,MA Xindi,MA Jianfeng. Optimal method for the generation of the attack path based on the Q-learning decision[J]. Journal of Xidian University, 2021, 48(1): 160-167.

图/表 11

图1

图2

图3

图4

图5

图6

图7

图8

图9

图10

图11

参考文献 15

[1]	PHILLIPS C, SWILER L P. A Graph-based Network-vulnerability Analysis System [C]//Proceedings of the New Security Paradigms Workshop.New York:ACM, 1998: 71-79.
[2]	杨英杰, 冷强, 常德显, 等. 基于属性攻击图的网络动态威胁分析技术研究[J]. 电子与信息学报, 2019,41(8):1838-1846.
	YANG Yingjie, LENG Qiang, CHANG Dexian, et al. Research on Network Dynamic Threat Analysis Technology Based on Attribute Attack Graph[J]. Journal of Electronics & Information Technology, 2019,41(8):1838-1846.
[3]	张涛, 吴冲, 刘晖. 基于安全状态空间的攻击图生成方法[C]//全国网络与信息安全技术研讨会论文集(上册)北京:中国通信学会, 2007: 153-160.
[4]	张书钦, 李凯江, 张露, 等. 基于Q-learning机制的攻击图生成技术研究[J]. 电子科技, 2018,31(10):6-10.
	ZHANG Shuqin, LI Kaijiang, ZHANG Lu, et al. Research on Attack Graph Generation Based on Q-learning Mechanism[J]. Electronic Science and Technology, 2018,31(10):6-10.
[5]	胡昌振, 陈韵, 吕坤.一种基于Q学习的最佳攻击路径规划方法:CN107317756A[P]. 2017 -11-03.
[6]	叶子维, 郭渊博, 王宸东, 等. 攻击图技术应用研究综述[J]. 通信学报, 2017,38(11):121-132.
	YE Ziwei, GUO Yuanbo, WANG Chendong, et al. Survey on Application of Attack Graph Technology[J]. Journal on Communications, 2017,38(11):121-132.
[7]	孙一品, 钟求喜, 苏金树. 基于隐马尔可夫模型的攻击意图识别技术研究[J]. 计算机工程与科学, 2007,29(8):19-22.
	SUN Yipin, ZHONG Qiuxi, SU Jinshu. Research on Intention Recognition Based on HMM[J]. Computer Engineering & Science, 2007,29(8):19-22.
[8]	李庆朋, 王布宏, 王晓东, 等. 基于最优攻击路径的网络安全增强策略研究[J]. 计算机科学, 2013. 40(4):152-154.
	LI Qingpeng, WANG Buhong, WANG Xiaodong, et al. Approach on Network Security Enhancement Strategies Based on Optimal Attack Path[J]. Computer Science, 2013. 40(4):152-154.
[9]	DEWRI R, RAY I, POOLSAPPASIT N, et al. Optimal Security Hardening on Attack Tree Models of Networks:a Costbenefit Analysis[J]. International Journal of Information Security, 2012,11(3):167-188.
[10]	王辉, 茹鑫鑫, 戴田旺, 等. 基于NAPG模型的攻击增益路径预测算法[J]. 吉林大学学报(理学版), 2019,57(5):1169-1178.
	WANG Hui, RU Xinxin, DAI Tianwang, et al. Attack Profit Path Prediction Algorithm Based on NAPG Model[J]. Journal of Jilin University(Science Edition), 2019,57(5):1169-1178.
[11]	王辉, 娄亚龙, 戴田旺, 等. 基于BNAG模型的脆弱性评估算法[J]. 计算机工程, 2019,45(9):128-135,142.
	WANG Hui, LOU Yalong, DAI Tianwang, et al. Vulnerability Evaluation Algorithm Based on BNAG Model[J]. Computer Engineering, 2019,45(9):128-135,142.
[12]	高妮, 高岭, 贺毅岳, 等. 基于贝叶斯攻击图的动态安全风险评估模型[J]. 四川大学学报(工程科学版), 2016,48(1):111-118.
	GAO Ni, GAO Ling, HE Yiyue, et al. Dynamic Security Risk Assessment Model Based on Bayesian Attack Graph[J]. Journal of Sichuan University(Engineer Science Edition), 2016,48(1):111-118.
[13]	闫峰, 刘淑芬, 冷煌. 基于转换的攻击图分析方法研究[J]. 电子学报, 2014,42(12):2477-2480.
	YAN Feng, LIU Shufen, LENG Huang. Study on Analysis of Attack Graphs Based on Conversion[J]. Acta Electronica Sinica, 2014,42(12):2477-2480.
[14]	胡浩, 叶润国, 张红旗, 等. 基于攻击预测的网络安全态势量化方法[J]. 通信学报, 2017,38(10):122-134.
	HU Hao, YE Runguo, ZHANG Hongqi, et al. Quantitative Method for Network Security Situation Based on Attack Prediction[J]. Journal on Communications, 2017,38(10):122-134.
[15]	LIU X G.A Network Attack Path Prediction Method Using Attack Graph[J/OL].[2020-07-26].https://doi.org/10.1007/s12652-020-02206-5.