结合3D-CNN和频-空注意力机制的EEG情感识别

doi:10.19665/j.issn1001-2400.2022.03.021

摘要/Abstract

摘要：

目前,将深度学习用于脑电情感识别的研究已提出很多方法,但大多数方法并没有同时考虑脑电信号在时间、空间以及频率三个维度上的信息。基于此,提出一种结合频率-空间注意力机制的三维卷积神经网络(FSA-3D-CNN),同时考虑脑电信号在时间、空间和频率三个维度的信息,从而提高情感识别的准确性。首先,根据脑电信号的特性设计了一种新颖的四维特征结构,对时域分段后的脑电信号分别提取微分熵特征,并将其转换为四维矩阵用于训练深度模型。然后,针对四维特征矩阵结构改进现有的3D-CNN情感识别模型,同时利用脑电信号中的时间、空间和频率的信息。最后,设计一种频率-空间注意力机制自适应地分配脑电信号的频率和空间通道的权值,挖掘脑电信号中更能显著反映情感状态变化的空间和频率信息。FSA-3D-CNN模型在DEAP公共情感数据集的效价维和唤醒维二分类准确率分别达到了约95.87%和95.23%,在效价-唤醒维的四分类准确率达到约94.53%,比现有的卷积神经网络和LSTM情感识别模型均取得了显著的提升。

关键词: 脑电信号, 情感识别, 微分熵, 深度学习, 注意力机制

Abstract:

Currently,many deep learning methods have been proposed for EEG-based emotion recognition.However,most of them do not fully consider the correlated information from temporal,spatial,and frequency dimensions of EEG signals,on the basis of which a three-dimensional convolutional neural network based on the spatial-frequency attention mechanism (FSA-3D-CNN) is proposed to improve the accuracy of emotion recognition,in which the emotion correlated information on EEG can be learned from temporal,spatial,and frequency perspectives effectively.First,the differential entropy features are extracted from the time-domain segmented EEG signals,and a novel 4D feature structure is designed to obtain the four-dimensional feature matrix for training the deep learning model according to the characteristics of the EEG signals.Then,the existing 3D-CNN is improved according to the 4D feature structure,which makes full use of the information on temporal,spatial,and frequency dimensions of EEG signals.Finally,a spatial-frequency attention mechanism is designed to adaptively allocate the weights to the spatial and frequency channels of the EEG signals,and extract the spatial and frequency information on EEG signals that can more significantly reflect changes in emotional state.The DEAP emotion dataset is used to test the performance of our method.Experimental results have demonstrated that the proposed FSA-3D-CNN method can achieve the average accuracy of 95.87% and 95.23% for the two classifications between arousal and valence dimension and the average accuracy of 94.53% for four classifications of arousal-valence dimension,which has achieved significant improvement than that of the existing CNN and LSTM emotion recognition methods.

Key words: electroencephalography, emotion recognition, differential entropy, deep learning, attention

中图分类号:

TP391

张静,张雪英,陈桂军,闫超. 结合3D-CNN和频-空注意力机制的EEG情感识别[J]. 西安电子科技大学学报, 2022, 49(3): 191-198.

ZHANG Jing,ZHANG Xueying,CHEN Guijun,YAN Chao. EEG emotion recognition based on the 3D-CNN and spatial-frequency attention mechanism[J]. Journal of Xidian University, 2022, 49(3): 191-198.

图/表 10

图1

图2

图3

表1

图4

图5

图6

表2

图7

表3

参考文献 15

[1]	ZHANG J H, YIN Z Y, CHEN P, et al. Emotion Recognition Using Multi-Modal Data and Machine Learning Techniques:A Tutorial and Review[J]. Information Fusion, 2020, 59:103-126. doi: 10.1016/j.inffus.2020.01.011
[2]	权学良, 曾志刚, 蒋建华, 等. 基于生理信号的情感计算研究综述[J]. 自动化学报, 2021, 47(8):1769-1784.
	QUAN Xueliang, ZENG Zhigang, JIANG Jianhua, et al. Physiological Signals Based Affective Computing:A Systematic Review[J]. Acta Automatica Sinica, 2021, 47(8):1769-1784.
[3]	LI X, SONG D, ZHANG P, et al. Emotion Recognition from Multi-Channel EEG Data through Convolutional Recurrent Neural Network[C]// Proceedings of the 2016 IEEE International Conference on Bioinformatics and Biomedicine.Piscataway:IEEE, 2016:352-359.
[4]	KOELSTRA S, MUHL C, SOLEYMANI M, et al. DEAP:A Database for Emotion Analysis;Using Physiological Signals[J]. IEEE Transactions on Affective Computing, 2011, 3(1):18-31. doi: 10.1109/T-AFFC.2011.15
[5]	MA J X, TANG H, ZHENG W L, et al. Emotion Recognition Using Multimodal Residual LSTM Network[C]// Proceedings of the 27th ACM International Conference on Multimedia. New York: ACM, 2019:176-183.
[6]	ESPOSITO R, BORTOLETTO M, MINIUSSI C. Integrating TMS,EEG,and MRI as An Approach for Studying Brain Connectivity[J]. Neuroscientist, 2020, 26(5-6):471-486. doi: 10.1177/1073858420916452
[7]	KWON Y H, SHIN S B, KIM S D. Electroencephalography Based Fusion Two-Dimensional(2D)-Convolution Neural Networks (CNN) Model for Emotion Recognition System[J]. Sensors, 2018, 18(5):1383. doi: 10.3390/s18051383
[8]	YANG Y L, WU Q, FU Y, et al. Continuous Convolutional Neural Network with 3D Input for EEG-Based Emotion Recognition[C]// Proceedings of the International Conference on Neural Information Processing.Berlin:Springer, 2018:433-443.
[9]	SHEN F, DAI G, LIN G, et al. EEG-Based Emotion Recognition Using 4D Convolutional Recurrent Neural Network[J]. Cognitive Neurodynamics, 2020, 14(6):815-828. doi: 10.1007/s11571-020-09634-1
[10]	SHI L C, JIAO Y Y, LU B L. Differential Entropy Feature for EEG-Based Vigilance Estimation[C]// Proceedings of the 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.Piscataway:IEEE, 2013:6627-6630.
[11]	杨晓莉, 蔺素珍. 一种注意力机制的多波段图像特征级融合方法[J]. 西安电子科技大学学报, 2020, 47(1):120-127.
	YANG Xiaoli, LIN Suzhen. Method for Multi-Band Image Feature-Level Fusion Based on The Attention Mechanism[J]. Journal of Xidian University, 2020, 47(1):120-127.
[12]	CHEN J X, JIANG D M, ZHANG Y N. A Hierarchical Bidirectional GRU Model with Attention for EEG-Based Emotion Classification[J]. IEEE Access, 2019, 7:118530-118540. doi: 10.1109/ACCESS.2019.2936817
[13]	陶威. 基于注意力机制的脑电情绪识别方法研究[D]. 合肥: 合肥工业大学, 2020.
[14]	曹卫东, 李嘉琪, 王怀超. 采用注意力门控卷积网络模型的目标情感分析[J]. 西安电子科技大学学报, 2019, 46(6):30-36.
	CAO Weidong, LI Jiaqi, WANG Huaichao. Analysis of Targeted Sentiment By The Attention Gated Convolutional Network Model[J]. Journal of Xidian University, 2019, 46(6):30-36.
[15]	LI Y, HUANG J, ZHOU H, et al. Human Emotion Recognition with Electroencephalographic Multidimensional Features by Hybrid Deep Neural Networks[J]. Applied Sciences, 2017, 7(10):1060. doi: 10.3390/app7101060

信号	组织结构	维度
计算机视觉	时间×宽×高×色彩通道(RGB)	N×W×H×3
脑电信号	时间×宽×高×频率(θ,α,β,γ)	N×9×9×4

N	脑电段时长/s	效价维 (2)	唤醒维 (2)	效价-唤醒维 (4)
12	0.25	94.26±3.92	93.12±5.08	88.84±8.45
6	0.50	94.36±3.86	93.28±5.00	92.01±6.58
3	1.00	95.87±2.86	95.23±3.48	94.53±5.01
2	1.50	95.82±3.16	94.81±3.66	93.89±4.75

模型	维度信息	效价维 (2)	唤醒维 (2)	效价-唤醒维 (4)
3D-CNN	空间+频率	91.12±4.19	90.14±4.59	85.00±6.40
LSTM	时间+频率	85.22±7.65	87.42±7.07	77.68±9.27
CCNN^[8]	空间+频率	89.80±2.76	90.50±2.98	85.30±6.47
CRNN^[9]	空间+时间	91.98±3.60	92.46±3.35	85.84±2.41
4D-CRNN^[9]	空间+频率+时间	94.22±2.61	94.58±3.69	88.87±2.29
ACRNN^[13]	空间+频率+时间	93.72±3.21	93.38±3.73
3D-CNN(文中)	空间+频率	91.02±5.31	90.71±5.72	90.02±7.81
FA-3D-CNN(文中)	空间+频率+时间	94.25±4.23	94.22±4.43	92.57±5.62
SA-3D-CNN(文中)	空间+频率+时间	93.21±5.09	92.80±6.13	92.82±5.73
FSA-3D-CNN(文中)	空间+频率+时间	95.87±2.86	95.23±3.48	94.53±5.01