登录 EN

添加临时用户

基于语音的阿尔茨海默症检测方法研究

Research on Speech Based Detection Methods for Alzheimer’s Disease

作者:陈旭初
  • 学号
    2020******
  • 学位
    硕士
  • 电子邮箱
    xuc******com
  • 答辩日期
    2023.05.19
  • 导师
    张卫强
  • 学科名
    信息与通信工程
  • 页码
    84
  • 保密级别
    公开
  • 培养单位
    023 电子系
  • 中文关键词
    阿尔茨海默症, 端到端, 音素后验概率图, 离散变分自编码器, BERT
  • 英文关键词
    Alzheimer‘s disease, end-to-end, phonetic posteriorgrams, discrete variational autoencoder, BERT

摘要

阿尔茨海默症(Alzheimer’s disease,AD),俗称“老年痴呆症”,是老年人中最常见的痴呆症形式。现有的治疗手段仅能维持或减缓患者认知能力衰退的速度,不能逆转已经恶化的痴呆。随着人均寿命增长和老龄化加速,被诊断为阿尔茨海默症的人数正在迅速增加。阿尔茨海默症的早期症状之一是语言能力的恶化,随着病情的加重,患者的失语、表达困难、语言空洞等症状更加明显。语音检测具有操作便捷、成本低、便于常态化检测等优点,因此,基于语音的阿尔茨海默症检测方法具有广阔的应用前景。如何从语音中提取有效的特征,并构建检测模型,是基于语音的阿尔茨海默症检测研究中的关键问题。围绕这些问题,本文提出了三种检测方法,具体内容如下。 (1)提出了一种基于原始波形的端到端的检测方法。传统的阿尔茨海默症检测方法主要使用手工设计的特征,然而,这些特征需要经过精心的设计,并且依赖于特定的领域知识,因此,本研究提出,使用端到端的方法直接从原始波形中学习特定的表征,获取更丰富和无损的信息。该方法使用一维卷积和含有膨胀卷积的残差块从语音中提取特征,并在残差块中加入挤压-激励模块,用以提高检测效果。通过实验证实,该方法可以从原始波形中提取有效的表征进行阿尔茨海默症检测。 (2)提出了一种基于语音音素后验概率图(Phonetic PosteriorGrams,PPGs)特征和BERT模型的检测方法。研究发现,阿尔茨海默症患者在病程早期就出现语音和发音障碍。语音的变化情况可以在音素后验概率图中直观的反映出来,但是目前在阿尔茨海默症检测方面还缺少深入的研究。因此,本研究提出基于PPGs-BERT模型的检测方法。该方法将从语音中提取的音素后验概率图特征作为输入,而后使用BERT模型从中提取高维表征,然后使用二元分类器实现阿尔茨海默症检测。实验结果表明,本方法对于阿尔茨海默症检测具有较好的性能。 (3)提出了一种基于离散变分自编码器(discrete Variational Autoencoders,dVAE)和BERT模型的检测方法。有研究者发现,使用音素序列训练得到的分类器,对于阿尔茨海默症检测具有较好的效果,但是由于受到年龄、疾病等影响,将语音识别为音素的错误率较高,因此,本研究提出了基于dVAE-BERT模型的检测方法。该方法使用离散变分自编码器,将连续的语音转变为独热编码的伪音素序列,而后利用BERT模型对伪音素序列的连接关系进行建模。实验表明,该方法在数据有限的情况下,能够有效地对患者和健康人语音进行分类。

Alzheimer‘s disease (AD), commonly known as "senile dementia", is the most common form of dementia in older people. The current treatment methods can only maintain or slow down the rate of cognitive decline in patients, and cannot reverse dementia that has already deteriorated. With the increase in life expectancy and the acceleration of aging, the number of people diagnosed with Alzheimer‘s disease is rapidly increasing. One of the early symptoms of Alzheimer‘s disease is the deterioration of language ability. As the disease progresses, patients may experience symptoms such as aphasia, difficulty expressing themselves, and empty speech. Speech detection has the advantages of being easy to operate, low cost, and easy to perform regular testing, making it a promising method for Alzheimer‘s disease detection based on speech. Extracting effective features from speech and building detection models are key issues in research on Alzheimer‘s disease detection based on speech. This article proposes three detection methods around these issues, which are described in detail below. (1) An end-to-end detection method based on raw waveform is proposed. Traditional Alzheimer‘s disease detection methods mainly use manually designed features, which require careful design and rely on specific domain knowledge. Therefore, this study proposes to use an end-to-end method to directly learn specific representations from the raw waveform and obtain richer and lossless information. This method uses one-dimensional convolution and residual blocks with dilated convolution to extract features from speech, and adds squeeze-excitation modules in the residual blocks to improve detection performance. Through experiments, it is demonstrated that this method can extract effective representations from raw waveforms for Alzheimer‘s disease detection. (2) A detection method based on Phonetic PosteriorGrams (PPGs) features and BERT model is proposed. It has been found that Alzheimer‘s disease patients experience speech and pronunciation disorders in the early stages of the disease. The changes in speech can be intuitively reflected in the phonetic posteriorgram, but there is still a lack of in-depth research on Alzheimer‘s disease detection. Therefore, this study proposes a detection method based on PPGs-BERT model. This method uses the phonetic posteriorgram features extracted from speech as input, and then uses the BERT model to extract high-dimensional representations from them. Finally, a binary classifier is used to achieve Alzheimer‘s disease detection. Experimental results show that this method has good performance for Alzheimer‘s disease detection. (3) A detection method based on discrete Variational Autoencoders (dVAE) and BERT model is proposed. Some researchers have found that classifiers trained using phoneme sequences have good performance for Alzheimer‘s disease detection. However, due to factors such as age and disease, the error rate of recognizing speech as phonemes is high. Therefore, this study proposes a detection method based on dVAE-BERT model. This method uses a discrete variational autoencoder to transform continuous speech into one-hot encoded pseudo-phoneme sequences, and then uses the BERT model to model the connection relationship of the pseudo-phoneme sequences. The experiment shows that this method can effectively classify the speech of patients and healthy people with limited data.