乳腺癌作为女性最常见的恶性肿瘤,其早期诊断与精准治疗十分关键。传统的乳腺癌病理诊断依赖于病理医生的显微镜观察和经验判断,存在主观性误差以及诊断耗时较长等问题。本研究旨在提出一种基于数字病理图像的乳腺癌智能辅助诊断系统,以提高诊断的准确性和效率。针对H&E上皮组织分割与亚型识别任务,本研究首先提出了全新的分割模型ResMTUNet。它采用CNN-Transformer双分支编码器,并结合了多任务学习策略来增强模型的信息处理能力。该模型在测试集上的IoU和Dice分数分别高达83.49%和91.04%,显著优于其他对比模型。同时,研究还提出了基于融合特征的乳腺上皮组织亚型分类方法,在测试集上七分类的准确率和F1分数分别达到了62.61%和62.34%,较单一特征的方法有了明显提升。全自动免疫组化定量任务面临两大挑战:免疫组化图像中的浸润癌分割以及肿瘤细胞识别。针对分割挑战,本文提出了一种新颖的方法,首次实现了从IHC图像中分别分割出浸润癌和原位癌。该方法能很好地捕获区分原位癌与浸润癌的关键特征——瘤区边缘特征,在测试集上达到了80.25%的平均IoU,超过了其他分割模型;针对检测挑战,研究创造性地引入领域自适应方法,充分利用H&E染色图像中的信息,成功实现了在Ki-67染色图像上的肿瘤细胞检测。基于上述工作,本文成功构建了一套乳腺病理智能辅助诊断系统,包含了Ki-67、ER、PR定量分析以及H&E上皮组织分割与亚型识别功能。系统已成功在合作医院开展试用。实验证明了其在临床应用中的潜力。其中,临床应用性能评估实验显示,Ki-67定量模块在56个病例的测试中表现良好,其中83.9%的病例预测误差在5%以内。ER和PR定量模块的准确率分别达到了83.93%和90.74%;运行性能评估实验表明了该系统在现有硬件配置下的高效运行性能。此外,针对数据标注的稀缺性问题,本文还提出了稀疏图块标注策略。通过结合少量的有稀疏标注的数据和大量无标注数据,采用半监督学习方法进行模型训练。实验结果表明,即使仅采用25%的稀疏标注数据进行训练,提出的模型在语义分割任务上的性能依然可以与全像素标注相媲美。综上所述,本研究通过创新性的模型设计和数据利用策略,成功实现了基于数字病理图像的乳腺癌智能辅助诊断系统,为推动病理智能化发展做出了积极贡献。
Breast cancer, as the most common malignant tumor among women, requires early diagnosis and precise treatment. Traditional breast cancer pathological diagnosis relies on the observation and experience of pathologists through a microscope, which is subject to subjective errors and can be time-consuming. This study aims to propose an intelligent assistant diagnostic system for breast cancer based on digital pathology images to improve the accuracy and efficiency of diagnosis.For the task of epithelial tissue segmentation and subtype identification on H&E (hematoxylin and eosin) stained slides, this study first proposed a new segmentation model called ResMTUNet. It employs a dual-branch encoder consisting of a CNN (Convolutional Neural Network) and Transformer, and incorporates a multi-task learning strategy to enhance the model‘s information processing capabilities. This model achieved an Intersection over Union (IoU) of 83.49% and a Dice score of 91.04% on the test set, significantly outperforming other comparative models. Additionally, the study proposed a classification method for breast epithelial tissue subtypes based on fused features, which achieved an accuracy of 62.61% and an F1 score of 62.34% on the test set for seven-category classification, a notable improvement over methods using single features.Fully automated immunohistochemical (IHC) quantitative tasks face two major challenges: the segmentation of invasive cancer in IHC images and the identification of tumor cells. To address the segmentation challenge, this paper proposed an innovative method that, for the first time, achieved the separation of invasive cancer and in situ cancer from IHC images. This method effectively captures the key features that distinguish in situ cancer from invasive cancer—the tumor edge characteristics—and achieved an average IoU of 80.25% on the test set, surpassing other segmentation models. To tackle the detection challenge, the study creatively introduced a domain adaptation method, making full use of information from H&E stained images, and successfully achieved tumor cell detection on Ki-67 stained images.Building on the above work, this paper successfully constructed a breast pathology intelligent assistant diagnostic system that includes Ki-67, ER (estrogen receptor), PR (progesterone receptor) quantitative analysis, as well as H&E epithelial tissue segmentation and subtype identification functions. The system has been successfully piloted in a cooperative hospital. Experiments have demonstrated its potential in clinical application. The clinical application performance evaluation showed that the Ki-67 quantitative module performed well in tests on 56 cases, with 83.9% of cases having a prediction error within 5%. The accuracy of the ER and PR quantitative modules reached 83.93% and 90.74%, respectively; the operational performance evaluation demonstrated the system‘s efficient operation under existing hardware configurations.Furthermore, addressing the scarcity of data annotation, this paper also proposed a sparse patch annotation strategy. By combining a small amount of data with sparse annotations and a large amount of unannotated data, the model was trained using semi-supervised learning methods. Experimental results showed that even with training using only 25% of sparsely annotated data, the proposed model‘s performance in semantic segmentation tasks can still compete with full-pixel annotation.In summary, this study has successfully realized an intelligent assistant diagnostic system for breast cancer based on digital pathology images through innovative model design and data utilization strategies, making a positive contribution to the advancement of intelligent pathology.