基于机器学习的乳腺肿瘤识别-霍双红.pdf
《基于机器学习的乳腺肿瘤识别-霍双红.pdf》由会员分享,可在线阅读,更多相关《基于机器学习的乳腺肿瘤识别-霍双红.pdf(42页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、审娃史缮单位代码:10110学 号:S1408035基于机器学习的乳腺肿瘤识别煽滋娑剿 万方数据图书分类号UDO注102951 O硕士学位论文指导教师(姓名、职称) 三建史塾攫申请学位级别 堡堂亟专业名称 座旦麴堂论文提交日期2Q曼2年 垒月 2 日论文答辩日期至Q羔Z年 量月 2 日学位授予日期 年月日论文评阅人 奎塞塞副麴攫 自垫莲麴攫答辩委员会主席 夔数塑麴攮2017年6月1日万方数据原创性声明本人郑重声明:所呈交的学位论文,是本人在指导教师的指导下,独立进行研究所取得的成果。除文中已经注明引用的内容外,本论文不包含其他个人或集体已经发表或撰写过的科研成果。对本文的研究作出重要贡献的个人
2、和集体,均已在文中以明确方式标明。本声明的法律责任由本人承担。论文作者签名:7之熟:缇关于学位论文使用权的说明本人完全了解中北大学有关保管、使用学位论文的规定,其中包括:学校有权保管、并向有关部门送交学位论文的原件与复印件;学校可以采用影印、缩印或其它复制手段复制并保存学位论文;学校可允许学位论文被查阅或借阅;学校可以学术交流为目的,复制赠送和交换学位论文;学校可以公布学位论文的全部或部分内容(保密学位论文在解密后遵守此规定)。签 名:超22 lo-签名:逞呈丞导师签名:万方数据中北大学学位论文基于机器学习的乳腺肿瘤识别摘 要乳腺肿瘤严重危害到女性的健康,目前为止还没有找到很好的预测乳腺癌的方
3、法。目前,依照当前的医疗水平,唯一提高乳腺癌的治愈率和降低乳腺癌的死亡率的方法关键在于要提早发现,早发现可以及时治疗,早诊断,不要耽误最佳治疗时间和早治疗。近几年,为了提高诊断的效率,其研究方法一直向着智能化和工具化这个方向发展。随着现在科学技术地不断发展,人工智能技术不断进步,人工神经网络技术也日益成熟,它的分类能力也越来越强,并且具有智能性,为乳腺肿瘤识别提供了一种新的诊断方法。由于乳腺病灶组织发生病变,然而它与正常的细胞核组织的显微图像有所不同,因此采用分类能力比较强的算法来进行乳腺肿瘤诊断,为乳腺肿瘤诊断提供了一种新的方法。本文主要以机器学习人工神经网络为基础对乳腺肿瘤的诊断进行研究和
4、实验,应用了几种方法进行仿真实验,都具有良好的实验效果,故障诊断精度被提高了,所以它是是一种有效的乳腺肿瘤诊断方法,并且此方法具有较高的医学应用价值。本文主要研究内容包括:(1)应用了统计学三种判别方法,fisher判别,距离判别,贝叶斯判别对乳腺肿瘤数据,进行实验,三种方法进行比较。通过仿真实验证明,fisher判别的准确达到了971,距离判别的正确率为841,贝叶斯判别的正确率为8841,三者比较fisher判别具有较高的正确率,被误判的概率也是最低的。所以fisher判别具有更好的实验效果。(2)应用Kmeans神经网络算法和自组织神经网络算法两种方法,对乳腺肿瘤数据进行了实验,Kmea
5、ns神经网络和自组织神经网络的正确率分别为80和8158。由于对乳腺肿瘤识别用的数据是乳腺肿瘤病灶组织的细胞核显微图像10个量化特征进行的,而每组数据是由采样细胞核的10个特征构成的,为了避免误差,取了10个特征的平均值、标准差和最坏值一共30个数据。由于输入维数较大,冗余信息较多,运行时间较长,所以采用PCA对数据进行降维,降到8维,累计贡献率已经达到了9991。然后在用Kmeans和自组织神经网络进行实验,运行时间缩短了,并且正确率也提高了。万方数据中北大学学位论文经过主成分分析以后用Kmeans和自组织神经网络的正确率分别为8895禾11 8842。(3)由于LVQ算法对初始权值敏感,本
6、文提出采用遗传算法对LVQ优化。针对应用LVQ对乳腺肿瘤诊断的不足,提出了改进的LVQ,采用遗传算法优化的LVQ算法,有效提高识别精度。实验结果证明,改进的LVQ对乳腺肿瘤进行诊断,其正确率为913,比LVQ的正确率提高了44关键词:乳腺肿瘤诊断;Fisher判别;Kmeans算法;自组织竞争神经网络;学习向量量化神经网络;遗传算法万方数据中北大学学位论文Research on Diagnosis of Breast CancerBased on Machine LearningAbstractBreast cancer has become the first killer of femal
7、e health hazardsAt present,there is alack of the first level prediction method for breast cancerThe key to improve the cure rateand reduce the mortality of breast cancer is early detection,early diagnosis and earlytreatmentSo early detecting,early diagnosing,early treating has become the only wayThe
8、traditional breast tumor diagnosis method of medical images is the focus partThis methodrelies on artificial experience for the diagnosis is subjective and lower accuralyThe diagnosisresults are unreliable extremelyAccording to the experience,experts determined the imageof the patientS tumor to clas
9、sification classifyIn order to improve the efficiency of diagnosis,the research methods have been developing in the direction of intelligence and tool in recentyearsArtificial neural network(ANN)is a kind of self adaptively intelligent algorithm,which is widely used in the diagnosis of breast cancer
10、It also provides a method for theauxiliary diagnosis of doctorsWith the development of artificial intelligence technology andneural network technology is matureits classification ability is very strong and intelligenceIt provides a new method for the identification of breast tumors because of the di
11、fferences inthe microscopic images of the breast lesions and normal tissuesAccording to the two kindsof images,the algorithm Can be used to diagnose breast tumorsIn this paper,the diagnosis and experiment of breast tumor are studied,which is based onmachine learning artificial neural networkSeveral
12、methods have been used to carry outsimulation experimentsIt can improve the accuracy of fault diagnosisit is an effective andcorrect method for breast tumor diagnosisAnd it has high value in medical applicationThemain contents of this paper include:Firstly,this paper describes the background and sig
13、nificance of this studyIn this paper,the method of breast recognition is briefly describedIt also briefly describes the mainproblems of the research and introduces the organization structure of this paper万方数据中北大学学位论文Secondly,three methods of statistical analysis were used tO detect breast tumor data
14、Three methods of statistical analysis are fisher discriminant analysis,distance discriminantanalysis and bias discriminant analysisThree methods are comparedThe simulation resultsshow that the accuracy of Fisher is 971The correct rate of distance discrimination is841Bias discriminant accuracy was 88
15、41Compared with the three methods,fisherdiscriminant analysis has higher accuracyAnd the probability of misclassification is thelowestTherefore,Fisher discriminant analysis has better experimental resultsThen,two methods of Kmeans neural network algorithm and self-organizing neuralnetwork algorithm
16、are used to test the data of breast cancerThe correct rate of K。meansneural network and self-organizing neural network is 80and 8 158respectivelyThe dataused for the identification of breast tumors are 1 0 quantitative features of the microscopicimage of the breast tumorThe data of each group includ
17、e 30 data of the average value andstandard deviation and the worst value of each of the nuclei in the sampled tissueSo theinput dimension is largerThe redundant information is moreAnd the running,time is longerUsing PCA to reduce the dimension of data,it is down to 8The cumulative contribution rateh
18、as reached 999 1Then the experiment is carried out诹m K-means and self-organizingneural networkThe running time is shortenedAfter the principal component analysis,thecorrect rate of Kmeans and self-organizing neural network is 8895 and 8842respectivelyFinally,genetic algorithm is used to optimize LVQ
19、 because that the LVQ algorithm issensitive to the initial weights,in this paperAccording to the application of LVQ in thediagnosis of breast cancer,an improved LVQ algorithm based on genetic algorithm isproposed for LVQIt is improve recognition accuracyThe experimental results show that theaccuracy
20、 of the improved LVQ in the diagnosis of breast tumors was 9 13,which was 44higher than that of LVQKey words: Breast cancer diagnosis; Fisher discriminant analysis;K。means neuralnetwork;self-organizing neurN network;LVQ;Genetic algorithms万方数据中北大学学位论文目录第一章绪论111国内外乳腺癌发病率状况以及研究意义l111国内外乳腺癌发病率状况1112研究的意
21、义112乳腺肿瘤检查的方法2121计算机辅助乳腺肿瘤诊断的现状213本文研究的主要内容及结构4第二章基于Fisher判别的乳腺肿瘤诊断621 Fisher判别原理6211确定判别式。6212手0另U规贝9822仿真实验9221特征属性的选取及其相关性分析。9222判别分析模型的建立9223判别准则的评价1023结论。10第三章基于PCAK-means和PCA自组织竞争网络的乳腺肿瘤分类1131基于主成分分析的样本特征降维11311 PCA原i里1l312主成分定义以及几何意义11313贡献率的定义。12314主成分分析法算法步骤。1332 Kmeans聚类13321 Kmeans聚类算法原理。
22、14322 Kmeans聚类算法的目标函数-。14323 Kmeans聚类算法的算法流程。15万方数据中北大学学位论文33 自组织竞争网络15331 自组织竞争网络结构以及算法。15332竞争网络结构和学习算法1634所用的算法步骤1735实验结果分析。1836结论20第四章基于改进的LVQ神经网络的乳腺肿瘤分类2141 LVQ神经网络概述。21411LVQ神经网络的结构2l412 GA优化LVQ网络2242仿真分析2443本章小结25第五章总结与展望2651本文研究主要内容和成果2652存在的问题及对以后工作的展望26参考文献。28读硕士学位期间发表的论文32致谢II万方数据中北大学学位论文
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 基于 机器 学习 乳腺 肿瘤 识别 霍双红
限制150内