代谢组学介绍ppt课件.ppt
Introduction of metabonomics/metabolomics2009-06-26The flow of the “omics” sciences: genomics, proteomics, and metabolomicsSpratlin J.L. , et al. Clin Cancer Res, 2009 January 15, 15(2):431-440Whats in a name?Metabolome “ refers to the complete set of small-molecule metabolites (such as metabolic intermediates, hormones and other signalling molecules, and secondary metabolites) to be found within a biological sample, such as a single organism ” Oliver et al., 1998代谢组代谢组“是是指基因组的所有下游产物也即最终产物的组合,这些产物是一指基因组的所有下游产物也即最终产物的组合,这些产物是一些参与生物新陈代谢、维持生物体正常功能和生长发育的小分子化合物,主些参与生物新陈代谢、维持生物体正常功能和生长发育的小分子化合物,主要是相对分子量小于要是相对分子量小于1000Da的内源性小分子的内源性小分子”许国旺著许国旺著. 代谢组学代谢组学-方法与应用方法与应用, 科学出版社,科学出版社,2008年第一版年第一版:第一章第一章,P1-10Metabonomics “measurement of the dynamic multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification” Nicholson et al., 1999 Metabolomics “.the complete set of metabolites/low-molecular-weight intermediates, which are context dependent, varying according to the physiology, developmental or pathological state of the cell, tissue, organ or organism” Oliver 2002代谢组学代谢组学“是通过考察生物体系(细胞、组织或生物体)受到刺激或扰动是通过考察生物体系(细胞、组织或生物体)受到刺激或扰动后(如将某个特定的基因变异或者环境变化后),其代谢产物的变化或其随后(如将某个特定的基因变异或者环境变化后),其代谢产物的变化或其随时间的变化,来研究生物体系的一门科学时间的变化,来研究生物体系的一门科学” 许国旺许国旺 2008Whats in a name? Analytical plat-forms: (1) Nuclear magnetic resonance (NMR); (2) Gas ChromatographyMass Spectrometry ( GC-MS ); (3) Liquid Chromatography-Mass Spectrometry ( LC-MS ); etc.GC-MSLC-MSMetadata obtainTao X.M., et al. Anal Bioanal Chem., 2008, 391:2881-2889Total ion chromatogramData obtain (1) Filtering and peak detection 滤噪、峰检测滤噪、峰检测 (2) Deconvolution 重叠峰解析重叠峰解析 (3)Peak alignment 峰对齐、匹配峰对齐、匹配 (4)Normalization 归一化归一化Data analysis and interpretation (5) 非监督的模式识别方法:非监督的模式识别方法: 利用获取的样本信息,对样本进行归类,并采用相应的可视化技术直观的表达出来,不需要利用获取的样本信息,对样本进行归类,并采用相应的可视化技术直观的表达出来,不需要有关样品分类的任何背景信息。该方法将得到的分类信息和这些样本的原始信息(如疾病的种)有关样品分类的任何背景信息。该方法将得到的分类信息和这些样本的原始信息(如疾病的种)进行比较,建立代谢产物与这些原始信息的联系,筛选与原始信息相关的标志物,进而考察其中进行比较,建立代谢产物与这些原始信息的联系,筛选与原始信息相关的标志物,进而考察其中的代谢途径。的代谢途径。 常用的非监督学习方法如常用的非监督学习方法如主成分分析主成分分析(principal components analysis), 系统聚类分析系统聚类分析主成分分析的基本思想:主成分分析的基本思想:对变量对变量X X进行线性变换,形成新的综合变量进行线性变换,形成新的综合变量PCPC;根据实际需要选择;根据实际需要选择2-32-3个个PCPC进行分析,以进行分析,以达到降维和简化问题的作用(多元达到降维和简化问题的作用(多元 二元二元/ /三元)三元)PC1=a11X1+a21X2+ +ap1Xp PC2=a12X1+a22X2+ +ap2Xp许国旺等著许国旺等著. 代谢组学代谢组学-方法与应用方法与应用, 科学出版社,科学出版社,2008年第一版年第一版:第第12章章,146-156PCA scores plot of onset ALL and AML patientsData analysis(6) 有监督的模式识别方法:有监督的模式识别方法: 利用一组已知分类的样本作为训练集,让计算机对其进行学习,获取分类的基本模型,进而利用一组已知分类的样本作为训练集,让计算机对其进行学习,获取分类的基本模型,进而可以利用这种模型对另一组分类未知的样本进行类别识别。可以利用这种模型对另一组分类未知的样本进行类别识别。 常用的有监督学习方法如常用的有监督学习方法如偏最小二乘判别分析偏最小二乘判别分析(Partial least squares-discriminant analysis,PLS-DA),正交偏最小二乘判别分析,费舍尔线性判别分析,正交偏最小二乘判别分析,费舍尔线性判别分析许国旺等著许国旺等著. 代谢组学代谢组学-方法与应用方法与应用, 科学出版社,科学出版社,2008年第一版年第一版:12,146-156偏最小二乘法分析思想偏最小二乘法分析思想 对变量进行分类:设定对变量进行分类:设定p个因变量个因变量Y1, , Yp和和m个自变量个自变量X1, ,Xm,对两类变量进行建模。,对两类变量进行建模。提取自变量的第一成分提取自变量的第一成分T1和因变量的第一成分和因变量的第一成分U1,使使T1和和U1相关程度达最大相关程度达最大, 然后建立然后建立U1和和T1的回归方程;如果回归方程未达到满意的精度,则用同样的方法提取的回归方程;如果回归方程未达到满意的精度,则用同样的方法提取T2和和U2。 T1=w11X1+ +w1mXm T2=w21X1+ +w2mXm 判别分析思想判别分析思想 应变量为定性变量,且分组类型在两组以上;自变量为可测量的度量变量。计算(线性)应变量为定性变量,且分组类型在两组以上;自变量为可测量的度量变量。计算(线性)判别式;将自变量代入判别式,计算每个观察样本的判别判别式;将自变量代入判别式,计算每个观察样本的判别Z得分,然后根据得分值对其进行归类。得分,然后根据得分值对其进行归类。t(1)t(2)The scores t, one vector for each model dimension, are new variables computed as linear combinations of the Xs. They provide a summary of X that both approximate X and predict Y.PLS-DA scores plot of onset ALL and AML patients Other statistic approaches, such as t test and ANOVA, are alternatives at this step.VIP (variable importance in the projection ) valuesThe influence of every term in the matrix X on all the Ys. VIP is normalized so that Sum (VIP)2 = K (number of terms in the matrix X). Terms with VIP 1 have an above average influence on Y.Deviation of each variables from ALL (standard deviations from average)Potential biomarker identification: standard Students t test or ANOVABlind prediction test of PLSDA modelY-PredictedThree major steps of metabolomics analysisSpratlin J.L. , et al. Clin Cancer Res, 2009 January 15, 15(2):431-440Clinical applications of metabolomics in oncology1. Search early diagnostic biomarkersBreast cancer: tCho glycerophosphocholine glucose2. Response assessment to chemical drugs/therapy treatmentsBoth as a predictive measure of efficacy and a pharmacodynamic marker Tiziani S, Lodi A, Khanim FL, Viant MR, Bunce CM, et al. PLoS ONE, 2009, 4(1):e4251 Bathen TF, et al. Breast Cancer Res Treat, 2007;104:181189.Some knowledge about prostate cancer1. Prostate cancer the most frequently diagnosed cancer in men2. current diagnostic methods: using a combination of digital rectal examination and measuring the levels of the enzyme PSA in the blood serum3. limitation of current diagnosis: the features of this kind of cancer are notoriously variable among patients.Metabolomic profiling of prostate cancerScreekumar A., et al. Nature, 2009 Feruary 12, 457(7231):910-914 a, Venn diagram of the total metabolites detected across 42 prostate-related tissues and 110 matched plasma and urine samples. b, Venn diagram of 626 metabolites in tissues measured across 16 benign adjacent prostate tissues, 12 clinically localized prostate cancers (PCA) and 14 metastatic prostate cancers (Mets)Screekumar A., et al. Nature, 2009 Feruary 12, 457(7231):910-914Metabolomic profiling of prostate cancerScreekumar A., et al. Nature, 2009 Feruary 12, 457(7231):910-914 Hierarchical cluster analysis of prostate tissue samplesScreekumar A., et al. Nature, 2009 Feruary 12, 457(7231):910-914blue circles-benign adjacent prostateyellow squares-localized prostate cancerred triangles-metastatic prostate cancer Principal components analysis of prostate tissue samplesScreekumar A., et al. Nature, 2009 Feruary 12, 457(7231):910-914blue-benign; yellow-localizedtwo-tailed Wilcoxon rank sum testScreekumar A., et al. Nature, 2009 Feruary 12, 457(7231):910-914yellow-localized; red-metastaticScreekumar A., et al. Nature, 2009 Feruary 12, 457(7231):910-914 A role for sarcosine in prostate cancer cell invasion and androgen signaling