基于不确定性线性判别分析的维吾尔语语音情感识别-塔什甫拉提·尼扎木丁.pdf
《基于不确定性线性判别分析的维吾尔语语音情感识别-塔什甫拉提·尼扎木丁.pdf》由会员分享,可在线阅读,更多相关《基于不确定性线性判别分析的维吾尔语语音情感识别-塔什甫拉提·尼扎木丁.pdf(7页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、Journal of Southeast University(English Edition) V0133,No4,PP437443 Dec2017 ISSN 1003-7985Emotion recognition of Uyghur speechuslnlg uncertain 1inear OlScrlmlnant analysis l - l Tashpolat Nizamidinl2 Zhao Lil Zhang Mingyan91 Xu Xinzhoul Askar Hamdulla2(1 Key Laboratory of Underwater Acoustic Signal
2、Processing of Ministry of Education,Southeast University,Nanjing 210096,China)(2 School of Information Science and Engineering,Xinjiang University,Ummqi 830046,China)Abstract:To achieve efncient and compact lowdimensionalfeatures for speech emotion recognition, a novel featurereduction method using
3、uncertain linear discriminant analysis isproposedUsing the same principles as for conventional lineardiscriminant analysis(LDA), uncertainties of the noisy ordistoned input data are employed in order to estimate maximallydiscriminant directionsnle effectiveness of the proposeduncertain LDA(ULDA)is d
4、emonstrated in the Uyghurspeech emotion recognition taskThe emotional features ofUyghur speech,especially,the fundamental frequency andformantare analyzed in the collected emotional dataThenULDA is employed in dimensionality reduction of emotionalfeatures and better performance is achieved compared
5、withother dimensionality reduction techniquesThe speechemotion recognition of Uyghur is implemented by feeding thelowdimensional data to support vector machine f SVM 1based on the proposed ULDAThe experimental results showthat when employing an appropriate uncertainty estimationalgorithmuncertain LD
6、A outperforms山e conventional LDAcounterpart on Uyghur speech emotion recognitionKey words:Uyghur language;speech emotion corpus;pitch;formant;unceaain linear discriminant analysis f ULDA)DoI:103969Jissn10037985201704008q peech is oneo“he moSt effective waysofh岫anUcomputer interaction in the era of a
7、rtificial intelligenceThereforein the humancomputer interaction systemin order to make the machine understand humanemotionthe identification of the emotional state in thespeech signal becomes increasingly importantSpeechemotion recognition(SER)involves several differentfields,including speech signal
8、 processing,pattern recognition,machine learning,psychology,and so onFor SERit is generally regarded as the default methodfor capturing paralinguistic 1nformation to generate a single highdimensional representation of an utterance from aset of underlying lowlevel acoustic descriptorsPreviousReceived
9、 2017-05-17,Revised 2017-08-30Biographies:Tashpolat Nizamidin(1988一),male,graduate;Zhao Li(corresponding author),male,doctor,professor,zhaoliselledu cnFoundation item:The National Natural Science Foundation of China(No61673108,61231002)Citation:Tashpolat Nizamidin,Zhao Li,Zhang Mingyang,et a1Emo。tio
10、n recognition of Uyghur speech using uncertain linear discriminantanalysis【JJournal of Southeast University(English Edition),2017,33(4):437443 DOI:103969jissn10037985201704008investigations consistently demonstrate the usefulness ofthis technique when applied to a range of different SERproblemsDimen
11、sionality reduction is frequently used in the preprocessing stage to make the input data more suitable formodelingLinear discriminant analysis(LDA)21 is oneof the simplest and most popular transforms to enhanceclass separability for multidimensional observationsConventional LDA assumes that each cla
12、ss follows a normal distribution and classes share the same covariancestructureAlthough these assumptions do not generallyhold in practice,the conventional approach and its variants have been found useful in many applications including automatic speech and speaker recognitionWhen thedimensionality o
13、f the data becomes comparable with thenumber of samples per class,the sample covariance esti-mation becomes unstableRegularization and Bayesianestimation of covariance modeIs have been discussed inexiting 1iterature to overcome this issueIt is also possibleto obtain a nonlinear class separation usin
14、g subclass discriminant analysis and the kernel trick in LDA Wheneach class is composed of several partitionssubclass discriminant analysis aims to maximize the distance betweenthe class means and the subclass means in the same classat the salTle timeCompared to the principal component analysis(PCA)
15、13,classdependent dimensionality reduction isexpected to be more effective in modeling classesTheextension of LDA includes heteroscedastic LDAquadratic discriminant analysisand mixture discriminant analysisA distance preserving dimensionality reduction transform maps the D-dimensional data samples t
16、o a d-dimen-sional space(d2In this paper,we address the task of finding linear discriminant directions,using a probabilistic descriptioninstead of a pointestimation for an observationWeachieve such a probabilistic description by using socalledobservation uncertaintiesIn this approach,the feature万方数据
17、438 Tashpolat Nizamidin,Zhao Li,Zhang Mingyang,Xu Xinzhou,and Askar Hamdullaextraction process outputs the pointestimation of a featurevector along witll an uncertaintyThe pointestimation isassumed to form a Gaussian mean,while the corresponding variance is set as the estimated uncertaintyThroughout
18、 this paper,we note this process as an uncertain observationAccordinglywe utilize uncertain LDA f ULDAl”1 to account for the observation uncertainties in estimating scatter matrices for LDABased on the Uyghur speech emotion database,we setup a benchmark for Uyghur speech emotion recognitionwhich inv
19、olves a set of SER tasks in various trainingtestconditions Additionally,the dimensionality reductiontechnique ULDA f uncertain linear discriminant analysis)is appliedWe provide the complete data descriptionsystem architecture,experimental setup and evaluationperformanceThese can be used as a full re
20、ference forUyghur speech emotion recognition research1 AlgorithmDimensionality reduction techniques are widely appliedin speech emotion recognition research,such as PCALDA,locality preserving projections(LPP)1 51,local dis-criminant embedding(LDE)61,graphbased Fisher anal-ysis(GbFA)1 and so onIt is
21、important to note thatthese methods do not solve recognition and hypothesistesting problems directly,and they are used as a pre-processing stage to reduce dimensionaiitVA conventionalSER task requires a large number of features;henceweshould use an efficient dimensionality reduction techniqueto deal
22、 with this highdimensional caseIn this paperweapplied uncertain linear discriminant analysis(ULDA)todimensionality reduction11 Conventional LDALet X=xI,x2,xL be a set of L samples(features),each sample belonging to one of K classes and partitio-ning the data into clusters CI,C2,CfThe conven-tiona】LD
23、A aims at finding a linear transforrnation ofthose features that can maximize the separability of theclustersEach class is assumed to be Gaussian distributedand has the same Gaussian covariance structureIn orderto find the discriminant directions,we first calculate thesample mean,l and class mean p女
24、as1 Lp=x, (1)p 2了己f 【l Jp女2 x,l Ecwhere CI is the cardinality of class kNext,the within-class Sw and betweenclass scatters SB are given asrS。=(p。一p)(p。一p)1k=JThe optimization problemthe FisherRap criteriont 1。3is then solved by maximizingaStWI,W2,-,W“K_1,硼伊叫器措) (5)where命,(i=1,2,K一1)is the i-th eigen
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 基于不确定性线性判别分析的维吾尔语语音情感识别-塔什甫拉提尼扎木丁
限制150内