英语声学和语音学英文版.ppt

上传人：wuy****n92

文档编号：66744994

上传时间：2022-12-19

格式：PPT

页数：26

大小：264.50KB

( 4.5 )

《英语声学和语音学英文版.ppt》由会员分享，可在线阅读，更多相关《英语声学和语音学英文版.ppt（26页珍藏版）》请在淘文阁 - 分享文档赚钱的网站上搜索。

1、Speech acoustics and phoneticsLouis C.W.PolsInstitute of Phonetic Sciences(IFA)Amsterdam Center for Language and Communication(ACLC)NATO-ASI“Dynamics of Speech Production and Perception”Il Ciocco,Tuscany,Italy,July 1,2002OverviewnDynamics in speech acousticsnContour modeling(mainly formants)nAspects

2、 of spectral undershootnModeling V and C reductionnPhonetic knowledge from speech corporanIFA,CGN,TIMIT,found speechnConclusionsJuly 1st,20022Speech acoustics and phonetics,Il CioccoJuly 1st,20023Speech acoustics and phonetics,Il CioccoDynamics in speech acousticsnDynamics is the norm,not stationari

3、tynarticulatory efficiencynDynamics is everywherengenerally no word boundaries in speechndeletion of words,syllables,phonemes;insertionnwithin/between word coarticulation/assimilationnvowel and consonant reductionnAcoustic manifestationsnsegment duration,F0,loudness,spectral qualityJuly 1st,20024Spe

4、ech acoustics and phonetics,Il CioccoDynamics is the normnThe speaker speaks as sloppily as the listeners allow him to do in communicationncommunicative efficiencynArticulatory vs.perceptual efficiencyndo spectral transitions facilitate or hamper perception?see other presentationnSpeaker flexibility

5、;speaking style(clear vs.sloppy);speaking rateJuly 1st,20025Speech acoustics and phonetics,Il CioccoDynamics is everywherenDeletionnbread and butter/brEmbY3/nAmsterdam(Du)/AmstrdAm/AmsdAm/nkoninklijke(Du)/konIklk/kolk/nInsertionnhomorganic glide insertion:die een(Du)/dijn/nDegeminationnis zichtbaar(

6、Du)/Is zIxtbar/IsIxbar/nReduction,coarticulation,assimilationJuly 1st,20026Speech acoustics and phonetics,Il CioccoAcoustic manifestationsnpitch,loudness,formant,component contoursncontour stylization(e.g.,pitch in praat)ncontour modelingnn-th degree curve fitting(D.van Bergem)nLegendre polynomials)

7、(R.van Son)n16 points per segment)n(phoneme)segmentationnby hand(time consuming;non-consistent)nautomatically(via forced phoneme recognition and a pronunciation lexicon with alternatives;systematic errors)July 1st,20027Speech acoustics and phonetics,Il CioccoContour modelingnallows modeling of speci

8、fic phenomenanpitch accentuation(vs.vowel onset)nreduction,centralization,undershootnallows generation of stimuli for perc.expts.nphoneme identification in extending contextn2-alternatives forced choice identif.of continuandiscrimination,RTnallows statistics on large speech corporanTIMIT,CGN,IFA-cor

9、pus,SwitchboardJuly 1st,20028Speech acoustics and phonetics,Il CioccoStatic vs.dynamic V recogn.nsee Weenink(2001)n“Vowel normalizations with the TIMIT acoustic phonetic speech corpus”,IFA Proc.24,117-123n438 males,both train&test sent.of TIMITn35,385 vowel segments,hand segmentedn13 monophthongeal

10、vowel categoriesn1-Bark bandfilter anal.(18),intensity.normal.n3 frames per segment:central and 25 ms L/RJuly 1st,20029Speech acoustics and phonetics,Il CioccoSome resultsnVowel classif.(%)with discriminant functionsCondition#ItemsStatic 1 frameDynamic 3 framesOriginal35,385438x13x(125)59.366.9speak

11、er normalized35,38562.269.2V centers per speaker5,374438x1378.990.1speaker normalized5,37487.994.5July 1st,200210Speech acoustics and phonetics,Il CioccoFormant tracks/speaking ratenPh.D.thesis Rob van Son(1993)n“Spectro-temporal features of vowel segments”nsee also Speech Comm.13,135-148(Pols&vSon)

12、n850-words text,read at normal and fast ratenhand segmentation of 7 most freq.V+schwanformant tracksnvia 16 points per segm.or 5 Legendre polynomialsninfluence of rate,V-dur.,context,sent.acc.nevidence for duration-controlled undershoot?July 1st,200211Speech acoustics and phonetics,Il CioccoSome res

13、ultsnno differences for F1/F2 in vowel center for normal-or fast-rate speech;only some over-all rise in F1 for fast rate(irrespective of V)nsame formant track shape(normalized to 16 points)for normal-or fast-rate speechnsame results when using the more elaborate Legendre polynomialsnConcl.:changes i

14、n V-duration do not change the amount of undershoot active control of articulation speedJuly 1st,200212Speech acoustics and phonetics,Il CioccoFormant representationszeroth order Legendre Legendre polynomial coefficients(mean Fi in vowel segment)second order polynomials(axes reversed)eeJuly 1st,2002

15、13Speech acoustics and phonetics,Il CioccoModeling vowel reductionnPh.D.thesis Dick van Bergem(1995)n“Acoustic and lexical vowel reduction”nsee also Speech Communication 16,329-358nlexical V reduction Fr/bet/vs.Du/btOn/nacoustic V reduction/banan,bAnan,bnan/nf(sent.acc.,w.str.,w.class):can-candy-can

16、teenncoarticulatory effects on the schwanC1C2V-and VC1C2-type nonsense wordsnperceptual effects(full V or schwa,f.i.ananas)July 1st,200214Speech acoustics and phonetics,Il CioccoSome resultsThe schwa is not just a centralized vowel but somethingthat is completely assimilated with its phonemic contex

17、tt-nw-lJuly 1st,200215Speech acoustics and phonetics,Il CioccoModeling consonant reductionnSp.Comm.(1999)28,125-140(vSon&Pols)n20 min.speech,both spontaneous and readn2 x 791 similar VCV;hand segmentedn5 aspects of V and C reductionnrelated to coarticulation:F2 slope differences at CV-vs.VC-boundari

18、es;F2 locus equations(F2 onset vs.F2 target)nrelated to speaking effort:duration;spectral COG(mean freq.);V-C sound energy differencesJuly 1st,200216Speech acoustics and phonetics,Il CioccoSome resultsnV markedly reduced in spontaneous speechnlower F2-slope diff.in spontaneous speech decrease in art

19、iculation speednno systematic effect on F2 locus equation;V onsets and targets change in concert any V reduction mirrored by comparable change in Cnspont.sp.:V and C shorter;lower COG decrease in vocal and articulatory effortJuly 1st,200217Speech acoustics and phonetics,Il CioccoAccess to large corp

20、oranmore,and more realistic,datanphonetic knowledge via statistical analysesnf.i.highly accessible IFA-corpus(free,SQL)nsee“Structure and access of the open source IFA-corpus”,IFA Proc.24,15-26(vSon&Pols)non-line http:/www.fon.hum.uva.nl/IFAcorpus/n4 M/4F speakers,5.5 hrs of speechnfrom informal to

21、read+sent.,words,syllablesn 50Kwords segm.and labeled at phoneme levelJuly 1st,200218Speech acoustics and phonetics,Il CioccoSome resultsnspeech+annot.+meta data:relational DBnrealization of final n,f.i.Du geven/xev(n)/Style#wrds/n/All%/n/Informal5,2501304305 0.3Retelling6,22913236249 5.2 LFHFNarr.s

22、tory14,453180372552334230Sentences14,97020334054337Pseudo-sent2,55462198177All43,4564591,2711,73036ReadJuly 1st,200219Speech acoustics and phonetics,Il CioccoSpoken Dutch Corpus(CGN)n10 M words,1,000 hrs of speechnvariety of styles,incl.telephone speechnadult Dutch and Flemish speakersnfor linguisti

23、c and technological researchnsee various LREC and ICSLP papers(2002)nsee also http:/lands.let.kun.nl/cgn/home.htmnfully transcribed:orthogr.,POS,lemmasnpartly transcr.:phonemic,prosodic,syntacticJuly 1st,200220Speech acoustics and phonetics,Il CioccoTIMITnpopular DB in acoustic phonetics and ASRnals

24、o telephone version(NTIMIT)nhand segmented&labeled at phoneme leveln438 males,192 females(8 dialect regions)n10 sent./sp.(2 fixed,1 pact,7 diverse)sa1:“She had her dark suit in greasy wash water all year”nincludes separate test data(112 M,56 F)ne.g.Ph.D thesis X.Wang (1997)“Incorporating knowledge o

25、n segmental duration in HMM-based continuous speech recognition”July 1st,200221Speech acoustics and phonetics,Il CioccoUseful info:durational variabilityAdopted from Wang(1998)normal rate=95 primary stress=104word final=136utterance final=186overall average=95 msJuly 1st,200222Speech acoustics and p

26、honetics,Il Ciocconormalized phone durationspeaking rateall 3,696 training sent.(sx+si)of TIMIT training set0July 1st,200223Speech acoustics and phonetics,Il Cioccofound speechnDARPA-LVSR community rather ambitiousnBroadcast News(BN),Sp.Comm.37(2002)95WSJ NAB read sp.1995Market place1996F0-F5,FX par

27、titioned19973 hrs test unpartit.1998+non Engl.speech also 900 Mbest%WERon test set27.0%27.1%1:46 hrs16.2%3 hrs13.5 16.1%3 hrs(10 xRT)For Proc.DARPA Workshops,see http:/www.nist.gov/speech/proc/darpa99/index.htmJuly 1st,200224Speech acoustics and phonetics,Il CioccoArticul.-acoustic features in ASRn“

28、A Dutch treatment of an elitist approach to articulatory-acoustic feature classification”,Proc.Eurospeech-2001,1729-1732(M.Wester et al.)n“Integrating articulatory features into acoustic models for speech recognition”,Phonus 5,73-86(K.Kirchhoff,2000)n“An overlapping-feature-based phonological model

29、incorporating linguistic constraints:Applications to speech recognition”,JASA 111(2),1086-1101 (J.Sun&L.Deng,2002)July 1st,200225Speech acoustics and phonetics,Il CioccoConclusionsnexamples of dynamics in speech acousticsngoing from formal to informal speech:nless dynamics,more reduction(artic.guided)nundershoot vs.speaking stylensloppiness or articulatory limits?nfunctionality of dynamics?other papernsystematicity of dynamics?neasing ASR,rules for TTS,acquiring knowledge?July 1st,200226Speech acoustics and phonetics,Il Ciocco

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

11.9 金币

版权申诉 word格式文档无特别注明外均可编辑修改；预览文档经过压缩，下载后原文更清晰！ 立即下载

配套讲稿：: 如PPT文件的首页显示word图标，表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
特殊限制：: 部分文档作品中含有的国旗、国徽等图片，仅作为作品整体效果示例展示，禁止商用。设计者仅对作品中独创性部分享有著作权。
关键词：: 英语声学语音学英文

淘文阁 - 分享文档赚钱的网站所有资源均是用户自行上传分享，仅供网友学习交流，未经上传用户书面授权，请勿作他用。

限制150内

关于本文

本文标题：英语声学和语音学英文版.ppt
链接地址：https://www.taowenge.com/p-66744994.html