corpus-introduction--section-1--语料库.ppt
《corpus-introduction--section-1--语料库.ppt》由会员分享,可在线阅读,更多相关《corpus-introduction--section-1--语料库.ppt(47页珍藏版)》请在淘文阁 - 分享文档赚钱的网站上搜索。
1、CL timetable 27/03 (Wed) 18:30-21:30E6-224 28/03 (Thu) 13:15-16:40E6-219 29/03 (Fri)14:05-17:30E6-219 03/04 (Wed) 18:30-21:30E6-224 07/04 (Fri)14:05-17:30E6-219 10/04 (Wed) 18:30-21:30E6-224 11/04 (Thu)13:15-16:40E6-219 12/04 (Fri)14:05-17:30E6-219Introducing Corpus LinguisticsCorpus LinguisticsRich
2、ard XModule description Since the 1990s, the corpus methodology has revolutionized nearly all branches of linguistics Corpus analysis can be illuminating in “virtually all branches of linguistics or language learning.” (Leech 1997) One of the strengths of corpus data lies in its empirical and attest
3、ed nature pools together the intuitions of a great number of speakers makes linguistic analysis more objective This module introduces the theoretical and practical issues of using corpora in linguistic studies explores how the corpus-based approach and other methodologies can be combined in linguist
4、ic studiesAims of the module The module aims to provide an introduction to corpus linguistics; familiarise students with major corpus resources and tools; pass on essential knowledge and skills for building DIY corpora; to keep students up to date with the latest developments in corpus research; dev
5、elop students ability in corpus-based language studies.Contents1)Introducing corpus linguistics2)Corpus design and types of corpora3)Data capture and markup4)Corpus annotation5)Making statistic claims6)Corpus analysis (1): concordance and wordlist7)Corpus analysis (2): keyword analysis8)Corpora in l
6、exicographic and lexical studies9)Corpora in grammatical studies10) Corpora in diachronic studies11)Corpora in language variation research12)Corpora in sociolinguistic studies13)Corpora in language education14)Corpora in literary and stylistic studies15)Corpora in critical discourse analysis16)Corpo
7、ra in contrastive and translation studiesLearning outcomesOn successful completion of the module, students will be able tounderstand the major theoretical frameworks in corpus linguistics and formulate research questions that are amenable to corpus research;think critically about the strengths and w
8、eaknesses of the corpus methodology and decide when and how to interface it with other methodologies;get familiar with major corpus resources and tools and to develop DIY corpora when necessary;apply the corpus-based approach in their own research.Teaching/learning strategies With a dual focus on wh
9、y and how to in corpus-based language studies, this practical module will be delivered through a series of lectures and hands-on lab sessions The module also engages students in extensive reading and interaction with corpus data outside of classAssessment Option A A 1,000-word essay that critically
10、reviews a corpus exploration tool or a corpus-based study (40%) A 2,500-word project report (60%) Option B One 3,500-word essay based on a research project of your own choice (100%) Deadline: Friday 31 May 2013 Submission A Word copy as email attachment Reading list Set text McEnery, A., Xiao, R. an
11、d Tono, Y. (2006) Corpus-Based Language Studies: An Advanced Resource Book. London & New York: Routledge. Wynne, M. (2005) Developing Linguistic Corpora. Oxford: Oxbow Books. Available online at http:/www.ahds.ac.uk/creating/guides/linguistic-corpora Recommended reading See the module syllabus at th
12、e course website www.lancs.ac.uk/fass/projects/corpus/ZJU/CL_syllabus.htm(pass for unzipping ebooks: lancs)Outline of this session Lecture: introducing key concepts and debates in corpus linguistics What is and is not a corpus? Why use corpora? Corpora vs. intuitions The corpus methodology A brief h
13、istory of Corpus Linguistics Nature and applications of corpus-based studies Lab: testing your intuitions + exploring online resourcesWhat is a corpus? The word corpus comes from Latin (“body”) and the plural is corpora A corpus is a body of naturally occurring language but rarely a random collectio
14、n of text Corpora “are generally assembled with particular purposes in mind, and are often assembled to be (informally speaking) representative of some language or text type.” (Leech 1992) “A corpus is a collection of (1) machine-readable (2) authentic texts (including transcripts of spoken data) wh
15、ich is (3) sampled to be (4) representative of a particular language or language variety.” (MXT 2006: 5)What is not a corpus?A list of words is not a corpus Building blocks of languageA text archive is not a corpus A random collection of textsA collection of citations is not a corpus A short quotati
16、on which contains a word or phrase that is the reason for its selectionA collection of quotations is not a corpus A short selection from a text chosen on internal criteria by human beingsA text is not a corpus Intending to be read in different waysThe Web is not a corpus Its dimensions unknown, cons
17、tantly changing, not designed from a linguistic perspectiveSinclair (2005)What is a corpus for? A corpus is made for the study of language in a broad sense To test existing linguistic theory and hypotheses To generate and verify new linguistic hypotheses Beyond linguistics, to provide textual eviden
18、ce in text-based humanities and social sciences subjects The purpose is reflected in a well-designed corpusWhy use corpora? Even expert speakers have only a partial knowledge of a language A corpus can be more comprehensive and balanced Even expert speakers tend to notice the unusual and think of wh
19、at is possible A corpus can show us what is common and typical Even expert speakers cannot quantify their knowledge of language A corpus can readily give us accurate statisticsWhy use corpora? Even expert speakers cannot remember everything they know A corpus can store and recall all the information
20、 that has been stored in it Even experts speakers cannot make up natural examples A corpus can provide us with a vast number of examples in real communication context Even expert speakers have prejudices and preferences and every language has cultural connotations and underlying ideology A corpus ca
21、n give you more objective evidenceWhy use corpora? Even expert speakers are not always available to be consulted A corpus can be made permanently accessible to all Even expert speakers cannot keep up with language change A constantly updated corpus can reflect even recent changes in the language Eve
22、n expert speakers lack authority: they can be challenged by other expert speakers A corpus can encompass the actual language use of many expert speakersIntuitions as an alternative Intuitions are always useful in linguistics To invent (grammatical, ungrammatical, or questionable) example sentences f
23、or linguistic analysis To make judgments about the acceptability / grammaticality or meaning of an expression To help with categorizationIntuitions as an alternative Intuitions should be applied with caution Possibly biased as they are likely to be influenced by ones dialect or sociolect Introspecti
24、ve data is artificial and may not represent typical language use as one is consciously monitoring ones language production Introspective data is decontextualized because it exists in the analysts mind rather than in any real linguistic context Intuitions are not observable and verifiable by everyone
25、 as corpora are Excessive reliance on intuitions blinds the analyst to the realities of language usage because we tend to notice the unusual but overlook the commonplace There are areas in linguistics where intuitions cannot be used reliably e.g. language variation, historical linguistics, register
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- corpus introduction section 语料库
限制150内