verb-particle constructions in a computational grammar of english.pdf
Verb-particle constructions in a computational grammar of EnglishAline VillavicencioUniversity of Cambridge Computer Laboratory,William Gates Building, JJ Thomson Avenue,Cambridge, CB3 0FD, UKAline.Villavicenciocl.cam.ac.ukAnn CopestakeUniversity of Cambridge Computer Laboratory,William Gates Building, JJ Thomson Avenue,Cambridge, CB3 0FD, UKAnn.Copestakecl.cam.ac.ukAbstractIn this paper we investigate the phenomenon of verb-particle constructions, discussing their character-istics and the challenges that they present for a computational grammar. We concentrate our discussionon the treatment adopted in the LinGO ERG. We also analyse how different (conventional and electronic)dictionaries capture them, and the inherent limitations in terms of coverage. Given the constantly grow-ing numberof verb-particlecombinations,possible ways of dealing with these limitations are investigated,taking into account the regularpatterns foundin some productivecombinationsof verbs and particles. Onepossible way to try to capture these is by means of lexical rules, and we discuss the difficulties encoun-tered when adopting such an approach. We also investigate possible ways of restricting the productivity oflexical rules to deal with subregularities and exceptions to the patterns found.1IntroductionIn this paper we investigate verb-particle constructions in English and discuss some of the challenges that theypose for a broad-coverage computational grammar. By verb-particle constructions, we mean both idiosyn-cratic or semi-idiosyncratic combinations, such as make up, where the meaning of the combination cannot bestraightforwardly inferred from the meaning of the verb and the particle, and also more regular combinations,such as wander up. Verb-particle constructions are often highly polysemous: eight senses are listed for makeup in the Collins Cobuild Dictionary of Phrasal Verbs, for instance. They also show syntactic variation: someparticles have a fixed position in relation to the verb, such as come up, in She came up with the idea, where theparticle is expected immediately after the verb, thus the ungrammatical *She came with the idea up. Othershave a more flexible order in relation to the verb, and can equally well occur immediately after the verb, orafter another complement.In terms of usage, verb-particle constructions tend to be thought of as informal: they are sometimes said tobe inappropriate in formal writing, and conversely slang is a rich source. Presumably because of this, dialectvariation in the use of verb-particle constructions is quite marked: the examples and judgements in this paperare British English, except where otherwise stated.This paper is organised as follows: in section 2 weanalyse the treatment of verb-particle constructions adoptedin the Lingo ERG. In section 3 we discuss possible ways of extending this treatment, through the use of lexicalrules. After that we analyse how different dictionaries capture them and the coverage they provide. In section5 we investigate ways of identifying more regular patterns among verb-particle combinations, and in section6 we discuss the problem of semi-productivity and how the application of these rules needs to be restricted.We finish with some conclusions and future work.2Verb-particle constructions in a computational grammar of EnglishThe grammar we will take as our starting point is the LinGO English Resource Grammar (ERG).1The LinGOERG treats verb-particle constructions by means of verb entries which subcategorize for particles. There is awide range of constructions captured in the grammar, and these vary, for instance, in terms of the subcategori-sation frame of the verb-particle combination, the position of the particle and the semantics of the particle.A lexical rule, NP particle lr, changes the order of the complements to deal with the NP-particle alternation:its application is controlled by the lexical type of the verb. The selection for the specific particle is via theparticles semantic relation. Particles and prepositions share a lexical entry with an underspecified relation(e.g., on rel), but in the structure for an utterance, the semantic relation for a particle is specialized differentlyfrom the independent preposition because of the selection (e.g., to on rel s as opposed to on rel p).2Forinstance, the entry for wander up is as follows:wander_up_v1 := v_particle_le & STEM ,SYNSEM.LOCAL.KEYS KEY _wander_up_rel,-COMPKEY _up_rel_s .where the semantics of up is specialized to the sematically vacuous up rel s. The scoped logical form for thedog wandered up is as follows (ignoring some complications irrelevant for current purposes, such as optionalarguments, and an extra event argument for prepositions):prpstn(def(x4,dog(x4),wander up(e2,x4)?up s(e15,v14)Note that there is no coindexation between the arguments of up s and wander up. The idea is that selected-for relations, such as up s, are semantically vacuous and can therefore be ignored in the logical form (LF).Contrast this with the logical form for the sentence The dog wandered along the street:prpstn(def(x4,dog(x4),def(x12,street(x12),wander(e2,x4)?along p(e2,x12)An earlier approach in the ERG followed Nerbonne (1995) in actually removing the semantic contributionof the selected-for particle within the process of composition. However, there is now a strong monotonicityassumption underlying semantic composition in the ERG which makes that analysis impossible. An analysisanalogous to that of Wechsler (1997) in which the semantic structures for the verb and particle are mergedis tempting, but this is also unavailable in the ERG because there is an assumption that the lexical entriescontribute individual elementary predications.There are two main practical problems with the ERGs analysis. The first is that verb-particle entries are nevertreated as productively formed, which leads to omissions for instance, while walk is in the lexicon, walk upis not. This is discussed further below. The second problem concerns semantics. Although the idea that theparticle is idiosyncratic and contributes no semantics makes sense for some verb-particle combinations, suchas make up (in at least some of its uses), it is not so reasonable for the productive cases. For instance, we willargue below that wander up can be regarded roughly as:prpstn(def(x4,dog(x4),wander(e2,x4)?up s(e2)where up s has either a directional or locational/aspectual interpretation, which in both cases can be regardedas qualifying the event of wandering (the semantics is discussed further below). The existing treatment meansthat the commonality between wander up and walk up is not captured in the LF, which means that general-izations will be missed in an inference component or in semantic transfer for Machine Translation. Similarly,1November 2001 version, available from http:/lingo.stanford.edu/ftp2There are some cases in the LinGO ERG where this has not been carried through systematically. The discussion below ignoresthis, since these seem to be infelicities rather than deliberate distinctions.there is no semantic connection between wander and wander up, which also has the disadvantage that it makesit impossible to construct the latter productively.The semantic vacuity idea also causes some problems for generation, at least when using the chart generatorprovided in the LKB system (Copestake, 2002). It is unreasonable to assume that a grammar-independentcomponent will be able to produce input LFs with the vacuous selected-for particles, and they thus have to beinserted into an input LF as a separate stage before normal generation with the ERG will work.3Regularities in verb-particle constructions: lexical rulesIt is often the case that some verb particle combinations form some productive pattern that can be captured,with the particles using a fixed particular meaning to contribute to the meaning of a number of combinations.This is the case of the particle up, indicating movement or position, and the verb-particle combinations jumpup, get up and stand up. These combinations involve the literal meanings of the verb and particle, and have atransparent semantics.A simple way of allowing for productive verb-particle combination is to produce an entry similar to the oneabove from a base verb via a rule that adds particles to the complements list. This is shown schematicallybelow:?main verbSYNSEM.LOCAL.CAT.VAL.COMPS?1?main verbSYNSEM.LOCAL.CAT.VAL.COMPS?FIRST?HEAD?prt?REST?1?This rule simply takes a verb lexeme and adds an extra complement, the particle, to its subcategorization list.The particle contributes a fixed meaning to the meaning of the verb: we discuss the details of the semanticsbelow. This leaves the analysis in the ERG essentially unchanged as far as syntax is concerned. In our currentimplementation, this rule is restricted to applying only to intransitive and simple transitive verbs, through thetyping system, since these are by far the most frequent candidates for a productive approach.In computational terms, the motivation for capturing productive cases is partly to add coverage, but also toimprove reliability of the coding. This rule could be used to generate the verb-particle entry for wander upfrom the entry for wander. However, it will of course overgenerate: it needs to be specialized to account forvarious classes of verb-particle constructions. For instance, even though the particle up occurs with a widerange of verbs, it only combines productively with some of these classes. Bame (1999) discusses two suchcases: the resultative and the aspectual up. For example:(1)Kim carried the television up (resultative up)(2)Kim ate the sandwich up (aspectual up)With the resultative up, the argument is affected (i.e., at the end of the action the television is up). In contrast,the aspectual or completive up suggests that the action is taken to some conclusion. Bames analysis followsWechsler (1997) in merging semantic structures in order to restrict the verb-particle combinations and alsoin order to give contrasting semantic structures for these two cases. Unfortunately, as mentioned above,this cannot be directly implemented in the ERG: it also does not lend itself to underspecification, which isimportant to avoid proliferation of analyses.One complication, however, is that up has a use with some motion verbs in which it simply denotes a contex-tually salient endpoint to the action:(3)Kim was standing in the bottom of the valley. Sandy galloped up.It is tempting to analyse this as an aspectual up, in which the end of the path is indicated. Assuming anapproach to event semantics where an activity verb such as gallop denotes an event which is underspecifiedas to whether it includes an end point, the very simple analysis below can be defended:gallop(e,x)?up-end-pt(e)where up-end-pt is taken as a predicate which is true of terminated events (accomplishments).An alternative to Bames account would then be to extend this approach to transitive verbs, where althoughthe up also generally has a directional component, the sense of completed path is still present:carry(e,x,y)?up-end-pt-and-dir(e)?television(y)Under this approach, given that the end of the path is up, it necessarily follows from the semantic propertiesof carry that the television is also up, so it isnt necessary to make the compositional semantics express thisdirectly. We can then utilize a very simple lexical rule, which inherits from the schema given above, but whichonly takes as input the class of motion verbs with the correct aspectual properties.3However, we should alsonote that there is a particle use of up which is very similar to the PP argument of a verb such as put:(4)Kim put the picture up.(5)The picture is up.(6)Kim put the picture on the table.(7)The picture is on the table.Associating individual particles with subtypes of lexical rules is very similar to the treatment of productivederivational morphology available within the LKB system. For encoding subregularities we use redundancyrules, with the verb-particle lexical entry default inheriting from the result of applying a rule to a verb. Thismeans that it is possible to relate a base verb form with the verb-particle construction derived from it, whichmeans that the latter inherits from the former all the common information, such as inflectional morphology,so that if the base verb is irregular, so is the verb-particle combination. Moreover, the same idea applies toregister and dialect information, which is shared between the base verb and the verb-particle combination(e.g. both piss and piss off are generally perceived as informal and impolite). However, in other respects thetreatment of productive verb particle formation is somewhat different, in that it is possible to also group theparticles, so that any one verb of a given group could occur with any one particle of a related group. Forinstance, the movement verbs (come, go, jump, run, walk,.) and the location or direction particles ( down,in, out, up, .) can be productively combined by a lexical rule that will generate all the possible verb-particlecombinations allowed by these groups (come down, come in, come out, come up, go down, .). This is donemore stipulatively than in Bames analysis, in the sense that the types for the classes of verbs and the classes ofprepositions are separately defined, but the actual work involved in doing the encoding for the computationallexicon is much the same. We consider how we can acquire these classes in the next sections.4Verb-particle combinations in dictionariesAlthough it seems intuitively plausible that there is some degree of productive formation of some verb-particlecombinations, it is not clear what proportion of verb-particles might be accounted for in this way. We inves-tigated this using several dictionaries and lexicons: the paper versions of the Collins Cobuild Dictionaryof Phrasal Verbs (Collins-PV), and of the Cambridge International Dictionary of Phrasal Verbs (CIDE-PV),the electronic versions of the Alvey Natural Language Tools (ANLT) lexicon (Carroll and Grover, 1989)3The availability of the hierarchy of lexical rules is a strong counter-argument to Ackerman and Webelhuths (1998) claims thatthey are unsuitable for capturing this type of phenomenon (see also Ackerman and Webelhuth (1998:162).(which was derived from the Longman Dictionary of Contemporary English, LDOCE), the COMLEX lexicon(Macleod et al, 1998), and the Cambridge International Dictionary of English (CIDE+) lexicon. Table 1 showsthe number of verb-particle entries for each of these dictionaries.4Table 1: Verb-Particle Entries in DictionariesDictionaryEntriesANLT2,649CIDE-PVover 4,500CIDE+1,433Collins-PVover 3,000Comlex3,433LinGO ERG276As we can see from these numbers, the coverage of each dictionary varies considerably. There is a commoncore of verbs that is described in every dictionary. For instance, there are 1,291 verb-particles combinationsthat are described in CIDE+