ESSLLI99, Butt/Frank/Kuhn: Exercises

ESSLLI99, Butt/Frank/Kuhn:
Development of large scale LFG grammars --
Linguistics, Engineering and Resources

Collection of Exercises

Exercises on rule annotation, subcategorization and constraining equations

Start from the grammar rule-anno-ex.lfg

Exercise 1 -- Missing Entries

If the problem is that a word is not listed in the lexicon, a message to this effect will appear in the XLE window, in addition to the morphology window appearing:

    
% parse "a monkey in the garden devours a banana"
parsing {a monkey in the garden devours a banana}  

 Chart unconnected because of unknown words
 Word possibly causing problem: garden 
0 solutions, 0.01 CPU seconds, 0 subtrees
0
%

Add the missing entry.

Exercise 2 - Extending c-structure rules to cover more adjuncts

Extend the lexicon and the VP-rule to cover temporal adverbs:

the girl saw the monkey yesterday
the girl saw the monkey yesterday in the park

In terms of f-structure, their contribution should end up in the same set as those of local PPs.

Exercise 3 - New verbs with different subcategorization

Add a treatment of ditransitive verbs to the grammar and the lexicon. Classically in LFG the second object is analyzed as OBJ2.
- the girl gave the monkey a banana
- the girl gives the monkey a banana
- the girls give the monkey a banana
Make sure that the grammatical functions you are using are defined in the CONFIG section.
Your analysis should avoid adding a spurious ambiguity to ordinary transitive sentences:
- the monkey devoured the banana
Add verbs taking a prepositional object. The standard LFG analysis assumes a function OBL for these.
- the monkey thought of a banana
How do you have to modify the f-annotation of the PPs in the verb rule?
Again, don't forget to add the new grammatical function in the CONFIG section.

Exercise 4 - Constraining equations and underspecification

In the current version of the grammar, subject-verb number agreement is checked by a constraining equation. Can you explain the difference in the grammar's behaviour for the following two sentences?
- the monkey sleeps
- the sheep sleeps
What are the options for fixing the problem?

Solution

rule-anno-ex.sol.lfg

Exercises on templates, lexical rules and functional uncertainty (English)

Exercise 1 -- Templates

Restructure the lexicon using (a hierarchy of) templates.

Using the handful of existing templates as a model, write more templates for the following (and use these names):

Name Function
CASE Case Assignment
PROPERNOUN For Proper Nouns
NOUN-PL For plural common nouns
DET For Determiners
TENSE Tense Marking
ASPECT Aspect Marking
TRANS For transitive verbs
INTRANS For intransitive verbs
PREP For prepositions
V-S-RAISING For modals

Name	Function
CASE	Case Assignment
PROPERNOUN	For Proper Nouns
NOUN-PL	For plural common nouns
DET	For Determiners
TENSE	Tense Marking
ASPECT	Aspect Marking
TRANS	For transitive verbs
INTRANS	For intransitive verbs
PREP	For prepositions
V-S-RAISING	For modals

The use of the templates is illustrated with the noun girl and the pronouns he, she, it, them.

If you want to write more templates or in order to try to organize things differently, you are most welcome to do so.

Note: In this exercise we have introduced the use of the variable %stem. This is a very useful device within XLE which allows you to make XLE do the work of figuring out the stem (based on the headword in the lexical entry you have written). This avoids cut-and-paste errors as in the monkey and park situation of the first exercise. However, it is really useful only in conjunction with a morphological analyzer that will return the ``real'' stems (eat as opposed to eats). This will be introduced at the end of the week.

Exercise 2 -- Lexical Rules

The grammar contains a pseudo-morphological analysis in order to demonstrate the lexical rule for passive (we will soon use a real morphological analysis).
A template for passivizable stems is already in the TEMPLATE section, but you still have to fill in the rewrite rules that make up the core of the lexical rule for passive.
First, write the rule that will suppress the subject, and make the object the subject.
Then add an alternative where the underlying subject is realized as an oblique PP with by. (Constrain this disjunct appropriately.)
The template should also work for ditransitive verbs. Add the template call to the appropriate lexical entries.
- the girl was see -n
- the girl was see -n by the boy
- the girl was give -n a telescope
Advanced Additional and Optional Exercise:

Add a new lexical rule for dative shift. Make sure it interacts appropriately with the passive lexical rule (which one has to apply first, i.e., should be embedded in the other one?)
- the telescope was give -n to the girl

Exercise 3 -- Functional Uncertainty

Run the testfile and make sure you understand the treatment of topicalization via functional uncertainty (as was demonstrated in class).

Relative clauses are another candidate for treatment via functional uncertainty. At the moment, the relative pronoun can only be interpreted locally (i.e., without modals or embedded clauses).

Parse the two sentences in the testfile (repeated here) and make sure you understand the analysis of relative clauses.

[a.] The boy who laughed saw a monkey.
[b.] The boy who she saw is sleeping.

Add a functional uncertainty analysis to cover the following data.

[a.] The boy who should laugh is sleeping.
[b.] The banana which the monkey should eat is in the park.
[c.] The boy who she thinks she sees is in the park.

Optional: Add a treatment of relative pronouns in PPs.

[a.] The monkey of whom she is thinking is in the park.
[b.] The monkey of whom she should think is in the park.
[c.] The monkey of whom she thinks that he is sleeping is in the park.

What should the f-structure under the TOPIC be when the PP is a relative phrase? (Recall that this phenomenon has been called ``Pied Piping'': the fronted relative pronoun is joined by the preposition, which can't be stranded.)

Exercise 4 -- Constraining Equations

Make sure that you can't parse sentences like

[a.] *The boy which is in the park is sleeping.
[b.] *The banana who the monkey should eat is in the park.

Hint: This will involve introducing a feature ANIM with the values + or -. Use constraining equations to make sure you get only the right analyses.

Solution

fu-ex-engl.sol.lfg

Exercises on templates (French)

Start from the grammar mini-french2.lfg

The testfile is mini-french-tests

Exercise 1

The grammar comes with a number of predefined template definitions. Introduce templates for the following phenomena. Try to use the following template names:

num, gend, case, common_noun, proper_noun, n_agr, n_xx_agr, pron_pers, spec, prep v_s, v_s_o, v_s_o_o2, v_s_obl, v_s_xcomp_rais, v_s_compfin s_v_agr

Distinction between common nouns and proper nouns:
Common nouns should be specified (NTYPE) = common;
proper names should be specified (NTYPE) = proper.
Pronouns will now be distinguished and specified by the following features:
(PRON-TYPE)= pers/rel and (PRON-FORM)= %stem.
Templates for relative pronouns are already defined. Complete this for personal pronouns.
Specification of gender and number for nouns, pronouns and determiner.
Again, you find some predefined templates which are used in noun and adjective entries.
How do you proceed for pronouns like les and lui, leur? Try different possibilities and observe the number of analyses in sentences like: Jean les voit.
In the current grammar, specifier features unify with noun features through the trivial equation . Introduce a SPEC attribute for determiners, which embeds SPEC-TYPE and SPEC-FORM features introduced by the determiners. How do you preserve number and gender agreement with the noun in this new analysis?
Define subcat templates for the different types of subcategorization. If you make use of the %stem variable (which is instantiated by the lexicon entry's headword), you avoid cut-and-paste errors like the one you observed with monkey and park.
In your testsuite you noticed that subject-verb agreement is not yet captured. Define subject-verb agreement using templates.
Use your testfile mini-french-tests for regression testing.

Solution

french-templates.lfg

Exercises on functional uncertainty (French)

Start from the grammar french-templates.lfg

Exercise 1

Introduce templates and lexicon entries for raising verbs, as e.g. devoir, vouloir.

Il doit venir.
* Il doit vient.

Exercise 2

The grammar contains a restricted analysis of relative clauses. Look at the analysis of relative clauses in your testfile mini-french-tests. Only subject and object relative pronouns that act as TOPIC of the local relative clause are captured.

Extend the analysis to cover the following phenomena:

Relative clause topics from embedded infinitives via functional uncertainty:
Jean voit le chat que il doit donner à Marie.
Jean pense au chat que la fille doit aimer.
How do you proceed for subject relative pronouns?
Jean voit la fille qui doit venir.
Extend the relative clause analysis to cover relative pronouns in PPs:
Jean voit la fille à laquelle il pense.
Note that the head noun and the relative pronoun must agree in number and gender.
Again, consider the treatment of long dependencies via functional uncertainty:
Jean voit la fille à laquelle il doit penser.
How do you proceed for VP-internal subjects in relative clauses?
Jean voit la chat que aime Marie.
Jean voit la fille que doit aimer Marie.

Solution

french-fu.lfg

German grammar - Exercises on templates, lexical rules and functional uncertainty

Start from the grammar fu-ex-ger.lfg

A testfile is in fu-ex-ger-testfile

1.

Restructure the lexicon using (a hierarchy of) templates. Look at how common nouns are done to get an initial idea. (You need not deal with every single entry, but do look at pronouns and verbs.)

2.

Run the testfile and try to understand the functional uncertainty analysis of topicalization.

3.

Add the rule parts required to extend the analysis to topicalization of prepositional objects:

an den Mann denkt Maria.
an den Mann soll Maria denken.
an den Mann glaubt Hans, daß Maria denkt.

4.

Currently, only few relative clauses work, since the relative pronoun can only be interpreted locally - so, without modal verbs, or embedded clauses. Add a functional uncertainty analysis to cover the following data:

der Mann, den sie sieht, lacht.
der Mann, den sie sehen soll, lacht.
der Mann, den sie sehen wollen soll, lacht.
der Mann, den sie glaubt, daß sie sieht, lacht.

5.

Add a treatment of relative pronouns in PPs.

der Mann, an den sie denkt, lacht.
der Mann, an den sie denken soll, lacht.
der Mann, an den sie glaubt, daß er denkt, lacht.

Note that the TOPIC of the relative clause plays a role in checking for gender agreement with the noun that the relative clause modifies (look at the NP rule where CPrel is introduced).

In this light, what has to be the f-structure that you want under TOPIC with a PP as a relative phrase? (Recall that this phenomenon has been called ``Pied Piping'': the fronted relative pronoun is joined by the preposition, which can't be stranded in German.)

Exercises on Coordination (English)

Start from the small grammar coord-ex-engl.lfg
The testfile is coord-ex-testfile

1.

Add a simple coordination analysis in the NP rule. Assume for the moment that there's no f-structure contribution coming from the conjunction. Parse just a coordinated NP:

parse {NP: Mary and John}

Is there any case information?

Now parse

parse {I saw Mary and John}

How does the case information get there?

(Click on the c-structure nodes with both mouse buttons to get the f-structure that's projected from that node.

If you hold down the control key and click with both buttons, you'll see the annotations that are in the rules for that node.)

2.

Add coordination at the level of S. Look at how the f-structure information is propagated for an example that involves functional uncertainty and coordination.

bananas, John likes and Mary should like
bananas, John likes and Mary thinks that Bill likes

3.

Use a rule macro to generalize all usages of the coordination scheme. Note: macro definitions go in the RULES section (not the TEMPLATES section).

Now, extend the scheme to cover coordinations with more than two conjuncts:

I saw John, Mary, and Bill
John laughs, Mary laughs, and Bill likes bananas

4.

Now, see what happens when you have the conjunctions introduce their own f-structure contribution, e.g. (^ CONJ-FORM) = and.
Check the different possibilities of annotating the conjunction in the coordination rule.

First, try the (default) annotation ^=!

Where does the CONJ-FORM feature end up for

I saw Mary and John

What happens with

I saw Mary and John or Bill

Next, try ! $ ^

Compare the effect of

parse {NP: Mary and John}

and

parse {I saw Mary and John}

Finally, define CONJ-FORM as a non-distributive feature. See where the information ends up in the f-structure. You should now be able to parse all previous sentences.

5.

What do you have to do to get the following data correct?

Mary and John laugh
*Mary and John laughs

(Hint: try saying something like - either I'm in a structure where nothing is said about number, or I'd like to be plural.)

What about

Bill or John laughs
*Bill or John laugh

(In the testfile, you can find a special annotation to mark ungrammatical sentences - the first number follwed by an exclamation mark denotes the number of expected readings.)

Optional additional exercise:
When you're finished with this exercise sheet, build the coordination analysis into the larger English, German, or French grammar.

Solution

coord-ex-engl.sol.lfg

Exercises on Coordination and Macros (French)

Start from the grammar french-sublex.lfg

Exercise 1: NP-/PP-coordination

Analyze the following phrases and declare the necessary features as NONDISTRIBUTIVES in the configuration.
NP: Jean et Marie
NP: la fille et les chats
Jean et Marie viennent souvent.
What do you have to do in order to treat the following data correctly?
* La femme et la fille dort.
La femme ou la fille doit dormir.
La femme ou la fille doivent dormir.
Jean et moi viennent souvent.
Il voit la fille qui dort et le chat qui court.
Define coordination also for PPs
La fille dort à l'école et sur le bateau.
* La fille pense au vacances et sur le bateau.

Exercise 2: Parameterized macros for coordination

Replace the explicit coordination rules in NP and PP by a macro for coordination (in the RULES section).
Define PP coordination also for relative clauses.
Il voit la table sur laquelle ou sous laquelle elle dort.
Extend the coordination macro to capture coordination structures with more than two conjuncts.
Jean, Pierre et Marie viennent souvent.
Le chat court dans la rue, dans le jardin, sur les toits et sur la table.
How could you analyze right node raising structures like the following?
NP: le format et l'orientation du papier
NP: le format et l'orientation du papier et la qualité de numérisation

Solution

french-coord-macros.lfg

The Interface to Morphological Analyzers
- Exercises (English)

Start from the small grammar morph-ex-engl.lfg

The testfile is morph-ex-testfile
The CONFIG is set up in such a way that you can use the morphological analyzer.

1.

Play with the XLE command analyze-string (for one or two words). It shows you the output of the tokenizer.

analyze-string the
analyze-string leaves
analyze-string {leaves fall}

2.

Parse the sentence

the leaf falls

Hold down the control key and click on a tree node with the left mouse button in order to expand the tree display to show sublexical nodes.

You can now see which words in the sentence are still analyzed using full form entries.

Furthermore, you can look at the transition graph by choosing the menu item `Show Morph Window' in the `Commands' menu of the tree window (upper left-hand window).

3.

Write a sublexical rule for the D (determiner) category and make sure there are ``sublexical lexicon entries'' for the tags occurring in the morphology output for determiners (+SP is for `singular or plural' - how can you capture this in terms of f-structure?).

4.

Now also include the verb in the morhology-based part of the grammar. Where should the person and number information given by the morphology go in the f-structure?

Note that you have to introduce the subcategorization frame for each verb somewhere. This information is not included in the morphology output. So, where can you attach it to?

You should be able to parse the following sentences:

the leaf falls
leaves fall
fall leaves

Recall that a string of letters can be assigned several categories, divided by semicolons. So you can do the following:

  fall         N-S XLE    @MY-TEMPLATE-FOR-NOUN-STEMS;
               V-S XLE    @MY-TEMPLATE-FOR-VERB-STEMS.

5.

As a next step, make use of the -Lunknown mechanism to deal with input for which there is no lexicon entry. This is useful for the lemma forms output by the morphology. Write an entry for -Lunknown and assign it all sublexical stem categories you want to deal with in this way.

(For verbs you have to ``guess'' what the subcategorization is.)

What analyses do you now get for

leaves fall

Try to parse sentences with other common nouns and intransitive verbs - in principle, they should all work now! If you guess verbs to be either transive or intransitive, it will also work for transitive verbs.

(Note that using %stem in the PRED values now gives you the lemma form as the predicate.)

6.

Further things the tokenizer will do for you

You can now use sentence-initial capitalization for arbitrary words, since the tokenizer will turn it back into the lower-case variant. The tokenizer does a lot of such preprocessing tasks.

Hippos snore.

7.

Optional exercise
Build the morphology interface into the larger English/French/German grammar (maybe just for some categories). You can either use your own grammar or the solution grammar coord-ex-engl.sol.lfg

It is of course okay to keep around a few full form entries for particularly complicated words (e.g., auxiliaries).

For French and German, there are several issues that are more interesting than what occurs with the English morphology. We have some particular exercises for these languages - so if you're interested, ask us.

Solution

morph-ex-engl.sol.lfg

Exercises on morphology interface (French)

Start from the grammar french-sublex.lfg

Exercise 1

The grammar is now interfaced with tokenization and morphology for French. The relevant finite-state transducers are referenced in the grammar configuration.

For some categories you already find sublexical rules and lexical entries in the grammar sections FRENCH SUBLEX RULES and FRENCH SUBLEX LEXICON.

Lexical entries for nouns, verbs, prepositions and (new!) adverbs are now only specified for stem forms. Try to understand the differences that arise between the previous grammar version without sublexical rules and the new version, by looking at template definitions and e.g. the definition of agreement constraints.

Test how the grammar behaves now. Use ``unknown'' vocabulary for the categories that are described by generic Lunknown lexical entries. You may also introduce further generic Lunknown lexical entries.

Why are there no Lunknown entries for verbs?
Proper names are now analyzed as N. What has been changed in the NP rule?
Why do we now get 2 analyses for Marie aime le chat and La fille aime le chat? Restrict the grammar.
Observe the effect of tokenization by parsing examples like PP: aux enfants, PP: des enfants. (For NP: des enfants (e.g. in Jean voit des enfants) we will have to introduce indefinite determiners (see below).)
The tokenizer optionally decapitalizes the first word of a sentence. You can now type in sentences as Le chat dort. However, decapitalization causes an unwarranted ambiguity for Jean dort. Lunknown lexical entries can be declared suboptimal in terms of so-called optimality marks. We will discuss optimality-style constraint ranking in detail only later. Aor the time being, we will filter the unwarranted ambiguities by the following additions:
Please add:
- to the CONFIG section: OPTIMALITYRANKING: NEUTRAL Unknown NOGOOD.
- to the LUnknown lexical entry for N: Unknown $ o::*
Le chat dort. is assigned MOOD = INDICATIVE, whereas Le chat aime Marie. is not. Inspect the sublexical nodes in both cases, and discuss the consequences of an alternative treatment, where mood is specified in terms of disjunction instead of being underspecified.
Why do we obtain 2 analyses for Jean voit la fille qui aime le chat? Restrict the grammar.

Exercise 2: Sublexical rules

Define the morphology interface for clitics and determiners, following the model of the existing sublexical rules. The lexical entries will have to be marked by XLE instead of *.
The relevant morphological analyses for clitics and determiners are stated below. You can obtain these morphological analyses in XLE with the command
analyze-string <string>.

je   je+Nom+InvGen+SG+P1+PC
tu   tu+Nom+InvGen+SG+P2+PC
il   il+Nom+Masc+SG+P3+PC
elle il+Nom+Fem+SG+P3+PC
ils  il+Nom+Masc+PL+P3+PC

le   le+Masc+SG+Def+Det
     le+Acc+Masc+SG+P3+PC
la   le+Acc+Fem+SG+P3+PC
     le+Fem+SG+Def+Det
les  le+InvGen+PL+Def+Det
     le+Acc+InvGen+PL+P3+PC

lui  lui+Dat+InvGen+SG+P3+PC
leur leur+Dat+InvGen+PL+P3+PC

un  un+Masc+SG+Indef+Det
une un+Fem+SG+Indef+Det
des un+InvGen+PL+Indef+Det

Solution

french-sublex2.lfg

Processing Aspects, Part I
- Exercises

Start from the small English context-free grammar proc-ex1-engl-cf.lfg

Parse the sentences
- flying planes can be dangerous
- flying planes with wings of wood can be dangerous
- a flying plane with wings of wood can be dangerous
- flying a plane with wings of wood can be dangerous
and try to understand how the chart works. The second sentence is the one used on the slides. You can browse through the trees represented in the parse forrest by clicking the left mouse button on nodes with grey (or dotted) lines in their subtree (to get back hold down Shift and click).
A display of the chart can be displayed from the `Commands' menu in the tree window. (What is the number of the chart edge for the N' category that's at the interface between the two non-interacting ambiguities?) Click on the boxes representing edges with both mouse buttons to get a display of this particular ``sub-forrest''.
Extend the grammar with some minimal f-annotations to remove the ambiguity from the following sentences (you can use the commented out lexicon):
- flying planes is dangerous
- flying planes are dangerous
Are the ``wrong readings'' still among the trees you can find by clicking through the possible c-structures?

Solution

proc-ex1-engl-cf.sol.lfg

Processing Aspects, Part II
- Exercises

The idea of this exercise is to compare two versions of a grammar of German: one making the clause type distinction at the level of f-structure (proc-ex2-german-f.lfg), the other at the level of c-structure (proc-ex2-german-c.lfg). The latter employs the concept of rule parametrization by complex category symbols.

The parametrization in german-c.lfg has not quite been finished. You still have to do it for the NP and PP rule. As you can see, for these categories the type information is still passed via the f-structure feature TYPE. Introduce complex categories for these categories.
You should best proceed in two steps.
- First, reconstruct the rules for these categories (starting with the NP rule) and check whether they work internally. To do this internal check, you can use the parse command with a different category than the normal root category:
  parse {NP[std]: der Mann}
  parse {NP[std]: sie}
  - Make sure that within the NP rule, you replace all the categories that influence the type of the NP (standard/relative/interrogative) by complex categories. (You need not worry about the sublexical rules for new categories like DET[std] or PRON[rel] - they have been defined for you.)
    parse {NP[std]: diesen Mann}
    parse {NP[int]: welchen Mann}
    parse {NP[std]: ihn}
    parse {NP[rel]: den}
    In the disjuncts for which the TYPE feature was fixed to one particular value should also fix the _type parameter to this value.
  - Once the internals of the NP rule work you can proceed to the PP rule. Where is the type information projected from?
- Next, modify the rules where the NP and PP category is called from. You have to replace the old category with the (^ TYPE) = ... annotation by the new complex categories. Make sure you pass in the correct instantiation of the parameter. (Sometimes this will be a formal parameter that is percolated through the rule, sometimes it will be a particular instance of the possible clause types.)
Now, you can compare the performance of the f-structure based grammar and the c-structure based grammar. You should make copies of the grammars in your own directory.
Run the testfile testfile-german with the former grammar:
parse-testfile testfile-german
Now, load your new grammar and compare the performance. To do so, you can run the parse-testfile command on the output of the previous test run:
parse-testfile testfile-german.new
(In case you didn't manage to solve the first part of this exercise entirely, you can use the solution grammar proc-ex2-german-c.sol.lfg.
You may also compare the performance with the grammar proc-ex2-multi-param.lfg, which is more heavily parametrized.)

Ambiguity, Generation and Overgeneration

Start from the grammar french-ambig.lfg

The testfile is test-ambiguity

Exercise 1: Object Drop

Modify the transitive verb template (v_s_o) to allow for object drop.

Check your analysis with the relevant examples in test-ambiguity (introduced by the comment #object drop).

Exercise 2: Headless NPs

Extend the present NP rule to account for headless NPs. In the headless construction, the f-structure for the missing noun head should contain a feature PRED= 'pro', with PRON-TYPE= null. The adjective is to be represented as an adjunct, as usual:

1#1

First test your analysis with the NP le petit (`the small one').

Then analyze the following sentences: (also contained in the testfile)

(Context:
Deux boutons s'allument. `Two buttons light up.')

Le vert indique si l'imprimante marche.
`The green (one) indicates that the printer works.'

Appuyez sur le rouge dans la fenêtre droite.
`Press the red (one) in the right window.'

Ne pas appuyer sur le rouge du côté droit.
not press on the red on the side right
`Do not press the red (one) on the right-hand side'

Exercise 3: Reflexive Constructions

The grammar contains a simple analysis for inherently reflexive verbs, e.g. se casser (break), s'évanouir (faint), s'allumer (light up) or se réveiller (wake up). These verbs are analyzed as intransitive reflexive verbs (VTYPE= reflexive). The reflexive clitic ( se) is not represented as an argument. It is represented as part of the verbal morphology by the feature VMORPH= cl(itic).

Explain the interaction of the constraints (2#2 VTYPE)=c reflexive and (2#2 vmorph)=c cl that you find in the entries of reflexive clitics and reflexive verbs, respectively.
What happens if you replace these constraining equations by simple defining equations? Try the following ungrammatical sentences:
* Marie évanouit. (`Mary faints')
* Marie se dort. (`Mary sleeps')
Reintroduce the constraining equations.
Introduce a template REFLEXIVIZATION which defines a lexical rule for semantic reflexivization with transitive verbs, as in the example Marie voit Jean (Mary sees John) - Marie se voit (Mary sees herself).
Note: The passive lexical rule for transitive verbs is different from the previous versions in that it only defines the passive alternation. The active construction is stated separately in the transitive template. Proceed in a similar way for reflexivization, and only provide a template with designator rewrites (lexical rule) for the reflexive construction.
The lexical rule for reflexivation suppresses the OBJect of the corresponding nonreflexive active construction (NULL). As in the case of inherently reflexive verbs, the reflexive clitic is not represented as grammatical function subcategorized by the verb, but as a part of verbal morphology by the feature VMORPH= cl. As opposed to inherently reflexive verbs, which define a feature REFL = - (no semantic reflexivization), semantic reflexivization introduces the feature REFL= +. The argument binding of the suppressed argument (NULL) to the subject is supposed to take place in argument- or semantic structure, which are not used here.
3#3
Check your analysis with the sentence Marie se voit (`Mary sees herself').
Now analyze the following sentences and observe the number of analyses.
Le conducteur casse le moteur.
`The driver breaks the motor'
Le moteur se casse.
`The motor breaks'
Les feux s'allument.
`The lights go on'
Les enfants se réveillent.
`The children wake up'
Which readings of those that you obtain do you consider appropriate?

Exercise 4: Lexical Ambiguities

The NP rule contains an analysis for noun compounds (photocopie couleur (`color photocopy'), document papier (`paper document')), which is currently commented out.

Make the compound rule active and check the effect by reparsing the examples you analyzed before, using parse-testfile <num>.

In the following, it may be wise to put the noun-compound rule into comments again.

Exercise 5: Generation

Make sure that your .xlerc-file contains the following line:

set defaultSocketPorts(generator) 20<n>, with <n> your student number,
e.g. set defaultSocketPorts(generator) 2001.

Otherwise your generation process will prevent other users to start their own generation processes.

Parse the sentences:
Marie et Jean chantent. (`Mary and sing.')
Marie et Jean se réveillent. (`Mary and John wake up')
Un grand chat et un chien paisible se rencontrent dans une rue sans issue.
`A big cat and a dog peaceful (se) meet in a street without exit
`A big cat and a peaceful dog meet in a dead end street'
Marie danse dans le parc après le lever du jour.
`Mary dances in the park after sunrise'.
For each of them, go to the left-hand side f-structure window, click the Commands button, and choose Generate from this F-structure. The first time the generation server needs some time to get started. (Cf. the upcoming window Generator.) Go back to the XLE shell and see whether a sentence is generated. Which result do you get?
What do you observe with respect to the reflexive ambiguities and the PP attachment ambiguities in generation?
Try to state surface order constraints in the coordination macro to restrict the order of conjuncts in coordination. Test the results in generation.
Add similar constraints to the PP position in the VP rule, and check the result in analysis and (re)generation with:
Marie danse dans le parc après le lever du jour.
`Mary dances in the park after sunrise'.
Optional: Parse the following phrases and regenerate from the f-structure(s):
Appuyez sur le rouge dans la fenêtre droite.
`Press the red one in the right window.'
NP: un grand chien affamé
a big dog starving (hungry)
`a big, starving dog'
Think about ways to constrain the order of adjectives in generation.
Optional:
Extend your grammar slightly, to be able to analyze the following sentence:
Un grand joli chat affamé et un chien lourd, paisible et anxieux de chats se rencontrent dans une rue sans issue.
A big pretty cat starving and a dog heavy, peaceful and anxious of cats (se) meet in a street without exit
`A big, pretty, starving cat and a heavy, peaceful dog, anxious of cats meet in a dead end street'
Proceed in the following way:
- NP: un grand joli chat affamé
  a big pretty cat starving
- anxieux is an adjective with an oblique PP complement (de chats). Use the subcategorization frame (2#2PRED)= 'anxieux4#4 (2#2OBL)5#5', and proceed similarly to the cases of oblique arguments subcategorized by verbs.
  NP: un chien anxieux de chats a dog, anxious of cats
- Call the coordination macro in AP to allow coordination of APs:
  NP: un chien [paisible] et [anxieux de chats]
  a dog peaceful and anxious of cats
- Un cat et un chien se rencontrent dans une rue sans issue.
  Un grand joli chat affamé et un chien lourd, paisible et anxieux de chats se rencontrent dans une rue sans issue.
- Parse the following sentence and regenerate from the correct analysis:
  Un grand joli chat affamé et un chien lourd et paisible se rencontrent dans une rue sans issue.
  Which alternatives are generated? Try generation without conjunct and adjunct order constraints.

Solution

french-ambig2.lfg

Exercises on Constraint Ranking

Start from the grammar french-ambig2.lfg

The testfile is test-ot

CONFIG for OT-style constraint ranking

The CONFIG section of your grammar contains two lines which state keywords for OT-constraint ranking in analysis and generation.

OPTIMALITYRANKING: NEUTRAL UNGRAMMATICAL NOGOOD. GENOPTIMALITYRANKING: NEUTRAL UNGRAMMATICAL NOGOOD.

The ot-marks you will define in the following exercises in the o-projection will have to be integrated and ranked relative to each other in this part of the configuration.

In XLE, o-descriptions are defined as follows: otmark $ o::*

The o-projection can be inspected by clicking the o:: button in the left-hand f-structure window.

For analyses that involve ot-marks, XLE displays the active ot-marks in their ranked order in the lower right-hand side window. You can reactivate suboptimal analyses by clicking at selected ot-marks, or you can activate all suboptimal analyses by choosing Unoptimal in the Options menue.

XLE displays the number of optimal vs. unoptimal solutions in the following way:
<numopt>+<numsubopt>, e.g. 1+7.

Exercise 1: Object Drop

Define an OT mark ObjDrop for object drop that avoids the unwarranted object-drop ambiguity in:

Le vert indique si le moteur marche.
`The green one indicates whether the motor works.'

Do you still get an analysis for:

ne pas ouvrir. (`do not open')

Exercise 2: Headless Noun Phrases

Introduce an OT constraint HeadlessNP that filters unwarranted ambiguities in the following cases:

NP: le conducteur (`the driver')
NP: le moteur (`the motor')

Now reconsider again the following sentences:

Le vert indique si l'imprimante marche.
`The green (one) indicates that the printer works.'

Appuyer sur le rouge dans la fenêtre droite.
`Press the red (one) in the right window.'

Exercise 3:

If you reparse the previous sentence with the (currently commented) compound analysis in NPap activated , the previous sentence gives you unwarranted ambiguities for la fenêtre droite ('the right-hand side') due to the lexical ambiguity of la as a noun (meaning `tone A'). Introduce a lexicalized OT-constraint in a noun entry for la that marks la as a RareNoun, and thus avoids one of these ambiguities.

Reparse the following sentences, and then put the compound rule into comments again.

Appuyer sur le rouge dans la fenêtre droite.
`Press the red (one) in the right window.'

Ne pas appuyer sur le rouge du coté droit.
`Do not press the red (one) on the right-hand side.'

Exercise 4: Reflexive Constructions

State an OT-constraint (using an ot-mark InhRefl) that filter unwarranted ambiguities for the following sentences, by expressing a preference for inherently reflexive verbs over semantic reflexivization:

Le moteur se casse. (`The motor breaks')

Les feux s'allument. (`The lights go on')

Again, consider the additional examples:

Les enfants se réveillent (mutuellement). (`The children wake (each other) up')

Marie se voit. ('Mary sees herself.')

Exercise 5: Oblique vs. Adjunct PPs

Introduce an OT-constraint (using the mark PPobl) to prefer the oblique PP analysis over the adjunct reading. Test the results with the following sentences:

Marie renonce au voyage. ('Mary abandons the travel')

Marie renonce au premier essai. ('Mary abandons (at) the first attempt')

Le chat est tué par un chasseur. (`The cat is killed by a hunter.')

Le chat est tué par inadvertance. (`The cat is killed by accident.')

OPTIONAL: Prenominal vs. postnominal APs

Try to state OT constraints (both in the grammar and in the lexicon) to cover the following generalization (in generation only):

Attributive adjectives are preferred in postnominal position.
A small number of adjectives are nevertheless preferred in prenominal position (e.g. joli, long, mauvais, petit). These adjectives can be explicitly listed in the lexicon.
[We disregard the fact that some of these adjectives have a different meaning depending on whether they are in pre- or postnominal position.]

Use the OT-marks Postnom and Prenom.

Check your constraint mechanism in generation with the following sentences:

NP: un grand joli chat (`a big pretty cat')

NP: un grand joli chat affamé (`a big, pretty, starving cat')

Les jolis chats de Marie dorment sous une grande table grise.
`Mary's pretty cats sleep under a big grey table'

Your generation preference constraints should determine the order of adjectives relative to the head correctly, i.e. as given in these example sentences.

Now execute parse-testfile test-ot.new and inspect the file test-ot.new.errors to see the effect of your constraints on the number of (optimal) analyses assigned.

Solution

french-ot.lfg

ESSLLI99, Butt/Frank/Kuhn: Development of large scale LFG grammars -- Linguistics, Engineering and Resources Collection of Exercises

Exercise 1 -- Missing Entries

Exercise 2 - Extending c-structure rules to cover more adjuncts

Exercise 3 - New verbs with different subcategorization

Exercise 4 - Constraining equations and underspecification

Solution

Exercise 1 -- Templates

Exercise 2 -- Lexical Rules

Exercise 3 -- Functional Uncertainty

Exercise 4 -- Constraining Equations

Solution

Exercise 1

Solution

Exercise 1

Exercise 2

Solution

Solution

Exercise 1: NP-/PP-coordination

Exercise 2: Parameterized macros for coordination

Solution

Solution

Exercise 1

Exercise 2: Sublexical rules

Solution

Solution

Exercise 1: Object Drop

Exercise 2: Headless NPs

Exercise 3: Reflexive Constructions

Exercise 4: Lexical Ambiguities

Exercise 5: Generation

Solution

CONFIG for OT-style constraint ranking

Exercise 1: Object Drop

Exercise 2: Headless Noun Phrases

Exercise 3:

Exercise 4: Reflexive Constructions

Solution

ESSLLI99, Butt/Frank/Kuhn:
Development of large scale LFG grammars --
Linguistics, Engineering and Resources

Collection of Exercises