Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dependency relation acl in French (and Spanish and Portuguese) #4

Open
jheinecke opened this issue Jun 22, 2018 · 22 comments
Open

dependency relation acl in French (and Spanish and Portuguese) #4

jheinecke opened this issue Jun 22, 2018 · 22 comments

Comments

@jheinecke
Copy link

Hello

The UD annotation guidelines define the acl relation as

acl stands for finite and non-finite clauses that modify a nominal. The acl relation contrasts with the advcl relation, which is used for adverbial clauses that modify a predicate. The head of the acl relation is the noun that is modified, and the dependent is the head of the clause that modifies the noun.

This implies the head of an acl deprel is only in rare cases a verb. However, in the UD_French-GSD (and UD_Spanish) I found 1542 acl (without acl:relcl) relations with a verb as head (24% of all acl, 2336 or 49% for UD_Spanish). All other treebanks (including French-sequoia, and Spanish-AnCora) have less than 1% of all acl relations linked to a verbal head. I wonder whether this is an error, since I expected ccomp (or xcomp) in these cases.

regards
Johannes

@dan-zeman
Copy link
Member

I have checked the first ~ 25 occurrences in UD_Spanish-GSD and agree that they are either GSD annotation errors or errors in conversion from GSD to UD. Many of them should be advcl. Some seem to be examples of secondary predication and should be xcomp, or they should stay acl but be reattached to a nominal argument of the verb.

@dan-zeman dan-zeman changed the title dependency relation acl in French (and Spanish) dependency relation acl in French (and Spanish and Portuguese) Jun 22, 2018
@dan-zeman
Copy link
Member

Same situation in UD_Portuguese-GSD: 791 acl and 20 acl:relcl. A frequent (but not the only) example is a present participle (gerundio) attached to a verb.

@dseddah
Copy link

dseddah commented Jun 22, 2018

Hi Dan an everyone,
Will the checking of that acl relation be included in a future version of the validator?

Best,
Djamé

@dan-zeman
Copy link
Member

@dseddah : I think it should. Sometimes it is difficult to establish that a rule must hold strictly in all languages but I think acl attached to a VERB might qualify as such a rule.

@bguil
Copy link
Contributor

bguil commented Jun 22, 2018

@jheinecke: I didn't manage to find the same numbers.
For the UD_Frensh-GSD corpora, I found:

  • in version 2.1: 447 cases of acl rel with a verbal head (Grew-match)
  • in version 2.2: 153 cases (Grew-match)

How do you find the 1542 acl you mention?

@dan-zeman
Copy link
Member

@bguil, @jheinecke : I also have 447 cases in UD_French-GSD 2.1: http://hdl.handle.net/11346/PMLTQ-XYAO

@dseddah
Copy link

dseddah commented Jun 22, 2018

maybe the next validator version could also include the notion of threshold below which a non-compliance to a rule should be qualified as residual errors, which are almost always unavoidable?

@dan-zeman
Copy link
Member

dan-zeman commented Jun 22, 2018

If it is unavoidable because the grammar/guidelines allows it, then it is not a matter for the validator.

If it is "unavoidable" because annotators make errors, then it is actually avoidable :-)

@dseddah
Copy link

dseddah commented Jun 22, 2018

After all this time treebanking, I don't believe in error-prone annotations for anything longer than a toy dataset (I'm sure you don't either btw.)

@jheinecke
Copy link
Author

@bguil , @dan-zeman You are right, I took by error the version 2.0. Much has been corrected since. After I git pull I only have 150 acl's linked to a verb (3%)

@dan-zeman
Copy link
Member

@dseddah : You're right, I don't believe in error-free annotation. But I do believe that a subset of the errors can be detected automatically and eliminated (although the elimination sometimes has to be done manually).

@dseddah
Copy link

dseddah commented Jun 22, 2018

" I do believe that a subset of the errors can be detected automatically and eliminated although the elimination sometimes has to be done manually"
Yes, but that's strictly a fonction of time and ressources available for that task, so when those are not available anymore, we need to learn how to live with those residual errors, like those 3% of POS annotations errors in the PTB, lying there for 25 year or so. We survived :)

dan-zeman added a commit to UniversalDependencies/UD_Spanish-GSD that referenced this issue Jun 22, 2018
@amir-zeldes
Copy link

Well, the beauty of releasing corpora on GitHub is that if we find those POS errors, instead of getting to know and learn them by heart for all those years, we can just fix them, commit, and they're gone in the next release :)

But on the current topic: I don't think we should forbid VERB --acl:relcl--> __ since sometimes this is a legitimate construction. For example in English:

http://match.grew.fr/?custom=5b2d3cd1a63d5&[email protected]

@dan-zeman
Copy link
Member

@amir-zeldes : I think there has been this discussion somewhere already but I cannot remember where. Anyways, acl “stands for finite and non-finite clauses that modify a nominal”, which is not the case of the example you gave.

(BTW, some of the examples I saw in other languages resembled this English one. But it does not mean that their annotation is not a violation of the guidelines.)

@nschneid
Copy link

advcl:relcl, then?

bguil added a commit that referenced this issue Jun 23, 2018
pattern {
  GOV [upos = "VERB"];
  GOV -[acl:relcl]-> DEP;
}
@amir-zeldes
Copy link

@dan-zeman I didn't know that, so what is the current recommendation? I don't know whether a special subtype of relative clauses is needed for when they modify verbs (it would be quite rare), and if so, I think advcl:relcl is maybe the wrong way around: I think it's a relative clause first, as indicated by the relative pronoun, and happens to modify a verb second (so maybe acl:advcl?)

But if you think about it, relative clauses can modify all sorts of things and I'm not sure this is reason enough to give different labels:

NOUN (normal): the day which we agreed on
VERB: she returned, which I thought was great
ADV: she's singing now, which is too late
ADJ: The shirt is green, which fits me perfectly

The class of token being modified is already marked on that token's part of speech, so I don't see what is gained by more subtyping.

@dan-zeman
Copy link
Member

I do not know what is the current recommendation (@sebschu, @manning?) But if anything with a relative pronoun qualifies as a relative clause, then it was probably wrong to assume that relative clauses are always a special case of acl.

@amir-zeldes : note that the part of speech of the parent token will not tell you everything. If it is a NOUN, you still don't know whether acl is correct:

  1. This is the car which I want to buy. ... acl, modifies just the noun car.
  2. Ivan is a dancer, which is something I'll never understand. ... modifies the clause, should be treated as if it modifies a verb.

@jnivre
Copy link

jnivre commented Jun 23, 2018

It is yet another case where the correlation between structural form (relative clause) and syntactic function (modifying a nominal) is less than perfect. We need to review all these cases systematically for v3, but for the time being I think the best compromise is to extend the use of acl:relcl to cover 2 as well (better than introducing advcl:relcl, since it is not really an adverbial clause either).

@amir-zeldes
Copy link

@dan-zeman yes, that's a good point, it's like nmod modifying a nominal predicate vs obl modifying the whole predication.

I'm with @jnivre in preferring acl:relcl for these and not modifying the current inventory if possible.

@nschneid
Copy link

Relative clauses seem like such a common strategy cross-linguistically that it feels odd to have them as a subtype anyway. Maybe we should make relcl a universal relation for v3.

@perrier54
Copy link
Contributor

All the examples given by @amir-zeldes are translated in French with "ce que":
she's singing now, which is too late - elle chante maintenant, ce qui est trop tard
The pronoun "ce" is the antecedent of the relative pronoun and there is an anaphoric relation between "elle chante maintenant" and "ce", which is represented in the UD annotation with a PARATAXIS dependency from "chante" to "ce".
In a similar way, it might be reasonable in English to put a PARATAXIS dependency from "singing" to "late";

@amir-zeldes
Copy link

For me parataxis, notwithstanding its use for parentheticals, implies a certain degree of independence - it's like a coordination without an explicit conjunction. These relative cases seem clearly subordinate for me, at least in English, so I wouldn't want to use that label.

In French I guess you could say 'ce qui ...' is an independent NP, so it's more natural to use parataxis (there's no formal subordination, although both sides of the parataxis are not of the same category, which is maybe unusual).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants