Abstract: On the relationship between (Enhanced) Universal Dependencies and Phrase Structure Grammar
Recent work on neural parsing using pre-trained language models has shown that accurate dependency parsing requires very little supervision, and no induction of grammar rules in the traditional sense. We show that this even holds for dependency annotation that goes beyond labeled edges between word tokens, and that is able to capture much of the same information as that of richer phrase based models such as HPSG. Second, we show that dependency parsing can still benefit from a combination with phrase based syntax, not unlike that of HPSG, and that such combinations might be readily available for many of the treebanks in the UD corpus.
???
Gosse, thank you so much for this talk! I greatly enjoyed it. Looking forward to the QA.
For some reason it took me a little while to find the Mrini et al. paper; in case anyone else also has trouble, here it is: https://arxiv.org/pdf/1911.03875.pdf
Thanks, Gosse – very interesting talk!
Do the results of Zhou and Zhao (2019) and Mrini et al. (2020) show only that it is beneficial to have both constituency and dependency information (as it seems to me at the moment), or do they point to any other advantages that HPSG (or LFG) representations might bring? I can see that they use dependency structures without labels, which is consistent with HPSG not naming each and every argument with a different name. I wonder if there are similar attempts for joint constituency and dependency parsing, but with labelled dependencies, which would be more consistent with LFG-like representations, where f-structures name each argument.
A minor point: the exact interpretation of the table around 10th minute of your talk was not clear to me: did the participating systems aim at maximising ELAS only (and the LAS provided in the table was calculated on the basis of the EUD representations they produced), or did they aim to maximise both LAS and ELAS, i.e., their task was to produce both basic UD and full EUD representations?
Thanks again!
the official evaluation metric was ELAS. The task was to produce complete EUD annotations, so with features, lemmas. POS and UD labels. But LAS was only reported to see to what extent the two correlate for various approaches.
Thanks, Gosse. Great talk! I hope you will publish a paper in the proceedings. I have to write a paper about DG and HPSG and it would be great if I could cite your work.
Do the UD people really assume that the modified noun in a noun–relative clause construction is the object of the verb inside the relative clause? Looks like an analysis by Richard Kayne. But this would not work since the case of the modified noun is independent of the case assigned to the object.
Yes, I think this is the current analysis of relatives. I see your problem, but I am not exactly sure whether it would be a reason to change the UD analysis. The advantage of the current analysis is mostly that in languages without relative pronouns, the simple UD annotation has no way of registering which argument of the relative clause is relativized, which obviously is a problem for semantic interpretation. Note also that UD is strange in that arguments from language X influence the analsis of language Y (it is supposed to be universal after all).
Thank you for the talk!!!
I have a question on your remark on NPIs. What is the prediction task here that the systems are not good at? – predicting the licenser or the NPI?
Hi, Gosse! Thanks for the very interesting talk! The idea of sucessfully mixing constituency and dependency is something that makes HPSG so attractive (as well as other related constraint-based theories like LFG).
Would you comment more on the slide that compares the constituency and dependency based F1 by various parsers. There constituency-based ones seem to have better results, while lately we have been observing the opposite trend.
They are two different evaluations, on different annotations. So comparing them is not easy I think. The message is that you can improve on both scores by combining the tasks/representations in some way.
Thanks for your interesting talk. Did anyone try to see beyond the scores what types of sentences benefit from adding constituency information? I wonder whether they are really the “usual suspects”.
Good question! I would be very interested in more error analysis, but have seen very little of this recently.
Another line of work that uses HPSG phrase structure in a neural model:
https://www.aclweb.org/anthology/P18-1038.pdf
https://www.aclweb.org/anthology/2020.acl-main.605.pdf
And following on from the answer to Adam’s question — this line of work also uses the phrasal types in the tree.
Thanks, I had not seen those (I think)!