Olga Zamaraeva & Guy Emerson (University of Washington & University of Cambridge): Multiple Question Fronting without Relational Constraints: An analysis of Russian as a basis for cross-linguistic modeling

Posted on 12. August 2020 by Stefan Müller

16 thoughts on “Olga Zamaraeva & Guy Emerson (University of Washington & University of Cambridge): Multiple Question Fronting without Relational Constraints: An analysis of Russian as a basis for cross-linguistic modeling”

Stefan Müller on 14. August 2020 at 16:25 said:

Dear Olga, dear Guy,

Thanks for the interesting talk and for prerecording it! I am wondering how powerful the append is that you describe. Do you have a publication describing the details? I remember having read something exciting by Guy some while ago. It would be good to link it here for other readers who are interested.

Can you use your append to get an arbitrary element from the middle of a list? I use [1] + < [2] > + [3] to combine a head with the element [2]. Petter claimed that this is impossible in the LKB and I believed this as well until recently. But if your append is a full append, it may actually be possible.

Thanks and best wishes

Stefan

Reply ↓
- olzama on 16. August 2020 at 20:22 said:
  
  Hi Stefan,
  
  Just quickly — a couple of talks by Guy are cited in the abstract (Emerson 2017 and 2019). Those are a bit dense though (they are just slides). I believe a publication on the details of the append is forthcoming (but he will tell you more).
  
  We can talk about selecting a specific element during the QA.
  
  Reply ↓
  - Guy Emerson on 17. August 2020 at 10:32 said:
    
    “Forthcoming” might be a little generous, as I haven’t written the paper yet… I am planning to write it, though.
    
    This approach to replacing relational constraints with “computation types” is very general. Because type constraints can apply recursively, it is Turing-complete. There are some more complicated examples in the grammar I wrote to accompany the 2019 talk that Olga mentioned, including logical operations (could be linguistically useful) and numerical operations (probably not linguistically useful, but an illustration that it’s very general):
    
    http://users.sussex.ac.uk/~johnca/summit-2019/wrapper-types.pdf
    http://users.sussex.ac.uk/~johnca/summit-2019/wrapper-type-grammar.tar.gz
    
    But unlike true relational constraints, all of the computation is part of the feature structure — this can lead to unintuitive behaviour if there ends up being a re-entrancy with these computation features. (This actually caused a problem for us when incorporating append-lists into the Grammar Matrix, because of the combination of lexical threading of SLASH, the analysis of adjuncts, and the analysis of coordination, all of which introduce re-entrancies of the SLASH lists.)
    
    Yes, it is possible to get an arbitrary element in the list. There are two ways that this could be arbitrary: if you mean an arbitrary but fully specified position (e.g. one rule asks for the 2nd element, another rule for the 3rd, etc.) then it would be possible to write a computation type that requires two inputs (a list and an integer) and gives two outputs (the element at that integer position, and the rest of the list with that element removed).
    
    Alternatively, if you mean an arbitrary and unspecified position (which is probably what you mean?), this is a little more complicated, but still possible. The operation is non-deterministic, because the possible outcomes (removing an element from each position in the list) cannot be expressed as a single feature structure. This can still be done using unary rules, and I’ve implemented this. A rule which requires non-deterministic computation uses a feature that triggers the unary rules. For example, the following rule (part of a grammar built on the Grammar Matrix and run using the LKB) is a head-comp rule where the complement can be an element of the COMPS list. After the unary rules have finished (recursively) applying, there will be one edge in the chart for each position in the list.
    
    basic-head-any-comp-phrase := basic-head-comp-phrase &
    [ SYNSEM.LOCAL.CAT.VAL.COMPS #new-comps,
    HEAD-DTR.SYNSEM.LOCAL.CAT.VAL.COMPS #old-comps,
    NON-HEAD-DTR.SYNSEM #synsem,
    NONDETERMINISTIC [ POP-INPUT #old-comps,
    POP-OUTPUT-LIST #new-comps,
    POP-OUTPUT-ITEM #synsem ] ].
    
    tl;dr Anything’s possible but it could be difficult or unintuitive!
    
    Reply ↓
    - Frank Richter on 17. August 2020 at 14:36 said:
      
      Hi Guy,
      From the little information on the technique that’s given in the talk, I assume that the append type and similar types is an adaptation of the junk slot encoding of relations in description logics / feature logics that came up in the 1980s and was extensively used in HPSG in the early 1990s. Am I correct assuming that this is now a specific adaptation of this idea to the flavor of feature logic implemented in LKB?
      More specifically, is there anything that’s crucially different from “classical junk slots”, e.g. the junk slot encoding of append in Ait-Kaci (1984)?
      
      Reply ↓
      - Guy Emerson on 17. August 2020 at 16:52 said:
        
        Thanks for the reference — I will try to have a look, although I haven’t been able to find it online…
        
        I’ve had a look at Götz and Meurers (1996) and from what it says there about junk slot encoding of append, what I’m proposing is different. Junk slot encoding requires disjunctive types, which isn’t part of the Delph-in Joint Reference Formalism.
        
        I should clarify that this isn’t an *encoding* of an append operation, but an actual calculation, using only unification of typed feature structures.
      - Frank Richter on 17. August 2020 at 20:29 said:
        
        Here’s the complete reference:
        Ait-Kaci, Hassan 1984. A Lattice Theoretic Approach to Computation Based
        on a Calculus of Partially Ordered Type Structures. PhD thesis, University
        of Pennsylvania.
        As far as I recall the discussion of the 1990s, this thesis was generally assumed to be the source of the idea of encoding relations in feature logics. From the beginning, these structures were seen as encodings of computations. There were also manuscripts showing how different kinds of Turing Machines could be encoded in feature / description logics based on such techniques.
        Manfred Sailer and I wrote a joint MA thesis (yes, that was possible at the time) in which we wrote an HPSG linearization grammar with junk slots. It’s been a long time, but I recall that the particulars of the feature logic play an important role as to what these structures mean and how close you can get to them to relations.
        For an LKB style feature logic, it might have an effect that they could be seen as encoding knowledge about relations rather than relations (just a guess).
      - Guy Emerson on 18. August 2020 at 0:41 said:
        
        Sorry, I should been clearer: I can’t find a copy of Ait-Kaci’s thesis, except behind a paywall.
        
        Based on Götz and Meurers’s summary, what I’m proposing is different. I have also just had a brief look at Carpenter’s book, and the account of junk slots seems to rely on feature structures being fully sort-resolved. The Delph-in formalism doesn’t do this.
        
        In the abstract of Ait-Kaci’s thesis, it says that it proposes “a model of computation which amounts to solving systems of simultaneous equations in a lattice of types”. I can see that this makes sense when all structures must be fully resolved. I am proposing a model of computation which amounts to unification. Feature structures are typed but can be underspecified (resolution is not necessary). The type hierarchy and type constraints together define the operations. One feature structure holds the operation (formalised as a computation type) and the other feature structure holds the input data. A computation type is always a subtype of a data type. Unifying the two structures gives the output, which is stored in a feature.
        
        In the example in our talk, one feature structure is simply `list-copy` (this is the computation type), while the other feature structure is a list (this is the input data). Unifying the two structures gives an output, which is stored under two features, NEW-LIST and END-LIST. (For the sake of uniformity so that the output is always under one feature, these paths could be changed to RESULT.LIST and RESULT.LAST, where RESULT is of type `diff-list`.)
        
        Carpenter describes junk slots as providing a “workspace”. With computation types, the type hierarchy itself provides the workspace. The only features are for the input and the output, and no junk slots are necessary.
      - Guy Emerson on 19. August 2020 at 16:57 said:
        
        Having now had time to carefully go over Carpenter’s account of list appends using junk slots following Aït-Kaci (1984), I can say with certainty that it is a distinct mechanism.
        
        Both involve recursive constraints, but they are in different places. With junk slots there is a recursion of the JUNK feature. With computation types, there is recursion in the actual data structure — in the case of list appends, recursion in the list itself. As a computation type is a subtype of a data type, recursion of the data type directly induces recursion of the computation type.
Manfred Sailer on 16. August 2020 at 2:17 said:

Hi! Thank you for the talk!
I would like to understand what makes you assume that the apparent in situ occurrence of “gde” in one of the last slides is extracted? Is it because it occurs preverbally? Is this an unbounded extraction? (“he where_i thinks [that you work _i]]”)

You also have multiple fronting of n-words (“nobody, nothing”, …) in Russian, right? Would these be dealt with with the same mechanism? Could you have one fronted wh-adverb and one fronted n-adverb? (“where who never wanted to work?”)

Reply ↓
- olzama on 16. August 2020 at 20:11 said:
  
  Thank you, Manfred, for watching and for the comment!
  
  (I will just post some data for now and will respond either during the QA or later).
  
  (1) On GDE dumaet chto ty rabotaesh?
  He where thinks that you work
  “What does he think is the place where you work?” OR “What is the place in which he is doing the thinking about the fact that you are working?
  
  (2) ?Gde kto nikogda ne hotel rabotat
  where who never not want to.work
  A question about the set of places and people such that each person never wanted to work in one of the places. A weird question. So the sentence does not sound great at all, but I think maybe it is possible.
  
  (3) Nikto nikogda nigde nikomu nichego ne sdelal.
  noone never nowhere to.noone nothing not did
  “Nobody has ever done anything to anyone anywhere”.
  
  Thanks for making me think about (3); there it is very clear that never and nowhere can go together without any coordination or anything. They sound much better together than “??where when”, in fact they sound absolutely fine. But is that fronting? Maybe, although, unlike wh-words, no-words can be very naturally sentence-final, for example.
  
  See you at the QA!
  
  Reply ↓
- olzama on 17. August 2020 at 18:49 said:
  
  I did not have time to address this during QA, so I will post some answers here.
  
  (1) ty gde rabotaesh? 2sg where work "Where do you work?" [rus]
  
  What makes me think *where* in (1) may not be in situ is basically the following: (i) analyzing it as in situ and therefore allowing head-final declarative rules to take wh words results in ambiguity which then needs to be fixed using additional features, and there are always new problems that I keep finding, which seem to require more and more additional patches; (ii) the original G&S analysis, even though it was for English, was intended to generalize to other languages, and for that analysis I think it is generally important that wh words do not show up in the nonhead daughter positions of declarative head-final rules; (iii) the minimalist literature insists that there is no such thing as optional fronting (see Bovskovic 2002, stjepanovic 2000, Mivsmavs 2015 and Bailyn 2015, inter alia). That is from the formal literature perspective. As for the data perspective, in (1) the 2SG pronoun is most certainly topic, so, I could easily see how it would make sense to topicalize it with an appropriate filler-head rule. But like I said, this is a topic for future exploration for me :).
  
  Reply ↓
  - olzama on 17. August 2020 at 18:53 said:
    
    I also do not think that the no-words are extracted and fronted:
    
    (2) *Nikto nikomu Ivan dumaet chto ne otdal deneg
    noone to.noone Ivan thinks that not gave money
    Intended: “Ivan thinks that nobody gave money back to anyone”
    
    So, in (3), this must just be Russian flexible word order.
    
    Reply ↓
Adam Przepiórkowski on 16. August 2020 at 14:56 said:

Hi, many thanks for an interesting talk!

As to adjunct extraction, I claimed once (in my first paper ever, in 1994) that Polish has exactly the same constraint, i.e. that it does not allow for multiple adjunct extraction. But I don’t think this is the right characterisation of the data – I think the constraint is on the inherent character of extracted phrases, not on their grammatical function status. This can be seen when one looks at verbs which take “adjunct-like” arguments, e.g., manner phrases (as in the case of verbs such as BEHAVE, WORD or TREAT) or locative phrases (as in RESIDE). Then, when one extracts an argument like that and a true adjunct (assuming one believes int he argument/adjunct distinction – I don’t), the result is also downgraded, e.g. (in Polish; RM = reflexive marker):

?? Kiedy jak się zachowywał? (Polish)
When how RM behaved.3SG.M
‘How has he behaved when?’

The other thing is that, whatever the nature of this constraint, it is not a constraint on multiple extraction, but on multiple filler realisation. When you extract multiple adjuncts and express them via a single coordinated filler, then the result is grammatical, also in Russian, I believe:

(4′) Kogda i gde my kupili… (Russian, based on your (4))
when and where we bought…

So it would be nice to see how your analysis extends to such constructions, and how this compares to the earlier HPSG analysis of such facts by Chaves and Paperno 2007 (and then how to reconcile this analysis with the fact that such lexico-semantic coordination – what they call hybrid coordination – is possible not only with extracted wh-elements, but also with in situ n-words and other series of pronominal quantifiers).

Anyway, apologies for the lengthy comment and many thanks for this talk! (Also for the technical details about list append without relations or difference lists!)

Reply ↓
- olzama on 16. August 2020 at 20:18 said:
  
  Thank you for watching, Adam, and for the comment!
  
  I will just post some data now, or rather, I will confirm (4′): (4′) is very natural in Russian.
  
  I also wanted to post this example which I came across on the web:
  
  (0) Ya ne znaju, gde kogda stavit zapyatye
  I not know where when put commas
  “I don’t know where and when to put commas”
  
  So, “where when” does exist, actually, at least judging from that example. But I think it is rare, and furthermore, I think in (0), the person does not actually mean “where and when” literally; I think they mean they do not know the rules of punctuation, so, it is more like “I don’t know how to use commas” kind of thing. There is also pseudocoordination sometimes (augmentative?)…
  
  Reply ↓
Rui Chaves on 17. August 2020 at 3:58 said:

Very interesting talk. So if I understood correctly, the DELPH-IN English grammar does not handle multiple FGDs, such as:

a. [Which problem]j don’t you know [who]i to talk to _i about _j?
b. Robin is someone [who]j I never know [what]i to say _i to _j.

Was this type of data that you mentioned you tested the English version of your rule / append on?

Second question: do you think that your account could scale to what I and Denis Paperno called Russian “hybrid coordination”?
https://web.stanford.edu/group/cslipublications/cslipublications/HPSG/2007/chaves-paperno.pdf
This is of course present in a number of slavic languages, and seems to be restricted in unusual ways, but I was wondering if you’d need a completely different mechanism, or you could perhaps just expand what you have.

Reply ↓
- Guy Emerson on 17. August 2020 at 11:03 said:
  
  There are two Delph-in English grammars worth mentioning here — the English Resource Grammar (ERG), and a smaller English grammar available in the Grammar Matrix. As far as I’m aware, neither deals with multiple gaps like the examples you’ve given. The ERG doesn’t use append-lists, and I don’t think the Grammar Matrix covers enough phenomena for these examples to be parsed out of the box (although the Grammar Matrix is constantly expanding and perhaps Olga will correct me on this!).
  
  I think you’re referring to the Grammar-Matrix-derived grammar, which I modified to test list appends? This was to test that the appends were working as I expected them to, and I was mainly looking at the semantic lists RELS, HCONS, and ICONS (which are simpler than the SLASH lists because you never take anything off). Olga then incorporated append lists into the Grammar Matrix itself. English is also one of the languages in her test suite for constituent questions, and perhaps she can say more about which phenomena are included and what is currently covered.
  
  A different mechanism wouldn’t be needed for hybrid coordination, but that doesn’t mean I think it would be easy! Coordination is already complicated, and I think the challenge would be a linguistic one (capturing the restrictions you mention) rather than a technical one (modifying our filler-gap rule so that multiple elements are taken off the SLASH list).
  
  Reply ↓

Head-Driven Phrase Structure Grammar

27th International conference August 17-19 2020, Berlin, Seattle, Buxtehude, whereever

Olga Zamaraeva & Guy Emerson (University of Washington & University of Cambridge): Multiple Question Fronting without Relational Constraints: An analysis of Russian as a basis for cross-linguistic modeling

16 thoughts on “Olga Zamaraeva & Guy Emerson (University of Washington & University of Cambridge): Multiple Question Fronting without Relational Constraints: An analysis of Russian as a basis for cross-linguistic modeling”

Leave a Reply Cancel reply