Olga Zamaraeva & Guy Emerson (University of Washington & University of Cambridge): Multiple Question Fronting without Relational Constraints: An analysis of Russian as a basis for cross-linguistic modeling

16 thoughts on “Olga Zamaraeva & Guy Emerson (University of Washington & University of Cambridge): Multiple Question Fronting without Relational Constraints: An analysis of Russian as a basis for cross-linguistic modeling

  1. Dear Olga, dear Guy,

    Thanks for the interesting talk and for prerecording it! I am wondering how powerful the append is that you describe. Do you have a publication describing the details? I remember having read something exciting by Guy some while ago. It would be good to link it here for other readers who are interested.

    Can you use your append to get an arbitrary element from the middle of a list? I use [1] + < [2] > + [3] to combine a head with the element [2]. Petter claimed that this is impossible in the LKB and I believed this as well until recently. But if your append is a full append, it may actually be possible.

    Thanks and best wishes

    Stefan

    • Hi Stefan,

      Just quickly — a couple of talks by Guy are cited in the abstract (Emerson 2017 and 2019). Those are a bit dense though (they are just slides). I believe a publication on the details of the append is forthcoming (but he will tell you more).

      We can talk about selecting a specific element during the QA.

      • “Forthcoming” might be a little generous, as I haven’t written the paper yet… I am planning to write it, though.

        This approach to replacing relational constraints with “computation types” is very general. Because type constraints can apply recursively, it is Turing-complete. There are some more complicated examples in the grammar I wrote to accompany the 2019 talk that Olga mentioned, including logical operations (could be linguistically useful) and numerical operations (probably not linguistically useful, but an illustration that it’s very general):

        http://users.sussex.ac.uk/~johnca/summit-2019/wrapper-types.pdf
        http://users.sussex.ac.uk/~johnca/summit-2019/wrapper-type-grammar.tar.gz

        But unlike true relational constraints, all of the computation is part of the feature structure — this can lead to unintuitive behaviour if there ends up being a re-entrancy with these computation features. (This actually caused a problem for us when incorporating append-lists into the Grammar Matrix, because of the combination of lexical threading of SLASH, the analysis of adjuncts, and the analysis of coordination, all of which introduce re-entrancies of the SLASH lists.)

        Yes, it is possible to get an arbitrary element in the list. There are two ways that this could be arbitrary: if you mean an arbitrary but fully specified position (e.g. one rule asks for the 2nd element, another rule for the 3rd, etc.) then it would be possible to write a computation type that requires two inputs (a list and an integer) and gives two outputs (the element at that integer position, and the rest of the list with that element removed).

        Alternatively, if you mean an arbitrary and unspecified position (which is probably what you mean?), this is a little more complicated, but still possible. The operation is non-deterministic, because the possible outcomes (removing an element from each position in the list) cannot be expressed as a single feature structure. This can still be done using unary rules, and I’ve implemented this. A rule which requires non-deterministic computation uses a feature that triggers the unary rules. For example, the following rule (part of a grammar built on the Grammar Matrix and run using the LKB) is a head-comp rule where the complement can be an element of the COMPS list. After the unary rules have finished (recursively) applying, there will be one edge in the chart for each position in the list.

        basic-head-any-comp-phrase := basic-head-comp-phrase &
        [ SYNSEM.LOCAL.CAT.VAL.COMPS #new-comps,
        HEAD-DTR.SYNSEM.LOCAL.CAT.VAL.COMPS #old-comps,
        NON-HEAD-DTR.SYNSEM #synsem,
        NONDETERMINISTIC [ POP-INPUT #old-comps,
        POP-OUTPUT-LIST #new-comps,
        POP-OUTPUT-ITEM #synsem ] ].

        tl;dr Anything’s possible but it could be difficult or unintuitive!

        • Hi Guy,
          From the little information on the technique that’s given in the talk, I assume that the append type and similar types is an adaptation of the junk slot encoding of relations in description logics / feature logics that came up in the 1980s and was extensively used in HPSG in the early 1990s. Am I correct assuming that this is now a specific adaptation of this idea to the flavor of feature logic implemented in LKB?
          More specifically, is there anything that’s crucially different from “classical junk slots”, e.g. the junk slot encoding of append in Ait-Kaci (1984)?

          • Thanks for the reference — I will try to have a look, although I haven’t been able to find it online…

            I’ve had a look at Götz and Meurers (1996) and from what it says there about junk slot encoding of append, what I’m proposing is different. Junk slot encoding requires disjunctive types, which isn’t part of the Delph-in Joint Reference Formalism.

            I should clarify that this isn’t an *encoding* of an append operation, but an actual calculation, using only unification of typed feature structures.

          • Here’s the complete reference:
            Ait-Kaci, Hassan 1984. A Lattice Theoretic Approach to Computation Based
            on a Calculus of Partially Ordered Type Structures. PhD thesis, University
            of Pennsylvania.
            As far as I recall the discussion of the 1990s, this thesis was generally assumed to be the source of the idea of encoding relations in feature logics. From the beginning, these structures were seen as encodings of computations. There were also manuscripts showing how different kinds of Turing Machines could be encoded in feature / description logics based on such techniques.
            Manfred Sailer and I wrote a joint MA thesis (yes, that was possible at the time) in which we wrote an HPSG linearization grammar with junk slots. It’s been a long time, but I recall that the particulars of the feature logic play an important role as to what these structures mean and how close you can get to them to relations.
            For an LKB style feature logic, it might have an effect that they could be seen as encoding knowledge about relations rather than relations (just a guess).

          • Sorry, I should been clearer: I can’t find a copy of Ait-Kaci’s thesis, except behind a paywall.

            Based on Götz and Meurers’s summary, what I’m proposing is different. I have also just had a brief look at Carpenter’s book, and the account of junk slots seems to rely on feature structures being fully sort-resolved. The Delph-in formalism doesn’t do this.

            In the abstract of Ait-Kaci’s thesis, it says that it proposes “a model of computation which amounts to solving systems of simultaneous equations in a lattice of types”. I can see that this makes sense when all structures must be fully resolved. I am proposing a model of computation which amounts to unification. Feature structures are typed but can be underspecified (resolution is not necessary). The type hierarchy and type constraints together define the operations. One feature structure holds the operation (formalised as a computation type) and the other feature structure holds the input data. A computation type is always a subtype of a data type. Unifying the two structures gives the output, which is stored in a feature.

            In the example in our talk, one feature structure is simply `list-copy` (this is the computation type), while the other feature structure is a list (this is the input data). Unifying the two structures gives an output, which is stored under two features, NEW-LIST and END-LIST. (For the sake of uniformity so that the output is always under one feature, these paths could be changed to RESULT.LIST and RESULT.LAST, where RESULT is of type `diff-list`.)

            Carpenter describes junk slots as providing a “workspace”. With computation types, the type hierarchy itself provides the workspace. The only features are for the input and the output, and no junk slots are necessary.

          • Having now had time to carefully go over Carpenter’s account of list appends using junk slots following Aït-Kaci (1984), I can say with certainty that it is a distinct mechanism.

            Both involve recursive constraints, but they are in different places. With junk slots there is a recursion of the JUNK feature. With computation types, there is recursion in the actual data structure — in the case of list appends, recursion in the list itself. As a computation type is a subtype of a data type, recursion of the data type directly induces recursion of the computation type.

  2. Hi! Thank you for the talk!
    I would like to understand what makes you assume that the apparent in situ occurrence of “gde” in one of the last slides is extracted? Is it because it occurs preverbally? Is this an unbounded extraction? (“he where_i thinks [that you work _i]]”)

    You also have multiple fronting of n-words (“nobody, nothing”, …) in Russian, right? Would these be dealt with with the same mechanism? Could you have one fronted wh-adverb and one fronted n-adverb? (“where who never wanted to work?”)

    • Thank you, Manfred, for watching and for the comment!

      (I will just post some data for now and will respond either during the QA or later).

      (1) On GDE dumaet chto ty rabotaesh?
      He where thinks that you work
      “What does he think is the place where you work?” OR “What is the place in which he is doing the thinking about the fact that you are working?

      (2) ?Gde kto nikogda ne hotel rabotat
      where who never not want to.work
      A question about the set of places and people such that each person never wanted to work in one of the places. A weird question. So the sentence does not sound great at all, but I think maybe it is possible.

      (3) Nikto nikogda nigde nikomu nichego ne sdelal.
      noone never nowhere to.noone nothing not did
      “Nobody has ever done anything to anyone anywhere”.

      Thanks for making me think about (3); there it is very clear that never and nowhere can go together without any coordination or anything. They sound much better together than “??where when”, in fact they sound absolutely fine. But is that fronting? Maybe, although, unlike wh-words, no-words can be very naturally sentence-final, for example.

      See you at the QA!

    • I did not have time to address this during QA, so I will post some answers here.

      (1) ty gde rabotaesh?
      2sg where work
      "Where do you work?" [rus]

      What makes me think *where* in (1) may not be in situ is basically the following: (i) analyzing it as in situ and therefore allowing head-final declarative rules to take wh words results in ambiguity which then needs to be fixed using additional features, and there are always new problems that I keep finding, which seem to require more and more additional patches; (ii) the original G&S analysis, even though it was for English, was intended to generalize to other languages, and for that analysis I think it is generally important that wh words do not show up in the nonhead daughter positions of declarative head-final rules; (iii) the minimalist literature insists that there is no such thing as optional fronting (see Bovskovic 2002, stjepanovic 2000, Mivsmavs 2015 and Bailyn 2015, inter alia). That is from the formal literature perspective. As for the data perspective, in (1) the 2SG pronoun is most certainly topic, so, I could easily see how it would make sense to topicalize it with an appropriate filler-head rule. But like I said, this is a topic for future exploration for me :).

      • I also do not think that the no-words are extracted and fronted:

        (2) *Nikto nikomu Ivan dumaet chto ne otdal deneg
        noone to.noone Ivan thinks that not gave money
        Intended: “Ivan thinks that nobody gave money back to anyone”

        So, in (3), this must just be Russian flexible word order.

  3. Hi, many thanks for an interesting talk!

    As to adjunct extraction, I claimed once (in my first paper ever, in 1994) that Polish has exactly the same constraint, i.e. that it does not allow for multiple adjunct extraction. But I don’t think this is the right characterisation of the data – I think the constraint is on the inherent character of extracted phrases, not on their grammatical function status. This can be seen when one looks at verbs which take “adjunct-like” arguments, e.g., manner phrases (as in the case of verbs such as BEHAVE, WORD or TREAT) or locative phrases (as in RESIDE). Then, when one extracts an argument like that and a true adjunct (assuming one believes int he argument/adjunct distinction – I don’t), the result is also downgraded, e.g. (in Polish; RM = reflexive marker):

    ?? Kiedy jak się zachowywał? (Polish)
    When how RM behaved.3SG.M
    ‘How has he behaved when?’

    The other thing is that, whatever the nature of this constraint, it is not a constraint on multiple extraction, but on multiple filler realisation. When you extract multiple adjuncts and express them via a single coordinated filler, then the result is grammatical, also in Russian, I believe:

    (4′) Kogda i gde my kupili… (Russian, based on your (4))
    when and where we bought…

    So it would be nice to see how your analysis extends to such constructions, and how this compares to the earlier HPSG analysis of such facts by Chaves and Paperno 2007 (and then how to reconcile this analysis with the fact that such lexico-semantic coordination – what they call hybrid coordination – is possible not only with extracted wh-elements, but also with in situ n-words and other series of pronominal quantifiers).

    Anyway, apologies for the lengthy comment and many thanks for this talk! (Also for the technical details about list append without relations or difference lists!)

    • Thank you for watching, Adam, and for the comment!

      I will just post some data now, or rather, I will confirm (4′): (4′) is very natural in Russian.

      I also wanted to post this example which I came across on the web:

      (0) Ya ne znaju, gde kogda stavit zapyatye
      I not know where when put commas
      “I don’t know where and when to put commas”

      So, “where when” does exist, actually, at least judging from that example. But I think it is rare, and furthermore, I think in (0), the person does not actually mean “where and when” literally; I think they mean they do not know the rules of punctuation, so, it is more like “I don’t know how to use commas” kind of thing. There is also pseudocoordination sometimes (augmentative?)…

  4. Very interesting talk. So if I understood correctly, the DELPH-IN English grammar does not handle multiple FGDs, such as:

    a. [Which problem]j don’t you know [who]i to talk to _i about _j?
    b. Robin is someone [who]j I never know [what]i to say _i to _j.

    Was this type of data that you mentioned you tested the English version of your rule / append on?

    Second question: do you think that your account could scale to what I and Denis Paperno called Russian “hybrid coordination”?
    https://web.stanford.edu/group/cslipublications/cslipublications/HPSG/2007/chaves-paperno.pdf
    This is of course present in a number of slavic languages, and seems to be restricted in unusual ways, but I was wondering if you’d need a completely different mechanism, or you could perhaps just expand what you have.

    • There are two Delph-in English grammars worth mentioning here — the English Resource Grammar (ERG), and a smaller English grammar available in the Grammar Matrix. As far as I’m aware, neither deals with multiple gaps like the examples you’ve given. The ERG doesn’t use append-lists, and I don’t think the Grammar Matrix covers enough phenomena for these examples to be parsed out of the box (although the Grammar Matrix is constantly expanding and perhaps Olga will correct me on this!).

      I think you’re referring to the Grammar-Matrix-derived grammar, which I modified to test list appends? This was to test that the appends were working as I expected them to, and I was mainly looking at the semantic lists RELS, HCONS, and ICONS (which are simpler than the SLASH lists because you never take anything off). Olga then incorporated append lists into the Grammar Matrix itself. English is also one of the languages in her test suite for constituent questions, and perhaps she can say more about which phenomena are included and what is currently covered.

      A different mechanism wouldn’t be needed for hybrid coordination, but that doesn’t mean I think it would be easy! Coordination is already complicated, and I think the challenge would be a linguistic one (capturing the restrictions you mention) rather than a technical one (modifying our filler-gap rule so that multiple elements are taken off the SLASH list).

Leave a Reply to Manfred Sailer Cancel reply

Your email address will not be published. Required fields are marked *