Kopfbild

Corpora and Syntax

Author: Detmar Meurers and Stefan Müller

Key Words: Corpus, Syntax, Annotation, Extraposition, Subjacency, Particle Verb, Fronting

This paper appeared 2009 in Anke Lüdeling und Merja Kytö (Hgg): Corpus Linguistics. An International Handbook, Handbücher zur Sprach- und Kommunikationswissenschaft, volume 29.2, Berlin: Mouton de Gruyter Verlag, pages 920–933.

Syntactic analysis connects empirical observations about language with theoretical generalizations and explanations. Depending on the perspective of the framework or individual researcher, syntactic research has emphasized the empirical or the theoretical aspect of the enterprise; but independent of the philosophical dispute between empiricism and rationalism about the nature of the connection between data and knowledge (cf., e.g., Markie, 2004), it is clear that neither aspect exists entirely without the other: observation of data is shaped by prior experience and current research questions, and data is needed for establishing or falsifying a theory. Leaving the philosophical dispute aside, we can thus ask how one can obtain data that is relevant for a particular theoretical issue. We address this question in this article by discussing how electronic corpora can be used in support of the creation and falsification of syntactic theories.

The paper provides three linguistic case studies exemplifying the use of a treebank for syntactic research. We discuss three phenomena of general interest for the architecture of grammar and show that a thorough empirical base is important for defining basic terms (particle verb, Satzglied), for constructing new linguistic analyses, and for constructing arguments to support or refute existing theories. We focus on the question how to find the relevant data in corpora and organize the discussion based on an increasing complexity of the query that is needed to obtain the desired types of examples.

The phenomena are:

  • Extraposition from NPs and Subjacency
  • possible positions of particles of particle verbs
  • frontings as a constituent test

Draft of November 09, 2013:

See also Introspection vs. Corpus.