Corpora and norms: methodological problems


Truus De Wilde

Ghent University


Not all facts encountered in a language can be explained according to the rules of the language. To give an example: there is no rule for the formation of Polish diminutives, they are rather formed according to "tradition". There are no firm rules fixing what suffix is to be used with what stem. Another example is word formation in Dutch. Compounding is a productive procedure, and in this language (like in German) compounds can easily be made ad hoc. It would be very difficult to give rules that make explicit which compounds you can make and which ones are impossible, and even more so when (and when not) they can be formed.


Simply to say that a diminutive or a compound is made according to "tradition" is, however, rather vague and unsatisfying. There is a need for a category of linguistic occurrences that fall between the systematic, abstract Saussurean langue and the concrete, possibly asystematic parole, a category that contains facts that are simply 'normal', i.e. neither strictly rule-governed, nor purely ad hoc. The gap between langue and parole could be filled with the category 'norm'. My research aims at reintroducing this category in linguistics, using the theory of Eugenio Coseriu, and at refining it.


My research aims at showing the interest of norms for current linguistic research and finding out how Polish second-language learners use the principles of compounding in Dutch. A main instrument in this research will be corpora. In this presentation I would like to clarify how corpus linguistics can be a crucial helping step towards norm linguistics, and what shortcomings corpora have.


I argue that corpora help to investigate what norms are, since 'normal' occurrences are obviously frequent. Frequency can be easily traced in corpora; the frequency of compounds in a corpus of speech of native speakers can, e.g., be compared with that of second-language learners. However, frequency is not the only parameter for active norms. And here the problem of working exclusively with corpora occurs. It is indeed also important to measure creativity in making compounds by natives and second-language learners. To find out whether second-language learners apply norms also in new contexts, and if they do this in a similar way as natives, is crucial in the evaluation of norm usage. As will be shown, corpora might fail to give us this information.




Albrecht, Jorn et al. (ed) (1988). Energeia und Ergon. Sprachliche Variation - Sprachgeschichte – Sprachtypologie. Studia in honorem Eugenio Coseriu. Tübingen: Narr.


Altenberg, Bengt and Sylviane Granger (ed.) (2002). Lexis in contrast : corpus-based approaches. Amsterdam: Benjamins.


Booij, G.E and A. van Santen (1998). Morfologie. De woordstructuur van het Nederlands.

Amsterdam: University Press.


Coseriu, Eugenio (1975). Sprachtheorie und allgemeine Sprachwissenschaft. 5 Studien. München: Wilhelm Fink.


Granger, Sylviane et al. (ed.) (2003). Corpus-based approaches to contrastive linguistics and translation studies. Amsterdam: Rodopi.


Grzegorczykowa, Renata, R. Laskowski and H. Wróbel (eds) (1998). Morfologia, Warszawa: PWN.


Gumperz, John and Dell Hymes (eds) (1972). Directions in sociolinguistics. The ethnography of communication. New York: Holt, Rinehart and Winston.


Klimaszewska, Zofia (1983). Diminutive und augmentative Ausdrucksmöglichkeiten des Niederländischen, Deutschen und Polnischen. Eine konfrontative Darstellung. Wroc³aw: Zaklad narodowy imienia Ossolinskich.


Morciniec, Norbert (1964). Die nominalen Wortzusammensetzungen in den westgermanischen Sprachen. Wroc³aw.


Naumann, Bernd (2000). Einführung in die Wortbildungslehre des Deutschen. Tübingen, Niemeyer.


Myers-Scotton, Carol (ed.) (1998). Codes and consequences. Choosing linguistic varieties. New York, Oxford: Oxford University Press.


Pounder, Amanda (2000). Processes and Paradigms in Word-Formation Morphology.” Berlin, New York: Mouton de Gruyter.


Stekauer, Pavol (1998). An onomasiological theory of English word-formation. Amsterdam: Benjamins.


Silverstein, Michael (2003). Indexical order and the dialectics of sociolinguistic life. Language and Communication 23, 193-229.


Home | Abstracts