The English corpora
Variable 1: native/non-native |
Non-native English |
Native English |
|||||
Variable 2: Learner/non-learner |
‘apprentice’ corpora |
‘expert’ corpora |
|||||
Variable 3: (Presumed) Proficiency Level (label) |
1. Interm. |
2. Upp-Int. |
3. Advanced |
4. College |
5. Professional |
||
Brief description |
Polish intermediate EFL |
Spanish (upper-) intermediate EFL |
Belgian-French advanced EFL |
Polish advanced EFL |
British and American college learner English |
British academic writing |
British and American quality press |
Corpus label(s) used |
|||||||
Words (tokens) in corpus* |
92,712 |
94,965 |
101,442 |
107,990 |
106,255 |
97,914 |
94,421 |
The native Polish corpora
Corpus label |
Variable 2: Learner/non-learner |
Variable 3: Proficiency |
Brief description |
Tokens |
POL-STUD | ‘apprentice’ corpus | college | college compositions | 103,382 |
POL-EXP | ‘expert’ corpus | professional | academic papers + quality-press articles | 101,348 |
* The count taken with the WordList facility, part of the WordSmith 3.0 Tools package. Hyphenated words were programmed to count as one word.