The following tagset was used
to tag PICLE and other corpora.
The accuracy of the assignment of especially the features may
vary, but the major wordclasses have been largely verified.
The full version of the TOSCA-ICLE Tagging Manual (from which the
following is an excerpt) may be requested by e-mail (my
discretion).
Przemysław Kaszubski
December 02, 2003
---------------------
TOSCA-ICLE Tagset
I. FULL INVENTORY OF TOSCA-ICLE TAGS
Most tags take the form of WORDCLASS(feature1,feature2,...)
Features are described in more detail here.
The following list of 220 possible tag combinations does not contain ditto tags. In principle all multi-token units can be discontinuous. For this reason the feature ‘disc’ can occur in all ditto tags. As ditto tags do not occur in the list, tags with the discontinuity feature are absent, too.
1. Adjective
ADJ(ge,comp)
ADJ(ge,pos,edp)
ADJ(ge,pos,ingp)
ADJ(ge,pos)
ADJ(ge,sup)
ADJ(ord)
ADJ(ord,nomplu)
2. Adverb
ADV(connec)
ADV(ge,pos)
ADV(ge,comp)
ADV(ge,sup)
ADV(neg)
ADV(phras)
ADV(wh)
3. Article
ART(def)
ART(indef)
4. Conjunction
CONJUNC(coord)
CONJUNC(subord)
5. Existential 'There'
EXTHERE
6. Genitive marker
GENM
7. Miscellaneous
MISC(discourse)
MISC(foreign)
MISC(interjec)
MISC(suffix)
MISC(prefix)
8. Noun
N(plu,collect)
N(sing,collect)
N(number)
N(plu)
N(sing)
9. Nominal adjective
NADJ(comp)
NADJ(pos,edp)
NADJ(pos,ingp)
NADJ(pos)
NADJ(sup)
10. Numeral
NUM(card,plu)
NUM(card,sing)
NUM(frac,plu)
NUM(frac,sing)
NUM(hyph,plu)
NUM(hyph,sing)
NUM(mult)
NUM(ord,plu)
NUM(ord,sing)
11. Preposition
PREP(ge)
PREP(phras)
12. Proform
PROFM(conj)
PROFM(one,sing)
PROFM(one,plu)
PROFM(so,phrase)
PROFM(so,clause)
13. Pronoun
PRON(antit)
PRON(antit,procl)
PRON(ass)
PRON(cleft)
PRON(cleft,procl)
PRON(dem,number)
PRON(dem,plu)
PRON(dem,sing)
PRON(exclam)
PRON(inter)
PRON(inter,poss)
PRON(neg)
PRON(nonass)
PRON(nomposs,number)
PRON(nomposs,plu)
PRON(nomposs,sing)
PRON(one)
PRON(pers,number)
PRON(pers,plu)
PRON(pers,plu,encl)
PRON(pers,sing)
PRON(pers,sing,procl)
PRON(poss,number)
PRON(poss,plu)
PRON(poss,sing)
PRON(quant)
PRON(recip)
PRON(rel)
PRON(rel,poss)
PRON(self,plu)
PRON(self,sing)
PRON(such)
PRON(univ)
14. Particle
PRTCL(for)
PRTCL(to)
PRTCL(with)
15. Punctuation
PUNC(cbrack)
PUNC(colon)
PUNC(comma)
PUNC(cquo)
PUNC(dash)
PUNC(ellip)
PUNC(exm)
PUNC(obrack)
PUNC(oquo)
PUNC(other)
PUNC(per)
PUNC(qm)
PUNC(scolon)
16. Verb
16a. Auxiliary
VB(aux,do,imp)
VB(aux,do,imp,neg)
VB(aux,do,past)
VB(aux,do,past,neg)
VB(aux,do,pres)
VB(aux,do,pres,encl)
VB(aux,do,pres,neg)
VB(aux,do,pres,procl)
VB(aux,modal,edp)
VB(aux,modal,infin)
VB(aux,modal,ingp)
VB(aux,modal,past)
VB(aux,modal,past,encl)
VB(aux,modal,past,neg)
VB(aux,modal,pres)
VB(aux,modal,pres,encl)
VB(aux,modal,pres,neg)
VB(aux,modal,subjun
VB(aux,modal,subjun,neg)
VB(aux,pass,edp)
VB(aux,pass,imp)
VB(aux,pass,infin)
VB(aux,pass,ingp)
VB(aux,pass,past)
VB(aux,pass,past,neg)
VB(aux,pass,pres)
VB(aux,pass,pres,encl)
VB(aux,pass,pres,neg)
VB(aux,pass,subjun)
VB(aux,pass,subjun,neg)
VB(aux,perf,edp)
VB(aux,perf,infin)
VB(aux,perf,infin,encl)
VB(aux,perf,ingp)
VB(aux,perf,past)
VB(aux,perf,past,encl)
VB(aux,perf,past,neg)
VB(aux,perf,pres)
VB(aux,perf,pres,encl)
VB(aux,perf,pres,neg)
VB(aux,perf,subjun)
VB(aux,prog,edp)
VB(aux,prog,infin)
VB(aux,prog,past)
VB(aux,prog,past,neg)
VB(aux,prog,pres)
VB(aux,prog,pres,encl)
VB(aux,prog,pres,neg)
VB(aux,prog,subjun)
VB(aux,prog,subjun,neg)
VB(aux,semi,edp)
VB(aux,semi,imp)
VB(aux,semi,infin)
VB(aux,semi,ingp)
VB(aux,semi,past)
VB(aux,semi,past,neg)
VB(aux,semi,pres)
VB(aux,semi,pres,ellipt)
VB(aux,semi,pres,encl)
VB(aux,semi,pres,neg)
VB(aux,semi,subjun)
VB(aux,semip,edp)
VB(aux,semip,imp)
VB(aux,semip,infin)
VB(aux,semip,ingp)
VB(aux,semip,past)
VB(aux,semip,pres)
VB(aux,semip,subjun)
16b. Lexical verb
VB(lex,cop,edp)
VB(lex,cop,imp)
VB(lex,cop,infin)
VB(lex,cop,ingp)
VB(lex,cop,past)
VB(lex,cop,past,neg)
VB(lex,cop,pres)
VB(lex,cop,pres,encl)
VB(lex,cop,pres,neg)
VB(lex,cop,subjun)
VB(lex,cop,subjun,neg)
VB(lex,cxtr,edp)
VB(lex,cxtr,imp)
VB(lex,cxtr,infin)
VB(lex,cxtr,ingp)
VB(lex,cxtr,past)
VB(lex,cxtr,pres)
VB(lex,cxtr,subjun)
VB(lex,ditr,edp)
VB(lex,ditr,imp)
VB(lex,ditr,infin)
VB(lex,ditr,ingp)
VB(lex,ditr,past)
VB(lex,ditr,pres)
VB(lex,ditr,subjun)
VB(lex,dimontr,edp)
VB(lex,dimontr,imp)
VB(lex,dimontr,infin)
VB(lex,dimontr,ingp)
VB(lex,dimontr,past)
VB(lex,dimontr,pres)
VB(lex,dimontr,subjun)
VB(lex,intr,edp)
VB(lex,intr,imp)
VB(lex,intr,infin)
VB(lex,intr,ingp)
VB(lex,intr,past)
VB(lex,intr,past,neg)
VB(lex,intr,pres)
VB(lex,intr,pres,encl)
VB(lex,intr,pres,neg)
VB(lex,intr,subjun)
VB(lex,montr,edp)
VB(lex,montr,imp)
VB(lex,montr,infin)
VB(lex,montr,ingp)
VB(lex,montr,past)
VB(lex,montr,past,encl)
VB(lex,montr,past,neg)
VB(lex,montr,pres)
VB(lex,montr,pres,encl)
VB(lex,montr,pres,neg)
VB(lex,montr,subjun)
17. Tags for extra-textual material
MARKUP
UNTAG [!used by me to tag nominal gerunds, PK!]
Features (90) appear in brackets after the wordclass label, in various sequences -- click here for the full list of tags.)
antit anticipatory it PRON
ass assertive PRON
aux auxiliary VB
card cardinal NUM
cbrack closing bracket PUNC
clause clause PROFM
cleft cleft it PRON
collect collective N
colon colon PUNC
comma comma PUNC
comp comparative ADJ; ADV; NADJ
conj conjoin PROFM
connec connective ADV
coord coordinating CONJUNC
cop copula VB
cquo closing quote PUNC
cxtr complex transitive VB
dash dash PUNC
def definitive ART
dem demonstrative PRON
dimontr dimono-transitive VB
discourse discourse MISC
ditr ditransitive VB
do do VB
edp -ed participle ADJ; NADJ; VB
ellip ellipsis PUNC
ellipt elliptical VB
encl enclitic PRON; VB
exclam exclamatory PRON
exm exclamation mark PUNC
for particle for PRTCL
foreign foreign MISC
frac fractional NUM
ge general ADJ; ADV; PREP
hyph hyphenated NUM
imp imperative VB
indef indefinite ART
infin infinitive VB
ingp -ing participle ADJ; NADJ; VB
inter interrogative PRON
interjec interjection MISC
intr intransitive VB
lex lexical VB
modal modal VB
montr monotransitive VB
mult multiplicative NUM
neg negative ADV; PRON; VB
nomplu plural nominal ADJ
nomposs nominal possessive PRON
nonass non-assertive PRON
number number N; PRON
obrack opening bracket PUNC
one one PROFM; PRON
oquo opening quote PUNC
ord ordinal ADJ NUM
other other PUNC
pass passive voice VB
past past tense VB
per period PUNC
perf perfective aspect VB
pers personal PRON
phras phrasal ADV; PREP
phrase phrase PROFM
plu plural N; NUM; PROFM; PRON
pos positive ADJ; ADV; NADJ
poss possessive PRON
prefix prefix MISC
pres present tense VB
procl proclitic PRON; VB
prog progressive aspect VB
qm question mark PUNC
quant quantifying PRON
recip reciprocal PRON
rel relative PRON
scolon semi colon PUNC
self -self / -selves PRON
semi semi VB
semip semi followed by -ing
participle VB
sing singular N; NUM; PROFM; PRON
so so PROFM
subjun subjunctive VB
subord subordinating CONJUNC
such such PRON
suffix suffix MISC
sup superlative ADJ; ADV; NADJ
to to PRTCL
univ universal PRON
wh wh- ADV
with with PRTCL