Deliberate
mispronunciation in EFL e-dictionaries: integrating PDI with TTS
Włodzimierz
Sobkowiak
PLM'2007
Contents:
1. TTS for
EFL?
(a)
feasible?
(b)
examples
(c)
simulating a foreign accent with TTS?
2. PDI – predicted difficulty
3. The 20 words – perceived
difficulty
4. The 20 words – attested difficulty
5. PDI TTS in EFL MRDs
6. Synthesizing
examples/definitions?
Abstract
The
Phonetic Difficulty Index (PDI) is a quantitative/qualitative measure of word
pronouncing difficulty to L1 learners of a given L2. Specifically, in its current implementation (see http://ifa.amu.edu.pl/~swlodek/public.htm
for bibliography), it assigns numerical (0-10 range) and difficulty (57 pronouncing
problems) Polglish-sensitive tags to an English word-list or text. The range of applications of the current
version of PDI extends from evaluation of pedagogical materials, such as texts,
word-lists, dictionaries, etc., in terms of phonetic difficulty, to generation
of word-lists meeting user-specified phonetic criteria for teaching, learning,
testing and materials preparation.
One application of PDI which has not so far
been considered is in modeling learners' pronunciation of English lexical items
through deliberately mispronouncing e-dictionary entries in ways characteristic of the given
L1, in this case – Polish, or, more accurately, Polglish, i.e. the
Polish-English interlanguage of Polish learners of English as a foreign
language (EFL). The rationale of this
project is as follows. EFL learners often have problems
perceiving the phonetic difference between their 'accented' pronunciation of a
given lexical item and the native speaker model. The modern techniques offered by
contemporary e‑dictionaries of allowing the learner to record his/her
pronunciation to compare audially or visually with the recorded native model
may not work in this situation. Demonstrating an actual Polglish
mispronunciation of the word alongside the correct native version, spoken in
the same voice and keeping all the other phonetic variables constant, might
be more useful. This has not
been feasible so far in e‑dictionaries: no professional native English
speaker could be expected to persuasively mimic Polglish mispronunciation, not
to mention the cost of such a procedure.
With PDI and Text-to-Speech synthesis (TTS) we have the two key
technologies to make such believable mispronunciations possible. PDI identifies for each lexical entry in an
e‑dictionary expected Polglish mispronunciations, generates a
mispronounced phonetic representation in the orthographic or transcriptional
form, and passes it on to the TTS module for conversion into audio. The model and the mispronunciation can now
be audially produced on the fly, with no need for prior recording with human
speakers. The exact mispronunciation
can be controlled down to minute phonetic detail to suit the proficiency level
and phonetic idiosyncracies of the user (as constructed by the user-modeling
component of the dictionary) or the pedagogical agenda of the learner/teacher
(for example, the amount of final obstruent voicing in English can be
exaggerated).
1. TTS
for EFL?
"Not
only is (top-quality) synthesized speech intelligible and natural, but it can
also actually function as a model of pronunciation. For example, Filoglossia, a CALL package with (modern)
Greek as a foreign language, already employs TTS synthesis: http://www.ilsp.gr/filoglossia_plus_eng.html,
and WordPilot from http://www.compulang.com/,
also has this feature" (Sobkowiak 2003).
"ScanSoft's
RealSpeak™ Word uses a ground-breaking approach to text-to-speech to
achieve superb quality speech
output from a dictionary of words and idioms, allowing language learners
to hear how words should be accurately pronounced" (http://www.nuance.com/realspeak/word/).
"TTS
applications can render many benefits to EFL students while making teachers job
easier. I have found that my students have improved their pronunciation since I
started using them in my classes, not to mention that they have become more
autonomous" (González 2007).
› › ›
Examples of
ScanSoft and IvoSoftware synthesis (web downloads):
My name
is Radek. I welcome all present in hall
C-1 at professor Sobkoviak's lecture.
My task is convincing you about the high level of * speech synthesis.
"My name is Radek"
from ScanSoft (http://www.scansoft.com/realspeak/demo)
"My name is Radek"
from IvoSoftware (http://www.ivosoftware.com/ivonaonline.php)
"Nazywam się
Radek" from ScanSoft (http://www.scansoft.com/realspeak/demo)
"Nazywam się Radek" from IvoSoftware (http://www.ivosoftware.com/ivonaonline.php)
› › ›
"Simulating a foreign accent of English by computer for
didactic purposes is not a new idea.
In 1997 Hyouk-Keun Kim created his Korean Accented English
Pronunciation Simulator (http://english-korean.net/kaeps/index.html),
rightly noticing that "Most adult ESL/EFL learners [...] do not recognize
the problems of their English pronunciation", and that it might be a good
idea to demonstrate these under computer control. Eventually a rule-based KAEPS system was set up, simulating
"three types of English pronunciations in the IPA symbols: 1) a
phoneme-based English pronunciation, 2) a desirable allophone-based American
English pronunciation, and 3) a possible Korean accented English
pronunciation". While Kim's system has never
advanced beyond accented graphemic (i.e. IPA) representation, it would
be easy enough to attach the IPA-to-speech engine to it. After all, most TTS systems use phonetic
transcription at some stage of the synthesis process [...] An L1-sensitive TTS
system would be able to dynamically adjust its parameters to realistically
simulate spoken Polglish at these various stages of proficiency" (Sobkowiak
2003).
› › ›
Proviso: "Deliberate
mispronunciation" is ambiguous:
·
There
are 798 Google hits with this phrase, two <in title>: my paper and the
following: "Try
saying words in a way which will help you remember the way they are spelt. E.g.
Wednesday say as Wed-nes-day, friend say as fry-end,
people say as pea-op-le.
If there is a word you always struggle with, try this method! (http://www.school-portal.co.uk/GroupDownloadFile.asp?GroupId=81173&ResourceId=454160).
· Gahmen - Deliberate mispronunciation of the word
"government". Used as a substitute for the actual word especially
when criticising the government in written form to prevent possible sanctions
against the author (http://en.wikipedia.org/wiki/Singlish_vocabulary).
· Verbage - /ver'b*j/ n. A deliberate
misspelling and mispronunciation of {verbiage} that assimilates it to the word
'garbage'. More pejorative than 'verbiage' (http://www.anvari.org/fortune/Fortune_Big_T/2528_verbage-ver-b-j-n-a-deliberate-misspelling-and-mispronunciation-of-verbiage-that-assimilates-it-to-the-word-garbage.html).
2. PDI –
predicted difficulty
The
Phonetic Difficulty Index (PDI) is a quantitative/qualitative measure of word
pronouncing difficulty to L1 learners of a given L2. Specifically, in its current implementation (see http://ifa.amu.edu.pl/~swlodek/public.htm
for bibliography), it assigns numerical (0-10 range) and difficulty (57
pronouncing problems) Polglish-sensitive tags to an English word-list or text.
Table 1. Example PDI codes with likely Polglish
errors
phonetic difficulty code (PDI code) |
likely Polglish error |
b – <ur> in word |
schwa, r? |
s – <age_> in stem and not eɪdʒ_ |
eɪdʒ |
w – <ey_> in stem and not eɪ_ |
eɪ |
B – eə |
j breaking, smoothing, schwa |
E – ʌ |
Polish a |
J – short schwa |
schwa quality |
L – voiced apico-dental |
d, z, v |
N – final voiced obstruent |
devoicing |
O – pre-voiced dɪs- or mɪs- |
z |
Q – vowel overnasalization |
Polish-like fully nasal vowels |
V – glottal fricative h |
Polish velar fricative x |
X – word-final syllabic sonorants |
schwa insertion |
2 – more than 5 syllables |
stress and articulation problems |
7 – <ary_>/<ory_>/<ery_> in
bisyllabic-plus stems |
stress, vowel quality |
9 – proper noun |
graphophonemically irregular |
3. The 20 words – perceived difficulty (Sobkowiak
2000)
In the
study originally written in 2000, and published on the web (http://ifa.amu.edu.pl/~swlodek/diffind2.pdf),
208 Polish students of English philology filled in a questionnaire concerning
the perceived phonetic difficulty of twenty English words stratified on two
dimensions: (a) a-priori rule-based assessment of phonetic difficulty and (b)
word frequency rank. The words were, alphabetically:
almost, appear, author, awkward, belief, carry,
coloured, debate, defect, dissolve, kingdom, mother, oblige, relax, server,
southern, survive, taxi, tired, youngster. A two-way ANOVA confirmed the significance of both main effects and
their synergetic interaction, i.e. the perceived difficulty rating was affected
by both the word's rule-based difficulty index and its frequency independently,
as well as by their product.
4. The
20 words – attested difficulty (Sobkowiak and Ferlacka 2006)
In a recent
study, Sobkowiak and Ferlacka (2006)
tried to "calibrate the Phonetic Difficulty Index"
empirically. The twenty English words of Sobkowiak 2000 were read in carrier
sentences by 38 Polish learners of English aged 17-18. The sentences
were definitions taken from the Macmillan English Dictionary for Advanced
Learners', first edition (MEDAL1). A total of 617 word-readings yielded 1211
errors, for the grand mean of 1.96 phonetic errors per reader per word. The primary aim of that experiment was to
verify empirically the intuitively arrived at lexico-phonetic difficulty
judgements encapsulated in the PDI.
Predictably, the PDI phonolapsological intuitions turned out to be taken
from the academic EFL context, and as such showed little correlation with the
actual errors made by Polish schoolchildren.
A sample of 5 sentences/definitions from one learner (keywords bolded):
· Melt - if you melt into or
against someone you relax as they hold you close in a
romantic way.
· Hail - to signal a taxi or bus so that it stops for you.
· Defect - a fault in someone or something.
5. PDI
TTS in EFL MRDs
Speech synthesis mechanisms can be
tweaked to produce human-sounding audio output of an arbitrary
phonemic/allophonic string, including deliberate mispronunciations illustrating
selected interlanguage features. These
can then be offered to the EFL e-dictionary user, suitably adjusted to their
needs and wants. In Table 2 two
mispronunciation versions are given, one containing the PDI-predicted error(s),
the other showing the most common of the actually attested errors in Sobkowiak
& Ferlacka study of 2006 (the actual transcription coding was made by
Ferlacka). I am grateful to Mr. Dawid Pietrala for
tweaking the Festival speech synthesis system to obtain phonolapsologically
accented 'Polglish' speech. All sounds
in the 'error' columns are to be interpreted as having basically Polish
qualities, e.g. /a/ is Polish /a/, similarly for other vowels and consonants,
e.g. /dʒ/ or /tʃ/.
Spelling pronunciation and heavy phonetic transfer from Polish are
obvious. Most phonetically interesting
examples are shaded.
|
word |
correct
(Festival) |
PDI-predicted
error |
most
common attested error |
1. |
almost |
|||
2. |
appear |
|||
3. |
author |
|||
4. |
awkward |
|||
5. |
belief |
--- |
||
6. |
carry |
--- |
||
7. |
coloured |
|||
8. |
debate |
--- |
||
9. |
defect |
--- |
||
10.
|
dissolve |
|||
11.
|
kingdom |
|||
12.
|
mother |
|||
13.
|
oblige |
|||
14.
|
relax |
--- |
||
15.
|
server |
|||
16.
|
southern |
|||
17.
|
survive |
|||
18.
|
taxi |
--- |
||
19.
|
tired |
|||
20.
|
youngster |
› › ›
Figure 1. LDOCE3 audio playback screen: "Play Polglish
pronunciation"?
6. Synthesizing
examples/definitions?
"The first ideas to audiolize
EFL e‑dictionary examples (but not definitions!), for instance, appeared
long ago. In an overview of electronic
learners' dictionaries, published in 1997, Perry dreamed: "Not only could
the pronunciation of headwords and derivatives be given, but also the use of
sound could be extended to cover some of the usage examples". With the recent introduction of recorded
audio example sentences in LDOCE4 (http://www.longman.com/ldoce/about.html)
there may be a distant glimmer of hope" (Sobkowiak 2006:81).
Some ScanSoft examples of
synthesized MEDAL1 definitions:
·
Youngster - a child
or a young person.
·
Run-down - so tired
that you do not feel well.
·
Oblige – to help
someone by doing something that they have asked you to do.
·
Double cream - thick cream
that becomes almost solid when you mix it quickly.
·
Hail - to signal a taxi or bus so that it
stops for you.
·
Pallbearer - someone who helps
to carry a coffin at a funeral.
Bibliography
González,D. 2007. "Text-to-speech applications used in EFL contexts to enhance pronunciation". TESL-EJ 11.2.
Sobkowiak,W.
2000. "Rule-based and
empirical rating of perceived phonetic difficulty of English words according to
Polish learners: does frequency matter?" [published electronically: http://ifa.amu.edu.pl/~swlodek/diffind2.pdf.
Sobkowiak,W. 2003. "TTS in EFL
CALL - some pedagogical considerations". Teaching English with
Technology 3.4.
Sobkowiak,W. 2006. Phonetics of
EFL dictionary definitions. Poznań: Wydawnictwo Poznańskie. (abstract here)
Sobkowiak,W.
& W.Ferlacka. 2006. "Calibrating
the Phonetic Difficulty Index". In W.Sobkowiak &
E.Waniek-Klimczak (eds). 2006. Dydaktyka fonetyki języka obcego w Polsce.
Konin: PWSZ w Koninie. 173-187.
(Proceedings of the Phonetics in FLT 6 Conference in Mikorzyn,
8-10.5.2006; abstract here)