New Article: Investigating English Language Teachers’ Beliefs and Stated Practices Regarding Bottom-up Processing Instruction for Listening in L2 English

An article based upon my MA dissertation has just been published in the Journal of Second Language Teaching & Research. It is Open Access so you can access the full text, but here is a summary a bit longer than the abstract.

Language learners face difficulties in parsing what they hear into a meaningful message. There are still gaps in SLA research about how we do this and about how it can be taught. There was nothing about bottom-up (phonology level, syllable-level) listening. There is not much research on what teachers do or say that they do, so I wanted find out about this.

Not much has changed since John Field (2008) said that “the Comprehension Approach” dominates how listening is taught. This was supported by Siegel (2014), who found that most teachers used comprehension-based activities.

Based on speech learning models (Best, 1995; Flege, 1995, 2007), it would be advisable to educate learners to discern the difference between sounds (phonemes) that form part of the language being learned, but which do not form part of their first language. Also, ongoing practice with variations in these sounds would be helpful.

With words and grammar, there might be a psychological process of cueing (Ellis, 2006), which might also explain lexical priming and collocation. However, making things stand out appears to be key. Making things stand out does not mean only teaching isolated, citation-form words because this does not always carry over to listening skill acquisition (Bonk, 2000; Joyce, 2013). Instead this need to be balanced with listening to natural connected speech.

I wanted to find out whether teachers taught learners to decode single words and phrases, connected speech and phonological differences between languages. I did this by questionnaire and asked people over Twitter. I analysed the data in JASP and did some explorations in the data.

There is not a total absence of bottom-up instruction. A lot of use of stresses corresponded with bottom up instruction. A minority of teachers in my sample used knowledge of phonology of their learners’ first languages. There is a correlation between using this knowledge and regular single sound (phoneme) and connected speech instruction. However, there is a reluctance among teachers to teach single sounds and words. However, it should be noted that this is a minority activity. Most teachers in the sample said they did not consider differences between first and second language phonology, are reluctant or do not regularly teach decoding of single words and , phrases, though connected speech may be taught slightly more regularly.

References (in this summary)

Best C. T. (1995) A direct realist view of cross-language speech perception, in Strange, W. (ed.) Speech Perception and Linguistic Experience. York Press. 171-206. Retrieved April 25th 2017 from

Bonk, W. J. (2000) Second Language Lexical Knowledge and Listening Comprehension, International Journal of Listening, 14:1, 14-31, DOI:10.1080/10904018.2000.10499033

Ellis, N. (2006) Language acquisition as rational contingency learning. Applied Linguistics 27, pp. 1-24

Field, J. (2008) Listening in the Language Classroom (ebook). Cambridge: CUP.

Flege, J. (1995) Second-language Speech Learning: Theory, Findings, and Problems. In Strange, W. (Ed) Speech Perception and Linguistic Experience: Issues in Cross-language research. Timonium, MD: York Press, pp. 229-273.

Flege, J. (2007) Language contact in bilingualism: Phonetic system interactions. In Cole, J. & Hualde, J. I. (Eds.), Laboratory Phonology 9. Berlin: Mouton de Gruyter, pp. 353-380.

JASP Team (2017). JASP (Version[Computer software]. Retrieved June 25th 2017 from

Joyce, P. (2013) Word Recognition Processing Efficiency as a Component of Second Language Listening, International Journal of Listening, 27:1, 13-24, DOI:10.1080/10904018.2013.732407

Siegel, J. (2014) Exploring L2 listening instruction: examinations of practice. ELT J 2014; 68 (1): 22-30. doi:10.1093/elt/cct058

Publication: Duoethnography of Two EFL Teachers Developing Their Own Classroom Teaching Materials

My colleague Jon Steven and I have just had our article Duoethnography of Two EFL Teachers Developing Their Own Classroom Teaching Materials published in The Language Scholar (online first and it will be in the Autumn 2020 issue).

We talk about the motivations behind us developing our own materials and some of the challenges of materials development. We cover working conditions briefly, commercial issues of coursebooks versus classroom issues of teaching, originality, marketability, guarding against precarity, values and skills we aim to develop in learners.

It is open access, so anyone can read it. I certainly look forward to any comments or questions (and would be happy to pass any on to Jon as well).

Surveying university English language instructors’ development provision and expenses in the shift to online teaching

Hello. I have been thinking about the current English language teaching situation and the shift to online instruction around the world due to the global COVID19 pandemic. If you are an English language instructor at a university, please consider taking the time to answer my questionnaire. I will write up a working paper based on the results within the month (with updates).

Link to Google Form.

First book chapter!

What seems like an age ago now but is likely less than a year ago, I was invited to write a chapter for a book on eikaiwa, or English conversation schools in Japan, edited by Daniel Hooper and Natasha Hashimoto:
Teacher Narratives from the Eikaiwa Classroom: Moving Beyond “McEnglish”. After reviews and rewrites it’s nearly ready. You can find the page for it here.

New Sounds 2019 Presentation

I gave a presentation at New Sounds 2019 at Waseda University on Saturday 31st August. The title was How Should We Approach Teaching to Facilitate Phoneme Acquisition in English as a Foreign Language? Here are the slides I used. I hope you find them useful.

Download PDF


How can we assess whether learners have generally acquired phonology?

I said, ideally we would run a test to see what phonemes can be perceived. Where this is not possible we have to talk to our learners and see whether they understand what we say. This is particularly important for connected speech. An observation in Bonk (2000) was that his learners knew all the CVC words in his list but had problems perceiving them in a stream of connected speech.

How can we avoid orthography?

It is difficult. It requires teacher autonomy, knowledgeable and willing administrators.


Bonk, W. (2000) Second Language Lexical Knowledge and Listening Comprehension, International Journal of Listening, 14(1) pp. 14-31. DOI: 10.1080/10904018.2000.10499033

References to the presentation are in the slide deck PDF.

There is a preprint to come. It will be linked here when it is moderated at PsyArxiv

Little Assistances

Here are some things that I have been doing during my MRes to save a bit of time and effort in the reading and research process.

Searchable Notebooks: Google Docs (electronic), Bullet Journal (paper)

Whenever I read a paper or a chapter, I log it in a Google Doc because I have an Android phone and tablet. I can make them available offline and can copy and paste bits from PDFs. I usually use header 1 with the field of study, header 2 with the full or partial APA citation and header 3 with the date and normal text is just notes. I often add comments for action items in there as well. Another good thing is that I can copy and paste things across if they’re relevant to another project.

However, I’ve also started being more consistent in my use of the Bullet Journal method for notetaking. The indexing and task management are really useful and it is great to get a sense of what has been done (and where to find the notes on that topic). The key is spending about ten minutes first thing in the morning and ten minutes in the evening to reviewing and sorting out your daily logs.

Reference Manager: Zotero

This is so useful, to search through all the PDFs and stuff in my Google Drive. Before that I was using the aforementioned Google Doc as a notebook. I tried several reference managers before with an older computer and all failed miserably to install properly, with the Word extension for Zotero going weird, but I assume that was my old laptop. I have since got a new laptop and Zotero works like a dream. Some people much prefer Mendeley but Zotero is open source so even if it stops being actively updated, it’s unlikely to become completely useless. The Firefox extension is really convenient and so is the desktop software’s magic wand button where it completes the metadata from just a DOI, PMID or Arxiv ID.

Jargon Cheat Sheet: Google Sheets

I decided to set up something a bit more systematic to help my poor memory cope with juggling several projects at once and the vast amount of abbreviations and jargon that linguistics can throw at us. I just started it today so it is definitely a work in progress. You could use Excel or something else, but Google Sheets is convenient for me just because I’m a cheapskate Android user.

I set up three columns in each sheet:

  1. Abbreviation/Jargon term
  2. (Sub)field (e.g. linguistics, neuroscience, psychology, SLA)
  3. Meaning (definition and/or example).

Hopefully it will be an easy search to find something I forgot about or want to confirm.

Notes on Construct Validity and Measurement in Applied Linguistics

This is intended primarily as a note for myself, and is very much a work in progress, but I thought that others might benefit. Also, if anyone commented, I would benefit. With the disclaimer out of the way, I will get to the point.

Basically, we have problems

In a pre-print, Flake & Fried (2019) make the point that measurement in psychology is very difficult to do in a valid way and, even worse, check the validity because of underreporting of decision-making processes among the researchers involved. The reason this matters is that psychology and its sub-disciplines heavily influence applied linguistics/SLA.

While psychology attempts to get through its replication crisis, the main ways for it to do so seem to be pre-registered studies and greater transparency in reporting them. Flake and Fried (2019) choose to look at “Questionable Measurement Practices (QMPs)” as opposed to “Questionable Research Practices” (Banks et al., 2016; John, Loewenstein, & Prelec, 2012 in Flake & Fried, ibid)). such as HARKing (hypothesising after results known) (Kerr, 1998 in Flake & Fried, ibid) and p-hacking (manipulating data so the p-value or probability that the hypothesis is validated by the results is due to chance is made smaller) (Head et al., 2015).

They go on to differentiate as follows:

“In the presence of QMPs, all four types of validity become difficult to evaluate… Statistical conclusion validity, which QRPs have largely focused on, captures whether conclusions from a statistical analysis are correct. It is difficult to evaluate when undisclosed measurement flexibility generates multiple comparisons in a statistical test, which could be exploited to obtain a desired result (i.e., QRPs). ”

(Flake & Fried, 2019, p.6-7)

Flake and Fried (2019) state that many of the QMPs are not carried out deliberately but a major problem is the lack of transparency in decisions made in the measurement process which reduces not only replicability but also the checking of validity.

They advocate answering the questions in a checklist (Flake & Fried, 2019, p. 9) to reduce the possibility of QMPs arising.

I am quite certain that a lot of applied linguistics masters-level students and above have seen articles where there are statistics reported but it is not clear why those particular statistics were chosen. Often these are blindly followed processes of running ANOVA or ANCOVA in SPSS software. I will go out on a limb and say that these problems are ignored as being simply how things are usually done.

However, how many of us have considered our controlled variables? For example, when running studies on phonological perception, are we explicit in the ranges of volume, fundamental frequency and formant frequency? Processing for noise reduction? I know I’ve seen studies that make claims for generalizability, not just exploratory or preliminary studies that do not control these. If you are going to make these claims, I think there should be greater controls than in a study that is primarily for oneself that you are sharing because it could be informative for others. Of course declaring the decision-making process and rationale ought to be necessary in both.

There’s an awful lot of talk about how language acquisition studies in classrooms are problematic due to individual differences being confounding. One way to increase the validity and generalisability is to be explicit in the choices made regarding measurement and variable choices.


I took part in a Google Hangout hosted by Julia Strand. Some of the ideas discussed over an hour have bound to have wormed their way in and mingled with my own.


Flake, J. K., & Fried, E. I. (2019, January 17). Measurement Schmeasurement: Questionable Measurement Practices and How to Avoid Them. Retrieved Jan 20th 2019 from .

Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M. D. (2015). The extent and consequences of p-hacking in science. PLoS biology, 13(3), e1002106. doi:10.1371/journal.pbio.1002106  . Retrieved Feb 1st  2019.

Plans for 2019

Thanks for dropping by. I ended 2018 with a last gasp submission to New Sounds 2019, which links to my MRes research project on phoneme acquisition. Hopefully I get accepted but it looks a lot more scientific than anything I have been a part of so far. It will be good to get out of my comfort zone somewhat, though.

I also have a full-time job to start in April, which I am looking forward to very much. I will be teaching first-year university students so I am looking at articles on the transition between high school and university. I also want my students to make the most of the new self-access centre thst will open at the university so I am also looking at self-access and autonomous learning, too. I am particularly interested in learners’ autonomous L2 listening, so hopefully I shall gain some more useful insights into this.

Further on into the year, I should be collecting data over a 13-week period. This should conclude the the bulk of the pre-writing of my MRes dissertation.

Other than that, I do not know other classes that I will teach as a part-time instructor at my part-time job but I foresee making at least one corpus and doing some more work on essay writing and managing learner expectations and enabling them to assess their own abilities more accurately.

New pre-print about corpus-informed teaching

I put up a new pre-print on SocArxiv:

Creating a small corpus to inform materials design in an ongoing English for Specialist Purposes (ESP) course for Orthodontists and Orthodontic Assistants

In my work as a language teacher to a group of orthodontists and orthodontic treatment assistants, I wanted an analysis of orthodontic practitioner-to-patient discourse. Because access to authentic spoken discourse was too difficult to attain due to ethical considerations, a small corpus was constructed in order to facilitate better informed form-focused instruction. Details of the typical forms found in the corpus are given, as is an overview of the corpus construction.

Rating learners’ pronunciation: how should it be done?

This goes into a bit more detail about phonetics than some people familiar with me might be comfortable with.

On Friday I went to Tokyo JALT’s monthly meeting (no link because I can’t find a permalink) to see three presentations on pronunciation (or more accurately, phonology, seeing as Alastair Graham-Marr covered both productive and receptive, listening skills). All three presenters, Kenichi Ohyama, Yukie Saito and Alastair Graham-Marr were interesting but there was one particular point that stuck with me from Yukie Saito’s presentation.

She was talking about rating pronunciation and how it had often been carried out by ‘native speaker’ raters. She also said that it was often carried out according to rater intuition on Likert scales of either ‘fluency’ (usually operating as speed of speech), ‘intelligibility’ (usually meaning phonemic conformity to a target community norm) or ‘comprehensibility’ (how easily raters understand speakers).

What else could work is something that needs to be answered, not only to make work done in applied linguistics more rigorous but to make assessment of pronunciation less arbitrary. I have an idea. Audio corpora could be gathered of speakers in target communities, phonemes run through Praat, and typical acceptable ranges for formant frequencies taken. Learners should then be rated according to comprehensibility by proficient speakers, ideally from the target community, as well as run through Praat to check that phonemes correspond to the acceptable ranges for formants. This data would all then be triangulated and a value assigned based on both.

Now, I fully acknowledge that there are some major drawbacks to this. Gathering an audio corpus is massive pain. Running it all through Praat and gathering the data even more so. To then do the same with learners for assessment makes things yet more taxing. However, is it really better to rely on rater hunches and hope that every rater generally agrees? I don’t think so and the reason is, there is no construct that makes any of this any less arbitrary, especially if assessment is done quickly. With the Praat data, there is at least some quantifiable data to show whether, for example, a learner-produced /l/ conforms to that typically produced in the target community and it would be triangulated with the rater data. It would also go some way to making the sometimes baffling assessment methodologies a bit more transparent, at least to other researchers.