Waveform Model of Vowel Perception and Production
The Waveform Model of Vowel Perception and Production (Stokes, 2009) was discovered after visually reviewing over 20,000 waveforms and effectively reading raw complex waveforms (Stokes, 1996). The Waveform Model of Vowels (WMV) organizes the American English vowel’s into categorical pairs defined by the number of F1 cycles per pitch period. F2 values then provides the distinguishing cue between the categorical pairs. Below is a short list of WMV achievements.
1) WMV is the first to explain vowel perception, production, and perceptual errors. A working model must be able to explain each of these facets (Klatt, 1988).
2) Presented human performance on the Hillenbrand et al. (1995) dataset at an Acoustical Society of America conference (ASA, 2011).
3) Presented human performance on the Peterson and Barney (1952) dataset at an ASA conference in 2014 (Stokes, 2014).
4) Achieved human performance on streaming speech (the Hillenbrand et al. wav files). This work is being prepared for publication.
5) Recent work has been focused on identifying concussions from h-vowel-d productions. The preliminary results were presented at an ASA conference in 2014 (Horner and Stokes, 2014). Since 2014, a total of 4,129 vowels across 45 concussion subjects and 840 vowels across 20 control subjects have been recorded for this project which is the only federally recognized research to identify concussions from speech. Acoustic measurements have been taken every 6 milliseconds across every production creating over 150,000 rows of data.
6) The logic of the WMV has been successfully introduced into algorithms and achieved human performance on the most cited datasets in the literature. As a model of cognition, the WMV is the first to be introduced into a working algorithm.
Although the WMV was published almost 10 years ago, it has been ignored. However, this has provided the time to validate the model and refine the programming achieving human performance. Also, no other model has successfully described production or perception. By extension, no model of perceptual errors has even been possible. This is succinctly illustrated in the title of one presentation; From speech signal to phonological features - A long way (60 years and counting). Henning Reetz, presented at the 164th Acoustical Society of America meeting in October, 2012.
Dr. Reetz’s presentation was made 3 years after the publication of the WMV. The WMV should be considered before another researcher prepares a 70 years and counting presentation. I look forward to any discussion about perception, production, errors, or the working algorithm of cognition.
Analysis of over 25,000 raw complex waveforms of speech have been analyzed. The recordings were made in a number of conditions including the following; produced in noise, under cognitive load, while intoxicated by alcohol, in a centrifuge with 2 to 6 G's of force, normally produced, whispered, Spanish and Hindi, and over 3,800 recordings of impaired speech (due to concussion).