Speech recognition can be difficult and effortful for older adults, even for those with normal hearing. Declining frontal lobe cognitive control has been hypothesized to cause age-related speech recognition problems. This study examined age-related changes in frontal lobe function for 15 clinically normal hearing adults (21–75 years) when they performed a word recognition task that was made challenging by decreasing word intelligibility. Although there were no age-related changes in word recognition, there were age-related changes in the degree of activity within left middle frontal gyrus (MFG) and anterior cingulate (ACC) regions during word recognition. Older adults engaged left MFG and ACC regions when words were most intelligible compared to younger adults who engaged these regions when words were least intelligible. Declining gray matter volume within temporal lobe regions responsive to word intelligibility significantly predicted left MFG activity, even after controlling for total gray matter volume, suggesting that declining structural integrity of brain regions responsive to speech leads to the recruitment of frontal regions when words are easily understood.
This study examined how prelingually deafened children with cochlear implants combine visual information from lipreading with auditory cues in an open-set speech perception task. A secondary aim was to examine lexical effects on the recognition of words in isolation and in sentences. Fifteen children with cochlear implants served as participants in this study. Participants were administered two tests of spoken word recognition. The LNT assessed isolated word recognition in an auditory-only format. The AV-LNST assessed recognition of key words in sentences in a visual-only, auditory-only and audiovisual presentation format. On each test, lexical characteristics of the stimulus items were controlled to assess the effects of lexical competition. The children also were administered a test of receptive vocabulary knowledge. The results revealed that recognition of key words was significantly influenced by presentation format. Audiovisual speech perception was best, followed by auditory-only and visual-only presentation, respectively. Lexical effects on spoken word recognition were evident for isolated words, but not when words were presented in sentences. Finally, there was a significant relationship between auditory-only and audiovisual word recognition and language knowledge. The results demonstrate that children with cochlear implants obtain significant benefit from audiovisual speech integration...
Much research has explored how spoken word recognition is influenced by the architecture and dynamics of the mental lexicon (e.g., Luce and Pisoni, 1998; McClelland and Elman, 1986). A more recent question is whether the processes underlying word recognition are unique to the auditory domain, or whether visually perceived (lipread) speech may also be sensitive to the structure of the mental lexicon (Auer, 2002; Mattys, Bernstein, and Auer, 2002). The current research was designed to test the hypothesis that both aurally and visually perceived spoken words are isolated in the mental lexicon as a function of their modality-specific perceptual similarity to other words. Lexical competition (the extent to which perceptually similar words influence recognition of a stimulus word) was quantified using metrics that are well-established in the literature, as well as a statistical method for calculating perceptual confusability based on the phi-square statistic. Both auditory and visual spoken word recognition were influenced by modality-specific lexical competition as well as stimulus word frequency. These findings extend the scope of activation-competition models of spoken word recognition and reinforce the hypothesis (Auer, 2002; Mattys et al....
Empirical work and models of visual word recognition have traditionally focused on group-level performance. Despite the emphasis on the prototypical reader, there is clear evidence that variation in reading skill modulates word recognition performance. In the present study, we examined differences between individuals who contributed to the English Lexicon Project (http://elexicon.wustl.edu), an online behavioral database containing nearly four million word recognition (speeded pronunciation and lexical decision) trials from over 1,200 participants. We observed considerable within- and between-session reliability across distinct sets of items, in terms of overall mean response time (RT), RT distributional characteristics, diffusion model parameters (Ratcliff, Gomez, & McKoon, 2004), and sensitivity to underlying lexical dimensions. This indicates reliably detectable individual differences in word recognition performance. In addition, higher vocabulary knowledge was associated with faster, more accurate word recognition performance, attenuated sensitivity to stimuli characteristics, and more efficient accumulation of information. Finally, in contrast to suggestions in the literature, we did not find evidence that individuals were trading-off in their utilization of lexical and nonlexical information.
This paper reports the results of three projects concerned with auditory word recognition and the structure of the lexicon. The first project was designed to experimentally test several specific predictions derived from MACS, a simulation model of the Cohort Theory of word recognition. Using a priming paradigm, evidence was obtained for acoustic-phonetic activation in word recognition in three experiments. The second project describes the results of analyses of the structure and distribution of words in the lexicon using a large lexical database. Statistics about similarity spaces for high and low frequency words were applied to previously published data on the intelligibility of words presented in noise. Differences in identification were shown to be related to structural factors about the specific words and the distribution of similar words in their neighborhoods. Finally, the third project describes efforts at developing a new theory of word recognition known as Phonetic Refinement Theory. The theory is based on findings from human listeners and was designed to incorporate some of the detailed acoustic-phonetic and phonotactic knowledge that human listeners have about the internal structure of words and the organization of words in the lexicon...
Thirty years of research has uncovered the broad principles that characterize spoken word processing across listeners. However, there have been few systematic investigations of individual differences. Such an investigation could help refine models of word recognition by indicating which processing parameters are likely to vary, and could also have important implications for work on language impairment. The present study begins to fill this gap by relating individual differences in overall language ability to variation in online word recognition processes. Using the visual world paradigm, we evaluated online spoken word recognition in adolescents who varied in both basic language abilities and non-verbal cognitive abilities. Eye movements to target, cohort and rhyme objects were monitored during spoken word recognition, as an index of lexical activation. Adolescents with poor language skills showed fewer looks to the target and more fixations to the cohort and rhyme competitors. These results were compared to a number of variants of the TRACE model (McClelland & Elman, 1986) that were constructed to test a range of theoretical approaches to language impairment: impairments at sensory and phonological levels; vocabulary size, and generalized slowing. None were strongly supported...
The coordination of word-recognition and oculomotor processes during reading was evaluated in two eye-tracking experiments that examined how word skipping, where a word is not fixated during first-pass reading, is affected by the lexical status of a letter string in the parafovea and ease of recognizing that string. Ease of lexical recognition was manipulated through target-word frequency (Experiment 1) and through repetition priming between prime-target pairs embedded in a sentence (Experiment 2). Using the gaze-contingent boundary technique the target word appeared in the parafovea either with full preview or with transposed-letter (TL) preview. The TL preview strings were nonwords in Experiment 1 (e.g., bilnk created from the target blink), but were words in Experiment 2 (e.g., sacred created from the target scared). Experiment 1 showed greater skipping for high-frequency than low-frequency target words in the full preview condition but not in the TL preview (nonword) condition. Experiment 2 showed greater skipping for target words that repeated an earlier prime word than for those that did not, with this repetition priming occurring both with preview of the full target and with preview of the target’s TL neighbor word. However...
Recognizing speech in difficult listening conditions requires considerable focus of attention that is often demonstrated by elevated activity in putative attention systems, including the cingulo-opercular network. We tested the prediction that elevated cingulo-opercular activity provides word-recognition benefit on a subsequent trial. Eighteen healthy, normal-hearing adults (10 females; aged 20–38 years) performed word recognition (120 trials) in multi-talker babble at +3 and +10 dB signal-to-noise ratios during a sparse sampling functional magnetic resonance imaging (fMRI) experiment. Blood oxygen level-dependent (BOLD) contrast was elevated in the anterior cingulate cortex, anterior insula, and frontal operculum in response to poorer speech intelligibility and response errors. These brain regions exhibited significantly greater correlated activity during word recognition compared with rest, supporting the premise that word-recognition demands increased the coherence of cingulo-opercular network activity. Consistent with an adaptive control network explanation, general linear mixed model analyses demonstrated that increased magnitude and extent of cingulo-opercular network activity was significantly associated with correct word recognition on subsequent trials. These results indicate that elevated cingulo-opercular network activity is not simply a reflection of poor performance or error but also supports word recognition in difficult listening conditions.
Emotion influences most aspects of cognition and behavior, but emotional factors are conspicuously absent from current models of word recognition. The influence of emotion on word recognition has mostly been reported in prior studies on the automatic vigilance for negative stimuli, but the precise nature of this relationship is unclear. Various models of automatic vigilance have claimed that the effect of valence on response times is categorical, an inverted-U, or interactive with arousal. The present study used a sample of 12,658 words, and included many lexical and semantic control factors, to determine the precise nature of the effects of arousal and valence on word recognition. Converging empirical patterns observed in word-level and trial-level data from lexical decision and naming indicate that valence and arousal exert independent monotonic effects: Negative words are recognized more slowly than positive words, and arousing words are recognized more slowly than calming words. Valence explained about 2% of the variance in word recognition latencies, whereas the effect of arousal was smaller. Valence and arousal do not interact, but both interact with word frequency, such that valence and arousal exert larger effects among low-frequency words than among high-frequency words. These results necessitate a new model of affective word processing whereby the degree of negativity monotonically and independently predicts the speed of responding. This research also demonstrates that incorporating emotional factors...
The cognitive model of reading comprehension (RC) posits that RC is a result of the interaction between decoding and linguistic comprehension. Recently, the notion of decoding skill was expanded to include word recognition. In addition, some studies suggest that other skills could be integrated into this model, like processing speed, and have consistently indicated that this skill influences and is an important predictor of the main components of the model, such as vocabulary for comprehension and phonological awareness of word recognition. The following study evaluated the components of the RC model and predictive skills in children and adolescents with dyslexia. 40 children and adolescents (8–13 years) were divided in a Dyslexic Group (DG; 18 children, MA = 10.78, SD = 1.66) and control group (CG 22 children, MA = 10.59, SD = 1.86). All were students from the 2nd to 8th grade of elementary school and groups were equivalent in school grade, age, gender, and IQ. Oral and RC, word recognition, processing speed, picture naming, receptive vocabulary, and phonological awareness were assessed. There were no group differences regarding the accuracy in oral and RC, phonological awareness, naming, and vocabulary scores. DG performed worse than the CG in word recognition (general score and orthographic confusion items) and were slower in naming. Results corroborated the literature regarding word recognition and processing speed deficits in dyslexia. However...
Thesis (Ph. D.)--University of Rochester. Dept. of Computer Science, 2000. Simultaneously published in the Technical Report series.; The focus of this thesis is to improve the ability of a computational system to understand spoken utterances in a dialogue with a human. Available computational methods for word recognition do not perform as well on spontaneous speech in task-oriented dialogue as we would hope. Even a state of the art recognizer achieves slightly worse than 70\% word accuracy on spontaneous speech in a conversation focused on solving a specific problem. To address this problem, I explore novel methods for post-processing the output of a speech recognizer in order to correct errors. I adopt statistical techniques for modeling the noisy channel from the speaker to the listener in order to correct some of the errors introduced there. The statistical model accounts for frequent errors such as simple word/word confusions and short phrasal problems (one-to-many word substitutions and many-to-one word concatenations). To use the model, a search algorithm is employed to find the most likely correction of a given word sequence from the speech recognizer. The post-processor output contains fewer erors, thus making interpretation by downstream components...
Dans le cadre de cette thèse, nous investiguons la capacité de chaque hémisphère cérébral à utiliser l’information visuelle disponible lors de la reconnaissance de mots. Il est généralement convenu que l’hémisphère gauche (HG) est mieux outillé pour la lecture que l’hémisphère droit (HD). De fait, les mécanismes visuoperceptifs utilisés en reconnaissance de mots se situent principalement dans l’HG (Cohen, Martinaud, Lemer et al., 2003). Puisque les lecteurs normaux utilisent optimalement des fréquences spatiales moyennes (environ 2,5 - 3 cycles par degré d’angle visuel) pour reconnaître les lettres, il est possible que l’HG les traite mieux que l’HD (Fiset, Gosselin, Blais et Arguin, 2006). Par ailleurs, les études portant sur la latéralisation hémisphérique utilisent habituellement un paradigme de présentation en périphérie visuelle. Il a été proposé que l’effet de l’excentricité visuelle sur la reconnaissance de mots soit inégal entre les hémichamps. Notamment, la première lettre est celle qui porte habituellement le plus d’information pour l’identification d’un mot. C’est aussi la plus excentrique lorsque le mot est présenté à l’hémichamp visuel gauche (HVG), ce qui peut nuire à son identification indépendamment des capacités de lecture de l’HD. L’objectif de la première étude est de déterminer le spectre de fréquences spatiales utilisé par l’HG et l’HD en reconnaissance de mots. Celui de la deuxième étude est d’explorer les biais créés par l’excentricité et la valeur informative des lettres lors de présentation en champs divisés. Premièrement...
This research looked at conditions which result in the
development of integrated letter code information in the
acquisition of reading vocabulary. Thirty grade three
children of normal reading ability acquired new reading
words in a Meaning Assigned task and a Letter Comparison
task, and worked to increase skill for known reading words
in a Copy task. The children were then assessed on their
ability to identify the letters in these words. During the
test each stimulus word for each child was exposed for 100
msec., after which each child reported as many of his or her
letters as he or she could. Familiar words, new words, and
a single letter identification task served as within subject
controls. Following this, subjects were assessed for word
meaning recall of the Meaning Assigned words and word
reading times for words in all condi tions • The resul ts
supported an episodic model of word recognition in which the
overlap between the processing operations employed in
encoding a word and those required when decoding it
affected decoding performance. In particular, the Meaning
Assigned and Copy tasks. appeared to facilitate letter code
accessibility and integration in new and familiar words
respectively. Performance in the Letter Comparison task...
This lexical decision study with eye tracking of Japanese two-kanji-character words investigated the
order in which a whole two-character word and its morphographic constituents are activated in the
course of lexical access, the relative contributions of the left and the right characters in lexical decision,
the depth to which semantic radicals are processed, and how nonlinguistic factors affect lexical processes.
Mixed-effects regression analyses of response times and subgaze durations (i.e., first-pass fixation
time spent on each of the two characters) revealed joint contributions of morphographic units
at all levels of the linguistic structure with the magnitude and the direction of the lexical effects modulated
by readers’ locus of attention in a left-to-right preferred processing path. During the early time
frame, character effects were larger in magnitude and more robust than radical and whole-word
effects, regardless of the font size and the type of nonwords. Extending previous radical-based and character-based
models, we propose a task/decision-sensitive character-driven processing model with a
level-skipping assumption: Connections from the feature level bypass the lower radical level and link
up directly to the higher character level.
In reading science, studies carried out with English-speaking participants reading in their native language have traditionally formed the basis for most theories and models. From the 1990s, however, there was a continuously growing conviction that this English-based research agenda alone would not lead to a universal science of reading.
A major reason why English is not a good basis for developing univerally applicable theories and models on reading and reading acquisition is that the English orthography is exceptionally inconsistent with regards to the relationship of letters and sounds (low grapheme-phoneme consistency). One consequence of this inconsistency is that for reading English linguistic processing rather relies on small lingusitic units, whereas for more consistent languages (such as German) large linguistic units are rather relied upon. (Psycholinguistic grain size theory (Ziegler & Goswami, 2005)) – for example ultimately leading to a relative delay regarding reading acquisition skills for children learning English.
The research project presented here set out to deepen and broaden the understanding of this phenomenon. In three studies, both word processing and sentence processing in the consistent German and the inconsistent English orthography were investigated. Methodologically...
Spoken word recognition is thought to be achieved via competition in the mental lexicon between perceptually similar word forms. A review of the development and initial behavioral validations of computational models of visual spoken word recognition is presented, followed by a report of new empirical evidence. Specifically, a replication and extension of Mattys, Bernstein & Auer's (2002) study was conducted with 20 deaf participants who varied widely in speechreading ability. Participants visually identified isolated spoken words. Accuracy of visual spoken word recognition was influenced by the number of visually similar words in the lexicon and by the frequency of occurrence of the stimulus words. The results are consistent with the common view held within auditory word recognition that this task is accomplished via a process of activation and competition in which frequently occurring units are favored. Finally, future directions for visual spoken word recognition are discussed.
We used magnetoencephalography (MEG) to map the spatiotemporal evolution of cortical activity for visual word recognition. We show that for five-letter words, activity in the left hemisphere (LH) fusiform gyrus expands systematically in both the posterior
Synaptic plasticity seems to be a capital aspect of the dynamics of neural
networks. It is about the physiological modifications of the synapse, which
have like consequence a variation of the value of the synaptic weight. The
information encoding is based on the precise timing of single spike events that
is based on the relative timing of the pre- and post-synaptic spikes, local
synapse competitions within a single neuron and global competition via lateral
connections. In order to classify temporal sequences, we present in this paper
how to use a local hebbian learning, spike-timing dependent plasticity for
unsupervised competitive learning, preserving self-organizing maps of spiking
neurons. In fact we present three variants of self-organizing maps (SOM) with
spike-timing dependent Hebbian learning rule, the Leaky Integrators Neurons
(LIN), the Spiking_SOM and the recurrent Spiking_SOM (RSSOM) models. The case
study of the proposed SOM variants is phoneme classification and word
recognition in continuous speech and speaker independent.; Comment: 10 pages, 15 tables
We have benchmarked the maximum obtainable recognition accuracy on various
word image datasets using manual segmentation and a currently available
commercial OCR. We have developed a Matlab program, with graphical user
interface, for semi-automated pixel level segmentation of word images. We
discuss the advantages of pixel level annotation. We have covered five
databases adding up to over 3600 word images. These word images have been
cropped from camera captured scene, born-digital and street view images. We
recognize the segmented word image using the trial version of Nuance Omnipage
OCR. We also discuss, how the degradations introduced during acquisition or
inaccuracies introduced during creation of word images affect the recognition
of the word present in the image. Word images for different kinds of
degradations and correction for slant and curvy nature of words are also
discussed. The word recognition rates obtained on ICDAR 2003, Sign evaluation,
Street view, Born-digital and ICDAR 2011 datasets are 83.9%, 89.3%, 79.6%,
88.5% and 86.7% respectively.; Comment: 16 pages, 4 figures
A prototype system for the transliteration of diacritics-less Arabic
manuscripts at the sub-word or part of Arabic word (PAW) level is developed.
The system is able to read sub-words of the input manuscript using a set of
skeleton-based features. A variation of the system is also developed which
reads archigraphemic Arabic manuscripts, which are dot-less, into
archigraphemes transliteration. In order to reduce the complexity of the
original highly multiclass problem of sub-word recognition, it is redefined
into a set of binary descriptor classifiers. The outputs of trained binary
classifiers are combined to generate the sequence of sub-word letters. SVMs are
used to learn the binary classifiers. Two specific Arabic databases have been
developed to train and test the system. One of them is a database of the Naskh
style. The initial results are promising. The systems could be trained on other
scripts found in Arabic manuscripts.; Comment: 8 pages, 7 figures, 6 tables