Автор работы: Пользователь скрыл имя, 09 Июня 2015 в 22:57, дипломная работа
During the past two decades, the exercise of spoken language skills has received increasing attention among educators. Foreign language curricula focus on productive skills with special emphasis on communicative competence. Students' ability to engage in meaningful conversational interaction in the target language is considered an important, if not the most important, goal of second language education. This shift of emphasis has generated a growing need for instructional materials that provide an opportunity for controlled interactive speaking practice outside the classroom.
INRODUCTION…………………………………………………………………3
II.PRINCIPLES OF ASR TECHNOLOGY………………..……………….…..3
1.1 Performance and design issues in speech applications……………………...9
1.2 Current trends in voice-interactive call……………………………………..10
1.3 Teaching Linguistic Structures and Limited Conversation…………………13
1.4 Future trends in voice-interactive call……………………………………...14
1.5 Defining and acquiring literacy in the age of information…………………16
1.6 Efficiency of using presentation techniques in
teaching foreign languages………………………………………………….…22
1.7 Presentation techniques directed on developing of listening
and reading skills…………………………………………………………….…25
1.8 Using video in the foreign language classroom……………………………27
1.9 Motivation and cultural studies…………………………………………….29
II.USAGE OF MODERN TECHNOLOGY IN THE ENGLISH LESSONS
2.1. How to teach foreign languages (general remarks)………………………..33
2.2 Comparing instructed and natural settings for language learning………….33
2.3 Classroom comparisons………………………………………………….…37
2.4 Five principles for classroom teaching……………………………………..41
2.5 The principle of saying what you mean and meaning what you say………45
2.6 Teach what is teacheable……………………………………………….…..49
2.7 Grammar acquisition: Focusing on past tenses and conditionals………...58
CONCLUSION……………………………………………………………….62
REFERENCES…………………………………
Ministry of Education and Science of the Republic of Kazakhstan
Abai Kazakh National Pedagogical University
Askerbay Zaure
DIPLOMA PAPER
Innovation in teaching foreign languages
Specialization: 5B01900- “Foreign language: two foreign languages”
Almaty, 2013
Ministry of Education and Science of the Republic of Kazakhstan
Abai Kazakh National Pedagogical University
“The diploma paper is admitted to defend”
DIPLOMA PAPER
THEME:
Specialization: 5B01900- “Foreign language: two foreign languages”
Innovation of methods in foreign language teaching
Done by
Supervisor
Consultant
Almaty, 2013
CONTENT
INRODUCTION…………………………………………………
II.PRINCIPLES OF ASR TECHNOLOGY………………..……………….…..3
1.1 Performance and design issues in speech applications……………………...9
1.2 Current trends in voice-interactive call……………………………………..10
1.3 Teaching Linguistic Structures and Limited Conversation…………………13
1.4 Future trends in voice-interactive call……………………………………...14
1.5 Defining and acquiring literacy in the age of information…………………16
1.6 Efficiency of using presentation techniques in
teaching foreign languages………………………………………………….…
1.7 Presentation techniques directed on developing of listening
and reading skills…………………………………………………………….
1.8 Using video in the foreign language classroom……………………………27
1.9 Motivation and cultural studies…………………………………………….29
II.USAGE OF MODERN TECHNOLOGY IN THE ENGLISH LESSONS
2.1. How to teach foreign languages (general remarks)………………………..33
2.2 Comparing instructed and natural settings for language learning………….33
2.3 Classroom comparisons…………………………………………………
2.4 Five principles for classroom teaching……………………………………..41
2.5 The principle of saying what you mean and meaning what you say………45
2.6 Teach what is teacheable……………………………………………….…
2.7 Grammar acquisition: Focusing on past tenses and conditionals………...58
CONCLUSION……………………………………………………
REFERENCES……………………………………………………
INTRODUCTION
During the past two decades, the exercise of spoken language skills has received increasing attention among educators. Foreign language curricula focus on productive skills with special emphasis on communicative competence. Students' ability to engage in meaningful conversational interaction in the target language is considered an important, if not the most important, goal of second language education. This shift of emphasis has generated a growing need for instructional materials that provide an opportunity for controlled interactive speaking practice outside the classroom.
Actuality of the research. With recent advances in multimedia technology, computer-aided language learning (CALL) has emerged as a tempting alternative to traditional modes of supplementing or replacing direct student-teacher interaction, such as the language laboratory or audio-tape-based self-study. The integration of sound, voice interaction, text, video, and animation has made it possible to create self-paced interactive learning environments that promise to enhance the classroom model of language learning significantly. A growing number of textbook publishers now offer educational software of some sort, and educators can choose among a large variety of different products. Yet, the practical impact of CALL in the field of foreign language education has been rather modest. Many educators are reluctant to embrace a technology that still seeks acceptance by the language teaching community as a whole (Kenning & Kenning, 1990).
The aims of the diploma work:
- the role of teaching information while teaching language;
- defining the role of different information on language lessons;
- characterizing views of different linguists on the problem of using technology;
- describing various types of informational technology;
- find out the useful ways of technology while teaching English.
The novelty of the work:
- the role of teaching information through games is described;
- the newest materials of the linguists published in the Internet have been analyzed;
Practical significance of the work that this work can be used:
- in High Schools and scientific circles of linguistic kind it can be successfully used by teachers and philologists as modern material for writing research works dealing with using informational technology;
- it can be used by teachers of schools, lyceums and colleges by teachers of English as a practical manual for teaching information;
Theoretical value is that the results and methods offered in the work:
Methods of scientific investigation used within the work:
The main methods for compiling our work are the method of description of grammar teaching methods, comparative analysis and the method of statistical research.
Linguists worked with the theme:
As the base for the qualification work Abbott G., Azar B. Sh., Horwitz E.K., Lee Su Kim and others works were used.
Structure and content of the work:
The present diploma work consists of three parts: introduction, first part, the main two parts: theoretical and practical, conclusion and references. Within the introduction part, which includes three items in which we gave the brief description of our qualification work (the first item), where we described its actuality, practical significance, and fields of amplification, and described the role of games on language lessons. The main part of our qualification work includes several items. There we discussed such problems as adequacy in using innovational technology and their advantages. In the third chapter (practical part) we described different types of methods on using technology, and included worksheets, which are needed for using these methods.
I. PRINCIPLES OF INFORMATIONAL TECHNOLOGY
A number of reasons have been cited for the limited practical impact of computer-based language instruction. Among them are the lack of a unified theoretical framework for designing and evaluating CALL systems (Chapelle, 1997; Hubbard, 1988; Ng & Olivier, 1987); the absence of conclusive empirical evidence for the pedagogical benefits of computers in language learning (Chapelle, 1997; Dunkel, 1991; Salaberry, 1996); and finally, the current limitations of the technology itself (Holland, 1995; Warschauer, 1996).
The rapid technological advances of the 1980s have raised both the expectations and the demands placed on the computer as a potential learning tool. Educators and second language acquisition (SLA) researchers alike are now demanding intelligent, user-adaptive CALL systems that offer not only sophisticated diagnostic tools, but also effective feedback mechanisms capable of focusing the learner on areas that need remedial practice.
As Warschauer puts it, a computerized language teacher should be able to understand a user's spoken input and evaluate it not just for correctness but also for appropriateness. It should be able to diagnose a student's problems with pronunciation, syntax, or usage, and then intelligently decide among a range of options (e.g., repeating, paraphrasing, slowing down, correcting, or directing the student to background explanations). (Warschauer, 1996, p. 6)
Salaberry (1996) demands nothing short of a system capable of simulating the complex socio-communicative competence of a live tutor--in other words, the linguistic intelligence of a human--only to conclude that the attempt to create an "intelligent language tutoring system is a fallacy" (p. 11). Because speech technology isn't perfect, it is of no use at all. If it "cannot account for the full complexity of human language," why even bother modeling more constrained aspects of language use (Higgins, 1988, p. vii)? This sort of all-or-nothing reasoning seems symptomatic of much of the latest pedagogical literature on CALL. The quest for a theoretical grounding of CALL system design and evaluation (Chapelle, 1997) tends to lead to exaggerated expectations as to what the technology ought to accomplish. When combined with little or no knowledge of the underlying technology, the inevitable result is disappointment [7, 112p.].
Consider the following four scenarios:
At some level, all four scenarios involve speech recognition. An incoming speech signal elicits a response from a "listener." In the first two instances, the response consists of a written transcript of the spoken input, whereas in the latter two cases, an action is performed in response to a spoken command. In all four cases, the "success" of the voice interaction is relative to a given task as embodied in a set of expectations that accompany the input. The interaction succeeds when the response--by a machine or human "listener"--matches these expectations.
Recognizing and understanding human speech requires a considerable amount of linguistic knowledge: a command of the phonological, lexical, semantic, grammatical, and pragmatic conventions that constitute a language. The listener's command of the language must be "up" to the recognition task or else the interaction fails. Jimmy returns with the wrong items, because he cannot yet verbally discriminate between different kinds of shoes.
Likewise, the reading tutor would miserably fail in performing the court-reporter's job or transcribing medical patient information, just as the medical dictation device would be a poor choice for diagnosing a student's reading errors. On the other hand, the human court reporter--assuming he or she is an adult native speaker--would have no problem performing any of the tasks mentioned under (1) through (4). The linguistic competence of an adult native speaker covers a broad range of recognition tasks and communicative activities. Computers, on the other hand, perform best when designed to operate in clearly circumscribed linguistic sub-domains [9, 117p.].
Humans and machines process speech in fundamentally different ways (Bernstein & Franco, 1996). Complex cognitive processes account for the human ability to associate acoustic signals with meanings and intentions. For a computer, on the other hand, speech is essentially a series of digital values. However, despite these differences, the core problem of speech recognition is the same for both humans and machines: namely, of finding the best match between a given speech sound and its corresponding word string. Automatic speech recognition technology attempts to simulate and optimize this process computationally.
Since the early 1970s, a number of different approaches to ASR have been proposed and implemented, including Dynamic Time Warping, template matching, knowledge-based expert systems, neural nets, and Hidden Markov Modeling (HMM) (Levinson & Liberman, 1981; Weinstein, McCandless, Mondshein, & Zue, 1975; for a review, see Bernstein & Franco, 1996). HMM-based modeling applies sophisticated statistical and probabilistic computations to the problem of pattern matching at the sub-word level. The generalized HMM-based approach to speech recognition has proven an effective, if not the most effective, method for creating high-performance speaker-independent recognition engines that can cope with large vocabularies; the vast majority of today's commercial systems deploy this technique.
Therefore, we focus our technical discussion on an explanation of this technique.
An HMM-based speech recognizer consists of five basic components: (a) an acoustic signal analyzer which computes a spectral representation of the incoming speech; (b) a set of phone models (HMMs) trained on large amounts of actual speech data; (c) a lexicon for converting sub-word phone sequences into words; (d) a statistical language model or grammar network that defines the recognition task in terms of legitimate word combinations at the sentence level; (e) a decoder, which is a search algorithm for computing the best match between a spoken utterance and its corresponding word string
Figure 1. Components of a speech recognition device
A. Signal Analysis
The first step in automatic speech recognition consists of analyzing the incoming speech signal. When a person speaks into an ASR device--usually through a high quality noise-canceling microphone--the computer samples the analog input into a series of 16- or 8-bit values at a particular sampling frequency (ranging from 8 to 22KHz). These values are grouped together in predetermined overlapping temporal intervals called "frames." These numbers provide a precise description of the speech signal's amplitude. In a second step, a number of acoustically relevant parameters such as energy, spectral features, and pitch information, are extracted from the speech signal. During training, this information is used to model that particular portion of the speech signal. During recognition, this information is matched against the pre-existing model of the signal.
B. Phone Models
Training a machine to recognize spoken language amounts to modeling the basic sounds of speech (phones). Automatic speech recognition strings together these models to form words. Recognizing an incoming speech signal involves matching the observed acoustic sequence with a set of HMM models. An HMM can model either phones or other sub-word units or it can model words or even whole sentences. Phones are either modeled as individual sounds--so-called monophones--or as phone combinations that model several phones and the transitions between them (biphones or triphones). After comparing the incoming acoustic signal with the HMMs representing the sounds of language, the system computes a hypothesis based on the sequence of models that most closely resembles the incoming signal. The HMM model for each linguistic unit (phone or word) contains a probabilistic representation of all the possible pronunciations for that unit--just as the model of the handwritten cursive b would have many different representations. Building HMMs--a process called training--requires a large amount of speech data of the type the system is expected to recognize.
Large-vocabulary speaker-independent continuous dictation systems are typically trained on tens of thousands of read utterances by a cross-section of the population, including members of different dialect regions and age-groups. As a general rule, an automatic speech recognizer cannot correctly process speech that differs in kind from the speech it has been trained on.
This is why most commercial dictation systems, when trained on standard American English, perform poorly when encountering accented speech, whether by non-native speakers or by speakers of different dialects. We will return to this point in our discussion of voice-interactive CALL applications.
C. Lexicon
The lexicon, or dictionary, contains the phonetic spelling for all the words that are expected to be observed by the recognizer. It serves as a reference for converting the phone sequence determined by the search algorithm into a word. It must be carefully designed to cover the entire lexical domain in which the system is expected to perform. If the recognizer encounters a word it does not "know" (i.e., a word not defined in the lexicon), it will either choose the closest match or return an out-of-vocabulary recognition error.
Whether a recognition error is registered as a misrecognition or an out-of-vocabulary error depends in part on the vocabulary size. If, for example, the vocabulary is too small for an unrestricted dictation task--let's say less than 3K--the out-of-vocabulary errors are likely to be very high. If the vocabulary is too large, the chance of misrecognition errors increases because with more similar-sounding words, the confusability increases. The vocabulary size in most commercial dictation systems tends to vary between 5K and 60K.
D. The Language Model
The language model predicts the most likely continuation of an utterance on the basis of statistical information about the frequency in which word sequences occur on average in the language to be recognized. For example, the word sequence A bare attacked him will have a very low probability in any language model based on standard English usage, whereas the sequence A bear attacked him will have a higher probability of occurring.
Thus the language model helps constrain the recognition hypothesis produced on the basis of the acoustic decoding just as the context helps decipher an unintelligible word in a handwritten note. Like the HMMs, an efficient language model must be trained on large amounts of data, in this case texts collected from the target domain.
In ASR applications with constrained lexical domain and/or simple task definition, the language model consists of a grammatical network that defines the possible word sequences to be accepted by the system without providing any statistical information. This type of design is suitable for CALL applications in which the possible word combinations and phrases are known in advance and can be easily anticipated (e.g., based on user data collected with a system pre-prototype). Because of the a priori constraining function of a grammar network, applications with clearly defined task grammars tend to perform at much higher accuracy rates than the quality of the acoustic recognition would suggest.
1.1 Performance and design issues in speech applications
For educators and developers interested in deploying ASR in CALL applications, perhaps the most important consideration is recognition performance: How good is the technology? Is it ready to be deployed in language learning? These questions cannot be answered except with reference to particular applications of the technology, and therefore touch on a key issue in ASR development: the issue of human-machine interface design.
As we recall, speech recognition performance is always domain specific--a machine can only do what it is programmed to do, and a recognizer with models trained to recognize business news dictation under laboratory conditions will be unable to handle spontaneous conversational speech transmitted over noisy telephone channels. The question that needs to be answered is therefore not simply "How good is ASR technology?" but rather, "What do we want to use it for?" and "How do we get it to perform the task?"
In the following section, we will address the issue of system performance as it relates to a number of successful commercial speech applications. By emphasizing the distinction between recognizer performance on the one hand--understood in terms of "raw" recognition accuracy--and system performance on the other; we suggest how the latter can be optimized within an overall design that takes into account not only the factors that affect recognizer performance as such, but also, and perhaps even more importantly, considerations of human-machine interface design [12, 34].
Информация о работе Innovation of methods in foreign language teaching