Format:
1 Online-Ressource (120 Seiten)
Edition:
Also available in print
ISBN:
9781598293098
Series Statement:
Synthesis Lectures on Speech and Audio Processing #4
Content:
In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-function form. This common form enables the use of the growth transformation (or extended Baum-Welch) optimization framework in discriminative learning of model parameters. In addition to all the necessary introduction of the background and tutorial material on the subject, we also included technical details on the derivation of the parameter optimization formulas for exponential-family distributions, discrete hidden Markov models (HMMs), and continuous-density HMMs in discriminative learning. Selected experimental results obtained by the authors in firsthand are presented to show that discriminative learning can lead to superior speech recognition performance over conventional parameter learning. Details on major algorithmic implementation issues with practical significance are provided to enable the practitioners to directly reduce the theory in the earlier part of the book into engineering practice
Content:
Introduction and background -- What is discriminative learning? -- What is speech recognition? -- Roles of discriminative learning in speech recognition -- Background: basic probability distributions -- Background: basic optimization concepts and techniques -- Organization of the book -- Statistical speech recognition: a tutorial -- Language modeling -- Acoustic modeling and HMMs -- Discriminative learning: a unified objective function -- A unified discriminative training criterion -- MMI and its unified form -- MCE and its unified form -- Minimum phone/word error and its unified form -- Discussions and comparisons -- Discriminative learning algorithm for exponential-family distributions -- Exponential-family models for classification -- Construction of auxiliary functions -- GT learning for exponential-family distributions -- Estimation formulas for two exponential-family distributions -- Discriminative learning algorithm for hidden Markov model -- Estimation formulas for discrete HMM -- Estimation formulas for CDHMM -- Relationship with gradient-based methods -- Setting constant D for GT-based optimization -- Practical implementation of discriminative learning -- Computing Dg (i, r, t) in growth-transform formulas -- Computing Dg (i, r, t) using lattices -- Arbitrary exponent scaling in MCE implementation -- Arbitrary slope in defining MCE cost function -- Selected experimental results -- Experimental results on small ASR tasks TIDIGITS -- Telephony LV-ASR applications -- Epilogue -- Summary of book contents -- Summary of contributions -- Remaining theoretical issue and future direction
Note:
Description based upon print version of record
,
Discriminative Learning for Speech Recognition; ABSTRACT; Keywords; Contents; Chapter 1; 1.1 WHAT IS DISCRIMINATIVE LEARNING?; 1.2 WHAT IS SPEECH RECOGNITION?; 1.3 ROLES OF DISCRIMINATIVE LEARNING IN SPEECH RECOGNITION; 1.4 BACKGROUND: BASIC PROBABILITY DISTRIBUTIONS; 1.4.1 Multinomial Distribution; 1.4.2 Gaussian and Mixture-of-Gaussian Distributions; 1.4.3 Exponential-Family Distribution; 1.5 BACKGROUND: BASIC OPTIMIZATION CONCEPTS AND TECHNIQUES; 1.5.1 Basic Definitions; 1.5.2 Necessary and Sufficient Conditions for an Optimum
,
1.5.3 Lagrange Multiplier Method for Constrained Optimization1.5.4 Gradient Descent Method; 1.5.5 Growth Transformation Method: Introduction; 1.6 ORGANIZATION OF THE BOOK; Chapter 2; 2.1 INTRODUCTION; 2.2 LANGUAGE MODELING; 2.3 ACOUSTIC MODELING AND HMMs; Chapter 3; 3.1 INTRODUCTION; 3.2 A UNIFIED DISCRIMINATIVE TRAINING CRITERION; 3.2.1 Notations; 3.2.2 The Central Result; 3.3 MMI AND ITS UNIFIED FORM; 3.3.1 Introduction to MMI Criterion; 3.3.2 Reformulation of the MMI Criterion into Its Unified Form; 3.4 MCE AND ITS UNIFIED FORM; 3.4.1 Introduction to the MCE Criterion
,
3.4.2 Reformulation of the MCE Criterion Into its Unified Form3.5 MPe/mWe AND ITS UNIFIED FORM; 3.5.1 Introduction to the MPE/MWE Criterion; 3.5.2 Reformulation of the MPE/MWE Criterion Into Its Unified Form; 3.6 DISCUSSIONS AND COMPARISONS; 3.6.1 Discussion and Elaboration on the Unified Form; 3.6.2 Comparisons With Another Unifying Framework; Chapter 4; 4.1 EXPONENTIAL-FAMILY MODELS FOR CLASSIFICATION; 4.2 CONSTRUCTION OF AUXILIARY FUNCTIONS; 4.3 GT LEARNING FOR EXPONENTIAL-FAMILY DISTRIBUTIONS; 4.4 ESTIMATION FORMULAS FOR TWO EXPONENTIAL-FAMILY DISTRIBUTIONS
,
4.4.1 Multinomial Distribution4.4.2 Multivariate Gaussian Distribution; Chapter 5; 5.1 ESTIMATION FORMULAS FOR DISCRETE HMM; 5.2 ESTIMATION FORMULAS FOR CDHMM; 5.3 RELATIONSHIP WITH GRADIENT-BASED METHODS; 5.4 SETTING CONSTANT D FOR GT-BASED OPTIMIZATION; 5.4.1. Existence Proof of Finite D in GT Updates for CDHMM; Chapter 6; 6.1 COMPUTING Dg (i, r, t) IN GROWTH-TRANSFORM FORMULAS; 6.1.1 Product Form of C(s) (for MMI); 6.1.2. Summation Form of C(s) (MCE and MPE/MWE); 6.2 COMPUTING Dg (i, r, t) USING LATTICES; 6.2.1 Computing Dg (i, r, t) for MMI Involving Lattices
,
6.2.2 Computing Dg (i, r, t) for MPE/MWE Involving Lattices6.2.3 Computing Dg (i, r, t) for MCE Involving Lattices; 6.3 ARBITRARY EXPONENT SCALING IN MCE IMPLEMENTATION; 6.4 ARBITRARY SLOPE IN DEFINING MCE COST FUNCTION; Chapter 7; 7.1 EXPERIMENTAL RESULTS ON SMALL ASR TASKS TIDIGITS; 7.2 TELEPHONY LV-ASR APPLICATIONS; Chapter 8; 8.1 SUMMARY OF BOOK CONTENTS; 8.2 SUMMARY OF CONTRIBUTIONS; 8.3 REMAINING THEORETICAL ISSUE AND FUTURE DIRECTION; Major Symbols Used in the Book and Their Descriptions; Mathematical Notation; Bibliography; Author Biography
,
Also available in print.
,
Mode of access: World Wide Web.
,
System requirements: Adobe Acrobat Reader.
Additional Edition:
ISBN 9781598293081
Additional Edition:
Print version Discriminative Learning for Speech Recognition Theory and Practice
Language:
English
Keywords:
Electronic books
DOI:
10.2200/S00134ED1V01Y200807SAP004
Bookmarklink