Digital speech transmission and enhancement

Eylem Seç

Ayırt
Listelerime ekle
Eposta
Yazdır

Başlık:

Yazar:

Vary, Peter, author.

ISBN:

9781119060970

9781119060994

9781119060987

Basım Bilgisi:

Second edition.

Fiziksel Tanımlama:

1 online resource

İçerik:

Cover -- Title Page -- Copyright -- Contents -- Preface -- Chapter 1 Introduction -- Chapter 2 Models of Speech Production and Hearing -- 2.1 Sound Waves -- 2.2 Organs of Speech Production -- 2.3 Characteristics of Speech Signals -- 2.4 Model of Speech Production -- 2.4.1 Acoustic Tube Model of the Vocal Tract -- 2.4.2 Discrete Time All-Pole Model of the Vocal Tract -- 2.5 Anatomy of Hearing -- 2.6 Psychoacoustic Properties of the Auditory System -- 2.6.1 Hearing and Loudness -- 2.6.2 Spectral Resolution -- 2.6.3 Masking -- 2.6.4 Spatial Hearing -- 2.6.4.1 Head-Related Impulse Responses and Transfer Functions -- 2.6.4.2 Law of The First Wavefront -- References -- Chapter 3 Spectral Transformations -- 3.1 Fourier Transform of Continuous Signals -- 3.2 Fourier Transform of Discrete Signals -- 3.3 Linear Shift Invariant Systems -- 3.3.1 Frequency Response of LSI Systems -- 3.4 The z-transform -- 3.4.1 Relation to Fourier Transform -- 3.4.2 Properties of the ROC -- 3.4.3 Inverse z-Transform -- 3.4.4 z-Transform Analysis of LSI Systems -- 3.5 The Discrete Fourier Transform -- 3.5.1 Linear and Cyclic Convolution -- 3.5.2 The DFT of Windowed Sequences -- 3.5.3 Spectral Resolution and Zero Padding -- 3.5.4 The Spectrogram -- 3.5.5 Fast Computation of the DFT: The FFT -- 3.5.6 Radix-2 Decimation-in-Time FFT -- 3.6 Fast Convolution -- 3.6.1 Fast Convolution of Long Sequences -- 3.6.2 Fast Convolution by Overlap-Add -- 3.6.3 Fast Convolution by Overlap-Save -- 3.7 Analysis-Modification-Synthesis Systems -- 3.8 Cepstral Analysis -- 3.8.1 Complex Cepstrum -- 3.8.2 Real Cepstrum -- 3.8.3 Applications of the Cepstrum -- 3.8.3.1 Construction of Minimum-Phase Sequences -- 3.8.3.2 Deconvolution by Cepstral Mean Subtraction -- 3.8.3.3 Computation of the Spectral Distortion Measure -- 3.8.3.4 Fundamental Frequency Estimation -- References.

5.3.4.1 The Bivariate Uniform Density -- 5.3.4.2 The Bivariate Gaussian Density -- 5.3.5 Functions of Two Random Variables -- 5.4 Probability and Information -- 5.4.1 Entropy -- 5.4.2 Kullback-Leibler Divergence -- 5.4.3 Cross-Entropy -- 5.4.4 Mutual Information -- 5.5 Multivariate Statistics -- 5.5.1 Multivariate Gaussian Distribution -- 5.5.2 Gaussian Mixture Models -- 5.6 Stochastic Processes -- 5.6.1 Stationary Processes -- 5.6.2 Auto-Correlation and Auto-Covariance Functions -- 5.6.3 Cross-Correlation and Cross-Covariance Functions -- 5.6.4 Markov Processes -- 5.6.5 Multivariate Stochastic Processes -- 5.7 Estimation of Statistical Quantities by Time Averages -- 5.7.1 Ergodic Processes -- 5.7.2 Short-Time Stationary Processes -- 5.8 Power Spectrum and its Estimation -- 5.8.1 White Noise -- 5.8.2 The Periodogram -- 5.8.3 Smoothed Periodograms -- 5.8.3.1 Non Recursive Smoothing in Time -- 5.8.3.2 Recursive Smoothing in Time -- 5.8.3.3 Log-Mel Filter Bank Features -- 5.8.4 Power Spectra and Linear Shift-Invariant Systems -- 5.9 Statistical Properties of Speech Signals -- 5.10 Statistical Properties of DFT Coefficients -- 5.10.1 Asymptotic Statistical Properties -- 5.10.2 Signal-Plus-Noise Model -- 5.10.3 Statistics of DFT Coefficients for Finite Frame Lengths -- 5.11 Optimal Estimation -- 5.11.1 MMSE Estimation -- 5.11.2 Estimation of Discrete Random Variables -- 5.11.3 Optimal Linear Estimator -- 5.11.4 The Gaussian Case -- 5.11.5 Joint Detection and Estimation -- 5.12 Non-Linear Estimation with Deep Neural Networks -- 5.12.1 Basic Network Components -- 5.12.1.1 The Perceptron -- 5.12.1.2 Convolutional Neural Network -- 5.12.2 Basic DNN Structures -- 5.12.2.1 Fully-Connected Feed-Forward Network -- 5.12.2.2 Autoencoder Networks -- 5.12.2.3 Recurrent Neural Networks -- 5.12.2.4 Time Delay, Wavenet, and Transformer Networks.

5.12.2.5 Training of Neural Networks -- 5.12.2.6 Stochastic Gradient Descent (SGD) -- 5.12.2.7 Adaptive Moment Estimation Method (ADAM) -- References -- Chapter 6 Linear Prediction -- 6.1 Vocal Tract Models and Short-Term Prediction -- 6.1.1 All-Zero Model -- 6.1.2 All-Pole Model -- 6.1.3 Pole-Zero Model -- 6.2 Optimal Prediction Coefficients for Stationary Signals -- 6.2.1 Optimum Prediction -- 6.2.2 Spectral Flatness Measure -- 6.3 Predictor Adaptation -- 6.3.1 Block-Oriented Adaptation -- 6.3.1.1 Auto-Correlation Method -- 6.3.1.2 Covariance Method -- 6.3.1.3 Levinson-Durbin Algorithm -- 6.3.2 Sequential Adaptation -- 6.4 Long-Term Prediction -- References -- Chapter 7 Quantization -- 7.1 Analog Samples and Digital Representation -- 7.2 Uniform Quantization -- 7.3 Non-uniform Quantization -- 7.4 Optimal Quantization -- 7.5 Adaptive Quantization -- 7.6 Vector Quantization -- 7.6.1 Principle -- 7.6.2 The Complexity Problem -- 7.6.3 Lattice Quantization -- 7.6.4 Design of Optimal Vector Code Books -- 7.6.5 Gain-Shape Vector Quantization -- 7.7 Quantization of the Predictor Coefficients -- 7.7.1 Scalar Quantization of the LPC Coefficients -- 7.7.2 Scalar Quantization of the Reflection Coefficients -- 7.7.3 Scalar Quantization of the LSF Coefficients -- References -- Chapter 8 Speech Coding -- 8.1 Speech-Coding Categories -- 8.2 Model-Based Predictive Coding -- 8.3 Linear Predictive Waveform Coding -- 8.3.1 First-Order DPCM -- 8.3.2 Open-Loop and Closed-Loop Prediction -- 8.3.3 Quantization of the Residual Signal -- 8.3.3.1 Quantization with Open-Loop Prediction -- 8.3.3.2 Quantization with Closed-Loop Prediction -- 8.3.3.3 Spectral Shaping of the Quantization Error -- 8.3.4 ADPCM with Sequential Adaptation -- 8.4 Parametric Coding -- 8.4.1 Vocoder Structures -- 8.4.2 LPC Vocoder -- 8.5 Hybrid Coding -- 8.5.1 Basic Codec Concepts.

8.5.1.1 Scalar Quantization of the Residual Signal -- 8.5.1.2 Vector Quantization of the Residual Signal -- 8.5.2 Residual Signal Coding: RELP -- 8.5.3 Analysis by Synthesis: CELP -- 8.5.3.1 Principle -- 8.5.3.2 Fixed Code Book -- 8.5.3.3 Long-Term Prediction, Adaptive Code Book -- 8.6 Adaptive Postfiltering -- 8.7 Speech Codec Standards: Selected Examples -- 8.7.1 GSM Full-Rate Codec -- 8.7.2 EFR Codec -- 8.7.3 Adaptive Multi-Rate Narrowband Codec (AMR-NB) -- 8.7.4 ITU-T/G.722: 7 kHz Audio Coding within 64 kbit/s -- 8.7.5 Adaptive Multi-Rate Wideband Codec (AMR-WB) -- 8.7.6 Codec for Enhanced Voice Services (EVS) -- 8.7.7 Opus Codec IETF RFC 6716 -- References -- Chapter 9 Concealment of Erroneous or Lost Frames -- 9.1 Concepts for Error Concealment -- 9.1.1 Error Concealment by Hard Decision Decoding -- 9.1.2 Error Concealment by Soft Decision Decoding -- 9.1.3 Parameter Estimation -- 9.1.3.1 MAP Estimation -- 9.1.3.2 MS Estimation -- 9.1.4 The A Posteriori Probabilities -- 9.1.4.1 The A Priori Knowledge -- 9.1.4.2 The Parameter Distortion Probabilities -- 9.1.5 Example: Hard Decision vs. Soft Decision -- 9.2 Examples of Error Concealment Standards -- 9.2.1 Substitution and Muting of Lost Frames -- 9.2.2 AMR Codec: Substitution and Muting of Lost Frames -- 9.2.3 EVS Codec: Concealment of Lost Packets -- 9.3 Further Improvements -- References -- Chapter 10 Bandwidth Extension of Speech Signals -- 10.1 BWE Concepts -- 10.2 BWE using the Model of Speech Production -- 10.2.1 Extension of the Excitation Signal -- 10.2.2 Spectral Envelope Estimation -- 10.2.2.1 Minimum Mean Square Error Estimation -- 10.2.2.2 Conditional Maximum A Posteriori Estimation -- 10.2.2.3 Extensions -- 10.2.2.4 Simplifications -- 10.2.3 Energy Envelope Estimation -- 10.3 Speech Codecs with Integrated BWE -- 10.3.1 BWE in the GSM Full-Rate Codec.

Özet:

DIGITAL SPEECH TRANSMISSION AND ENHANCEMENT Enables readers to understand the latest developments in speech enhancement/transmission due to advances in computational power and device miniaturization The Second Edition of Digital Speech Transmission and Enhancement has been updated throughout to provide all the necessary details on the latest advances in the theory and practice in speech signal processing and its applications, including many new research results, standards, algorithms, and developments which have recently appeared and are on their way into state-of-the-art applications. Besides mobile communications, which constituted the main application domain of the first edition, speech enhancement for hearing instruments and man-machine interfaces has gained significantly more prominence in the past decade, and as such receives greater focus in this updated and expanded second edition. Readers can expect to find information and novel methods on: Low-latency spectral analysis-synthesis, single-channel and dual-channel algorithms for noise reduction and dereverberation Multi-microphone processing methods, which are now widely used in applications such as mobile phones, hearing aids, and man-computer interfaces Algorithms for near-end listening enhancement, which provide a significantly increased speech intelligibility for users at the noisy receiving side of their mobile phone Fundamentals of speech signal processing, estimation and machine learning, speech coding, error concealment by soft decoding, and artificial bandwidth extension of speech signals Digital Speech Transmission and Enhancement is a single-source, comprehensive guide to the fundamental issues, algorithms, standards, and trends in speech signal processing and speech communication technology, and as such is an invaluable resource for engineers, researchers, academics, and graduate students in the areas of communications, electrical engineering, and information technology.

Notlar:

John Wiley and Sons

Konu Terimleri:

Digital communications.

Transmission numérique.

Yazar Ek Girişi:

Martin, Rainer,

Elektronik Erişim:

https://onlinelibrary.wiley.com/doi/book/10.1002/9781119060970

Ayırtma:
Kopya:

Rafta:*

Kütüphane	Materyal Türü	Demirbaş Numarası	Yer Numarası	Durumu/İade Tarihi	Materyal Ayırtma
Arıyor... Çevrimiçi Kütüphane	E-Kitap	598745-1001	TK5103.7 .V37 2024	Arıyor... Unknown	Arıyor... Rafta

Rafta:*

On Order