skip to content
Parametric time-frequency domain spatial audio Preview this item
ClosePreview this item

Parametric time-frequency domain spatial audio

Author: Ville Pulkki; Symeon Delikaris-Manias; Archontis Politis
Publisher: Hoboken, NJ : John Wiley & Sons, Inc., [2018] ©2018
Edition/Format:   eBook : Document : EnglishView all editions and formats
"A comprehensive guide that addresses the theory and practice of spatial audio This book provides readers with the principles and best practices in spatial audio signal processing. It describes how sound fields and their perceptual attributes are captured and analyzed within the time-frequency domain, how essential representation parameters are coded, and how such signals are efficiently reproduced for practical  Read more...

(not yet rated) 0 with reviews - Be the first.

More like this

Find a copy online

Links to this item

Find a copy in the library

&AllPage.SpinnerRetrieving; Finding libraries that hold this item...


Genre/Form: Electronic books
Additional Physical Format: Print version:
Parametric time-frequency domain spatial audio.
Hoboken, NJ, USA : John Wiley & Sons, Inc., 2018
(DLC) 2017020532
Material Type: Document, Internet resource
Document Type: Internet Resource, Computer File
All Authors / Contributors: Ville Pulkki; Symeon Delikaris-Manias; Archontis Politis
ISBN: 9781119252610 111925261X 1119252598 9781119252597
OCLC Number: 1031279282
Description: 1 online resource (1 volume) : illustrations
Contents: ContentsList of Contributors xiiiPreface xvAbout the Companion Website xixPart I Analysis and Synthesis of Spatial Sound 1Time-Frequency Processing: Methods and Tools 3Juha Vilkamo and Tom Backstrom1.1 Introduction 31.2 Time-Frequency Processing 41.2.1 Basic Structure 41.2.2 Uniform Filter Banks 51.2.3 Prototype Filters and Modulation 61.2.4 A Robust Complex-Modulated Filter Bank, and Comparison with STFT 81.2.5 Overlap-Add and Windowing 121.2.6 Example Implementation of a Robust Filter Bank in Matlab 131.2.7 Cascaded Filters 151.3 Processing of Spatial Audio 161.3.1 Stochastic Estimates 171.3.2 Decorrelation 181.3.3 Optimal and Generalized Solution for Spatial Sound Processing Using Covariance Matrices 19References 232 Spatial Decomposition by Spherical Array Processing 25David Lou Alon and Boaz Rafaely2.1 Introduction 252.2 Sound Field Measurement by a Spherical Array 262.3 Array Processing and Plane-Wave Decomposition 262.4 Sensitivity to Noise and Standard Regularization Methods 292.5 Optimal Noise-Robust Design 322.5.1 PWD Estimation Error Measure 322.5.2 PWD Error Minimization 342.5.3 R-PWD Simulation Study 352.6 Spatial Aliasing and High Frequency Performance Limit 372.7 High Frequency Bandwidth Extension by Aliasing Cancellation 392.7.1 Spatial Aliasing Error 392.7.2 AC-PWD Simulation Study 402.8 High Performance Broadband PWD Example 422.8.1 Broadband Measurement Model 422.8.2 Minimizing Broadband PWD Error 422.8.3 BB-PWD Simulation Study 442.9 Summary 452.10 Acknowledgment 46References 463 Sound Field Analysis Using Sparse Recovery 49Craig T. Jin, Nicolas Epain, and Tahereh Noohi3.1 Introduction 493.2 The Plane-Wave Decomposition Problem 503.2.1 Sparse Plane-Wave Decomposition 513.2.2 The Iteratively Reweighted Least-Squares Algorithm 513.3 Bayesian Approach to Plane-Wave Decomposition 533.4 Calculating the IRLS Noise-Power Regularization Parameter 553.4.1 Estimation of the Relative Noise Power 563.5 Numerical Simulations 583.6 Experiment: Echoic Sound Scene Analysis 593.7 Conclusions 65Appendix 65References 66Part II Reproduction of Spatial Sound 69Overview of Time-Frequency Domain Parametric Spatial Audio Techniques 71Archontis Politis, Symeon Delikaris-Manias, and Ville Pulkki4.1 Introduction 714.2 Parametric Processing Overview 734.2.1 Analysis Principles 744.2.2 Synthesis Principles 754.2.3 Spatial Audio Coding and Up-Mixing 764.2.4 Spatial Sound Recording and Reproduction 784.2.5 Auralization of Measured Room Acoustics and Spatial Rendering of Room Impulse Responses 81References 825 First-Order Directional Audio Coding (DirAC) 89Ville Pulkki, Archontis Politis, Mikko-Ville Laitinen, Juha Vilkamo, and Jukka Ahonen5.1 Representing Spatial Sound with First-Order B-Format Signals 895.2 Some Notes on the Evolution of the Technique 925.3 DirAC with Ideal B-Format Signals 945.4 Analysis of Directional Parameters with Real Microphone Setups 975.4.1 DOA Analysis with Open 2D Microphone Arrays 975.4.2 DOA Analysis with 2D Arrays with a Rigid Baffle 995.4.3 DOA Analysis in Underdetermined Cases 1015.4.4 DOA Analysis: Further Methods 1025.4.5 Effect of Spatial Aliasing and Microphone Noise on the Analysis of Diffuseness 1035.5 First-Order DirAC with Monophonic Audio Transmission 1055.6 First-Order DirAC with Multichannel Audio Transmission 1065.6.1 Stream-Based Virtual Microphone Rendering 1065.6.2 Evaluation of Virtual Microphone DirAC 1095.6.3 Discussion of Virtual Microphone DirAC 1115.6.4 Optimized DirAC Synthesis 1115.6.5 DirAC-Based Reproduction of Spaced-Array Recordings 1145.7 DirAC Synthesis for Headphones and for Hearing Aids 1175.7.1 Reproduction of B-Format Signals 1175.7.2 DirAC in Hearing Aids 1185.8 Optimizing the Time-Frequency Resolution of DirAC for Critical Signals 1195.9 Example Implementation 1205.9.1 Executing DirAC and Plotting Parameter History 1225.9.2 DirAC Initialization 1255.9.3 DirAC Runtime 1315.9.4 Simplistic Binaural Synthesis of Loudspeaker Listening 1365.10 Summary 137References 1386 Higher-Order Directional Audio Coding 141Archontis Politis and Ville Pulkki6.1 Introduction 1416.2 Sound Field Model 1446.3 Energetic Analysis and Estimation of Parameters 1456.3.1 Analysis of Intensity and Diffuseness in the Spherical Harmonic Domain 1466.3.2 Higher-Order Energetic Analysis 1476.3.3 Sector Profiles 1496.4 Synthesis of Target Setup Signals 1516.4.1 Loudspeaker Rendering 1526.4.2 Binaural Rendering 1556.5 Subjective Evaluation 1576.6 Conclusions 157References 1587 Multi-Channel Sound Acquisition Using a Multi-Wave Sound Field Model 161Oliver Thiergart and Emanuel Habets7.1 Introduction 1617.2 Parametric Sound Acquisition and Processing 1637.2.1 Problem Formulation 1637.2.2 Principal Estimation of the Target Signal 1667.3 Multi-Wave Sound Field and Signal Model 1677.3.1 Direct Sound Model 1687.3.2 Diffuse Sound Model 1697.3.3 Noise Model 1697.4 Direct and Diffuse Signal Estimation 1707.4.1 Estimation of the Direct Signal Ys(k, n) 1707.4.2 Estimation of the Diffuse Signal Yd(k, n) 1767.5 Parameter Estimation 1797.5.1 Estimation of the Number of Sources 1797.5.2 Direction of Arrival Estimation 1817.5.3 Microphone Input PSD Matrix 1817.5.4 Noise PSD Estimation 1827.5.5 Diffuse Sound PSD Estimation 1827.5.6 Signal PSD Estimation in Multi-Wave Scenarios 1857.6 Application to Spatial Sound Reproduction 1867.6.1 State of the Art 1867.6.2 Spatial Sound Reproduction Based on Informed Spatial Filtering 1877.7 Summary 194References 1958 Adaptive Mixing of Excessively Directive and Robust Beamformers for Reproduction of Spatial Sound 201Symeon Delikaris-Manias and Juha Vilkamo8.1 Introduction 2018.2 Notation and Signal Model 2028.3 Overview of the Method 2038.4 Loudspeaker-Based Spatial Sound Reproduction 2048.4.1 Estimation of the Target Covariance Matrix Cy 2048.4.2 Estimation of the Synthesis Beamforming Signals Ws 2068.4.4 Processing the Synthesis Signals (Wsx) to Obtain the Target Covariance Matrix Cy 206Spatial Energy Distribution 2078.4.5 Listening Tests 2088.5 Binaural-Based Spatial Sound Reproduction 2098.5.1 Estimation of the Analysis and Synthesis Beamforming Weight Matrices 2108.5.2 Diffuse-Field Equalization of HRTFs 2108.5.3 Adaptive Mixing and Decorrelation 2118.5.4 Subjective Evaluation 2118.6 Conclusions 212References 2129 Source Separation and Reconstruction of Spatial Audio Using Spectrogram Factorization 215Joonas Nikunen and Tuomas Virtanen9.1 Introduction 2159.2 Spectrogram Factorization 2179.2.1 Mixtures of Sounds 2179.2.2 Magnitude Spectrogram Models 2189.2.3 Complex-Valued Spectrogram Models 2219.2.4 Source Separation by Time-Frequency Filtering 2259.3 Array Signal Processing and Spectrogram Factorization 2269.3.1 Spaced Microphone Arrays 2269.3.2 Model for Spatial Covariance Based on Direction of Arrival 2279.3.3 Complex-Valued NMF with the Spatial Covariance Model 2299.4 Applications of Spectrogram Factorization in Spatial Audio 2319.4.1 Parameterization of Surround Sound: Upmixing by Time-Frequency Filtering 2319.4.2 Source Separation Using a Compact Microphone Array 2339.4.3 Reconstruction of Binaural Sound Through Source Separation 2389.5 Discussion 2439.6 Matlab Example 243References 247Part III Signal-Dependent Spatial Filtering 25110 Time-Frequency Domain Spatial Audio Enhancement 253Symeon Delikaris-Manias and Pasi Pertila10.1 Introduction 25310.2 Signal-Independent Enhancement 25410.3 Signal-Dependent Enhancement 25510.3.1 Adaptive Beamformers 25510.3.2 Post-Filters 25710.3.3 Post-Filter Types 25710.3.4 Estimating Post-Filters with Machine Learning 25910.3.5 Post-Filter Design Based on Spatial Parameters 259References 26111 Cross-Spectrum-Based Post-Filter Utilizing Noisy and Robust Beamformers 265Symeon Delikaris-Manias and Ville Pulkki11.1 Introduction 26511.2 Notation and Signal Model 26711.2.1 Virtual Microphone Design Utilizing Pressure Microphones 26811.3 Estimation of the Cross-Spectrum-Based Post-Filter 26911.3.1 Post-Filter Estimation Utilizing Two Static Beamformers 27011.3.2 Post-Filter Estimation Utilizing a Static and an Adaptive Beamformer 27211.3.3 Smoothing Techniques 27711.4 Implementation Examples 27911.4.1 Ideal Conditions 27911.4.2 Prototype Microphone Arrays 28111.5 Conclusions and Further Remarks 28311.6 Source Code 284References 28712 Microphone-Array-Based Speech Enhancement Using Neural NetworksPasi Pertila 29112.1 Introduction 29112.2 Time-Frequency Masks for Speech Enhancement Using Supervised Learning 29312.2.1 Beamforming with Post-Filtering 29312.2.2 Overview of Mask Prediction 29412.2.3 Features for Mask Learning 29512.2.4 Target Mask Design 29712.3 Artificial Neural Networks 29812.3.1 Learning the Weights 29912.3.2 Generalization 30112.3.3 Deep Neural Networks 30512.4 Mask Learning: A Simulated Example 30512.4.1 Feature Extraction 30612.4.2 Target Mask Design 30612.4.3 Neural Network Training 30712.4.4 Results 30812.5 Mask Learning: A Real-World Example 31012.5.1 Brief Description of the Third CHiME Challenge Data 31012.5.2 Data Processing and Beamforming 31212.5.3 Description of Network Structure, Features, and Targets 31212.5.4 Mask Prediction Results and Discussion 31412.5.5 Speech Enhancement Results 31612.6 Conclusions 31812.7 Source Code 31812.7.1 Matlab Code for Neural-Network-Based Sawtooth Denoising Example 31812.7.2 Matlab Code for Phase Feature Extraction 321References 324Part IV Applications 32713 Upmixing and Beamforming in Professional Audio 329Christof Faller13.1 Introduction 32913.2 Stereo-to-Multichannel Upmix Processor 32913.2.1 Product Description 32913.2.2 Considerations for Professional Audio and Broadcast 33113.2.3 Signal Processing 33213.3 Digitally Enhanced Shotgun Microphone 33613.3.1 Product Description 33613.3.2 Concept 33613.3.3 Signal Processing 33613.3.4 Evaluations and Measurements 33913.4 Surround Microphone System Based on Two Microphone Elements 34113.4.1 Product Description 34113.4.2 Concept 34413.5 Summary 345References 34514 Spatial Sound Scene Synthesis and Manipulation for Virtual Reality and Audio Effects 347Ville Pulkki, Archontis Politis, Tapani Pihlajamaki, and Mikko-Ville Laitinen14.1 Introduction 34714.2 Parametric Sound Scene Synthesis for Virtual Reality 34814.2.1 Overall Structure 34814.2.2 Synthesis of Virtual Sources 35014.2.3 Synthesis of Room Reverberation 35214.2.4 Augmentation of Virtual Reality with Real Spatial Recordings 35214.2.5 Higher-Order Processing 35314.2.6 Loudspeaker-Signal Bus 35414.3 Spatial Manipulation of Sound Scenes 35514.3.1 Parametric Directional Transformations 35614.3.2 Sweet-Spot Translation and Zooming 35614.3.3 Spatial Filtering 35614.3.4 Spatial Modulation 35714.3.5 Diffuse Field Level Control 35814.3.6 Ambience Extraction 35914.3.7 Spatialization of Monophonic Signals 36014.4 Summary 360References 36115 Parametric Spatial Audio Techniques in Teleconferencing and Remote Presence 363Anastasios Alexandridis, Despoina Pavlidi, Nikolaos Stefanakis, and Athanasios Mouchtaris15.1 Introduction and Motivation 36315.2 Background 36515.3 Immersive Audio Communication System (ImmACS) 36615.3.1 Encoder 36615.3.2 Decoder 37315.4 Capture and Reproduction of Crowded Acoustic Environments 37615.4.1 Sound Source Positioning Based on VBAP 37615.4.2 Non-Parametric Approach 37715.4.3 Parametric Approach 37915.4.4 Example Application 38215.5 Conclusions 384References 384Index 387
Responsibility: edited by Ville Pulkki, Symeon Delikaris-Manias, and Archontis Politis.


A comprehensive guide that addresses the theory and practice of spatial audio This book provides readers with the principles and best practices in spatial audio signal processing.  Read more...


User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...


Be the first.
Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data

Primary Entity

<> # Parametric time-frequency domain spatial audio
    a schema:Book, schema:MediaObject, schema:CreativeWork ;
    library:oclcnum "1031279282" ;
    library:placeOfPublication <> ;
    schema:about <> ; # Time-domain analysis
    schema:about <> ; # Signal processing
    schema:about <> ; # Surround-sound systems--Mathematical models
    schema:about <> ; # TECHNOLOGY & ENGINEERING / Electronics / General
    schema:about <> ;
    schema:bookFormat schema:EBook ;
    schema:datePublished "2018" ;
    schema:description ""A comprehensive guide that addresses the theory and practice of spatial audio This book provides readers with the principles and best practices in spatial audio signal processing. It describes how sound fields and their perceptual attributes are captured and analyzed within the time-frequency domain, how essential representation parameters are coded, and how such signals are efficiently reproduced for practical applications. The book is split into four parts starting with an overview of the fundamentals. It then goes on to explain the reproduction of spatial sound before offering an examination of signal-dependent spatial filtering. The book finishes with coverage of both current and future applications and the direction that spatial audio research is heading in. Parametric Time-frequency Domain Spatial Audio focuses on applications in entertainment audio, including music, home cinema, and gamingcovering the capturing and reproduction of spatial sound as well as its generation, transduction, representation, transmission, and perception. This book will teach readers the tools needed for such processing, and provides an overview to existing research. It also shows recent up-to-date projects and commercial applications built on top of the systems. Provides an in-depth presentation of the principles, past developments, state-of-the-art methods, and future research directions of spatial audio technologies Includes contributions from leading researchers in the field Offers MATLAB codes with selected chapters An advanced book aimed at readers who are capable of digesting mathematical expressions about digital signal processing and sound field analysis, Parametric Time-frequency Domain Spatial Audio is best suited for researchers in academia and in the audio industry"--"@en ;
    schema:editor <> ; # Symeon Delikaris-Manias
    schema:editor <> ; # Ville Pulkki
    schema:editor <> ; # Archontis Politis
    schema:exampleOfWork <> ;
    schema:genre "Electronic books"@en ;
    schema:inLanguage "en" ;
    schema:isSimilarTo <> ;
    schema:name "Parametric time-frequency domain spatial audio"@en ;
    schema:productID "1031279282" ;
    schema:url <> ;
    schema:url <> ;
    schema:url <> ;
    schema:url <> ;
    schema:url <> ;
    schema:url <> ;
    schema:url <> ;
    schema:url <> ;
    schema:url <> ;
    schema:workExample <> ;
    schema:workExample <> ;
    wdrs:describedby <> ;

Related Entities

<> # Symeon Delikaris-Manias
    a schema:Person ;
    schema:familyName "Delikaris-Manias" ;
    schema:givenName "Symeon" ;
    schema:name "Symeon Delikaris-Manias" ;

<> # Archontis Politis
    a schema:Person ;
    schema:familyName "Politis" ;
    schema:givenName "Archontis" ;
    schema:name "Archontis Politis" ;

<> # Ville Pulkki
    a schema:Person ;
    schema:familyName "Pulkki" ;
    schema:givenName "Ville" ;
    schema:name "Ville Pulkki" ;

<> # Signal processing
    a schema:Intangible ;
    schema:name "Signal processing"@en ;

<> # Surround-sound systems--Mathematical models
    a schema:Intangible ;
    schema:name "Surround-sound systems--Mathematical models"@en ;

<> # TECHNOLOGY & ENGINEERING / Electronics / General
    a schema:Intangible ;
    schema:name "TECHNOLOGY & ENGINEERING / Electronics / General"@en ;

<> # Time-domain analysis
    a schema:Intangible ;
    schema:name "Time-domain analysis"@en ;

    rdfs:comment "URL des Erstveröffentlichers" ;

    rdfs:comment "5 simultaneous users allowed through Safari Technical Books" ;

    a schema:ProductModel ;
    schema:isbn "1119252598" ;
    schema:isbn "9781119252597" ;

    a schema:ProductModel ;
    schema:isbn "111925261X" ;
    schema:isbn "9781119252610" ;

    a schema:CreativeWork ;
    rdfs:label "Parametric time-frequency domain spatial audio." ;
    schema:description "Print version:" ;
    schema:isSimilarTo <> ; # Parametric time-frequency domain spatial audio

Content-negotiable representations

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.