skip to content
Multimodal scene understanding : algorithms, applications and deep learning Preview this item
ClosePreview this item
Checking...

Multimodal scene understanding : algorithms, applications and deep learning

Author: Michael Ying Yang; Bodo Rosenhahn; Vittorio Murino
Publisher: London ; San Diego, CA : Academic Press, [2019]
Edition/Format:   eBook : Document : EnglishView all editions and formats
Summary:
Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the  Read more...
Rating:

(not yet rated) 0 with reviews - Be the first.

Subjects
More like this

Find a copy online

Links to this item

Find a copy in the library

&AllPage.SpinnerRetrieving; Finding libraries that hold this item...

Details

Genre/Form: Electronic books
Additional Physical Format: Ebook version :
Print version:
Multimodal scene understanding.
London ; San Diego, CA : Academic Press, [2019]
(OCoLC)1089504196
Material Type: Document, Internet resource
Document Type: Internet Resource, Computer File
All Authors / Contributors: Michael Ying Yang; Bodo Rosenhahn; Vittorio Murino
ISBN: 9780128173596 0128173599 9780128173589 0128173580
OCLC Number: 1109390062
Description: 1 online resource
Contents: Front Cover; Multimodal Scene Understanding; Copyright; Contents; List of Contributors; 1 Introduction to Multimodal Scene Understanding; 1.1 Introduction; 1.2 Organization of the Book; References; 2 Deep Learning for Multimodal Data Fusion; 2.1 Introduction; 2.2 Related Work; 2.3 Basics of Multimodal Deep Learning: VAEs and GANs; 2.3.1 Auto-Encoder; 2.3.2 Variational Auto-Encoder (VAE); 2.3.3 Generative Adversarial Network (GAN); 2.3.4 VAE-GAN; 2.3.5 Adversarial Auto-Encoder (AAE); 2.3.6 Adversarial Variational Bayes (AVB); 2.3.7 ALI and BiGAN 2.4 Multimodal Image-to-Image Translation Networks2.4.1 Pix2pix and Pix2pixHD; 2.4.2 CycleGAN, DiscoGAN, and DualGAN; 2.4.3 CoGAN; 2.4.4 UNIT; 2.4.5 Triangle GAN; 2.5 Multimodal Encoder-Decoder Networks; 2.5.1 Model Architecture; 2.5.2 Multitask Training; 2.5.3 Implementation Details; 2.6 Experiments; 2.6.1 Results on NYUDv2 Dataset; 2.6.2 Results on Cityscape Dataset; 2.6.3 Auxiliary Tasks; 2.7 Conclusion; References; 3 Multimodal Semantic Segmentation: Fusion of RGB and Depth Data in Convolutional Neural Networks; 3.1 Introduction; 3.2 Overview; 3.2.1 Image Classi cation and the VGG Network 3.2.2 Architectures for Pixel-level Labeling3.2.3 Architectures for RGB and Depth Fusion; 3.2.4 Datasets and Benchmarks; 3.3 Methods; 3.3.1 Datasets and Data Splitting; 3.3.2 Preprocessing of the Stanford Dataset; 3.3.3 Preprocessing of the ISPRS Dataset; 3.3.4 One-channel Normal Label Representation; 3.3.5 Color Spaces for RGB and Depth Fusion; 3.3.6 Hyper-parameters and Training; 3.4 Results and Discussion; 3.4.1 Results and Discussion on the Stanford Dataset; 3.4.2 Results and Discussion on the ISPRS Dataset; 3.5 Conclusion; References 4 Learning Convolutional Neural Networks for Object Detection with Very Little Training Data4.1 Introduction; 4.2 Fundamentals; 4.2.1 Types of Learning; 4.2.2 Convolutional Neural Networks; 4.2.2.1 Arti cial neuron; 4.2.2.2 Arti cial neural network; 4.2.2.3 Training; 4.2.2.4 Convolutional neural networks; 4.2.3 Random Forests; 4.2.3.1 Decision tree; 4.2.3.2 Random forest; 4.3 Related Work; 4.4 Traf c Sign Detection; 4.4.1 Feature Learning; 4.4.2 Random Forest Classi cation; 4.4.3 RF to NN Mapping; 4.4.4 Fully Convolutional Network; 4.4.5 Bounding Box Prediction; 4.5 Localization 4.6 Clustering4.7 Dataset; 4.7.1 Data Capturing; 4.7.2 Filtering; 4.8 Experiments; 4.8.1 Training and Test Data; 4.8.2 Classi cation; 4.8.3 Object Detection; 4.8.4 Computation Time; 4.8.5 Precision of Localizations; 4.9 Conclusion; Acknowledgment; References; 5 Multimodal Fusion Architectures for Pedestrian Detection; 5.1 Introduction; 5.2 Related Work; 5.2.1 Visible Pedestrian Detection; 5.2.2 Infrared Pedestrian Detection; 5.2.3 Multimodal Pedestrian Detection; 5.3 Proposed Method; 5.3.1 Multimodal Feature Learning/Fusion; 5.3.2 Multimodal Pedestrian Detection; 5.3.2.1 Baseline DNN model
Responsibility: edited by Michael Ying Yang, Bodo Rosenhahn, Vittorio Murino.

Abstract:

Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms. Researchers collecting and analyzing multi-sensory data collections - for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful.

Reviews

User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...

Tags

Be the first.
Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data


Primary Entity

<http://www.worldcat.org/oclc/1109390062> # Multimodal scene understanding : algorithms, applications and deep learning
    a schema:Book, schema:MediaObject, schema:CreativeWork ;
    library:oclcnum "1109390062" ;
    library:placeOfPublication <http://id.loc.gov/vocabulary/countries/enk> ;
    schema:about <http://experiment.worldcat.org/entity/work/data/9407219989#Topic/computer_vision> ; # Computer vision
    schema:about <http://dewey.info/class/006.3/e23/> ;
    schema:about <http://experiment.worldcat.org/entity/work/data/9407219989#Topic/artificial_intelligence> ; # Artificial intelligence
    schema:about <http://experiment.worldcat.org/entity/work/data/9407219989#Topic/computational_intelligence> ; # Computational intelligence
    schema:about <http://experiment.worldcat.org/entity/work/data/9407219989#Topic/engineering> ; # Engineering
    schema:about <http://experiment.worldcat.org/entity/work/data/9407219989#Topic/algorithms> ; # Algorithms
    schema:bookFormat schema:EBook ;
    schema:datePublished "2019" ;
    schema:description "Front Cover; Multimodal Scene Understanding; Copyright; Contents; List of Contributors; 1 Introduction to Multimodal Scene Understanding; 1.1 Introduction; 1.2 Organization of the Book; References; 2 Deep Learning for Multimodal Data Fusion; 2.1 Introduction; 2.2 Related Work; 2.3 Basics of Multimodal Deep Learning: VAEs and GANs; 2.3.1 Auto-Encoder; 2.3.2 Variational Auto-Encoder (VAE); 2.3.3 Generative Adversarial Network (GAN); 2.3.4 VAE-GAN; 2.3.5 Adversarial Auto-Encoder (AAE); 2.3.6 Adversarial Variational Bayes (AVB); 2.3.7 ALI and BiGAN"@en ;
    schema:description "Multimodal Scene Understanding: Algorithms, Applications and Deep Learning presents recent advances in multi-modal computing, with a focus on computer vision and photogrammetry. It provides the latest algorithms and applications that involve combining multiple sources of information and describes the role and approaches of multi-sensory data and multi-modal deep learning. The book is ideal for researchers from the fields of computer vision, remote sensing, robotics, and photogrammetry, thus helping foster interdisciplinary interaction and collaboration between these realms. Researchers collecting and analyzing multi-sensory data collections - for example, KITTI benchmark (stereo+laser) - from different platforms, such as autonomous vehicles, surveillance cameras, UAVs, planes and satellites will find this book to be very useful."@en ;
    schema:editor <http://experiment.worldcat.org/entity/work/data/9407219989#Person/murino_vittorio> ; # Vittorio Murino
    schema:editor <http://experiment.worldcat.org/entity/work/data/9407219989#Person/rosenhahn_bodo> ; # Bodo Rosenhahn
    schema:editor <http://experiment.worldcat.org/entity/work/data/9407219989#Person/yang_michael_ying> ; # Michael Ying Yang
    schema:exampleOfWork <http://worldcat.org/entity/work/id/9407219989> ;
    schema:genre "Electronic books"@en ;
    schema:inLanguage "en" ;
    schema:isSimilarTo <http://worldcat.org/entity/work/data/9407219989#CreativeWork/> ;
    schema:isSimilarTo <http://www.worldcat.org/oclc/1089504196> ;
    schema:name "Multimodal scene understanding : algorithms, applications and deep learning"@en ;
    schema:productID "1109390062" ;
    schema:url <https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=2035400> ;
    schema:url <https://public.ebookcentral.proquest.com/choice/publicfullrecord.aspx?p=5830073> ;
    schema:url <https://www.sciencedirect.com/science/book/9780128173589> ;
    schema:url <http://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9780128173596> ;
    schema:workExample <http://worldcat.org/isbn/9780128173589> ;
    schema:workExample <http://worldcat.org/isbn/9780128173596> ;
    umbel:isLike <http://bnb.data.bl.uk/id/resource/GBB9C9474> ;
    wdrs:describedby <http://www.worldcat.org/title/-/oclc/1109390062> ;
    .


Related Entities

<http://experiment.worldcat.org/entity/work/data/9407219989#Person/murino_vittorio> # Vittorio Murino
    a schema:Person ;
    schema:familyName "Murino" ;
    schema:givenName "Vittorio" ;
    schema:name "Vittorio Murino" ;
    .

<http://experiment.worldcat.org/entity/work/data/9407219989#Person/rosenhahn_bodo> # Bodo Rosenhahn
    a schema:Person ;
    schema:familyName "Rosenhahn" ;
    schema:givenName "Bodo" ;
    schema:name "Bodo Rosenhahn" ;
    .

<http://experiment.worldcat.org/entity/work/data/9407219989#Person/yang_michael_ying> # Michael Ying Yang
    a schema:Person ;
    schema:familyName "Yang" ;
    schema:givenName "Michael Ying" ;
    schema:name "Michael Ying Yang" ;
    .

<http://experiment.worldcat.org/entity/work/data/9407219989#Topic/artificial_intelligence> # Artificial intelligence
    a schema:Intangible ;
    schema:name "Artificial intelligence"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/9407219989#Topic/computational_intelligence> # Computational intelligence
    a schema:Intangible ;
    schema:name "Computational intelligence"@en ;
    .

<http://worldcat.org/entity/work/data/9407219989#CreativeWork/>
    a schema:CreativeWork ;
    schema:description "Ebook version :" ;
    schema:isSimilarTo <http://www.worldcat.org/oclc/1109390062> ; # Multimodal scene understanding : algorithms, applications and deep learning
    .

<http://worldcat.org/isbn/9780128173589>
    a schema:ProductModel ;
    schema:isbn "0128173580" ;
    schema:isbn "9780128173589" ;
    .

<http://worldcat.org/isbn/9780128173596>
    a schema:ProductModel ;
    schema:isbn "0128173599" ;
    schema:isbn "9780128173596" ;
    .

<http://www.worldcat.org/oclc/1089504196>
    a schema:CreativeWork ;
    rdfs:label "Multimodal scene understanding." ;
    schema:description "Print version:" ;
    schema:isSimilarTo <http://www.worldcat.org/oclc/1109390062> ; # Multimodal scene understanding : algorithms, applications and deep learning
    .


Content-negotiable representations

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.