Hyperspectral Data Exploitation

Theory and Applications
By Chein-I Chang

John Wiley & Sons

Copyright © 2007 John Wiley & Sons, Ltd
All right reserved.

ISBN: 978-0-471-74697-3


Chapter One

OVERVIEW

CHEIN-I CHANG

Remote Sensing Signal and Image Processing Laboratory, Department of Computer Science and Electrical Engineering, University of Maryland-Baltimore County, Baltimore, MD 21250

1.1. INTRODUCTION

Hyperspectral imaging has become a fast growing technique in remote sensing image processing due to recent advances of hyperspectral imaging technology. It makes use of as many as hundreds of contiguous spectral bands to expand the capability of multispectral sensors that use tens of discrete spectral bands. As a result, with such high spectral resolution many subtle objects and materials can now be uncovered and extracted by hyperspectral imaging sensors with very narrow diagnostic spectral bands for detection, discrimination, classification, identification, recognition, and quantification. Many of its applications are yet to be explored. It has been common sense to think of hyperspectral imaging as a natural extension of multispectral imaging with band expansion. Accordingly, all techniques developed for multispectral imagery are considered to be readily applicable to hyperspectral imagery. Unfortunately, this intuitive interpretation may be somewhat misleading. To understand the fundamental difference between multispectral and hyperspectral images from a data processing perspective, we use a good example in mathematics for illustration, which is the difference between real analysis and complex analysis where the variables considered are real variables in real analysis as opposed to complex variables in complex analysis. Since real variables can be considered as real parts of complex variables, this may lead many to a belief that real analysis is a special case of complex analysis, which is certainly not true. One piece of clear evidence is derivatives. When a derivative is considered in real analysis, it has only two directions along the real line: left limit and right limit. However, in complex analysis, the direction of a derivative can be any curve in the complex plane. As a result, only partial derivatives in complex analysis can be considered as a natural extension of derivatives in real analysis. When a complex variable is differentiable in the complex plane, it is usually called total differentiable or analytic because it must satisfy the so-called Cauchy-Riemann equation. This simple example provides a similar interpretation to explain the key difference between multispectral and hyperspectral images. In the early days, multispectral imagery was used in remote sensing mainly for land cover/use classification in agriculture applications, disaster assessment and management, ecology, environmental monitoring, geology, geographical information system (GIS), and so on. In these cases, low spectral resolution multispectral imagery may provide sufficient information for data analysis, and the techniques developed for multispectral image processing are primarily derived from the traditional two-dimensional spatial domain-based image processing that takes advantage of spatial correlation to perform various tasks. Compared to multispectral imagery, hyperspectral imagery utilizes hundreds of spectral bands for data acquisition and collection with two prominent improvements, very fine spectral resolution, and hundreds of spectral bands. It is these differences that distinguish hyperspectral imagery from multispectral imagery in their utility in many applications as demonstrated by the chapters presented in this book.

1.2. ISSUES OF MIXED PIXELS AND SUBPIXELS

Due to its low spectral resolution, a multispectral image pixel may not have information that is as rich as that of a hyperspectral image pixel. In this case, it must rely on its surrounding image pixels to provide spatial correlation and information to help to make up insufficient spectral information provided by multiple discrete spectral bands. Because of that, this may be one of main reasons that early development of multsipectral image processing has been focused on spatial domain-based techniques. The issues of subpixels and mixed pixels usually arise from very high spectral resolution produced by hyperspectral imagery and have become crucial but may not be critical to multispectral imagery. First of all, targets or objects of interest are different. In multispectral imagery, land covers or patterns are often of major interest. Therefore, the techniques developed for multispectral image analysis generally perform pattern classification and recognition. As a complete opposite, the objects of interest in hyperspectral imagery usually appear either in a form mixed by a number of material substances or at subpixel level with targets embedded in a single pixel due to their sizes smaller than the ground sampling distance (GSD). In both cases, these objects may not be identified a priori or by visual inspection. Therefore, they are generally considered as insignificant targets but are indeed of major interest from an intelligence or information point of view. More specifically, in hyperspectral data exploitation the objects of particular interest are those targets which have their small spatial presence and low probability existence in either form of a mixed pixel or a subpixel. Such targets may include special spices in agriculture and ecology, toxic wastes in environmental monitoring, rare minerals in geology, drug/smuggler trafficking in law enforcement, military vehicles and landmines in battlefields, chemical/biological agents in bioterrorism, and weapon concealment and mass graves in intelligence gathering. Under such circumstances, they can only be detected at mixed or subpixel level, and the traditional spatial domain (i.e., literal)-based image processing techniques may not be suitable and may also not be effective even if they can be applied. So, a great challenge in extraction of such targets is that these targets provide very limited spatial information and are generally difficult to be visualized in data. Therefore, the techniques developed for hyperspectral image analysis generally perform target-based detection, discrimination, classification, identification, recognition, and quantification as opposed to pattern-based multispectral imaging techniques. Consequently, a direct extension of multispectral imaging techniques to hyperspectral imagery may not be applicable in hyperspectral data exploitation. In order to address this issue, an approach directly from a hyperspectral imagery point of view is highly desirable and may offer insights into design and development of hyperspectral imaging algorithms because a single hyperspectral image pixel alone may already provide a wealth of spectral information for data processing without appealing to its spatial correlation with other sample pixels due to its limited spatial information.

1.3. PIGEON-HOLE PRINCIPLE

The advent of hyperspectral imagery has changed the way we think of multispectral imagery because we now have hundreds of spectral bands available for our use. Thus, one major issue is how to effectively use and take advantage of spectral information provided by these hundreds spectral bands to perform target detection, discrimination, classification and identification. This interesting issue can be addressed by the following well-known pigeon-hole principle in discrete mathematics.

Suppose that there are 13 pigeons flying into a dozen pigeon holes (nests). According to the pigeon-hole principle, there exists at least one pigeon hole that must accommodate at least two pigeons. Now, assume that L is the total number of spectral bands and p is the number of target classes to be classified. A hyperspectral image pixel is actually an L-dimensional column vector. By virtue of the pigeon-hole principle, we interpret a pigeon hole as a spectral band while a pigeon is considered as a target (or an object) so that we can actually use a spectral band to detect, discriminate, and classify a distinct target. With this interpretation, L spectral bands can be used to classify L different targets. Since there are hundreds of spectral bands available from hyperspectral imagery, technically speaking, hundreds of spectrally distinct targets can be also classified and discriminated by these spectral bands. In order to make this idea work, three issues need to be addressed. One is that the number of spectral bands must be greater than or equal to the number of targets to be classified; that is, L [greater than or equal to] p, which always seems true for hyperspectral imagery, but not valid for multispectral imagery, in which L < p, such as three-band SPOT data that may have more than three target substances present in the data. Furthermore, the first issue also gives rise to a second issue that is a well-known curse of dimensionality-that is, to determine the value of p if L [greater than or equal to] p. This has been a most difficult and challenging issue for any hyperspectral image analyst to resolve, since it is nearly impossible to know the exact value of p in real-world problems and it may not be reliable even if the value of p is provided by prior knowledge. In multivariate data analysis, the value of p can be estimated by so-called intrinsic dimensionality (ID), which is defined as the minimum number of parameters used to specify the data. However, this concept is only of theoretic interest, and no method has been proposed for this purpose in the literature regarding how to find it. A common strategy is on a trial-and-error basis. A similar problem is also encountered in passive array processing where the number of signal sources arriving at an array of sensors is of major interest and a key issue. In order to estimate this number, two criteria-an information criterion (AIC) suggested by Akaike and minimum description length developed by Schwarz and Rissanen-have been shown successfully in such estimation. Unfortunately, a key assumption made on these criteria is that the noise must be independent identically distributed, which is usually not a valid assumption in hyperspectral images as shown in Chang and in Chang and Du. In order to cope with this dilemma, a new concept coined and suggested by Chang, called virtual dimensionality (VD), was recently proposed to estimate the number of spectrally distinct signatures in hyperspectral imagery. Its applications to hyperspectral data exploitation such as linear spectral unmixing (Chapters 4-6 in this book), dimensionality reduction (Chapter 8 in this book), band selection (Chapters 9 and 10 in this book), and so on, are also reported in Chang. Finally, the third and last issue is that once a spectral band is being used to accommodate one target, it cannot be used again to accommodate another distinct target. How do we make sure that this will not happen? One way to do so is to perform orthogonal subspace projection (OSP) developed in Harsanyi and Chang on the hyperspectral imagery so that no two or more distinct targets will be accommodated by a single spectral band. This implies that no two pigeons will be allowed to fly into a single pigeon hole (nest) in terms of the pigeon-hole principle. Once these three issues-that is, L [greater than or equal to] p, determination of p, and no two distinct target signatures to be accommodated by a single spectral band-are addressed, the idea of using the pigeon-hole principle for hyperspectral data exploitation can be realized and becomes feasible. Most importantly, it provides an alternative approach that uses spectral bands as a means to perform detection, and discrimination, classification, and identification without counting on spatial information or correlation. This is particularly important for targets that are small or insignificant due to their limited spatial presence and cannot be captured by spatial correlation or information. As a result, hyperspectral imaging techniques developed from this aspect are generally carried out on a pixel-by-pixel basis rather than on a spatial domain basis.

1.4. ORGANIZATION OF CHAPTERS IN THE BOOK

This book has 13 chapters contributed by researchers from various disciplinary areas whose expertise is in hyperspectral data exploitation. Each of these chapters addresses different problems caused by the above-mentioned issues. In particular, these 13 chapters are organized into three categories, Part I: Tutorials, Part II: Theory, and Part III: Applications.

1.4.1. Part I: Tutorials

The tutorials part consists of two tutorial chapters that review some basics of hyperspectral data exploitation, hyperspectral imaging systems, and algorithm design rationale for target detection and classification. Chapter 2 by Kerekes and Schott offers an excellent introduction of hyperspectral imaging systems including two popular airborne hyperspectral imagers, known as Airborne Visible/InfraRed Imaging Spectrometer (AVIRIS) and Hyperspectral Digital Image Collection Experiment (HYDICE), and a satellite-operated HYPERION. It is then followed by Chapter 3 by Chang, which is a review of matched filter-based target detection and classification algorithms.

1.4.2. Part II: Theory

The theory part is comprised of eight chapters that essentially address key issues in data modeling and representation by various approaches: linear mixing model (LMM) with deterministic endmembers (Chapter 4) and random endmembers (Chapters 5 and 6), endmember extraction (Chapter 7), dimensionality reduction (Chapter 8), band selection (Chapter 9), band partition (Chapter 10), and semisupervised support vector machines (Chapter 11).

Chapter 4 by Bowles and Gillis describes an optical real-time adaptive spectral identification system developed by the Naval Research Laboratory, known as ORA-SIS, which is a collection of algorithms to perform a series of tasks in sequence, an exemplar set selection, basis selection, endmember selection, and spectral unmixing. While the endmembers considered in Chapter 4 for spectral unmixing are deterministic, Chapter 5 by Eismann and Stein develops a stochastic mixing model (SMM) to describe statistical representation of hyperspectral data where the endmembers used in the model are considered as random vectors with probability density functions described by finite Gaussian mixtures. As an alternative to the stochastic mixing model discussed in Chapter 5, Chapter 6 by Nascimento and Dias presents Independent Component Analysis (ICA) and Independent Factor Analysis (IFA) for spectral unmixing where the abundance fractions of endmembers used in the linear mixing model for the ICA/IFA are described by a mixture of Dirichlet densities as opposed to a mixture of Gaussian densities assumed in the SMM in Chapter 5. Two common and key issues shared by Chapters 4-6 are (1) finding an appropriate set of endmembers to be used to form a linear mixing model and (2) performing data dimensionality reduction to reduce computational complexity. To address the first issue, Chapter 7 by Winter revisits his well-known endmember extraction algorithm, N-finder algorithm (N-FINDR), and further develops a new improved version of the N-FINDR, called maximum volume transform (MVT). Chapter 8 by Jia and Richards addresses the second issue by investigating data representation of hyperspectral data to cope with the so-called curse of dimensionality where feature extraction becomes a powerful and effective means to resolve this issue, such as variance used by the PCA, Fisher's ratio, or Rayleigh quotient used by Fisher's linear discriminant analysis (FLDA). Another approach to address the issue of data dimensionality reduction is band selection. Chapter 9 by Shen develops an entropy-based genetic algorithm to select optimal band sets for spectral imaging systems including five existing multispectral imaging systems and further substantiates the utility of optimal band selection in target detection and material identification. As an alternative to band selection, Chapter 10 by Serpico et al. proposes an approach to band partition which is based on feature extraction/selection for a specific classification application. Finally, Chapter 11 by Bruzzone et al. improves a well-known supervised classifier, support vector machines (SVMs), by introducing semisupervised SVMs for classification of hyperspectral remote sensing images.

(Continues...)



Excerpted from Hyperspectral Data Exploitation by Chein-I Chang Copyright © 2007 by John Wiley & Sons, Ltd. Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.