Huttenlocher, Daniel P.
Overview
Works:  28 works in 49 publications in 1 language and 447 library holdings 

Roles:  Author, Other 
Classifications:  TA1632, 621.367 
Publication Timeline
.
Most widely held works by
Daniel P Huttenlocher
Object recognition by computer : the role of geometric constraints by
William Eric Leifur Grimson(
Book
)
6 editions published in 1990 in English and held by 381 WorldCat member libraries worldwide
6 editions published in 1990 in English and held by 381 WorldCat member libraries worldwide
On planar point matching under affine transformation by
John E Hopcroft(
Book
)
3 editions published in 1989 in English and held by 5 WorldCat member libraries worldwide
In general, the number of transformations will be much smaller, so we have developed an output sensitive algorithm that runs in time O(n[superscrpit 2]log n + tm log n), where [formulas], and t depends on the number of transformations. The method relies on the affine properties that intersection points and length ratios along a line are preserved."
3 editions published in 1989 in English and held by 5 WorldCat member libraries worldwide
In general, the number of transformations will be much smaller, so we have developed an output sensitive algorithm that runs in time O(n[superscrpit 2]log n + tm log n), where [formulas], and t depends on the number of transformations. The method relies on the affine properties that intersection points and length ratios along a line are preserved."
Exploiting sequential phonetic constraints in recognizing spoken words by
Daniel P Huttenlocher(
Book
)
2 editions published in 1985 in English and held by 4 WorldCat member libraries worldwide
Machine recognition of spoken language requires developing more robust recognition algorithms. A recent study by Shipman and Zue suggest using partial descriptions of speech sounds to eliminate all but a handful of word candidates from a large lexicon. The current paper extends their work by investigating the power of partial phonetic descriptions for developing recognition algorithms. First, we demonstrate that sequences of manner of articulation classes are more reliable and provide more constraint than certain other classes. Alone these results are of limited utility, due to the high degree of variability in natural speech. This variability is not uniform however, as most modifications and deletions occur in unstressed syllables. Comparing the relative constraint provided by sounds in stressed versus unstressed syllables, we discover that the stressed syllables provide substantially more constraint. This indicates that recognition algorithms can be made more robust by exploiting the manner of articulation information in stressed syllables. Keywords: Natural constraints, Partial information, Word recognition, Speech recognition
2 editions published in 1985 in English and held by 4 WorldCat member libraries worldwide
Machine recognition of spoken language requires developing more robust recognition algorithms. A recent study by Shipman and Zue suggest using partial descriptions of speech sounds to eliminate all but a handful of word candidates from a large lexicon. The current paper extends their work by investigating the power of partial phonetic descriptions for developing recognition algorithms. First, we demonstrate that sequences of manner of articulation classes are more reliable and provide more constraint than certain other classes. Alone these results are of limited utility, due to the high degree of variability in natural speech. This variability is not uniform however, as most modifications and deletions occur in unstressed syllables. Comparing the relative constraint provided by sounds in stressed versus unstressed syllables, we discover that the stressed syllables provide substantially more constraint. This indicates that recognition algorithms can be made more robust by exploiting the manner of articulation information in stressed syllables. Keywords: Natural constraints, Partial information, Word recognition, Speech recognition
Finding convex edge groupings in an image by
Daniel P Huttenlocher(
Book
)
1 edition published in 1990 in English and held by 4 WorldCat member libraries worldwide
From this local neighborhood a local convexity graph is constructed. This planar graph encodes which neighboring image edges could be part of a convex group. A cycle in the graph corresponds to a group of image edges that form a convex region. The structure of the graph guarantees that each image edge belongs to at most one such cycle, thus limiting the total number of groups to O(n) for n image edges. We have implemented the method and found that it is efficient in practice as well as in theory
1 edition published in 1990 in English and held by 4 WorldCat member libraries worldwide
From this local neighborhood a local convexity graph is constructed. This planar graph encodes which neighboring image edges could be part of a convex group. A cycle in the graph corresponds to a group of image edges that form a convex region. The structure of the graph guarantees that each image edge belongs to at most one such cycle, thus limiting the total number of groups to O(n) for n image edges. We have implemented the method and found that it is efficient in practice as well as in theory
The upper envelope of Voronoi surfaces and its applications by
Daniel P Huttenlocher(
Book
)
2 editions published in 1991 in English and held by 4 WorldCat member libraries worldwide
Abstract: "Given a set S of sources (points or segments), we consider the surface that is the graph of the function [formula], for some metric p. This surface is closely related to the Voronoi diagram, Vor(S), of S under the metric p. The upper envelope of a set of these Voronoi surfaces, each defined for a different set of sources, can be used to solve a number of problems, including finding the minimum Hausdorff distance between two sets of points or line segments under translation, and determining the optimal placement of a site with respect to sets of utilities
2 editions published in 1991 in English and held by 4 WorldCat member libraries worldwide
Abstract: "Given a set S of sources (points or segments), we consider the surface that is the graph of the function [formula], for some metric p. This surface is closely related to the Voronoi diagram, Vor(S), of S under the metric p. The upper envelope of a set of these Voronoi surfaces, each defined for a different set of sources, can be used to solve a number of problems, including finding the minimum Hausdorff distance between two sets of points or line segments under translation, and determining the optimal placement of a site with respect to sets of utilities
Special issue on interpretation of 3D scenes(
Book
)
in English and held by 4 WorldCat member libraries worldwide
in English and held by 4 WorldCat member libraries worldwide
On dynamic Voronoi diagrams and the minimum Hausdorff distance for point sets under Euclidean motion in the plane by
Daniel P Huttenlocher(
Book
)
2 editions published in 1992 in English and held by 4 WorldCat member libraries worldwide
Abstract: "We show that the dynamic Voronoi diagram of k sets of points in the plane, where each set consists of n points moving rigidly, has complexity O(n²k²[lambda subscript s](k)) for some fixed s, where [lambda subscript s](n) is the maximum length of a (n, s) DavenportSchinzel sequence. This improves the result of Aonuma et. al., who show an upper bound of O(n³k⁴log*k) for the complexity of such Voronoi diagrams. We then apply this result to the problem of finding the minimum Hausdorff distance between two point sets in the plane under Euclidean motion. We show that this distance can be computed in time 0((m + n)⁶log(mn)), where the two sets contain m and n points respectively."
2 editions published in 1992 in English and held by 4 WorldCat member libraries worldwide
Abstract: "We show that the dynamic Voronoi diagram of k sets of points in the plane, where each set consists of n points moving rigidly, has complexity O(n²k²[lambda subscript s](k)) for some fixed s, where [lambda subscript s](n) is the maximum length of a (n, s) DavenportSchinzel sequence. This improves the result of Aonuma et. al., who show an upper bound of O(n³k⁴log*k) for the complexity of such Voronoi diagrams. We then apply this result to the problem of finding the minimum Hausdorff distance between two point sets in the plane under Euclidean motion. We show that this distance can be computed in time 0((m + n)⁶log(mn)), where the two sets contain m and n points respectively."
Comparing images using the Hausdorff distance under translation by
Daniel P Huttenlocher(
Book
)
1 edition published in 1991 in English and held by 3 WorldCat member libraries worldwide
In practice the methods are both highly efficient and simple to implement. The computation is in many ways similar to binary correlation, however it is more tolerant of perturbations in the locations of points because it measures proximity rather than exact superposition. We present a number of examples illustrating the operation of the approach, and compare it with correlation."
1 edition published in 1991 in English and held by 3 WorldCat member libraries worldwide
In practice the methods are both highly efficient and simple to implement. The computation is in many ways similar to binary correlation, however it is more tolerant of perturbations in the locations of points because it measures proximity rather than exact superposition. We present a number of examples illustrating the operation of the approach, and compare it with correlation."
Tracking nonrigid objects in complex scenes by
Daniel P Huttenlocher(
Book
)
2 editions published in 1992 in English and held by 3 WorldCat member libraries worldwide
Abstract: "We consider the problem of tracking nonrigid objects moving in a complex scene. We describe a modelbased tracking method, in which twodimensional geometric models are used to localize an object in each frame of an image sequence. The basic idea is to decompose the image of a solid object moving in space into two components: a twodimensional motion and a twodimensional shape change. The motion component is factored out, and the shape change is represented by explicitly storing a sequence of twodimensional models, one corresponding to each image frame. The major assumption underlying the method is that the twodimensional shape of an object will change slowly from one frame to the next
2 editions published in 1992 in English and held by 3 WorldCat member libraries worldwide
Abstract: "We consider the problem of tracking nonrigid objects moving in a complex scene. We describe a modelbased tracking method, in which twodimensional geometric models are used to localize an object in each frame of an image sequence. The basic idea is to decompose the image of a solid object moving in space into two components: a twodimensional motion and a twodimensional shape change. The motion component is factored out, and the shape change is represented by explicitly storing a sequence of twodimensional models, one corresponding to each image frame. The major assumption underlying the method is that the twodimensional shape of an object will change slowly from one frame to the next
Visuallyguided navigation by comparing twodimensional edge images by
Daniel P Huttenlocher(
Book
)
1 edition published in 1994 in English and held by 3 WorldCat member libraries worldwide
Abstract: "We present a method for navigating a robot from an initial position to a specified landmark in its visual field, using a sequence of monocular images. The location of the landmark with respect to the robot is determined using the change in size and location of the landmark in the image, as a function of the motion of the robot. The landmark location is estimated after the first three images are taken, and this estimate is refined as the robot moves. The method can correct for errors in the robot motion, as well as navigate around obstacles. The obstacle avoidance is done using bump sensors, sonar and dead reckoning, rather than visual servoing. The method does not require prior calibration of the camera. We show some examples of the operation of the system."
1 edition published in 1994 in English and held by 3 WorldCat member libraries worldwide
Abstract: "We present a method for navigating a robot from an initial position to a specified landmark in its visual field, using a sequence of monocular images. The location of the landmark with respect to the robot is determined using the change in size and location of the landmark in the image, as a function of the motion of the robot. The landmark location is estimated after the first three images are taken, and this estimate is refined as the robot moves. The method can correct for errors in the robot motion, as well as navigate around obstacles. The obstacle avoidance is done using bump sensors, sonar and dead reckoning, rather than visual servoing. The method does not require prior calibration of the camera. We show some examples of the operation of the system."
On invariants of sets of points or line segments under projection by
Daniel P Huttenlocher(
Book
)
2 editions published in 1992 in English and held by 3 WorldCat member libraries worldwide
Abstract: "We consider the problem of computing invariant functions of the image of a set of points or line segments in R³ under projection. Such functions are in principle useful for machine vision systems, because they allow different images of a given geometric object to be described by an invariant 'key'. We show that if a geometric object consists of an arbitrary set of points or line segments in R³, and the object can undergo a general rotation, then there are no invariants of its image under projection. For certain constrained rotations, however, there are invariants (e.g., rotation about the viewing direction). Thus we precisely delimit the conditions for the existence or nonexistence of invariants of arbitrary sets of points or line segments under projection."
2 editions published in 1992 in English and held by 3 WorldCat member libraries worldwide
Abstract: "We consider the problem of computing invariant functions of the image of a set of points or line segments in R³ under projection. Such functions are in principle useful for machine vision systems, because they allow different images of a given geometric object to be described by an invariant 'key'. We show that if a geometric object consists of an arbitrary set of points or line segments in R³, and the object can undergo a general rotation, then there are no invariants of its image under projection. For certain constrained rotations, however, there are invariants (e.g., rotation about the viewing direction). Thus we precisely delimit the conditions for the existence or nonexistence of invariants of arbitrary sets of points or line segments under projection."
Comparing point sets under projection by
Daniel P Huttenlocher(
Book
)
2 editions published in 1992 in English and held by 3 WorldCat member libraries worldwide
The basic issue is that for nearly all groups G, ̃ is not an equivalence relation (does not have an underlying invariant function). Despite this fact, however, ̃ does contain considerable geometric information. Thus we provide an algorithm for deciding whether P̃Q that runs in time O(n³), where n is the cardinality of the sets P and Q."
2 editions published in 1992 in English and held by 3 WorldCat member libraries worldwide
The basic issue is that for nearly all groups G, ̃ is not an equivalence relation (does not have an underlying invariant function). Despite this fact, however, ̃ does contain considerable geometric information. Thus we provide an algorithm for deciding whether P̃Q that runs in time O(n³), where n is the cardinality of the sets P and Q."
A multiresolution technique for comparing images using the Hausdorff distance by
Daniel P Huttenlocher(
Book
)
2 editions published in 1992 in English and held by 3 WorldCat member libraries worldwide
Abstract: "The Hausdorff distance measures the extent to which each point of a 'model' set lies near some point of an 'image' set and vice versa. In this paper we describe an efficient method of computing this distance, based on a multiresolution tessellation of the space of possible transformations of the model set. We focus on the case in which the model is allowed to translate and scale with respect to the image. This four dimensional transformation space (two translation and two scale dimensions) is searched rapidly, while guaranteeing that no match will be missed. We present some examples of identifying an object in a cluttered scene, including cases where the object is partially hidden from view."
2 editions published in 1992 in English and held by 3 WorldCat member libraries worldwide
Abstract: "The Hausdorff distance measures the extent to which each point of a 'model' set lies near some point of an 'image' set and vice versa. In this paper we describe an efficient method of computing this distance, based on a multiresolution tessellation of the space of possible transformations of the model set. We focus on the case in which the model is allowed to translate and scale with respect to the image. This four dimensional transformation space (two translation and two scale dimensions) is searched rapidly, while guaranteeing that no match will be missed. We present some examples of identifying an object in a cluttered scene, including cases where the object is partially hidden from view."
Recognizing 3D objects from 2D images : an error analysis by
William Eric Leifur Grimson(
Book
)
1 edition published in 1992 in English and held by 3 WorldCat member libraries worldwide
Abstract: "Many recent object recognition systems use a small number of pairings of data and model features to compute the 3D transformation from a model coordinate frame into the sensor coordinate system. In the case of perfect image data, these systems seem to work well. With uncertain image data, however, the performance of such methods is less well understood. In this paper, we examine the effects of two dimensional sensor uncertainty on the computation of threedimensional model transformations. We use this analysis to bound the uncertainty in the transformation parameters, as well as the uncertainty associated with applying the transformation to map other model features into the image. We also examine the effects of the transformation uncertainty on the effectiveness of recognition methods."
1 edition published in 1992 in English and held by 3 WorldCat member libraries worldwide
Abstract: "Many recent object recognition systems use a small number of pairings of data and model features to compute the 3D transformation from a model coordinate frame into the sensor coordinate system. In the case of perfect image data, these systems seem to work well. With uncertain image data, however, the performance of such methods is less well understood. In this paper, we examine the effects of two dimensional sensor uncertainty on the computation of threedimensional model transformations. We use this analysis to bound the uncertainty in the transformation parameters, as well as the uncertainty associated with applying the transformation to map other model features into the image. We also examine the effects of the transformation uncertainty on the effectiveness of recognition methods."
Detecting moving objects with a moving camera by comparing edge contours by
Daniel P Huttenlocher(
Book
)
1 edition published in 1994 in English and held by 2 WorldCat member libraries worldwide
Abstract: "This paper introduces a method for detecting moving objects in a monocular image sequence that is obtained using a moving camera. The method first estimates the motion of the edge contours in a given image frame, by recovering a transformation that best matches each edge contour with the edges in the subsequent frame. Any contour that is not well accounted for by a single transformation is split into subparts. The transformation of each edge contour together with the relative spatial locations of the contours is used to partition the image into regions with similar motions. Hypotheses about the locations of possible moving objects are then made based on these motion regions. One of the key aspects of the approach is that it is based on estimating the motion of entire edge contours, as opposed to recovering a velocity field that measures the motion of individual points. We present some examples for image sequences taken of animate objects using a handheld video camera."
1 edition published in 1994 in English and held by 2 WorldCat member libraries worldwide
Abstract: "This paper introduces a method for detecting moving objects in a monocular image sequence that is obtained using a moving camera. The method first estimates the motion of the edge contours in a given image frame, by recovering a transformation that best matches each edge contour with the edges in the subsequent frame. Any contour that is not well accounted for by a single transformation is split into subparts. The transformation of each edge contour together with the relative spatial locations of the contours is used to partition the image into regions with similar motions. Hypotheses about the locations of possible moving objects are then made based on these motion regions. One of the key aspects of the approach is that it is based on estimating the motion of entire edge contours, as opposed to recovering a velocity field that measures the motion of individual points. We present some examples for image sequences taken of animate objects using a handheld video camera."
Affine Matching With Bounded Sensor Error: Study of Geometric Hashing and Alignment by
William Eric Leifur Grimson(
Book
)
3 editions published in 1991 in English and held by 2 WorldCat member libraries worldwide
Abstract: "Affine transformations of the plane have been used in a number of modelbased recognition systems, in order to approximate the effects of perspective projection. The mathematics underlying these methods is for exact data, where there is no positional uncertainty in the measurement of feature points. In practice, various heuristics are used to adapt the methods to real data with uncertainty. In this paper, we provide a precise analysis of affine point matching under uncertainty. We obtain an expression for the range of affineinvariant values that are consistent with a given set of four points, where each data point lies in a disk of radius [epsilon]. This analysis reveals that the range of affine invariant values depends on the actual xypositions of the data points
3 editions published in 1991 in English and held by 2 WorldCat member libraries worldwide
Abstract: "Affine transformations of the plane have been used in a number of modelbased recognition systems, in order to approximate the effects of perspective projection. The mathematics underlying these methods is for exact data, where there is no positional uncertainty in the measurement of feature points. In practice, various heuristics are used to adapt the methods to real data with uncertainty. In this paper, we provide a precise analysis of affine point matching under uncertainty. We obtain an expression for the range of affineinvariant values that are consistent with a given set of four points, where each data point lies in a disk of radius [epsilon]. This analysis reveals that the range of affine invariant values depends on the actual xypositions of the data points
Special issue on interpretation of 3D scenes(
Book
)
2 editions published in 1992 in English and held by 2 WorldCat member libraries worldwide
2 editions published in 1992 in English and held by 2 WorldCat member libraries worldwide
On the sensitivity of the Hough transform for object recognition by
William Eric Leifur Grimson(
Book
)
2 editions published in 1988 in English and held by 2 WorldCat member libraries worldwide
Object recognition from sensory data involves, in part, determining the pose of a model with respect to a scene. A common method for finding an object's pose is the generalized Hough transform, which accumulates evidence for possible coordinate transformations in a parameter space whose axes are the quantized transformation parameters. Large clusters of similar transformations in that space are taken as evidence of a correct match. This article provides a theoretical analysis of the behavior of such methods. The authors derive bounds on the set of transformations consistent with each pairing of data and model features, in the presence of noise and occlusion in the image. They also provide bounds on the likelihood of false peaks in the parameter space, as a function of noise, occlusion, and tessellation effects. It is argued that blithely applying such methods to complex recognition tasks is a risky proposition, as the probability of false positives can be very high. Keywords: Two dimensional noise analysis. (kr)
2 editions published in 1988 in English and held by 2 WorldCat member libraries worldwide
Object recognition from sensory data involves, in part, determining the pose of a model with respect to a scene. A common method for finding an object's pose is the generalized Hough transform, which accumulates evidence for possible coordinate transformations in a parameter space whose axes are the quantized transformation parameters. Large clusters of similar transformations in that space are taken as evidence of a correct match. This article provides a theoretical analysis of the behavior of such methods. The authors derive bounds on the set of transformations consistent with each pairing of data and model features, in the presence of noise and occlusion in the image. They also provide bounds on the likelihood of false peaks in the parameter space, as a function of noise, occlusion, and tessellation effects. It is argued that blithely applying such methods to complex recognition tasks is a risky proposition, as the probability of false positives can be very high. Keywords: Two dimensional noise analysis. (kr)
On the verification of hypothesized matches in modelbased recognition by
William Eric Leifur Grimson(
Book
)
2 editions published in 1989 in English and held by 2 WorldCat member libraries worldwide
In modelbased recognition a number of ad hoc techniques are used to decide whether or not a match of data to a model is correct. Generally an empirically determined threshold is placed on the fraction of model features that must be matched. In this paper we present a more rigorous approach in which the conditions under which to accept a matched are derived based on fundamental grounds. We obtain an expression that relates the probability of a matched occuring at random to the reaction of a model features that are accounted for by the match. This expression is a function of the number of model features, the number of image features, and a bound on the degree on the degree of sensor noise. One implication of our analysis is that a proper threshold for matching must vary with the number of model and data features. Thus, it is important to be able to set the threshold as a function of a particular matching problem, rather than setting a single threshold as a function of a particular matching problem, rather than setting a single threshold based on experimentation. We analyze some existing recognition systems and find that our method yields threshold similiar to the ones were determined empirically for these systems, providing evidence of the validity of the technique. (KR)
2 editions published in 1989 in English and held by 2 WorldCat member libraries worldwide
In modelbased recognition a number of ad hoc techniques are used to decide whether or not a match of data to a model is correct. Generally an empirically determined threshold is placed on the fraction of model features that must be matched. In this paper we present a more rigorous approach in which the conditions under which to accept a matched are derived based on fundamental grounds. We obtain an expression that relates the probability of a matched occuring at random to the reaction of a model features that are accounted for by the match. This expression is a function of the number of model features, the number of image features, and a bound on the degree on the degree of sensor noise. One implication of our analysis is that a proper threshold for matching must vary with the number of model and data features. Thus, it is important to be able to set the threshold as a function of a particular matching problem, rather than setting a single threshold as a function of a particular matching problem, rather than setting a single threshold based on experimentation. We analyze some existing recognition systems and find that our method yields threshold similiar to the ones were determined empirically for these systems, providing evidence of the validity of the technique. (KR)
An Efficiently computable metric for comparing polygonal shapes by E. M Arkin(
Book
)
2 editions published between 1989 and 1991 in English and held by 1 WorldCat member library worldwide
Modelbased recognition is concerned with comparing a shape A, which is stored as a model for some particular object, with a shape B, which is found to exist in an image. If A and B are close to being the same shape, then a vision system should report a match and return a measure of how good that match is. To be useful this measure should satisfy a number of properties, including: (1) it should be a metric, (2) it should be invariant under translation, rotation, and changeofscale, (3) it should be reasonably easy to compute, and (4) it should match our intuition (i.e., answers should be similar to those that a person might give). We develop a method for comparing polygons that has these properties. The method works for both convex and nonconvex polygons and runs in time O(mn log mn) where m is the number of vertices in one polygon and n is the number of vertices in the other. Some examples are presented that show the method produces answers that are intuitively reasonable
2 editions published between 1989 and 1991 in English and held by 1 WorldCat member library worldwide
Modelbased recognition is concerned with comparing a shape A, which is stored as a model for some particular object, with a shape B, which is found to exist in an image. If A and B are close to being the same shape, then a vision system should report a match and return a measure of how good that match is. To be useful this measure should satisfy a number of properties, including: (1) it should be a metric, (2) it should be invariant under translation, rotation, and changeofscale, (3) it should be reasonably easy to compute, and (4) it should match our intuition (i.e., answers should be similar to those that a person might give). We develop a method for comparing polygons that has these properties. The method works for both convex and nonconvex polygons and runs in time O(mn log mn) where m is the number of vertices in one polygon and n is the number of vertices in the other. Some examples are presented that show the method produces answers that are intuitively reasonable
more
fewer
Audience Level
0 

1  
Kids  General  Special 
Related Identities
 Grimson, William Eric Leifur Other Author
 LozanoPérez, Tomás
 LozanoPerez, Thomas
 Rucklidge, William J.
 Kleinberg, Jon M.
 Massachusetts Institute of Technology Artificial Intelligence Laboratory
 Kedem, Klara
 MASSACHUSETTS INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE LAB
 Hopcroft, John E. 1939 Author
 Sharir, Micha
Useful Links
Alternative Names
Daniel P. Huttenlocher American university teacher
Daniel P. Huttenlocher Amerikaans hoogleraar
Daniel P. Huttenlocher USamerikanischer Informatiker
Languages