Lafferty, John D.
Most widely held works by John D Lafferty
Grammatical trigrams : a probabilistic model of link grammar by John D Lafferty ( Book )
1 edition published in 1992 in English and held by 7 WorldCat member libraries worldwide
Abstract: "In this paper we present a new class of language models. This class derives from link grammar, a context-free formalism for the description of natural language. We describe an algorithm for determining maximum-likelihood estimates of the parameters of these models. The language models which we present differ from previous models based on stochastic context-free grammars in that they are highly lexical. In particular, they include the familiar n-gram models as a natural subclass. The motivation for considering this class is to estimate the contribution which grammar can make to reducing the relative entropy of natural language."
Inducing features of random fields by Stephen Della Pietra ( Book )
1 edition published in 1995 in English and held by 5 WorldCat member libraries worldwide
Abstract: "We present a technique for constructing random fields from a set of training samples. The learning paradigm builds increasingly complex fields by allowing potential functions, or features, that are supported by increasingly large subgraphs. Each feature has a weight that is trained by minimizing the Kullback-Leibler divergence between the model and the empirical distribution of the training data. A greedy algorithm determines how features are incrementally added to the field and an iterative scaling algorithm is used to estimate the optimal values of the weights. The random field models and techniques introduced in this paper differ from those common to much of the computer vision literature in that the underlying random fields are non-Markovian and have a large number of parameters that must be estimated. Relations to other learning approaches including decision trees and Boltzmann machines are given. As a demonstration of the method, we describe its application to the problem of automatic word classification in natural language processing."
A robust parsing algorithm for link grammars by Dennis Grinberg ( Book )
1 edition published in 1995 in English and held by 5 WorldCat member libraries worldwide
Abstract: "In this paper we present a robust parsing algorithm based on the link grammar formalism for parsing natural languages. Our algorithm is a natural extension of the original dynamic programming recognition algorithm which recursively counts the number of linkages between two words in the input sentence. The modified algorithm uses the notion of a null link in order to allow a connection between any pair of adjacent words, regardless of their dictionary definitions. The algorithm proceeds by making three dynamic programming passes. In the first pass, the input is parsed using the original algorithm which enforces the constraints on links to ensure grammaticality. In the second pass, the total cost of each substring of words is computed, where cost is determined by the number of null links necessary to parse the substring. The final pass counts the total number of parses with minimal cost. All of the original pruning techniques have natural counterparts in the robust algorithm. When used together with memoization [sic], these techniques enable the algorithm to run efficiently with cubic worst-case complexity. We have implemented these ideas and tested them by parsing the Switchboard corpus of conversational English. This corpus is comprised of approximately three million words of text, corresponding to more than 150 hours of transcribed speech collected from telephone conversations restricted to 70 different topics. Although only a small fraction of the sentences in this corpus are 'grammatical' by standard criteria, the robust link grammar parser is able to extract relevant structure for a large portion of the sentences. We present the results of our experiments using this system, including the analyses of selected and random sentences from the corpus. We placed a version of the robust parser on the Word [sic] Wide Web for experimentation. It can be reached at URL http://www.cs.cmu.edu/afs/cs.cmu.edu/project/link/www/robust.html. In this version there are some limitations such as the maximum length of a sentence in words and the maximum amount of memory the parser can use."
A derivation of the inside-outside algorithm from the EM algorithm by John D Lafferty ( Book )
2 editions published in 2000 in English and held by 4 WorldCat member libraries worldwide
Abstract: "This note is a technical supplement to . The purpose is to show how the Inside-Outside algorithm is a special case of the EM algorithm , and to derive the parameter update formulas."
Level spacings for SL(2,p) by John D Lafferty ( Book )
2 editions published in 1997 in English and held by 3 WorldCat member libraries worldwide
We investigate the eigenvalue spacing distributions for randomly generated 4-regular Cayley graphs on SL2(Fp) by numerically calculating their spectra. We present strong evidence that the distributions are Poisson and hence do not follow the Gaussian orthogonal ensemble. Among the Cayley graphs of SL2(Fp) we consider are the new expander graphs recently discovered by Y. Shalom. In addition, we use a Markov chain method to generate random 4-regular graphs and observe that the average eigenvalue spacings are closely approximated by the Wigner surmise
Basic methods of probabilistic context free grammars by Frederick Jelinek ( Book )
1 edition published in 1990 in English and held by 3 WorldCat member libraries worldwide
We introduce four classes of algorithms that handle PCFGs: (1) Computation of the total probability that a PCFG generates a given sentence; (2) Method of finding the most probable parse tree of a given sentence; (3) Estimation of probabilities of rewriting rules of a PCFG on the basis of a text corpus; (4) Computation of the probability that the PCFG produces a sentence having a given initial substring."
A unification-grammar-based approach to the statistical analysis of English by Ezra W Black ( Book )
1 edition published in 1989 in English and held by 2 WorldCat member libraries worldwide
Abstract: "An overview is given of a system currently under development for the statistical analysis of natural language. An account of the stochastic training algorithm is given, together with a discussion of the underlying unification grammar of English, and a presentation of initial results."
Ordered Binary Decision Diagrams and Minimal Trellises by John D Lafferty ( Book )
1 edition published in 1998 in English and held by 2 WorldCat member libraries worldwide
Abstract: "Ordered binary decision diagrams (OBDDs) are graph-based data structures for representing Boolean functions. They have found widespread use in computer-aided design and in formal verification of digital circuits. Minimal trellises are graphical representations of error-correcting codes that play a prominent role in coding theory. This paper establishes a close connection between these two graphical models, as follows. Let C be a binary code of length n, and let f[subscript c](x₁, ..., x[subscript n]) be the Boolean function that takes the value 0 at x₁, ..., x[subscript n] if and only if (x₁, ..., x[subscript n]) [element of] C. Given this natural one-to-one correspondence between Boolean functions and binary codes, we prove that the minimal proper trellis for a code C with minimum distance d> 1 is isomorphic to the single-terminal OBDD for its Boolean indicator function f[subscript c](x₁, ..., x[subscript n]). Prior to this result, the extensive research during the past decade on binary decision diagrams -- in computer engineering -- and on minimal trellises -- in coding theory -- has been carried out independently. As outlined in this work, the realization that binary decision diagrams and minimal trellises are essentially the same data structure opens up a range of promising possibilities for transfer of ideas between these disciplines."
Duality and auxiliary functions for Bregman distances by Stephen Della Pietra ( Book )
2 editions published between 2001 and 2002 in English and held by 2 WorldCat member libraries worldwide
Abstract: "We formulate and prove a convex duality theorem for Bregman distances and present a technique based on auxiliary functions for deriving and proving convergence of iterative algorithms to minimize Bregman distance subject to linear constraints."
Automatic word classification using features of spellings by Thomas J. Watson IBM Research Center ( Book )
1 edition published in 1993 in English and held by 1 WorldCat member library worldwide
Boosting and maximum likelihood for exponential models by Guy Lebanon ( Book )
1 edition published in 2001 in English and held by 1 WorldCat member library worldwide
Abstract: "Recent research has considered the relationship between boosting and more standard statistical methods, such as logistic regression, concluding that AdaBoost is similar but somehow still very different from statistical methods in that it minimizes a different loss function. In this paper we derive an equivalence between AdaBoost and the dual of a convex optimization problem. In this setting, it is seen that the only difference between minimizing the exponential loss used by AdaBoost and maximum likelihood for exponential models is that the latter requires the model to be normalized to form a conditional probability distribution over labels; the two methods minimize the same Kullback-Leibler divergence objective function subject to identical feature constraints. In addition to establishing a simple and easily understood connection between the two methods, this framework enables us to derive new regularization procedures for boosting that directly correspond to penalized maximum likelihood. Experiments on UCI datasets, comparing exponential loss and maximum likelihood for parallel and sequential update algorithms, confirm our theoretical analysis, indicating that AdaBoost and maximum likelihood typically yield identical results as the number of features increases to allow the models to fit the training data."
Gibbs-Markov models by John D Lafferty ( )
1 edition published in 1996 in English and held by 1 WorldCat member library worldwide
A derivation of the inside-outside algorithm from the EM algorithm by International Business Machines Corporation ( Book )
1 edition published in 2000 in English and held by 1 WorldCat member library worldwide