## Find a copy in the library

Finding libraries that hold this item...

## Details

Genre/Form: | Electronic books |
---|---|

Additional Physical Format: | Print version: |

Material Type: | Document |

Document Type: | Book, Computer File |

All Authors / Contributors: |
Shay Cohen; Graeme Hirst |

ISBN: | 9781681735276 168173527X 9781681735283 1681735288 9781681735269 1681735261 |

OCLC Number: | 1110598699 |

Notes: | Part of: Synthesis digital library of engineering and computer science. Title from PDF title page (viewed on May 3, 2019). |

Description: | 1 online resource (1 PDF (xxxi, 311 pages)) : illustrations |

Contents: | 1. Preliminaries -- 1.1. Probability Measures -- 1.2. Random Variables -- 1.3. Conditional Distributions -- 1.4. Expectations of Random Variables -- 1.5. Models -- 1.6. Learning from Data Scenarios -- 1.7. Bayesian and Frequentist Philosophy (Tip of the Iceberg) -- 1.8. Summary -- 1.9. Exercises 2. Introduction -- 2.1. Overview : Where Bayesian Statistics and NLP Meet -- 2.2. First Example : The Latent Dirichlet Allocation Model -- 2.3. Second Example : Bayesian Text Regression -- 2.4. Conclusion and Summary -- 2.5. Exercises 3. Priors -- 3.1. Conjugate Priors -- 3.2. Priors Over Multinomial and Categorical Distributions -- 3.3. Non-Informative Priors -- 3.4. Conjugacy and Exponential Models -- 3.5. Multiple Parameter Draws in Models -- 3.6. Structural Priors -- 3.7. Conclusion and Summary -- 3.8. Exercises 4. Bayesian Estimation -- 4.1. Learning with Latent Variables : Two Views -- 4.2. Bayesian point estimation -- 4.3. Empirical Bayes -- 4.4. Asymptotic behavior of the posterior -- 4.5. Summary -- 4.6. Exercises 5. Sampling methods -- 5.1. MCMC algorithms : overview -- 5.2. NLP model structure for MCMC inference -- 5.3. Gibbs sampling -- 5.4. The Metropolis-Hastings algorithm -- 5.5. Slice sampling -- 5.6. Simulated annealing -- 5.7. Convergence of MCMC algorithms -- 5.8. Markov chain : basic theory -- 5.9. Sampling algorithms not in the MCMC realm -- 5.10. Monte Carlo integration -- 5.11. Discussion -- 5.12. Conclusion and summary -- 5.13. Exercises 6. Variational inference -- 6.1. Variational bound on marginal log-likelihood -- 6.2. Mean-field approximation -- 6.3. Mean-field variational inference algorithm -- 6.4. Empirical Bayes with variational inference -- 6.5. Discussion -- 6.6. Summary -- 6.7. Exercises 7. Nonparametric priors -- 7.1. The dirichlet process : three views -- 7.2. Dirichlet process mixtures -- 7.3. The hierarchical Dirichlet process -- 7.4. The Pitman-Yor process -- 7.5. Discussion -- 7.6. Summary -- 7.7. Exercises 8. Bayesian grammar models -- 8.1. Bayesian hidden Markov models -- 8.2. Probabilistic context-free grammars -- 8.3. Bayesian probabilistic context-free grammars -- 8.4. Adaptor grammars -- 8.5. Hierarchical Dirichlet process PCFGS (HDP-PCFGS) -- 8.6. Dependency grammars -- 8.7. Synchronous grammars -- 8.8. Multilingual learning -- 8.9. Further reading -- 8.10. Summary -- 8.11. Exercises 9. Representation learning and neural networks -- 9.1. Neural networks and representation learning : why now? -- 9.2. Word embeddings -- 9.3. Neural networks -- 9.4. Modern use of neural networks in NLP -- 9.5. Tuning neural networks -- 9.6. Generative modeling with neural networks -- 9.7. Conclusion -- 9.8. Exercises A. Basic concepts -- A.1. Basic concepts in information theory -- A.2. Other basic concepts -- A.3. Basic concepts in optimization -- B. Distribution catalog -- B.1. The multinomial distribution -- B.2. The Dirichlet distribution -- B.3. The Poisson distribution -- B.4. The gamma distribution -- B.5. The multivariate normal distribution -- B.6. The Laplace distribution -- B.7. The logistic normal distribution -- B.8. The inverse Wishart distribution -- B.9. The Gumbel distribution. |

Series Title: | Synthesis lectures on human language technologies, #41.; Synthesis digital library of engineering and computer science. |

Responsibility: | Shay Cohen. |

### Abstract:

Natural language processing (NLP) went through a profound transformation in the mid-1980s when it shifted to make heavy use of corpora and data-driven techniques to analyze language. Since then, the use of statistical techniques in NLP has evolved in several ways. One such example of evolution took place in the late 1990s or early 2000s, when full-fledged Bayesian machinery was introduced to NLP. This Bayesian approach to NLP has come to accommodate various shortcomings in the frequentist approach and to enrich it, especially in the unsupervised setting, where statistical learning is done without target prediction examples. In this book, we cover the methods and algorithms that are needed to fluently read Bayesian learning papers in NLP and to do research in the area. These methods and algorithms are partially borrowed from both machine learning and statistics and are partially developed "in-house" in NLP. We cover inference techniques such as Markov chain Monte Carlo sampling and variational inference, Bayesian estimation, and nonparametric modeling. In response to rapid changes in the field, this second edition of the book includes a new chapter on representation learning and neural networks in the Bayesian context. We also cover fundamental concepts in Bayesian statistics such as prior distributions, conjugacy, and generative modeling. Finally, we review some of the fundamental modeling techniques in NLP, such as grammar modeling, neural networks and representation learning, and their use with Bayesian analysis.

## Reviews

*User-contributed reviews*

Add a review and share your thoughts with other readers.
Be the first.

Add a review and share your thoughts with other readers.
Be the first.

## Tags

Add tags for "Bayesian analysis in natural language processing".
Be the first.