skip to content
Machine Learning in Translation Corpora Processing Preview this item
ClosePreview this item
Checking...

Machine Learning in Translation Corpora Processing

Author: Krzysztof Wołk
Publisher: Milton : Chapman and Hall/CRC, 2019.
Edition/Format:   eBook : Document : EnglishView all editions and formats
Summary:
This book reviews ways to improve statistical machine speech translation between Polish and English. Research has been conducted mostly on dictionary-based, rule-based, and syntax-based, machine translation techniques. Most popular methodologies and tools are not well-suited for the Polish language and therefore require adaptation, and language resources are lacking in parallel and monolingual data. The main  Read more...
Rating:

(not yet rated) 0 with reviews - Be the first.

Subjects
More like this

Find a copy online

Find a copy in the library

&AllPage.SpinnerRetrieving; Finding libraries that hold this item...

Details

Genre/Form: Electronic books
Additional Physical Format: Print version:
Wolk, Krzysztof.
Machine Learning in Translation Corpora Processing.
Milton : Chapman and Hall/CRC, ©2019
Material Type: Document, Internet resource
Document Type: Internet Resource, Computer File
All Authors / Contributors: Krzysztof Wołk
ISBN: 9780429590771 0429590776 9780429197543 0429197543 9780429586897 0429586892 9780429588839 0429588836
OCLC Number: 1089525727
Notes: 4.9.3.4 Analogy-based method
Description: 1 online resource (281 pages)
Contents: Cover; Title Page; Copyright Page; Acknowledgements; Preface; Table of Contents; Abbreviations and Definitions; Overview; 1: Introduction; 1.1 Background and context; 1.1.1 The concept of cohesion; 1.2 Machine translation (MT); 1.2.1 History of statistical machine translation (SMT); 1.2.2 Statistical machine translation approach; 1.2.3 SMT applications and research trends; 2: Statistical Machine Translation and Comparable Corpora; 2.1 Overview of SMT; 2.2 Textual components and corpora; 2.2.1 Words; 2.2.2 Sentences; 2.2.3 Corpora; 2.3 Moses tool environment for SMT; 2.3.1 Tuning for quality 2.3.2 Operation sequence model (OSM)2.3.3 Minimum error rate training tool; 2.4 Aspects of SMT processing; 2.4.1 Tokenization; 2.4.2 Compounding; 2.4.3 Language models; 2.4.3.1 Out of vocabulary words; 2.4.3.2 N-gram smoothing methods; 2.4.4 Translation models; 2.4.4.1 Noisy channel model; 2.4.4.2 IBM models; 2.4.4.3 Phrase-based models; 2.4.5 Lexicalized reordering; 2.4.5.1 Word alignment; 2.4.6 Domain text adaptation; 2.4.6.1 Interpolation; 2.4.6.2 Adaptation of parallel corpora; 2.5 Evaluation of SMT quality; 2.5.1 Current evaluation metrics; 2.5.1.1 BLEU overview 2.5.1.2 Other SMT metrics2.5.1.3 HMEANT metric; 2.5.1.3.1 Evaluation using HMEANT; 2.5.1.3.2 HMEANT calculation; 2.5.2 Statistical significance test; 3: State of the Art; 3.1 Current methods and results in spoken language translation; 3.2 Recent methods in comparable corpora exploration; 3.2.1 Native Yalign method; 3.2.2 A* algorithm for alignment; 3.2.3 Needleman-Wunsch algorithm; 3.2.4 Other alignment methods; 4: Author's Solutions to PL-EN Corpora Processing Problems; 4.1 Parallel data mining improvements; 4.2 Multi-threaded, tuned and GPU-accelerated Yalign 4.2.1 Needleman-Wunsch algorithm with GPU optimization4.2.2 Comparison of alignment methods; 4.3 Tuning of Yalign method; 4.4 Minor improvements in mining for Wikipedia exploration; 4.5 Parallel data mining using other methods; 4.5.1 The pipeline of tools; 4.5.2 Analogy-based method; 4.6 SMT metric enhancements; 4.6.1 Enhancements to the BLEU metric; 4.6.2 Evaluation using enhanced BLEU metric; 4.7 Alignment and filtering of corpora; 4.7.1 Corpora used for alignment experiments; 4.7.2 Filtering and alignment algorithm; 4.7.3 Filtering results; 4.7.4 Alignment evaluation results 4.8 Baseline system training4.9 Description of experiments; 4.9.1 Text alignment processing; 4.9.2 Machine translation experiments; 4.9.2.1 TED lectures translation; 4.9.2.1.1 Word stems and SVO word order; 4.9.2.1.2 Lemmatization; 4.9.2.1.3 Translation and translation parameter adaptation experiments; 4.9.2.2 Subtitles and EuroParl translation; 4.9.2.3 Medical texts translation; 4.9.2.4 Pruning experiments; 4.9.3 Evaluation of obtained comparable corpora; 4.9.3.1 Native Yalign method; 4.9.3.2 Improved Yalign method; 4.9.3.3 Parallel data mining using tool pipeline
Responsibility: Krzysztof Wolk.

Abstract:

This book reviews ways to improve statistical machine speech translation between Polish and English. Research has been conducted mostly on dictionary-based, rule-based, and syntax-based, machine translation techniques. Most popular methodologies and tools are not well-suited for the Polish language and therefore require adaptation, and language resources are lacking in parallel and monolingual data. The main objective of this volume to develop an automatic and robust Polish-to-English translation system to meet specific translation requirements and to develop bilingual textual resources by mining comparable corpora.

Reviews

User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...

Tags

Be the first.
Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data


Primary Entity

<http://www.worldcat.org/oclc/1089525727> # Machine Learning in Translation Corpora Processing
    a schema:Book, schema:CreativeWork, schema:MediaObject ;
    library:oclcnum "1089525727" ;
    library:placeOfPublication <http://experiment.worldcat.org/entity/work/data/8899689037#Place/milton> ; # Milton
    schema:about <http://dewey.info/class/491.85802210285/e23/> ;
    schema:about <http://experiment.worldcat.org/entity/work/data/8899689037#Topic/mathematics_arithmetic> ; # MATHEMATICS--Arithmetic
    schema:about <http://experiment.worldcat.org/entity/work/data/8899689037#Topic/polish_language_machine_translating> ; # Polish language--Machine translating
    schema:about <http://experiment.worldcat.org/entity/work/data/8899689037#Topic/computers_general> ; # COMPUTERS--General
    schema:about <http://experiment.worldcat.org/entity/work/data/8899689037#Topic/machine_translating> ; # Machine translating
    schema:about <http://experiment.worldcat.org/entity/work/data/8899689037#Topic/english_language_machine_translating> ; # English language--Machine translating
    schema:about <http://experiment.worldcat.org/entity/work/data/8899689037#Topic/computers_machine_theory> ; # COMPUTERS--Machine Theory
    schema:bookFormat schema:EBook ;
    schema:creator <http://experiment.worldcat.org/entity/work/data/8899689037#Person/wolk_krzysztof> ; # Krzysztof Wołk
    schema:datePublished "2019" ;
    schema:description "This book reviews ways to improve statistical machine speech translation between Polish and English. Research has been conducted mostly on dictionary-based, rule-based, and syntax-based, machine translation techniques. Most popular methodologies and tools are not well-suited for the Polish language and therefore require adaptation, and language resources are lacking in parallel and monolingual data. The main objective of this volume to develop an automatic and robust Polish-to-English translation system to meet specific translation requirements and to develop bilingual textual resources by mining comparable corpora."@en ;
    schema:description "Cover; Title Page; Copyright Page; Acknowledgements; Preface; Table of Contents; Abbreviations and Definitions; Overview; 1: Introduction; 1.1 Background and context; 1.1.1 The concept of cohesion; 1.2 Machine translation (MT); 1.2.1 History of statistical machine translation (SMT); 1.2.2 Statistical machine translation approach; 1.2.3 SMT applications and research trends; 2: Statistical Machine Translation and Comparable Corpora; 2.1 Overview of SMT; 2.2 Textual components and corpora; 2.2.1 Words; 2.2.2 Sentences; 2.2.3 Corpora; 2.3 Moses tool environment for SMT; 2.3.1 Tuning for quality"@en ;
    schema:exampleOfWork <http://worldcat.org/entity/work/id/8899689037> ;
    schema:genre "Electronic books"@en ;
    schema:inLanguage "en" ;
    schema:isSimilarTo <http://worldcat.org/entity/work/data/8899689037#CreativeWork/machine_learning_in_translation_corpora_processing> ;
    schema:name "Machine Learning in Translation Corpora Processing"@en ;
    schema:productID "1089525727" ;
    schema:publication <http://www.worldcat.org/title/-/oclc/1089525727#PublicationEvent/milton_chapman_and_hall_crc_2019> ;
    schema:publisher <http://experiment.worldcat.org/entity/work/data/8899689037#Agent/chapman_and_hall_crc> ; # Chapman and Hall/CRC
    schema:url <https://public.ebookcentral.proquest.com/choice/publicfullrecord.aspx?p=5720171> ;
    schema:url <https://www.taylorfrancis.com/books/e/9780429197543> ;
    schema:url <http://www.vlebooks.com/vleweb/product/openreader?id=none&isbn=9780429590771> ;
    schema:workExample <http://worldcat.org/isbn/9780429586897> ;
    schema:workExample <http://worldcat.org/isbn/9780429590771> ;
    schema:workExample <http://worldcat.org/isbn/9780429588839> ;
    schema:workExample <http://worldcat.org/isbn/9780429197543> ;
    wdrs:describedby <http://www.worldcat.org/title/-/oclc/1089525727> ;
    .


Related Entities

<http://experiment.worldcat.org/entity/work/data/8899689037#Agent/chapman_and_hall_crc> # Chapman and Hall/CRC
    a bgn:Agent ;
    schema:name "Chapman and Hall/CRC" ;
    .

<http://experiment.worldcat.org/entity/work/data/8899689037#Person/wolk_krzysztof> # Krzysztof Wołk
    a schema:Person ;
    schema:familyName "Wołk" ;
    schema:givenName "Krzysztof" ;
    schema:name "Krzysztof Wołk" ;
    .

<http://experiment.worldcat.org/entity/work/data/8899689037#Topic/computers_general> # COMPUTERS--General
    a schema:Intangible ;
    schema:name "COMPUTERS--General"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/8899689037#Topic/computers_machine_theory> # COMPUTERS--Machine Theory
    a schema:Intangible ;
    schema:name "COMPUTERS--Machine Theory"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/8899689037#Topic/english_language_machine_translating> # English language--Machine translating
    a schema:Intangible ;
    schema:name "English language--Machine translating"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/8899689037#Topic/machine_translating> # Machine translating
    a schema:Intangible ;
    schema:name "Machine translating"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/8899689037#Topic/mathematics_arithmetic> # MATHEMATICS--Arithmetic
    a schema:Intangible ;
    schema:name "MATHEMATICS--Arithmetic"@en ;
    .

<http://experiment.worldcat.org/entity/work/data/8899689037#Topic/polish_language_machine_translating> # Polish language--Machine translating
    a schema:Intangible ;
    schema:name "Polish language--Machine translating"@en ;
    .

<http://worldcat.org/entity/work/data/8899689037#CreativeWork/machine_learning_in_translation_corpora_processing>
    a schema:CreativeWork ;
    rdfs:label "Machine Learning in Translation Corpora Processing." ;
    schema:description "Print version:" ;
    schema:isSimilarTo <http://www.worldcat.org/oclc/1089525727> ; # Machine Learning in Translation Corpora Processing
    .

<http://worldcat.org/isbn/9780429197543>
    a schema:ProductModel ;
    schema:isbn "0429197543" ;
    schema:isbn "9780429197543" ;
    .

<http://worldcat.org/isbn/9780429586897>
    a schema:ProductModel ;
    schema:isbn "0429586892" ;
    schema:isbn "9780429586897" ;
    .

<http://worldcat.org/isbn/9780429588839>
    a schema:ProductModel ;
    schema:isbn "0429588836" ;
    schema:isbn "9780429588839" ;
    .

<http://worldcat.org/isbn/9780429590771>
    a schema:ProductModel ;
    schema:isbn "0429590776" ;
    schema:isbn "9780429590771" ;
    .


Content-negotiable representations

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.