skip to content
Combination of a Probabilistic-Based and a Rule-Based Approach for Genealogical Record Linkage Preview this item
ClosePreview this item
Checking...

Combination of a Probabilistic-Based and a Rule-Based Approach for Genealogical Record Linkage

Author: Pooja P Shah; Franz Kurfess
Publisher: [San Luis Obispo, Calif.] : [California Polytechnic State University], 2015. ©2015.
Dissertation: Thesis (M.S.)--California Polytechnic State University, 2015.
Edition/Format:   Thesis/dissertation : Thesis/dissertation : English
Summary:
Record linkage is the task of identifying records within one or multiple databases that refer to the same entity. Currently, there exist many different approaches for record linkage. Some approaches incorporate the use of heuristic rules, mathematical models, Markov models, or machine learning. This thesis focuses on the application of record linkage to genealogical records within family trees. Today, large
Rating:

(not yet rated) 0 with reviews - Be the first.

Subjects
More like this

 

Find a copy online

Links to this item

Find a copy in the library

We were unable to get information about libraries that hold this item.

Details

Material Type: Thesis/dissertation, Internet resource
Document Type: Book, Internet Resource
All Authors / Contributors: Pooja P Shah; Franz Kurfess
OCLC Number: 953694866
Notes: Title from PDF title page (viewed on July 1, 2016).
Committee chair: Franz Kurfess, PhD
"February 2015."
Description: 1 online resource (x, pages 64) : color illustrations
Other Titles: Cal Poly master's thesis--Computer Science.
Responsibility: by Pooja P. Shah.

Abstract:

Record linkage is the task of identifying records within one or multiple databases that refer to the same entity. Currently, there exist many different approaches for record linkage. Some approaches incorporate the use of heuristic rules, mathematical models, Markov models, or machine learning. This thesis focuses on the application of record linkage to genealogical records within family trees. Today, large collections of genealogical records are stored in databases, which may contain multiple records that refer to a single individual. Resolving duplicate genealogical records can extend our knowledge on who has lived and more complete information can be constructed by combining all information referring to an individual. Simple string matching is not a feasible option for identifying duplicate records due to inconsistencies such as typographical errors, data entry errors, and missing data.

Record linkage algorithms can be classified under two broad categories, a rule-based or heuristic approach, or a probabilistic-based approach. The Cocktail Approach, presented by Shirley Ong Ai Pei, combines a probabilistic-based approach with a rule-based approach for record linkage. This thesis discusses a re-implementation and adoption of the Cocktail Approach to genealogical records.

Reviews

User-contributed reviews
Retrieving GoodReads reviews...
Retrieving DOGObooks reviews...

Tags

Be the first.

Similar Items

Related Subjects:(1)

Confirm this request

You may have already requested this item. Please select Ok if you would like to proceed with this request anyway.

Linked Data


Primary Entity

<http://www.worldcat.org/oclc/953694866> # Combination of a Probabilistic-Based and a Rule-Based Approach for Genealogical Record Linkage
    a schema:CreativeWork, bgn:Thesis, schema:Book ;
   bgn:inSupportOf "Thesis (M.S.)--California Polytechnic State University, 2015." ;
   library:oclcnum "953694866" ;
   library:placeOfPublication <http://id.loc.gov/vocabulary/countries/cau> ;
   rdfs:seeAlso <http://experiment.worldcat.org/entity/work/data/3536685353#CreativeWork/cal_poly_master_s_thesis_computer_science> ; # Cal Poly master's thesis--Computer Science.
   schema:about <http://experiment.worldcat.org/entity/work/data/3536685353#Topic/> ; #
   schema:author <http://experiment.worldcat.org/entity/work/data/3536685353#Person/shah_pooja_p> ; # Pooja P. Shah
   schema:contributor <http://experiment.worldcat.org/entity/work/data/3536685353#Person/kurfess_franz> ; # Franz Kurfess
   schema:datePublished "2015" ;
   schema:description "Record linkage is the task of identifying records within one or multiple databases that refer to the same entity. Currently, there exist many different approaches for record linkage. Some approaches incorporate the use of heuristic rules, mathematical models, Markov models, or machine learning. This thesis focuses on the application of record linkage to genealogical records within family trees. Today, large collections of genealogical records are stored in databases, which may contain multiple records that refer to a single individual. Resolving duplicate genealogical records can extend our knowledge on who has lived and more complete information can be constructed by combining all information referring to an individual. Simple string matching is not a feasible option for identifying duplicate records due to inconsistencies such as typographical errors, data entry errors, and missing data."@en ;
   schema:description "Record linkage algorithms can be classified under two broad categories, a rule-based or heuristic approach, or a probabilistic-based approach. The Cocktail Approach, presented by Shirley Ong Ai Pei, combines a probabilistic-based approach with a rule-based approach for record linkage. This thesis discusses a re-implementation and adoption of the Cocktail Approach to genealogical records."@en ;
   schema:exampleOfWork <http://worldcat.org/entity/work/id/3536685353> ;
   schema:inLanguage "en" ;
   schema:name "Combination of a Probabilistic-Based and a Rule-Based Approach for Genealogical Record Linkage"@en ;
   schema:productID "953694866" ;
   schema:url <http://digitalcommons.calpoly.edu/theses/1353/> ;
   wdrs:describedby <http://www.worldcat.org/title/-/oclc/953694866> ;
    .


Related Entities

<http://experiment.worldcat.org/entity/work/data/3536685353#CreativeWork/cal_poly_master_s_thesis_computer_science> # Cal Poly master's thesis--Computer Science.
    a schema:CreativeWork ;
   schema:name "Cal Poly master's thesis--Computer Science." ;
    .

<http://experiment.worldcat.org/entity/work/data/3536685353#Person/kurfess_franz> # Franz Kurfess
    a schema:Person ;
   schema:familyName "Kurfess" ;
   schema:givenName "Franz" ;
   schema:name "Franz Kurfess" ;
    .

<http://experiment.worldcat.org/entity/work/data/3536685353#Person/shah_pooja_p> # Pooja P. Shah
    a schema:Person ;
   schema:familyName "Shah" ;
   schema:givenName "Pooja P." ;
   schema:name "Pooja P. Shah" ;
    .

<http://www.worldcat.org/title/-/oclc/953694866>
    a genont:InformationResource, genont:ContentTypeGenericResource ;
   schema:about <http://www.worldcat.org/oclc/953694866> ; # Combination of a Probabilistic-Based and a Rule-Based Approach for Genealogical Record Linkage
   schema:dateModified "2017-09-02" ;
   void:inDataset <http://purl.oclc.org/dataset/WorldCat> ;
    .


Content-negotiable representations

Close Window

Please sign in to WorldCat 

Don't have an account? You can easily create a free account.