跳到内容
关闭窗口

请登入WorldCat 

没有张号吗?很容易就可以 建立免费的账号.

Automatic lexicon generation for unsupervised part-of-speech tagging using only unannotated text
关闭预览资料

Automatic lexicon generation for unsupervised part-of-speech tagging using only unannotated text

著者: Dennis V Pereira
出版商: [Blacksburg, Va. : University Libraries, Virginia Polytechnic Institute and State University, 2004]
版本/格式: 电子书 : 文献 : 硕士论文/博士论文 : 州政府或者省政府刊物 : 英语
提要:
With the growing number of textual resources available, the ability to understand them becomes critical. An essential first step in understanding these sources is the ability to identify the parts-of-speech in each sentence. The goal of this research is to propose, improve, and implement an algorithm capable of finding terms (words in a corpus) that are used in similar ways - a term categorizer. Such a term  再读一些...
评估:

正在检索评估和评论的资料...  

 

在线查找

与资料的链接

在图书馆查找

正在检索... 正在查找有这资料的图书馆...

详细书目

材料类型: 文献, 硕士论文/博士论文, 政府刊物, 州政府或者省政府刊物, 互联网资源
文件类型: 互联网资源, 计算机文档
所有的著者/提供者: Dennis V Pereira
OCLC号码: 56569925
注意: Title from electronic submission form. Vita. Abstract.
详述: System requirements: PC, World Wide Web browser and PDF reader.; Available electronically via Internet.
责任: Dennis V. Pereira.

摘要:

With the growing number of textual resources available, the ability to understand them becomes critical. An essential first step in understanding these sources is the ability to identify the parts-of-speech in each sentence. The goal of this research is to propose, improve, and implement an algorithm capable of finding terms (words in a corpus) that are used in similar ways - a term categorizer. Such a term categorizer can be used to find a particular part-of-speech, i.e. nouns in a corpus, and generate a lexicon. The proposed work is not dependent on any external sources of information, such as dictionaries, and it shows a significant improvement (30%) over an existing method of categorization. More importantly, the proposed algorithm can be applied as a component of an unsupervised part-of-speech tagger, making it truly unsupervised, requiring only unannotated text. The algorithm is discussed in detail, along with its background, and its performance. Experimentation shows that the proposed algorithm performs within 3% of the baseline, the Penn-TreeBank Lexicon.

评论

正在检索WorldCat中的评论...
正在检索EMRO中的评论...
正在检索weRead中的评论...
正在获取GoodReads评论...
正在检索Amazon中的评论...

标签

争取是第一个!

相似资料

确认申请

您可能已经申请过这份资料。如果还想继续进行,请选确认。