首页  软件  游戏  图书  电影  电视剧

请输入您要查询的图书:

 

图书 Java自然语言处理(影印版)(英文版)
内容
目录

Preface

Chapter 1: Introduction to NLP

What is NLP?

Why use NLP?

Why is NLP so hard?

Survey of NLP tools

Apache OpenNLP

Stanford NLP

LingPipe

GATE

UIMA

Overview of text processing tasks

Finding parts of text

Finding sentences

Finding people and things

Detecting Parts of Speech

Classifying text and documents

Extracting relationships

Using combined approaches

Understanding NLP models

Identifying the task

Selecting a model

Building and training the model

Verifying the model

Using the model

Preparing data

Summary

Chapter 2: Finding Parts of Text

Understanding the parts of text

What is tokenization?

Uses of tokenizers

Simple Java tokenizers

Using the Scanner class

Specifying the delimiter

Using the split method

Using the Breaklterator class

Using the StreamTokenizer class

Using the StringTokenizer class

Performance considerations with java core tokenization

NLP tokenizer APIs

Using the OpenNLPTokenizer class

Using the SimpleTokenizer class

Using the WhitespaceTokenizer class

Using the TokenizerME class

Using the Stanford tokenizer

Using the PTBTokenizer class

Using the DocumentPreprocessor class

Using a pipeline

Using LingPipe tokenizers

Training a tokenizer to find parts of text

Comparing tokenizers

Understanding normalization

Converting to lowercase

Removing stopwords

Creating a StopWords class

Using LingPipe to remove stopwords

Using stemming

Using the Porter Stemmer

Stemming with LingPipe

Using lemmatization

Using the StanfordLemmatizer class

Using lemmatization in OpenNLP

Normalizing using a pipeline

Summary

Chapter 3: Finding Sentences

The SBD process

What makes SBD difficult?

Understanding SBD rules of LingPipe's

HeuristicSentenceModel class

Simple Java SBDs

Using regular expressions

Using the Breaklterator class

Using NLP APIs

Using OpenNLP

Using the SentenceDetectorME class

Using the sentPosDetect method

Using the Stanford API

Using the PTBTokenizer class

Using the DocumentPreprocessor class

Using the StanfordCoreNLP class

Using LingPipe

Using the IndoEuropeanSentenceModel class

Using the SentenceChunker class

Using the MedlineSentenceModel class

Training a Sentence Detector model

Using the Trained model

Evaluating the model using the SentenceDetectorEvaluator class

Summary

Chapter 4: Finding People and Things

Why NER is difficult?

Techniques for name recognition

Lists and regular expressions

Statistical classifiers

Using regular expressions for NER

Using Java's regular expressions to find entities

Using LingPipe's RegExChunker class

Using NLP APIs

Using OpenNLP for NER

Determining the accuracy of the entity

Using other entity types

Processing multiple entity types

Using the Stanford API for NER

Using LingPipe for NER

Using LingPipe's name entity models

Using the ExactDictionaryChunker class

Training a model

Evaluating a model

Summary

Chapter 5: Detecting Parts of Speech

The tagging process

Importance of POS taggers

What makes POS difficult?

Using the NLP APIs

Using OpenNLP POS taggers

Using the OpenNLP POSTaggerME class for POS taggers

Using OpenNLP chunking

Using the POSDictionary class

Using Stanford POS taggers

Using Stanford MaxentTagger

Using the MaxentTagger class to tag textese

Using Stanford pipeline to perform tagging

Using LingPipe POS taggers

Using the HmmDecoder class with BestFirst tags

Using the HmmDecoder class with NBest tags

Determining tag confidence with the HmmDecoder class

Training the OpenNLP POSModel

Summary

Chapter 6: Classi ify_~g_ Texts and Documents

How classification is used

Understanding sentiment analysis

Text classifying techniques

Using APIs to classify text

Using OpenNLP

Training an OpenNLP classification model

Using DocumentCategorizerME to classify text

Using Stanford API

Using the ColumnDataClassifier class for classification

Using the Stanford pipeline to perform sentiment analysis

Using LingPipe to classify text

Training text using the Classified class

Using other training categories

Classifying text using LingPipe

Sentiment analysis using LingPipe

Language identification using LingPipe

Summary

Chapter 7: Using Parser to Extract Relationships

Relationship types

Understanding parse trees

Using extracted relationships

Extracting relationships

Using NLP APIs

Using OpenNLP

Using the Stanford API

Using the LexicalizedParser class

Using the TreePrint class

Finding word dependencies using the GrammaticalStructure class

Finding coreference resolution entities

Extracting relationships for a question-answer system

Finding the word dependencies

Determining the question type

Searching for the answer

Summary

Chapter 8: Combined Approaches

Preparing data

Using Boilerpipe to extract text from HTML

Using POI to extract text from Word documents

Using PDFBox to extract text from PDF documents

Pipelines

Using the Stanford pipeline

Using multiple cores with the Stanford pipeline

Creating a pipeline to search text

Summary

Index

内容推荐

自然语言处理(NLP)是应用开发中的重要领域之一,其与解决当代问题的相关性将与日俱增。对于它通过NLP任务支持实现的自然语言可访问应用的需求已有显著增长。里斯编写的《Java自然语言处理(影印版)(英文版)》将运用诸如全文检索、合适名称识别、聚类、标签、信息抽取和摘要等手段展示如何自动组织文本。本书介绍了各种NLP概念,即便你没有任何统计学自然语言处理背景也能理解。

编辑推荐

如果你是一位想要学习自然语言处理基础知识的Java程序员,本书就是为你而写的。通过学习里斯编写的《Java自然语言处理(影印版)(英文版)》你将能识别和运用NLP任务处理众多常见问题,并将它们与你的应用集成以处理更有挑战性的问题。作为本书的读者,应熟悉或具备Java软件开发经验。

标签
缩略图
书名 Java自然语言处理(影印版)(英文版)
副书名
原作名
作者 (英)里斯
译者
编者
绘者
出版社 东南大学出版社
商品编码(ISBN) 9787564160883
开本 16开
页数 237
版次 1
装订 平装
字数 318
出版时间 2016-01-01
首版时间 2016-01-01
印刷时间 2016-01-01
正文语种
读者对象 普通大众
适用范围
发行范围 公开发行
发行模式 实体书
首发网站
连载网址
图书大类
图书小类
重量 0.42
CIP核字 2015256587
中图分类号 TP312JA
丛书名
印张 16.25
印次 1
出版地 江苏
232
185
11
整理
媒质 图书
用纸 普通纸
是否注音
影印版本 原版
出版商国别 CN
是否套装 单册
著作权合同登记号
版权提供者 PACKT Publishing Ltd
定价
印数
出品方
作品荣誉
主角
配角
其他角色
一句话简介
立意
作品视角
所属系列
文章进度
内容简介
作者简介
目录
文摘
安全警示 适度休息有益身心健康,请勿长期沉迷于阅读小说。
随便看

 

兰台网图书档案馆全面收录古今中外各种图书,详细介绍图书的基本信息及目录、摘要等图书资料。

 

Copyright © 2004-2025 xlantai.com All Rights Reserved
更新时间:2025/5/18 7:16:30