《Java自然语言处理(影印版)(英文版)》(英)里斯东南大学出版社PDF电子书网盘迅雷下载、免费在线阅读-兰台网

Preface

Chapter 1: Introduction to NLP

What is NLP?

Why use NLP?

Why is NLP so hard?

Survey of NLP tools

Apache OpenNLP

Stanford NLP

LingPipe

GATE

UIMA

Overview of text processing tasks

Finding parts of text

Finding sentences

Finding people and things

Detecting Parts of Speech

Classifying text and documents

Extracting relationships

Using combined approaches

Understanding NLP models

Identifying the task

Selecting a model

Building and training the model

Verifying the model

Using the model

Preparing data

Summary

Chapter 2: Finding Parts of Text

Understanding the parts of text

What is tokenization?

Uses of tokenizers

Simple Java tokenizers

Using the Scanner class

Specifying the delimiter

Using the split method

Using the Breaklterator class

Using the StreamTokenizer class

Using the StringTokenizer class

Performance considerations with java core tokenization

NLP tokenizer APIs

Using the OpenNLPTokenizer class

Using the SimpleTokenizer class

Using the WhitespaceTokenizer class

Using the TokenizerME class

Using the Stanford tokenizer

Using the PTBTokenizer class

Using the DocumentPreprocessor class

Using a pipeline

Using LingPipe tokenizers

Training a tokenizer to find parts of text

Comparing tokenizers

Understanding normalization

Converting to lowercase

Removing stopwords

Creating a StopWords class

Using LingPipe to remove stopwords

Using stemming

Using the Porter Stemmer

Stemming with LingPipe

Using lemmatization

Using the StanfordLemmatizer class

Using lemmatization in OpenNLP

Normalizing using a pipeline

Summary

Chapter 3: Finding Sentences

The SBD process

What makes SBD difficult?

Understanding SBD rules of LingPipe's

HeuristicSentenceModel class

Simple Java SBDs

Using regular expressions

Using the Breaklterator class

Using NLP APIs

Using OpenNLP

Using the SentenceDetectorME class

Using the sentPosDetect method

Using the Stanford API

Using the PTBTokenizer class

Using the DocumentPreprocessor class

Using the StanfordCoreNLP class

Using LingPipe

Using the IndoEuropeanSentenceModel class

Using the SentenceChunker class

Using the MedlineSentenceModel class

Training a Sentence Detector model

Using the Trained model

Evaluating the model using the SentenceDetectorEvaluator class

Summary

Chapter 4: Finding People and Things

Why NER is difficult?

Techniques for name recognition

Lists and regular expressions

Statistical classifiers

Using regular expressions for NER

Using Java's regular expressions to find entities

Using LingPipe's RegExChunker class

Using NLP APIs

Using OpenNLP for NER

Determining the accuracy of the entity

Using other entity types

Processing multiple entity types

Using the Stanford API for NER

Using LingPipe for NER

Using LingPipe's name entity models

Using the ExactDictionaryChunker class

Training a model

Evaluating a model

Summary

Chapter 5: Detecting Parts of Speech

The tagging process

Importance of POS taggers

What makes POS difficult?

Using the NLP APIs

Using OpenNLP POS taggers

Using the OpenNLP POSTaggerME class for POS taggers

Using OpenNLP chunking

Using the POSDictionary class

Using Stanford POS taggers

Using Stanford MaxentTagger

Using the MaxentTagger class to tag textese

Using Stanford pipeline to perform tagging

Using LingPipe POS taggers

Using the HmmDecoder class with BestFirst tags

Using the HmmDecoder class with NBest tags

Determining tag confidence with the HmmDecoder class

Training the OpenNLP POSModel

Summary

Chapter 6: Classi ify_~g_ Texts and Documents

How classification is used

Understanding sentiment analysis

Text classifying techniques

Using APIs to classify text

Using OpenNLP

Training an OpenNLP classification model

Using DocumentCategorizerME to classify text

Using Stanford API

Using the ColumnDataClassifier class for classification

Using the Stanford pipeline to perform sentiment analysis

Using LingPipe to classify text

Training text using the Classified class

Using other training categories

Classifying text using LingPipe

Sentiment analysis using LingPipe

Language identification using LingPipe

Summary

Chapter 7: Using Parser to Extract Relationships

Relationship types

Understanding parse trees

Using extracted relationships

Extracting relationships

Using NLP APIs

Using OpenNLP

Using the Stanford API

Using the LexicalizedParser class

Using the TreePrint class

Finding word dependencies using the GrammaticalStructure class

Finding coreference resolution entities

Extracting relationships for a question-answer system

Finding the word dependencies

Determining the question type

Searching for the answer

Summary

Chapter 8: Combined Approaches

Preparing data

Using Boilerpipe to extract text from HTML

Using POI to extract text from Word documents

Using PDFBox to extract text from PDF documents

Pipelines

Using the Stanford pipeline

Using multiple cores with the Stanford pipeline

Creating a pipeline to search text

Summary

Index

图书	Java自然语言处理(影印版)(英文版)
内容	目录 Preface Chapter 1: Introduction to NLP What is NLP? Why use NLP? Why is NLP so hard? Survey of NLP tools Apache OpenNLP Stanford NLP LingPipe GATE UIMA Overview of text processing tasks Finding parts of text Finding sentences Finding people and things Detecting Parts of Speech Classifying text and documents Extracting relationships Using combined approaches Understanding NLP models Identifying the task Selecting a model Building and training the model Verifying the model Using the model Preparing data Summary Chapter 2: Finding Parts of Text Understanding the parts of text What is tokenization? Uses of tokenizers Simple Java tokenizers Using the Scanner class Specifying the delimiter Using the split method Using the Breaklterator class Using the StreamTokenizer class Using the StringTokenizer class Performance considerations with java core tokenization NLP tokenizer APIs Using the OpenNLPTokenizer class Using the SimpleTokenizer class Using the WhitespaceTokenizer class Using the TokenizerME class Using the Stanford tokenizer Using the PTBTokenizer class Using the DocumentPreprocessor class Using a pipeline Using LingPipe tokenizers Training a tokenizer to find parts of text Comparing tokenizers Understanding normalization Converting to lowercase Removing stopwords Creating a StopWords class Using LingPipe to remove stopwords Using stemming Using the Porter Stemmer Stemming with LingPipe Using lemmatization Using the StanfordLemmatizer class Using lemmatization in OpenNLP Normalizing using a pipeline Summary Chapter 3: Finding Sentences The SBD process What makes SBD difficult? Understanding SBD rules of LingPipe's HeuristicSentenceModel class Simple Java SBDs Using regular expressions Using the Breaklterator class Using NLP APIs Using OpenNLP Using the SentenceDetectorME class Using the sentPosDetect method Using the Stanford API Using the PTBTokenizer class Using the DocumentPreprocessor class Using the StanfordCoreNLP class Using LingPipe Using the IndoEuropeanSentenceModel class Using the SentenceChunker class Using the MedlineSentenceModel class Training a Sentence Detector model Using the Trained model Evaluating the model using the SentenceDetectorEvaluator class Summary Chapter 4: Finding People and Things Why NER is difficult? Techniques for name recognition Lists and regular expressions Statistical classifiers Using regular expressions for NER Using Java's regular expressions to find entities Using LingPipe's RegExChunker class Using NLP APIs Using OpenNLP for NER Determining the accuracy of the entity Using other entity types Processing multiple entity types Using the Stanford API for NER Using LingPipe for NER Using LingPipe's name entity models Using the ExactDictionaryChunker class Training a model Evaluating a model Summary Chapter 5: Detecting Parts of Speech The tagging process Importance of POS taggers What makes POS difficult? Using the NLP APIs Using OpenNLP POS taggers Using the OpenNLP POSTaggerME class for POS taggers Using OpenNLP chunking Using the POSDictionary class Using Stanford POS taggers Using Stanford MaxentTagger Using the MaxentTagger class to tag textese Using Stanford pipeline to perform tagging Using LingPipe POS taggers Using the HmmDecoder class with BestFirst tags Using the HmmDecoder class with NBest tags Determining tag confidence with the HmmDecoder class Training the OpenNLP POSModel Summary Chapter 6: Classi ify_~g_ Texts and Documents How classification is used Understanding sentiment analysis Text classifying techniques Using APIs to classify text Using OpenNLP Training an OpenNLP classification model Using DocumentCategorizerME to classify text Using Stanford API Using the ColumnDataClassifier class for classification Using the Stanford pipeline to perform sentiment analysis Using LingPipe to classify text Training text using the Classified class Using other training categories Classifying text using LingPipe Sentiment analysis using LingPipe Language identification using LingPipe Summary Chapter 7: Using Parser to Extract Relationships Relationship types Understanding parse trees Using extracted relationships Extracting relationships Using NLP APIs Using OpenNLP Using the Stanford API Using the LexicalizedParser class Using the TreePrint class Finding word dependencies using the GrammaticalStructure class Finding coreference resolution entities Extracting relationships for a question-answer system Finding the word dependencies Determining the question type Searching for the answer Summary Chapter 8: Combined Approaches Preparing data Using Boilerpipe to extract text from HTML Using POI to extract text from Word documents Using PDFBox to extract text from PDF documents Pipelines Using the Stanford pipeline Using multiple cores with the Stanford pipeline Creating a pipeline to search text Summary Index 内容推荐自然语言处理(NLP)是应用开发中的重要领域之一，其与解决当代问题的相关性将与日俱增。对于它通过NLP任务支持实现的自然语言可访问应用的需求已有显著增长。里斯编写的《Java自然语言处理(影印版)(英文版)》将运用诸如全文检索、合适名称识别、聚类、标签、信息抽取和摘要等手段展示如何自动组织文本。本书介绍了各种NLP概念，即便你没有任何统计学自然语言处理背景也能理解。编辑推荐如果你是一位想要学习自然语言处理基础知识的Java程序员，本书就是为你而写的。通过学习里斯编写的《Java自然语言处理(影印版)(英文版)》你将能识别和运用NLP任务处理众多常见问题，并将它们与你的应用集成以处理更有挑战性的问题。作为本书的读者，应熟悉或具备Java软件开发经验。
标签
缩略图
书名	Java自然语言处理(影印版)(英文版)
副书名
原作名
作者	(英)里斯
译者
编者
绘者
出版社	东南大学出版社
商品编码（ISBN）	9787564160883
开本	16开
页数	237
版次	1
装订	平装
字数	318
出版时间	2016-01-01
首版时间	2016-01-01
印刷时间	2016-01-01
正文语种	英
读者对象	普通大众
适用范围
发行范围	公开发行
发行模式	实体书
首发网站
连载网址
图书大类
图书小类
重量	0.42
CIP核字	2015256587
中图分类号	TP312JA
丛书名
印张	16.25
印次	1
出版地	江苏
长	232
宽	185
高	11
整理
媒质	图书
用纸	普通纸
是否注音	否
影印版本	原版
出版商国别	CN
是否套装	单册
著作权合同登记号
版权提供者	PACKT Publishing Ltd
定价
印数
出品方
作品荣誉
主角
配角
其他角色
一句话简介
立意
作品视角
所属系列
文章进度
内容简介
作者简介
目录
文摘
安全警示	适度休息有益身心健康，请勿长期沉迷于阅读小说。
随便看	沉思录/时光文库培根随笔集/时光文库如此购车最聪明(好车子的100个标准第2版)/陈总编爱车热线书系社区管理实务(高等职业教育十二五社区管理与服务专业规划教材) 投资先锋(基金教父的资本市场沉思录上下) 整体卫浴(一站式装修让你无忧) 管理信息系统(计算机类高等职业教育十二五规划教材) 跨国公司与国际直接投资(第2版普通高等教育国际经济与贸易专业规划教材) 螺纹精度与检测技术手册新编简明机修钳工手册图表详解电磁炉故障检修手机原理与维修(全国高等职业教育规划教材) 物流运输组织及管理/从校园到职场物流工程师必读丛书数控车削工艺与技能训练(职业教育改革与创新系列教材) 应急驾驶与急救全攻略/我爱我车全攻略系列丛书计算机绘图(AutoCAD2011中文版高等职业技术教育机电类专业规划教材) 通信原理简明教程(普通高等教育电子信息类规划教材) 计算机系统导论(普通高等院校计算机课程规划教材) Python标准库/华章程序员书库高级驾驶全攻略/我爱我车全攻略系列丛书制冷与空调设备组装与调试赛题集(全国职业院校技能大赛中职组电工电子技术技能比赛赛题集锦) ASP.NET开发宝典(附光盘)/程序员开发宝典系列 Visual C++开发宝典(附光盘)/程序员开发宝典系列儿童简笔画图典(中英双语版初级) 电梯结构原理及安装维修(第5版) QQ免打扰工具绿色版文件夹万能解密器网吧全能工具包 QQ号批量申请器仓管专家 Alexa工具条 WebShell免杀版本生成器自动扫描溢出工具物理路径暴破器灰鸽子万能捆绑器朗哥资源争夺战 Rongorongo 神秘推箱子 Sokoban Can Knocker-3D 塔防帝国之星球大战独角兽冒险2 星座爆破爱洗澡的猫猫艾伦比亚的魔镜金牌监狱狙击手恐龙水管工3d 永远的忠诚大学生士兵的故事2 聪明小空空十送红军爱的飞行模式黑喵知情棋魂风起陇西我和我的时光少年我叫布萨芭