1. What do named entities do ?

In the field of natural language processing applications , Named entity recognition is information retrieval , knowledge graph , MT , Emotional analysis , Q & a system and other basic tasks of natural language processing applications , for example , We need to use named entity recognition technology to automatically identify user queries , Then the
The entity in the query is linked to the node corresponding to the knowledge map, and its recognition accuracy will directly affect the follow-up work .

2. What are the difficulties of named entity recognition ?

* The recognition of named entities is different in different fields or scenes
. At present, the labeled corpus is usually limited to some fields , It is difficult to apply to other corpora , for example : Training based on news corpus , Then the test was conducted in social corpus , The test results are often difficult to achieve the desired results , Because there are a lot of nonstandard words in social corpus .
* The cost of named entity identification and annotation is high
, At present, there are few labeled corpus for named entity recognition , How to learn a better model from less corpus , Or with the help of other similar task corpus and a large number of unlabeled corpus for learning , This brings new challenges to named entity recognition .
* Chinese named entity recognition “ word ” The boundary of is determined , however “ Word ” The boundary is fuzzy , Therefore, there are usually some semantic ambiguity
The situation , for example :“ It's amazing ” There are two word segmentation schemes for this sentence ,“ Give Way / People's Congress / Take a surprise ” and “ Let people / be startled at ”, The sentence meaning of the two word segmentation schemes is completely different . Chinese named entity recognition is usually combined with Chinese word segmentation , The combination of shallow grammar analysis and other processes , and
participle , The accuracy of parsing directly affects the effect of named entity recognition .
* There are a lot of unknown words in the text to be recognized , It is a new entity word , as time goes on , It is difficult for us to maintain these new words .
3 Existing research

Induction of named entity recognition methods based on statistical model

4 CRF( Conditional Random Fields, Conditional random field )

4.1 Introduction to conditional random fields

Comparison of four models

In a given observation sequence X Time , A specific sequence of tags Y The probability can be defined as

4.2 CRF Parameter estimation of

4.3 forecast

 

5  experiment

1998 People's daily

 

#sentence

#PER

#LOC

#ORG

train

46364

17615

36517

20571

test

4365

1973

2877

1331

Technology
©2019-2020 Toolsou All rights reserved,
( Super detail )Eclipse Using tutorials —— use Eclipse Create first HelloWorld! Database operation 5 code implementation mysql Addition, deletion, modification and query of database What can MCU do , Do you have any interesting works made by MCU or open source hardware Go to the interview after reading this , Steady over ~~ Single linked list of primary data structure (C Language implementation )SQL Comprehensive questions Employee unit comprehensive questions Python Implementation of Hanoi Tower code VHDL——JK trigger It's over , Starting salary 30k