Attention based lattice bilstm model for Chinese named entity recognition
首发时间:2019-12-25
Abstract:A recently proposed model named Lattice LSTM has focused on integrating segmentation information into the long short-term memory (LSTM) network. However, it can only affect the subsequent character sequence of each character in the sequence from the level of word granularity, which results in insufficient extraction of word segmentation information. Besides, features of characters extracted by LSTM are given the same weight when transferred to the conditional random field (CRF) layer, the key semantic information does not receive much consideration. To solve the above problems, a novel neural network model is proposed in this paper which improves the original lattice model (Att-Lattice BiLSTM) with bidirectional long short-term memory based on the attention mechanism. An information path is added from the end character of word to the start character of word in the back propagation of LSTM, which integrates the word boundary information into both the start and end character of the word during bidirectional transfer of LSTM network, introducing the word information comprehensively. Moreover, this new model allows seamlessly incorporating attention mechanism to capture relatively important semantic feature automatically. Meanwhile, two strategies are provided to aggregate the bidirectional LSTM layers output to integrate semantic features effectively. Experimental results on four data sets show that the proposed model performs better than other most advanced models.
keywords: named entity recognition deep learning bidirectional long short-term memory attention mechanism lattice network
点击查看论文中文信息
基于注意力机制的 Lattice BiLSTM 中文命名实体识别模型
摘要:最近被提出的点阵长短期记忆网络(Lattice LSTM)模型致力于将分词信息集成到长短期记忆网络(LSTM)中。然而,该模型只能从词粒度的层面对序列中每个字符的后续字符序列产生影响,没有充分提取分词信息。此外,由长短期记忆网络网络层提取的字符特征在输入到条件随机场(CRF) 层时被赋予了相同的权值,不利于模型关注到关键的语义信息。为了解决以上问题,本文提出了一种新的基于注意机制的双向长短期记忆神经网络点阵(Att-Lattice BiLSTM)模型。在长短期记忆网络的反向传播中,额外增加了从词的结束字符到词的开始字符的信息路径,使其在双向传输过程中,将词的边界信息全面整合到词的开始字符和结束字符之间,同时,新模型结合了注意力机制来自动捕捉相对重要的语义特征。另外,本文采用了两种策略来整合双向长短期记忆网络的输出信息,这可以更有效地集成语义特征。在四个数据集上的实验结果表明,本文新提出的模型比其他的最优模型表现得更好。
基金:
引用
No.****
动态公开评议
共计0人参与
勘误表
基于注意力机制的 Lattice BiLSTM 中文命名实体识别模型
评论
全部评论0/1000