Main Architecture and Components of the Model: Input, Encoding, Graph Neural Network, and Decoding and Training
In the realm of natural language processing, named entity recognition (NER) is a crucial task that involves identifying specific entities in a given text, such as names of people, organizations, locations, etc. This task becomes even more challenging when dealing with ancient languages like ancient Chinese, where the context and semantics differ significantly from modern languages. To address this challenge, a sophisticated model architecture is required, which is precisely what Figure 1 illustrates in the context of the model we have developed.
Our model architecture is partitioned into four layers: the input layer, the encoder layer, the graph neural network (GNN) layer, and the output layer. Each of these layers plays a crucial role in the overall functioning of the model, starting from the input processing to the final prediction of entity labels.
In the input layer, we define the components of the input for the NER task in ancient Chinese. This includes the input sentence \(S\) consisting of individual characters \(c_i\), the predicted entity labels \(Y\), lexical sets \(L_s\) matching the input sentence, and global chapter information \(P\). The incorporation of lexical sets and chapter information enhances the model’s understanding of lexical knowledge and contextual information, respectively.
The encoder layer processes the input elements by concatenating the input sentence with chapter information and feeding it into a pre-training language model. Additionally, the matching dictionary is encoded using pre-trained embeddings to enrich the semantic representation of the input text.
The graph neural network (GNN) layer is where the magic happens in terms of integrating the matching lexical items and chapter information into the input sentence through graph construction and attention mechanisms. The GAT (Graph Attention Network) model helps in capturing semantic relationships and eliminating noise words to improve the overall performance of the model.
Decoding and training play a crucial role in predicting the entity labels based on the information processed by the previous layers. A CRF (Conditional Random Field) layer is used to capture dependencies between successive labels, and the model is trained using a cross-entropy objective function.
In conclusion, the meticulous detailing of each component of our model in Figure 1 showcases the intricate methods involved in training and inference phases. By leveraging lexical knowledge, chapter information, graph attention, and CRF decoding, our model demonstrates promising results in enhancing named entity recognition in ancient Chinese texts. The integration of these diverse elements reflects the complexity and sophistication required to tackle NER tasks in ancient languages, which opens up new avenues for research and exploration in the field of natural language processing.