Sequence labeling tasks require the computation of sentence representations for each word within a given sentence. A prevalent method incorporates a Bi-directional Long Short-Term Memory (BiLSTM) layer to enhance the sequence structure information. However, empirical evidence Li (2020) suggests that the capacity of BiLSTM to produce sentence representations for sequence labeling tasks is inherently limited. This limitation primarily results from the integration of fragments from past and future sentence representations to formulate a complete sentence representation. In this study, we observed that the entire sentence representation, found in both the first and last cells of BiLSTM, can supplement each the individual sentence representation of each cell. Accordingly, we devised a global context mechanism to integrate entire future and past sentence representations into each cell's sentence representation within the BiLSTM framework. By incorporating the BERT model within BiLSTM as a demonstration, and conducting exhaustive experiments on nine datasets for sequence labeling tasks, including named entity recognition (NER), part of speech (POS) tagging, and End-to-End Aspect-Based sentiment analysis (E2E-ABSA). We noted significant improvements in F1 scores and accuracy across all examined datasets.
翻译:暂无翻译