Lexical Substitution: Enhancing Language Generation with Word Replacement
PDF
DOI

Keywords

Lexical substitution, Contextual language models, BERT (Bidirectional Encoder Representations from Transformers), ELMo (Embeddings from Language Models), XLNet, Context-sensitive modeling, State-of-the-art (SOTA) results, Lexical resources, Paraphrasing,Sentiment analysis, Data augmentation,Rule-based methods, Word embedding-based methods, Machine learning-based methods, Context preservation, Grammatical accuracy, Syntactic organization,Natural language processing (NLP)

How to Cite

Lexical Substitution: Enhancing Language Generation with Word Replacement. (2023). Journal of Science-Innovative Research in Uzbekistan, 1(8), 120-124. https://universalpublishings.com/index.php/jsiru/article/view/2639

Abstract

Lexical substitution (LS) is a technique used in natural language processing (NLP) to replace words or phrases in a sentence while preserving the original meaning. Recent advancements in LS based on pretrained language models have shown promising results in suggesting suitable replacements for target words by considering their context. This article explores LS approaches using neural language models (LMs) and masked language models (MLMs) such as context2vec, ELMo, BERT, and XLNet. The study demonstrates the effectiveness of injecting target word information into these models and analyzes the semantic links between targets and substitutes. Lexical substitution plays a vital role in enhancing language generation models and is widely used in NLP tasks such as data augmentation, paraphrase generation, word sense induction, and text simplification. While earlier methods relied on manual lexical resources, the emergence of contextual language models like BERT, ELMo, and XLNet has revolutionized LS by incorporating contextual information and achieving state-of-the-art results. Ongoing research aims to develop more context-aware approaches to address the challenges of lexical substitution, ultimately advancing the capabilities of NLP systems.

PDF
DOI

References

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171-4186.

Melamud, O., Goldberger, J., & Dagan, I. (2015). Context2vec: Learning Generic Context Embedding with Bidirectional LSTM. Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, 51-61

)Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep Contextualized Word Representations. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2227-2237.

Shardlow, M. (2014). A Survey of Automated Lexical Substitution. Journal of Artificial Intelligence Research, 51, 457-503.

Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., & Le, Q. V. (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding. Advances in Neural Information Processing Systems, 32, 5753- 5763

https://aclanthology.org/2022.coling-1.362.pdf

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.