[1] J.E. Font, M.R. Costa-Jussa, Equalizing gender biases in neural machine translation with word embeddings techniques, arXiv preprint arXiv:1901.03116, (2019).
[2] R.A. Stein, P.A. Jaques, J.F. Valiati, An analysis of hierarchical text classification using word embeddings, Information Sciences, 471 (2019) 216-232.
[3] E. Biswas, K. Vijay-Shanker, L. Pollock, Exploring word embedding techniques to improve sentiment analysis of software engineering texts, in: Proceedings of the 16th International Conference on Mining Software Repositories, IEEE Press, 2019, pp. 68-78.
[4] F. Incitti, F. Urli, L. Snidaro, Beyond word embeddings: A survey, Information Fusion, 89 (2023) 418-436.
[5] R. JeffreyPennington, C. Manning, Glove: Global vectors for word representation, in: Conference on Empirical Methods in Natural Language Processing, 2014.
[6] T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in: Advances in neural information processing systems, 2013, pp. 3111-3119.
[7] D.S. Asudani, N.K. Nagwani, P. Singh, Impact of word embedding models on text analytics in deep learning environment: a review, Artificial Intelligence Review, (2023) 1-81.
[8] J. Qiang, F. Zhang, Y. Li, Y. Yuan, Y. Zhu, X. Wu, Unsupervised statistical text simplification using pre-trained language modeling for initialization, Frontiers of Computer Science, 17 (2023) 171303.
[9] J. Pennington, R. Socher, C.D. Manning, Glove: Global vectors for word representation, in, pp. 1532-1543.
[10] Y. Zhang, R. He, Z. Liu, K.H. Lim, L. Bing, An unsupervised sentence embedding method by mutual information maximization, arXiv preprint arXiv:2009.12061, (2020).
[11] B. Li, H. Zhou, J. He, M. Wang, Y. Yang, L. Li, On the sentence embeddings from pre-trained language models, arXiv preprint arXiv:2011.05864, (2020).
[12] B. Wang, C.C.J. Kuo, Sbert-wk: A sentence embedding method by dissecting bert-based word models, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28 (2020) 2146-2157.
[13] J. Wieting, M. Bansal, K. Gimpel, K. Livescu, Towards universal paraphrastic sentence embeddings, arXiv preprint arXiv:1511.08198, (2015).
[14] R. Socher, E.H. Huang, J. Pennin, C.D. Manning, A.Y. Ng, Dynamic pooling and unfolding recursive autoencoders for paraphrase detection, in: Advances in neural information processing systems, 2011, pp. 801-809.
[15] B. Min, H. Ross, E. Sulem, A.P.B. Veyseh, T.H. Nguyen, O. Sainz, E. Agirre, I. Heinz, D. Roth, Recent advances in natural language processing via large pre-trained language models: A survey, arXiv preprint arXiv:2111.01243, (2021).
[16] S. Li, X. Puig, C. Paxton, Y. Du, C. Wang, L. Fan, T. Chen, D.-A. Huang, E. Akyürek, A. Anandkumar, Pre-trained language models for interactive decision-making, Advances in Neural Information Processing Systems, 35 (2022) 31199-31212.
[17] R.K. Kaliyar, A multi-layer bidirectional transformer encoder for pre-trained word embedding: a survey of bert, in, IEEE, pp. 336-340.
[18] M.S. Charikar, Similarity estimation techniques from rounding algorithms, in, 2002, pp. 380-388.
[19] G.E. Hinton, R.R. Salakhutdinov, Reducing the dimensionality of data with neural networks, science, 313 (2006) 504-507.
[20] Y. Weiss, A. Torralba, R. Fergus, Spectral hashing, Advances in neural information processing systems, 21 (2008).
[21] Y. Li, F. Liu, Z. Du, D. Zhang, A simhash-based integrative features extraction algorithm for malware detection, Algorithms, 11 (2018) 124.
[22] J. Leskovec, A. Rajaraman, J.D. Ullman, Mining of massive data sets, Cambridge university press, 2020.
[23] F. Hill, K. Cho, A. Korhonen, Learning distributed representations of sentences from unlabelled data, arXiv preprint arXiv:1602.03483, (2016).
[24] T. Mikolov, W.-t. Yih, G. Zweig, Linguistic regularities in continuous space word representations, in: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2013, pp. 746-751.
[25] O. Levy, Y. Goldberg, Linguistic regularities in sparse and explicit word representations, in: Proceedings of the eighteenth conference on computational natural language learning, 2014, pp. 171-180.
[26] S. Arora, Y. Li, Y. Liang, T. Ma, A. Risteski, Linear algebraic structure of word senses, with applications to polysemy, Transactions of the Association of Computational Linguistics, 6 (2018) 483-495.
[27] W. Blacoe, M. Lapata, A comparison of vector-based representations for semantic composition, in: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, Association for Computational Linguistics, 2012, pp. 546-556.
[28] J. Mitchell, M. Lapata, Vector-based models of semantic composition, proceedings of ACL-08: HLT, (2008) 236-244.
[29] K.S. Tai, R. Socher, C.D. Manning, Improved semantic representations from tree-structured long short-term memory networks, arXiv preprint arXiv:1503.00075, (2015).
[30] R. Socher, B. Huval, C.D. Manning, A.Y. Ng, Semantic compositionality through recursive matrix-vector spaces, in: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, Association for Computational Linguistics, 2012, pp. 1201-1211.
[31] R. Socher, A. Perelygin, J. Wu, J. Chuang, C.D. Manning, A. Ng, C. Potts, Recursive deep models for semantic compositionality over a sentiment treebank, in: Proceedings of the 2013 conference on empirical methods in natural language processing, 2013, pp. 1631-1642.
[32] Q. Le, T. Mikolov, Distributed representations of sentences and documents, in: International conference on machine learning, 2014, pp. 1188-1196.
[33] N. Kalchbrenner, E. Grefenstette, P. Blunsom, A convolutional neural network for modelling sentences, arXiv preprint arXiv:1404.2188, (2014).
[34] R. Kiros, Y. Zhu, R.R. Salakhutdinov, R. Zemel, R. Urtasun, A. Torralba, S. Fidler, Skip-thought vectors, in: Advances in neural information processing systems, 2015, pp. 3294-3302.
[35] A. Conneau, D. Kiela, H. Schwenk, L. Barrault, A. Bordes, Supervised learning of universal sentence representations from natural language inference data, arXiv preprint arXiv:1705.02364, (2017).
[36] S.R. Bowman, G. Angeli, C. Potts, C.D. Manning, A large annotated corpus for learning natural language inference, arXiv preprint arXiv:1508.05326, (2015).
[37] I. Goodfellow, Y. Bengio, A. Courville, Deep learning, MIT press, 2016.
[38] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, Communications of the ACM, 60 (2017) 84-90.
[39] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Advances in neural information processing systems, 30 (2017).
[40] T.J. Sejnowski, The unreasonable effectiveness of deep learning in artificial intelligence, Proceedings of the National Academy of Sciences, 117 (2020) 30033-30038.
[41] S. Lamsiyah, A. El Mahdaouy, B. Espinasse, S. El Alaoui Ouatik, An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings, Expert Systems with Applications, 167 (2021) 114152.
[42] P. Gupta, Unsupervised learning of sentence embeddings using compositional n-gram features, in, 2018.
[43] M. Pagliardini, P. Gupta, M. Jaggi, Unsupervised learning of sentence embeddings using compositional n-gram features, arXiv preprint arXiv:1703.02507, (2017).
[44] S. Arora, Y. Liang, T. Ma, A simple but tough-to-beat baseline for sentence embeddings, (2016).
[45] A. Roshanzamir, H. Aghajan, M. Soleymani Baghshah, Transformer-based deep neural network language models for Alzheimer’s disease risk assessment from targeted speech, BMC Medical Informatics and Decision Making, 21 (2021) 1-14.
[46] J. Lu, X. Zhan, G. Liu, X. Zhan, X. Deng, BSTC: A Fake Review Detection Model Based on a Pre-Trained Language Model and Convolutional Neural Network, Electronics, 12 (2023) 2165.
[47] Z. Dai, J. Callan, Deeper text understanding for IR with contextual neural language modeling, in, pp. 985-988.
[48] N. Azzouza, K. Akli-Astouati, R. Ibrahim, Twitterbert: Framework for twitter sentiment analysis based on pre-trained language model representations, in, Springer, pp. 428-437.
[49] H. Christian, D. Suhartono, A. Chowanda, K.Z. Zamli, Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging, Journal of Big Data, 8 (2021).
[50] V. Suresh, D.C. Ong, Using knowledge-embedded attention to augment pre-trained language models for fine-grained emotion recognition, in, IEEE, pp. 1-8.
[51] L.K. Şenel, I. Utlu, V. Yücesoy, A. Koc, T. Cukur, Semantic structure and interpretability of word embeddings, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26 (2018) 1769-1779.
[52] J.J. Lastra-Díaz, J. Goikoetxea, M.A.H. Taieb, A. García-Serrano, M.B. Aouicha, E. Agirre, A reproducible survey on word embeddings and ontology-based methods for word similarity: Linear combinations outperform the state of the art, Engineering Applications of Artificial Intelligence, 85 (2019) 645-665.
[53] A. Bakarov, A survey of word embeddings evaluation methods, arXiv preprint arXiv:1801.09536, (2018).
[54] V. Lampos, B. Zou, I.J. Cox, Enhancing feature selection using word embeddings: The case of flu surveillance, in, pp. 695-704.
[55] P. Indyk, R. Motwani, Approximate nearest neighbors: towards removing the curse of dimensionality, in: Proceedings of the thirtieth annual ACM symposium on Theory of computing, ACM, 1998, pp. 604-613.
[56] M.S. Charikar, Similarity estimation techniques from rounding algorithms, in: Proceedings of the thirty-fourth annual ACM symposium on Theory of computing, ACM, 2002, pp. 380-388.
[57] B. Schölkopf, A. Smola, K.-R. Müller, Kernel principal component analysis, in: International conference on artificial neural networks, Springer, 1997, pp. 583-588.
[58] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks, arXiv preprint arXiv:1908.10084, (2019).