Help ?

IGMIN: 我们很高兴您来到这里. 如果您是我们网站的新访客,并且需要更多信息,请点击“创建新查询”.

如果您已经是我们网络的成员,并且需要跟踪您已提交问题的任何进展,请点击‘带我去我的查询.'

Search

Organised by  IgMin Fevicon

Languages

Browse by Subjects

Welcome to IgMin Research – an Open Access journal uniting Biology, Medicine, and Engineering. We’re dedicated to advancing global knowledge and fostering collaboration across scientific fields.

Members

Our aim is to create opportunities for researchers to share ideas across disciplines.

Articles

Our aim is to create opportunities for researchers to share ideas across disciplines.

Explore Content

Our aim is to create opportunities for researchers to share ideas across disciplines.

Identify Us

Our aim is to create opportunities for researchers to share ideas across disciplines.

IgMin Corporation

Welcome to IgMin, a leading platform dedicated to enhancing knowledge dissemination and professional growth across multiple fields of science, technology, and the humanities. We believe in the power of open access, collaboration, and innovation. Our goal is to provide individuals and organizations with the tools they need to succeed in the global knowledge economy.

Publications Support
[email protected]
E-Books Support
[email protected]
Webinars & Conferences Support
[email protected]
Content Writing Support
[email protected]
IT Support
[email protected]

Search

Select Language

Explore Section

Content for the explore section slider goes here.

Abstract

摘要 at IgMin Research

Our aim is to create opportunities for researchers to share ideas across disciplines.

Engineering Group Mini Review 文章编号: igmin135

Revolutionizing Duplicate Question Detection: A Deep Learning Approach for Stack Overflow

Machine Learning DOI10.61927/igmin135 Affiliation

Affiliation

    Harun Jamil, Department of Electronic Engineering, Jeju National University, Jeju-si, Jeju-do, 63243, Republic of Korea, Email: [email protected]

3.7k
VIEWS
572
DOWNLOADS
Connect with Us

摘要

This study provides a novel way to detect duplicate questions in the Stack Overflow community, posing a daunting problem in natural language processing. Our proposed method leverages the power of deep learning by seamlessly merging Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks to capture both local nuances and long-term relationships inherent in textual input. Word embeddings, notably Google’s Word2Vec and GloVe, raise the bar for text representation to new heights. Extensive studies on the Stack Overflow dataset demonstrate the usefulness of our approach, generating excellent results. The combination of CNN and LSTM models improves performance while streamlining preprocessing, establishing our technology as a viable piece in the arsenal for duplicate question detection. Aside from Stack Overflow, our technique has promise for various question-and-answer platforms, providing a robust solution for finding similar questions and paving the path for advances in natural language processing.

数字

参考文献

    1. Ye X, Manoharan S. Marking essays automatically. In Proceedings of the 2020 4th International Conference on E-Education, E-Business and E-Technology. 2020; 56–60.
    2. Stack Overflow Dataset. https://www.kaggle.com/datasets/stackoverflow/ stackoverflow
    3. Yazdaninia M, Lo D, Sami A. Characterization and prediction of questions without accepted answers on stack overflow. In 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC). IEEE. 2021; 59–70.
    4. Zhang H, Zeng P, Hu Y, Qian J, Song J, Gao L. Learning visual question answering on controlled semantic noisy labels. Pattern Recognition. 2023; 138:109339.
    5. Roy PK, Saumya S, Singh JP, Banerjee S, Gutub A. Analysis of community question‐answering issues via machine learning and deep learning: State‐of‐the‐art review. CAAI Transactions on Intelligence Technology. 2023; 8(1):95-117.
    6. Fan M, Lin W, Feng Y, Sun M, Li P. A globalization-semantic matching neural network for paraphrase identification. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2018; 2067–2075.
    7. Vani K, Gupta D. Text plagiarism classification using syntax-based linguistic features. Expert Systems with Applications. 2017; 88:448–464.
    8. Wang L, Zhang L, Jiang J. Duplicate question detection with deep learning in a stack overflow. IEEE Access. 2020; 8:25964–25975.
    9. Prabowo DA, Herwanto GB. Duplicate question detection in question-answer websites using a convolutional neural network. In 2019 5th International conference on science and technology (ICST). IEEE. 2019; 1:1–6.
    10. Roy PK, Singh JP. Predicting closed questions on community question answering sites using convolutional neural network: Neural Computing and Applications. 2020; 32(14):10555-10572.
    11. Chali Y, Islam R. Question-question similarity in online forums. In Proceedings of the 10th annual meeting of the forum for information retrieval evaluation. 2018; 21–28.
    12. Kamath CN, Bukhari SS, Dengel A. Comparative study between traditional machine learning and deep learning approaches for text classification. In Proceedings of the ACM Symposium on Document Engineering. 2018; 1–11.
    13. Kim Y, Jernite Y, Sontag D, Rush A. Characteraware neural language models. In Proceedings of the AAAI conference on artificial intelligence 2016; 30.
    14. Jiang JY, Zhang M, Li C, Bendersky M, Golbandi N, Najork M. Semantic text matching for long-form documents. In The world wide web conference. 2019; 795–806.
    15. Imtiaz Z, Umer M, Ahmad M, Ullah S, Choi GS, Mehmood A. Duplicate questions pair detection using siamese malstm. IEEE Access. 2020; 8:21932–21942.
    16. Goldberg Y, Levy O. word2vec explained: deriving mikolov et al.’s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722. 2014.
    17. Eyecioglu A, Keller B. Twitter paraphrase identification with simple overlap features and svms. In Proceedings of the 9th International Workshop on Semantic Evaluation. 2015; 64–69.
    18. Mudgal RK, Niyogi R, Milani A, Franzoni V. Analysis of tweets to find the basis of popularity based on events semantic similarity. International Journal of Web Information Systems. 2018; 14(4):438–452.
    19. Roul RK, Sahoo JK, Arora K. Modified tf-idf term weighting strategies for text categorization. In 2017 14th IEEE India council international conference (INDICON). IEEE. 2017; 1–6.
    20. Dey K, Shrivastava R, Kaushik S. A paraphrase and semantic similarity detection system for user generated short-text content on microblogs. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 2016; 2880–2890.
    21. Hassanzadeh H, Groza T, Nguyen A, Hunter J. A supervised approach to quantifying sentence similarity: with application to evidence based medicine. PloS one. 2015; 10(6):e0129392.
    22. Soğancıoğlu G, Öztürk H, Özgür A. Biosses: a semantic sentence similarity estimation system for the biomedical domain. Bioinformatics. 2017; 33(14):i49–i58.
    23. Wu D, Huang J, Yang S. A joint model for sentence semantic similarity learning. In 2017 13th International Conference on Semantics, Knowledge and Grids (SKG). IEEE. 2017; 120–125.
    24. Shaheer S, Hossain I, Sarna SN, Mehedi MHK, Rasel AA. Evaluating Question generation models using QA systems and Semantic Textual Similarity. In 2023 IEEE 13th Annual Computing and Communication Workshop and Conference (CCWC). IEEE. 2023; 0431-0435
    25. Amur ZH, Hooi KY, Bhanbhro H, Dahri K, Soomro GM. Short-Text Semantic Similarity (STSS): Techniques, Challenges and Future Perspectives. Applied Sciences. 2023; 13(6):3911.
    26. Huang J, Yao S, Lyu C, Ji D. Multi-granularity neural sentence model for measuring short text similarity. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10177 LNCS. 2017; 439– 455. doi: 10.1007/978-3-319-55753-3_28.
    27. Ferreira R, Cavalcanti GDC, Freitas F, Lins RD, Simske SJ, Riss M. Combining sentence similarities measures to identify paraphrases. Comput. Speech Lang. 2018; 47:59–73. doi: 10.1016/j.csl.2017.07.002.
    28. Jiang JY, Bendersky M, Zhang M, Golbandi N, Li C, Najork M. Semantic text matching for long-form documents. Web Conf. 2019 - Proc. World Wide Web Conf. WWW 2019. 2019; 795–806. doi: 10.1145/3308558.3313707.
    29. Homma Y, Sy S, Yeh C. Detecting Duplicate Questions with Deep Learning. 30th Conf. Neural Inf. Process. Syst. (NIPS 2016), no. Nips. 2016; 1–8. https://pdfs.semanticscholar.org/6ffd/e80e503fe6125237476494e777f4fe6d62c4.pdf

类似文章

Sorption-based Spectrophotometric Assay for Lead(II) with Immobilized Azo Ligand
Ashirov Mansur Allanazarovich, Yusupova Mavluda Rajabboyevna, Takhirov Yuldash Rajabovich, Smanova Zulaykho Asanaliyevna and Avazyazov Mukhammad Akbarovich
DOI10.61927/igmin283
Investigation of Lateral Vibrations in Turbine-generator Unit 5 of the Inga 2 Hydroelectric Power Plant
André Mampuya Nzita, Edmond Phuku Phuati, Robert Muanda Ngimbi, Guyh Dituba Ngoma and Nathanaël Masiala Mavungu
DOI10.61927/igmin173
Dimensioning of Splices Using the Magnetic System
Ryszard Błażej, Leszek Jurdziak, Agata Kirjanów-Błażej, Paweł Kostrzewa and Aleksandra Rzeszowska
DOI10.61927/igmin204
Integrated Multi-fidelity Structural Optimization for UAV Wings
Sanusi Muhammad Babansoro, Deng Zhongmin, Hasan Mehedi and SM Tarikul Islam
DOI10.61927/igmin191
Exploring Upper Limb Kinematics in Limited Vision Conditions: Preliminary Insights from 3D Motion Analysis and IMU Data
Artemis Zarkadoula, Themistoklis Tsatalas, George Bellis, Paris Papaggelos, Evangelia Vlahogianni, Stefanos Moustos, Eirini Koukourava and Dimitrios Tsaopoulos
DOI10.61927/igmin138
×

Why Publish with IgMin Research?

Submit Your Article