Degree
Master of Science in Computer Science
Department
Department of Computer Science
School
School of Mathematics and Computer Science (SMCS)
Date of Submission
Fall 2022
Supervisor
Dr. Sajjad Haider, Professor, Department of Computer Science, School of Mathematics and Computer Science (SMCS)
Keywords
Questions Answering System (QAS), Community Question Answering (CQA), Information Retrieval (IR), Bag of words (BoW), Word embeddings
Abstract
The Questions Answering System (QAS), an advanced form of Information Retrieval (IR), returns comprehensive information as an answer to a human question. A new variant of QAS, Community Question Answering (cQA), provides a platform where users can share their knowledge by asking and responding to questions about any topic. Our research problem is to find similar questions that have already been asked in the forum for any new query. Existing methods use traditional Information Retrieval techniques of the bag of words (BOW) model, which loses semantics in text. Significant challenges revolve around different words between past and new questions, the length of questions, the relationship between questions and answers, and the ranking of matched questions. Furthermore, questions can have different words but can be related to any already asked question with respect to semantics and context. Therefore, we run experiments and compare different text encoding and word embedding approaches to vectorize the existing questions and new questions and calculate the similarity between vectors in vector space. Our results on the TrecQA dataset show relatively higher performance than traditional IR methods. To scale this further, hashed embeddings could be calculated so that for larger datasets, embeddings could be compared faster. In the future, the proposed solution could be integrated into chatbots to find frequently asked questions (FAQs) on websites and public or internal enterprise search engines to find relevant questions from forums.
Document Type
Restricted Access
Submission Type
Research Project
Recommended Citation
Mandhwani, Aveenash. "Community Question Answering – enhancing search quality and relevance." Unpublished graduate research project. Institute of Business Administration. 2022. https://ir.iba.edu.pk/research-projects-mscs/13
The full text of this document is only accessible to authorized users.